Why Finance Analysts Need SQL
There comes a point in every finance analyst’s career when the data you need is not in a spreadsheet. It lives in a database, and getting it requires either waiting three days for the data team to run a query or learning to do it yourself. The second option is almost always faster.
SQL is not programming in the traditional sense. It is a structured way to ask a database for specific information. If you can write a SUMIFS formula in Excel, you can learn SQL. The logic is the same: filter rows, group data, and aggregate numbers. The syntax is just different.
This guide covers the queries that finance analysts use most frequently. It is not a comprehensive SQL course. It is the 20 percent of SQL that handles 80 percent of finance use cases.
Setting Up Your Environment
Most finance teams connect to databases through one of these tools:
- SQL Server Management Studio (SSMS) for Microsoft SQL Server databases
- pgAdmin or DBeaver for PostgreSQL databases
- MySQL Workbench for MySQL databases
- Mode, Metabase, or Looker for browser-based SQL access
Ask your IT or data engineering team for read-only access to the relevant databases. Emphasize “read-only” because finance should never have write access to production systems. You only need to query data, not modify it.
The Core Query Structure
Every SQL query follows the same basic pattern:
SELECT columns_you_want
FROM table_name
WHERE conditions
GROUP BY grouping_columns
HAVING aggregate_conditions
ORDER BY sort_columns
Think of it this way: SELECT is like choosing which columns to show in your Excel report. FROM is the table you are pulling from. WHERE is your filter. GROUP BY is how you subtotal. ORDER BY is your sort.
Essential Queries for Finance
Revenue by Month
The most common finance query aggregates revenue by time period:
SELECT
DATE_TRUNC('month', transaction_date) AS revenue_month,
SUM(amount) AS total_revenue
FROM revenue_transactions
WHERE transaction_date >= '2025-01-01'
AND transaction_date < '2026-01-01'
GROUP BY DATE_TRUNC('month', transaction_date)
ORDER BY revenue_month
DATE_TRUNC groups dates into months, similar to using MONTH() and YEAR() in an Excel pivot table. The WHERE clause sets the date range, and SUM does the aggregation.
Revenue by Product and Region
Add more dimensions by including additional columns in SELECT and GROUP BY:
SELECT
product_category,
region,
DATE_TRUNC('month', transaction_date) AS revenue_month,
SUM(amount) AS total_revenue,
COUNT(DISTINCT customer_id) AS customer_count
FROM revenue_transactions
WHERE transaction_date >= '2025-01-01'
GROUP BY product_category, region, DATE_TRUNC('month', transaction_date)
ORDER BY revenue_month, product_category, region
COUNT(DISTINCT customer_id) gives you unique customer counts, something that is surprisingly difficult to calculate correctly in Excel with large datasets.
Budget vs. Actual Comparison
When actuals and budget data live in different tables, use a JOIN to bring them together:
SELECT
a.department,
a.account_code,
a.period,
a.actual_amount,
b.budget_amount,
a.actual_amount - b.budget_amount AS variance,
CASE
WHEN b.budget_amount = 0 THEN NULL
ELSE (a.actual_amount - b.budget_amount) / b.budget_amount * 100
END AS variance_pct
FROM (
SELECT department, account_code, period, SUM(amount) AS actual_amount
FROM gl_actuals
GROUP BY department, account_code, period
) a
LEFT JOIN (
SELECT department, account_code, period, SUM(amount) AS budget_amount
FROM budget_data
GROUP BY department, account_code, period
) b ON a.department = b.department
AND a.account_code = b.account_code
AND a.period = b.period
ORDER BY a.department, a.account_code, a.period
The LEFT JOIN ensures you see all actual line items even if there is no corresponding budget entry. The CASE statement handles division by zero, similar to using IFERROR in Excel.
Running Totals (Year-to-Date)
Window functions are one of SQL’s most powerful features for finance. A running total calculates cumulative values across an ordered set:
SELECT
period,
monthly_revenue,
SUM(monthly_revenue) OVER (ORDER BY period) AS ytd_revenue
FROM (
SELECT
DATE_TRUNC('month', transaction_date) AS period,
SUM(amount) AS monthly_revenue
FROM revenue_transactions
WHERE transaction_date >= '2026-01-01'
GROUP BY DATE_TRUNC('month', transaction_date)
) monthly
ORDER BY period
The OVER (ORDER BY period) clause tells SQL to compute a cumulative sum ordered by period. This replaces the manual running total formulas you would build in Excel.
Year-Over-Year Comparison
Use the LAG window function to pull the prior year’s value alongside the current year:
SELECT
period,
revenue,
LAG(revenue, 12) OVER (ORDER BY period) AS prior_year_revenue,
revenue - LAG(revenue, 12) OVER (ORDER BY period) AS yoy_change
FROM (
SELECT
DATE_TRUNC('month', transaction_date) AS period,
SUM(amount) AS revenue
FROM revenue_transactions
GROUP BY DATE_TRUNC('month', transaction_date)
) monthly
ORDER BY period
LAG(revenue, 12) looks back 12 rows, which corresponds to 12 months when data is grouped monthly. This is the SQL equivalent of an OFFSET or EDATE reference in Excel.
Top Expense Categories
Identify the largest cost drivers with a ranked query:
SELECT
expense_category,
SUM(amount) AS total_expense,
SUM(amount) * 100.0 / SUM(SUM(amount)) OVER () AS pct_of_total
FROM gl_actuals
WHERE account_type = 'Expense'
AND period >= '2026-01-01'
GROUP BY expense_category
ORDER BY total_expense DESC
LIMIT 20
The window function SUM(SUM(amount)) OVER () calculates the grand total across all categories, allowing you to compute each category’s share as a percentage.
Duplicate Detection for Reconciliation
Finding duplicate transactions is a common reconciliation task:
SELECT
vendor_name,
invoice_number,
amount,
transaction_date,
COUNT(*) AS occurrence_count
FROM ap_transactions
GROUP BY vendor_name, invoice_number, amount, transaction_date
HAVING COUNT(*) > 1
ORDER BY occurrence_count DESC
The HAVING clause filters groups after aggregation, showing only combinations that appear more than once. This is far faster than using COUNTIFS and conditional formatting in Excel to find duplicates in a large dataset.
Common Table Expressions for Complex Analysis
When a query gets complex, break it into named steps using CTEs (Common Table Expressions):
WITH monthly_revenue AS (
SELECT
DATE_TRUNC('month', transaction_date) AS period,
SUM(amount) AS revenue
FROM revenue_transactions
GROUP BY DATE_TRUNC('month', transaction_date)
),
monthly_expenses AS (
SELECT
DATE_TRUNC('month', transaction_date) AS period,
SUM(amount) AS expenses
FROM gl_actuals
WHERE account_type = 'Expense'
GROUP BY DATE_TRUNC('month', transaction_date)
)
SELECT
r.period,
r.revenue,
e.expenses,
r.revenue - e.expenses AS net_income
FROM monthly_revenue r
JOIN monthly_expenses e ON r.period = e.period
ORDER BY r.period
CTEs make complex queries readable and maintainable. Think of them as named ranges in Excel: they do not change the result, but they make the logic much easier to follow.
From Query to Report
Once you write a query, you need to get the results into a usable format. The common approaches are:
- Copy and paste into Excel for one-time analysis
- Export to CSV and link to your model via Power Query
- Connect directly from Power BI or Tableau using the query as a data source
- Save as a view in the database so it refreshes automatically
For recurring reports, saving the query as a database view or connecting it directly to your BI tool eliminates the manual export step entirely.
Practical Tips for Finance SQL Users
Always filter by date. Querying an entire table without a date range can return millions of rows and slow down the database for everyone. Apply a WHERE clause on the date column first.
Test with LIMIT. Add LIMIT 100 (or TOP 100 in SQL Server) to the end of your query while developing. This returns only the first 100 rows so you can verify the logic before running the full query.
Comment your queries. Use double dashes for single-line comments. If you save queries in a file for reuse, add comments explaining the business logic, not just the technical steps.
Start simple and add complexity. Write the most basic version of your query first. Verify the numbers make sense. Then add joins, window functions, and additional filters one at a time. Debugging a 50-line query is much harder than building it up in stages.