Automating Financial Reports with Python

The Case for Automation in Finance

Every finance team has at least one report that consumes a disproportionate amount of time. Maybe it is the weekly cash position summary that requires logging into three bank portals, downloading CSVs, consolidating them in Excel, and emailing the result to the treasury team. Maybe it is the monthly department spending report that involves pulling GL data, matching it to budgets, building variance tables, and formatting everything for presentation.

These reports follow the same steps every time. The logic does not change. The format does not change. Only the data changes. This is exactly the kind of work that Python handles exceptionally well.

You do not need to become a software engineer to automate financial reports. You need to learn a handful of Python libraries, understand the basic workflow, and be willing to invest a few hours upfront to save dozens of hours over the following months.

The Automation Workflow

Every automated financial report follows the same four-stage pipeline:

Extract data from source systems
Transform the data into the required structure
Generate the output report
Distribute the report to stakeholders

Python has mature libraries for each stage. The key is connecting them into a script that runs end to end with minimal manual intervention.

Essential Python Libraries for Finance

pandas

The workhorse of data manipulation in Python. pandas provides DataFrames, which are essentially programmable spreadsheets. If you can use Excel pivot tables and SUMIFS, you can learn pandas. It handles reading CSVs, filtering, grouping, joining tables, and calculating new columns.

openpyxl

This library reads and writes Excel files with full formatting support. It lets you populate a pre-formatted Excel template with fresh data, preserving charts, conditional formatting, and cell styles. This is critical for finance because stakeholders expect polished Excel output, not raw CSV files.

xlsxwriter

An alternative to openpyxl focused on creating new Excel files from scratch. It offers fine-grained control over formatting, charts, and layout. Use it when you are building a report template programmatically rather than filling in an existing one.

smtplib and email

Built-in Python libraries for sending emails. After generating a report, you can attach it to an email and send it automatically, completing the full automation loop.

sqlalchemy

Connects Python to databases. If your source data lives in SQL Server, PostgreSQL, or MySQL, sqlalchemy lets you run queries directly from your Python script and load results into pandas DataFrames.

Step-by-Step: Automating a Monthly Variance Report

Step 1: Extract Data

Start by pulling the data you need. In this example, we read actuals from a database and budget from an Excel file:

import pandas as pd
from sqlalchemy import create_engine

# Connect to the database
engine = create_engine('postgresql://user:password@host:5432/finance_db')

# Pull actuals
actuals = pd.read_sql("""
    SELECT department, account_code, period, SUM(amount) as actual_amount
    FROM gl_actuals
    WHERE period >= '2026-01-01'
    GROUP BY department, account_code, period
""", engine)

# Pull budget from Excel
budget = pd.read_excel('budget_2026.xlsx', sheet_name='Monthly')

Store database credentials in environment variables or a configuration file rather than hardcoding them in the script. This is a basic security practice that also makes the script portable across environments.

Step 2: Transform the Data

Clean and merge the datasets, then calculate variances:

# Merge actuals and budget
report = actuals.merge(
    budget,
    on=['department', 'account_code', 'period'],
    how='left'
)

# Calculate variance
report['variance'] = report['actual_amount'] - report['budget_amount']
report['variance_pct'] = report['variance'] / report['budget_amount']

# Create summary by department
dept_summary = report.groupby('department').agg(
    total_actual=('actual_amount', 'sum'),
    total_budget=('budget_amount', 'sum')
).reset_index()

dept_summary['total_variance'] = dept_summary['total_actual'] - dept_summary['total_budget']
dept_summary['variance_pct'] = dept_summary['total_variance'] / dept_summary['total_budget']

This replaces the SUMIFS, INDEX-MATCH, and manual aggregation steps that you would normally do in Excel. The logic is explicit and repeatable.

Step 3: Generate the Report

Write the results to a formatted Excel file:

from openpyxl import load_workbook
from openpyxl.styles import Font, PatternFill, numbers

# Load the template
wb = load_workbook('variance_template.xlsx')
ws = wb['Summary']

# Write department summary data
for idx, row in dept_summary.iterrows():
    excel_row = idx + 3  # Start at row 3 to leave room for headers
    ws.cell(row=excel_row, column=1, value=row['department'])
    ws.cell(row=excel_row, column=2, value=row['total_actual'])
    ws.cell(row=excel_row, column=3, value=row['total_budget'])
    ws.cell(row=excel_row, column=4, value=row['total_variance'])

    # Apply red fill for negative variances
    variance_cell = ws.cell(row=excel_row, column=4)
    if row['total_variance'] < 0:
        variance_cell.fill = PatternFill(start_color='FFCCCC', fill_type='solid')

# Write the detail tab
ws_detail = wb['Detail']
for idx, row in report.iterrows():
    excel_row = idx + 2
    ws_detail.cell(row=excel_row, column=1, value=row['department'])
    ws_detail.cell(row=excel_row, column=2, value=row['account_code'])
    ws_detail.cell(row=excel_row, column=3, value=row['period'].strftime('%Y-%m'))
    ws_detail.cell(row=excel_row, column=4, value=row['actual_amount'])
    ws_detail.cell(row=excel_row, column=5, value=row['budget_amount'])
    ws_detail.cell(row=excel_row, column=6, value=row['variance'])

# Save the report
output_filename = f"variance_report_{pd.Timestamp.now().strftime('%Y%m')}.xlsx"
wb.save(output_filename)

Using a template file is the key insight here. Design your template once in Excel with all the formatting, headers, logos, and chart placeholders. The Python script populates the data cells while everything else stays intact.

Step 4: Distribute the Report

Send the report via email automatically:

import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email.mime.text import MIMEText
from email import encoders

msg = MIMEMultipart()
msg['From'] = '[email protected]'
msg['To'] = '[email protected]'
msg['Subject'] = f'Monthly Variance Report - {pd.Timestamp.now().strftime("%B %Y")}'

body = """Hi team,

Please find attached the monthly variance report. Key highlights:

- Total actual spend: ${:,.0f}
- Total budget: ${:,.0f}
- Net variance: ${:,.0f}

Reach out with any questions.

Best,
Finance Team
""".format(
    dept_summary['total_actual'].sum(),
    dept_summary['total_budget'].sum(),
    dept_summary['total_variance'].sum()
)

msg.attach(MIMEText(body, 'plain'))

# Attach the Excel file
with open(output_filename, 'rb') as f:
    attachment = MIMEBase('application', 'octet-stream')
    attachment.set_payload(f.read())
    encoders.encode_base64(attachment)
    attachment.add_header('Content-Disposition', f'attachment; filename={output_filename}')
    msg.attach(attachment)

# Send
server = smtplib.SMTP('smtp.company.com', 587)
server.starttls()
server.login('[email protected]', 'app_password')
server.send_message(msg)
server.quit()

Scheduling the Script

A script that you have to remember to run is only half-automated. Use a scheduler to run it automatically:

Windows Task Scheduler for scripts running on a local machine
cron on Linux or Mac servers
Airflow or Prefect for more complex workflows with dependencies
Cloud Functions (AWS Lambda, Azure Functions) for serverless execution

For most finance teams, Windows Task Scheduler or a simple cron job is sufficient. Set the script to run on the second business day of each month, after the close is complete, and the report generates and distributes itself.

Error Handling and Logging

Automated scripts need to fail gracefully. Add try-except blocks around critical operations and log the results:

import logging

logging.basicConfig(
    filename='report_log.txt',
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)

try:
    actuals = pd.read_sql(query, engine)
    logging.info(f"Pulled {len(actuals)} rows of actuals data")
except Exception as e:
    logging.error(f"Failed to pull actuals: {e}")
    # Send alert email to the finance team
    raise

If the database is down or the budget file is missing, the script should log the error and notify someone rather than silently producing a broken report.

Building Your Automation Portfolio

Start with one report. Pick the most time-consuming, least intellectually stimulating report you produce. Automate it end to end. Once it runs reliably for two or three cycles, move to the next one.

Common candidates for early automation include:

Weekly cash position summaries
Monthly department spending reports
Quarterly board deck data preparation
Daily revenue flash reports
Headcount reconciliation between HR and finance systems

Each automated report frees up hours that you can redirect toward analysis, business partnering, and strategic work. Over time, the compounding effect is significant. A finance team that automates five recurring reports saves hundreds of hours per year, time that translates directly into higher-value output.