Workflow AutomationResearch & Data · London

Automated 20+ data products — 80% less manual effort

How automating a global ETF research firm's data pipeline freed up the team to focus on analysis instead of production.

Manual effort per week

~100% manual

80% automated

Data products automated

20+ products

Dataset covered

Manual handling

$10T+ AUM

Error rate

Manual = error-prone

Validated on every run

⚠️

The problem

A global ETF research firm produced 20+ recurring data products every week — market summaries, flow reports, asset class breakdowns — across a $10 trillion+ dataset spanning thousands of funds worldwide. Every product was produced manually: pull data, clean it, format it, check it, send it.

The team was spending most of their time producing reports rather than analysing them. Clients were waiting. Errors were creeping in from manual handling. And as the client list grew, the process wasn't scaling.

🔧

What we built

We rebuilt the entire data pipeline in Python — automated ingestion, cleaning, validation, transformation, and output formatting. Each product became a scheduled job: it runs, produces the output in the right format, validates it against expected ranges, and delivers it.

We also built a lightweight monitoring layer so the team could see at a glance which products had run, which had flagged anomalies, and which needed human review.

Tech used

PythonPandasScheduled pipelinesData validationAutomated deliveryExcel/PDF output

Key learning

“Automation without validation is dangerous. The most important part of this build wasn't the automation itself — it was building in the checks that caught when something unexpected happened in the data before it went out to clients.”

Have a similar problem?

Book a free 30-minute call — we'll map out exactly what's possible for your business.

Book a Discovery Call See more case studies