How to Build a Crypto Funding Rate Data Pipeline: Q&A with a Developer

Need to track funding rates across multiple exchanges in one place? You're not alone. Many crypto traders want historical rates, cross-exchange comparisons, and accurate annualized calculations—but free tools often fall short. In this Q&A, I share how I built a complete data pipeline using Python, TimescaleDB, and a static site generator. The system collects data every 8 hours from Bybit and Binance, computes properly compounded annualized rates, and serves snapshots via static JSON files—all hosted for free on Cloudflare Pages.

Why build a custom funding rate data pipeline instead of using existing tools?
What architecture did you choose for the pipeline?
How do you collect funding rate data from exchanges?
How do you correctly annualize funding rates?
How does the snapshot generator work?
What would you do differently if starting over?
What was the final result?

Why build a custom funding rate data pipeline instead of using existing tools?

Existing free tools often fail to deliver what serious traders need: historical data over weeks or months, the ability to compare rates side-by-side across exchanges, and accurate annualized calculations. Most platforms either show only the raw 8-hour rate or use simple linear multiplication, which underestimates the true compounding effect. I needed a solution that stored every rate change, computed proper annualized figures, and let me compare, say, Bybit's BTC rate against Binance's—all in one view. Nothing free met those requirements, so I built a custom pipeline.

How to Build a Crypto Funding Rate Data Pipeline: Q&A with a Developer — Source: dev.to

What architecture did you choose for the pipeline?

The pipeline follows a collect-store-serve pattern. A scheduled Python collector runs every 8 hours, pulls the latest funding rates from exchange APIs (no authentication needed for reads), and writes them into a PostgreSQL database with TimescaleDB extensions. Then a separate job generates static JSON snapshots from the database—also three times a day—and triggers a rebuild of the site on Cloudflare Pages. The frontend is built with Astro and reads those JSON files at build time. The key design decision: never query the database from the frontend. This keeps page loads under 50ms globally, imposes zero backend load, and avoids API rate limits entirely.

How do you collect funding rate data from exchanges?

Both Bybit and Binance expose public REST endpoints for funding rate history. No API key is required. The collector runs on a cron-like schedule, hits those endpoints for each asset (BTC, ETH, SOL, etc.), and inserts the raw data into TimescaleDB. Each row stores: the asset symbol, exchange name, raw 8-hour funding rate, the annualized rate (computed before storing), and a timestamp. The database uses hypertables (TimescaleDB's partitioning) to speed up time-series queries—a massive improvement over plain PostgreSQL.

How do you correctly annualize funding rates?

Most tools miscalculate by simply multiplying the 8-hour rate by 3 (settlements per day) and then by 365. That ignores compounding. The correct formula treats each 8-hour period as a growth factor: annualized = ((1 + rate_8h) ** (3 * 365) - 1) * 100. The exponent 3 * 365 = 1,095 compounding periods per year. For a raw rate of 0.01% per 8 hours, simple linear multiplication gives ~10.95% annually, but compounding yields ~11.6%. This difference matters when comparing spreads: a 0.03% vs 0.05% raw rate produces a larger annualized spread under compounding. Storing the properly annualized figure upfront ensures every downstream calculation is accurate.

How does the snapshot generator work?

Every 8 hours (aligned with exchange settlement times), a scheduled job queries TimescaleDB for the latest rates across all assets and exchanges. It computes cross-exchange spreads (e.g., difference between Binance and Bybit for the same asset), then writes a separate JSON file per asset. These files are placed in a directory that Astro reads at build time. The snapshot generator also triggers a Cloudflare Pages redeployment so the static site gets the new data. During a build, Astro imports each JSON file as a data module—no runtime database queries, no API calls. The result is a purely static site that loads instantly anywhere in the world.

What would you do differently if starting over?

Three things stand out. First, start with TimescaleDB from day one. I began with plain PostgreSQL and later migrated to hypertables—time-series queries like rolling averages became dramatically faster. Second, collect more frequently than you display. I collect every 8 hours but could go hourly. Having higher-resolution data lets you spot intraday patterns even if you only show 8-hour snapshots to users. Third, add WebSocket support for liquidation data early. REST polling works fine for funding rates (they change only every 8 hours), but liquidations happen in real time and require a persistent connection. Including WebSocket from the start would avoid a later retrofit.

What was the final result?

The finished product, FundingKai, tracks 10 major assets across multiple exchanges. The data pipeline runs autonomously—no manual intervention since launch. The static site serves funding rate snapshots and comparisons with sub-50ms response times worldwide, hosted entirely on Cloudflare Pages at zero cost. If you're building something similar, the core insight is: separate collection from presentation. Collect into a proper time-series database, then serve static snapshots. This approach scales effortlessly, avoids backend costs, and delivers a fast, reliable user experience.