Data Pipeline
Data ingestion, normalization, feature engineering, and data quality controls
Sources
Exchange market data (candles, trades, order books)
DEX pool states and on-chain events (swaps, liquidity changes)
Reference data (asset metadata, oracles)
Normalization
Time alignment to canonical intervals; clock skew handling
Missing data handling (gap-filling with confidence flags)
Outlier detection (z-score and robust estimators)
Feature engineering
Price/volume transforms (returns, volatility, microstructure metrics)
Order book features (depth imbalance, spread dynamics)
On-chain flow features (pool TVL changes, swap volume bursts)
Sentiment features (level, velocity, decay)
Quality and lineage
Schema versioning for backwards-compatible model inputs
Data quality SLAs with alerting; quarantine on breach
Reproducible snapshots for training and backtesting
Last updated

