Futures Foundation Model (FFM) β Pretrained Backbone v2
A pretrained transformer backbone for futures market structure and regime classification. Trained on ~2.3M bars across 6 instruments (Oct 2020 β Apr 2026).
Model Files
| File | Description |
|---|---|
best_backbone.pt |
Backbone weights only (no pretraining heads) β use for fine-tuning |
best_pretrained.pt |
Backbone + 4 frozen context heads (regime/vol/structure/range) β recommended for fine-tuning with context heads |
Links
- GitHub: johnamcruz/Futures-Foundation-Model
- Fine-tuning framework docs: README
Usage
pip install git+https://github.com/johnamcruz/Futures-Foundation-Model.git
from huggingface_hub import hf_hub_download
from futures_foundation import FFMConfig, FFMBackbone
# Download weights
path = hf_hub_download(repo_id="johnamcruz/futures-foundation-model", filename="best_pretrained.pt")
config = FFMConfig()
backbone = FFMBackbone(config)
backbone.load_pretrained(path)
embeddings = backbone(features_tensor) # (batch, 256)
Fine-Tuning a Strategy
from futures_foundation.finetune import StrategyLabeler, TrainingConfig, run_walk_forward
class MyStrategyLabeler(StrategyLabeler):
@property
def name(self): return 'my_strategy'
@property
def feature_cols(self): return ['zone_height', 'entry_depth', 'risk_norm']
def run(self, df_raw, ffm_df, ticker):
features_df, labels_df = my_signal_logic(df_raw, ffm_df)
return features_df, labels_df
See the full documentation for the complete fine-tuning framework.
Pretraining Details
- Architecture: 6-layer transformer encoder, 8 heads, 256-dim, 512 FFN
- Input: 68 ATR-normalized continuous features + candle_type embedding (69 total)
- Sequence length: 96 bars (8 hours of 5-min data)
- Instruments: ES, NQ, RTY, YM, GC, SI
- Training data: ~2.3M bars, Oct 2020 β Apr 2026
- Pretraining tasks: Regime (4-class), Volatility State (4-class), Market Structure (2-class), Range Position (5-class)
Context Heads (best_pretrained.pt)
The pretrained checkpoint includes 4 frozen context heads that expose explicit market state at inference:
| Head | Classes | Output |
|---|---|---|
| Regime | Trending Up / Trending Down / Rotational / Volatile | 4-dim softmax |
| Volatility | Low / Normal / Elevated / Extreme | 4-dim softmax |
| Structure | Bullish / Bearish | 2-dim softmax |
| Range Position | 5 quintiles | 5-dim softmax |
These 15 dimensions are concatenated into the fusion layer during fine-tuning, giving the signal head named handles on market state without relying on implicit encoding.
Why Use the Pretrained Backbone
- Regime changes don't require retraining β context heads adapt at inference as market conditions shift
- Adding new data is just a re-run β backbone representations are stable; only re-run strategy fine-tuning when new bars arrive
- One backbone, unlimited strategies β CISD, SuperTrend, ORB, breaker blocks all fine-tune on the same backbone
- 5+ year training window covers COVID crash, 2021 melt-up, 2022 bear, 2023 recovery, 2025 volatility β new regimes map to existing embedding space
License
Apache 2.0
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support