Regime Model Specification
This document defines the preparatory specification for the regime-aware model layer. It covers ENG-5241, ENG-5242, ENG-5243, ENG-5244, ENG-5245, ENG-5251, and ENG-5252.
It is a specification and review artifact only. It does not implement model changes or change current UI behavior.
Purpose
The model layer should convert Gordon's market thesis into auditable diagnostics:
Where is spot-vol behavior changing, does the options surface reflect that
change, and should the conditional model be trusted right now?
The model should not simply label BTC as in one global regime. The sponsor discussion made clear that the effect may live in specific parts of the surface: front month, a strike band, a delta bucket, or a smile segment.
Gordon's Thesis To Encode
Recent BTC behavior may have shifted locally from:
spot up -> vol down
spot down -> vol up
to:
spot up -> vol up
The important divergence is that the risk reversal may still show the older equity-style skew pattern while realized local spot-vol behavior is starting to behave differently. That gap can be alpha, but it can also be noise. The model therefore needs evidence, confidence, and surface-location detail.
Ticket Coverage
| Ticket | Model concern | Preparatory output |
|---|---|---|
ENG-5241 | Lookback and post-IBIT policy. | Configurable lookback policy and historical relevance rules. |
ENG-5242 | Spot-vol regime sensitivity by surface location. | Segmented classification by tenor, delta, moneyness, and smile segment. |
ENG-5243 | Smile-deviation matrix audit. | Decomposition of broad vol level, skew, curvature, tenor, and sparse-cell effects. |
ENG-5244 | Risk-reversal versus realized spot-vol divergence. | Divergence states and decision interpretation. |
ENG-5245 | Conditional-vs-vanilla danger zones. | Strike-band probability gap and warning contract. |
ENG-5251 | Model Lab provenance and scenario tuning. | Parameter provenance and scenario review contract. |
ENG-5252 | Mixed thesis and low confidence. | First-class low-confidence state instead of forced conviction. |
Lookback Policy
Current pages use inconsistent historical windows, including approximately 730-day, 365-day, 130-day, and 90-day views. This inconsistency needs to become explicit model configuration.
Gordon's comment on post-IBIT history implies this policy:
- older BTC option history is not useless
- older history should not be treated as structurally identical to the current IBIT/liquid-options era
- Model Lab should allow the user to compare longer history, post-IBIT history, and recency-weighted history
- production defaults should be stable and documented
- scenario overrides should be visible and not silently promoted to production defaults
Recommended model modes:
| Mode | Use | Interpretation |
|---|---|---|
production_default | Normal dashboard read. | Uses approved defaults and shows source/version. |
post_ibit_focus | Test modern options-market structure. | Discounts or filters older market-structure observations. |
long_history_context | Stress broader historical behavior. | Useful for context, but may dilute current regime. |
recent_local | Detect fast regime transitions. | Higher sensitivity, lower sample size, more fragile. |
scenario_override | Gordon/user changes assumptions. | Exploratory only unless approved as default. |
Every regime output should show:
- lookback window
- sample size
- data start and end dates
- whether old/pre-IBIT history is included
- weighting method if recency-weighted
- scenario override state
Surface-Location Spot-Vol Regime Detection
The spot-vol regime detector should classify behavior by surface location, not only globally.
Dimensions:
| Dimension | Examples |
|---|---|
| Tenor / DTE | 7d, 14d, 30d, 60d, 90d, front month, next listed expiry. |
| Moneyness | OTM put, ATM, OTM call, strike band such as 82k-95k. |
| Delta bucket | 10d put, 25d put, ATM, 25d call, 10d call. |
| Smile segment | downside wing, put skew, belly, call wing, upside call area. |
| Time slice | trailing 30d, trailing 90d, post-event window, scenario window. |
Classification states:
| State | Meaning |
|---|---|
spot_up_vol_down | Classic recent regime; vol falls as spot rises. |
spot_down_vol_up | Crash-risk insurance behavior dominates. |
spot_up_vol_up | Local positive spot-vol sensitivity. |
decorrelated | Spot and vol relationship is weak or unstable. |
mixed_by_surface | Different surface regions produce conflicting classifications. |
insufficient_data | Not enough observations for this surface location. |
stale_source | Inputs exist but are too stale to trust. |
Minimum evidence fields:
surface_regime = {
currency,
as_of,
tenor_bucket,
dte_range,
delta_bucket,
moneyness_bucket,
smile_segment,
lookback_mode,
sample_size,
spot_return_measure,
iv_change_measure,
correlation,
slope,
classification,
p_value_or_confidence_proxy,
source_quality,
caveat
}
The exact estimator can evolve. The product need is stable: show whether the observed behavior is global or local, and whether it is robust enough to affect trade warnings.
Risk-Reversal Versus Realized Spot-Vol Divergence
The model should explicitly compare what the risk reversal implies with what realized local spot-vol behavior is doing.
| State | Interpretation |
|---|---|
old_regime_confirmed | Risk reversal and realized behavior both imply classic downside-vol/upside-vol-down behavior. |
new_regime_confirmed | Risk reversal and realized behavior both support spot-up/vol-up or upside-vol repricing. |
divergence | Risk reversal still reflects old regime while realized behavior shows local spot-up/vol-up. |
surface_mixed | Divergence exists in some tenors/delta buckets but not others. |
insufficient_data | Cannot compare reliably. |
The divergence state should not automatically mean "trade now." It means the model found a potentially important mismatch that needs confidence, surface location, and trust-engine gating.
Conditional-Vs-Vanilla Danger Zones
The conditional distribution page resonated with Gordon because it translated the thesis into a clear question:
Where does the conditional model assign materially more probability than
the vanilla market-implied surface, and are we short options there?
For the sponsor discussion, the notable example was the approximate 85k-90k area on a 30-day tenor. The model should generalize this as a strike-band detection problem.
Minimum danger-zone output:
danger_zone = {
currency,
as_of,
tenor_bucket,
strike_low,
strike_high,
vanilla_probability,
conditional_probability,
probability_gap,
gap_materiality,
trust_state,
source_quality,
message,
warning_severity
}
Interpretation rules:
| Condition | Output |
|---|---|
| Conditional probability materially above vanilla and trust is acceptable. | Warn that shorting options in the band is dangerous. |
| Conditional probability above vanilla but trust is weak. | Show low-confidence caveat, not a strong warning. |
| Conditional probability close to vanilla. | No edge or neutral read. |
| Vanilla probability above conditional. | Market may already overprice that band, subject to trust. |
| Inputs are missing or unstable. | Suppress strong wording and show unavailable/low-confidence reason. |
Warning language should remain risk-control language, not trade advice.
Smile-Deviation Matrix Audit
Gordon thought the smile-deviation matrix was useful in principle but suspicious because every cell appeared negative. That output may mean IV is broadly near the bottom of the recent range, or it may mean the baseline logic is too crude.
The audit should decompose the matrix into separate effects:
| Effect | Question |
|---|---|
| Broad level | Is the entire surface low versus recent history? |
| Skew / risk reversal | Are puts rich or calls cheap relative to ATM? |
| Curvature | Is the smile belly or wing shape unusual? |
| Tenor normalization | Are short tenors being compared to long-tenor baselines? |
| Delta comparability | Are equivalent delta buckets being compared through time? |
| Moneyness drift | Are strike buckets moving as spot moves? |
| Sparse cells | Are some cells driven by too few observations? |
| Source staleness | Are current or baseline rows stale or incomplete? |
The matrix should eventually distinguish:
- "all vol is cheap versus history"
- "calls are cheap relative to ATM"
- "puts are rich relative to calls"
- "front month is cheap versus later tenors"
- "cell is unreliable because history is sparse"
Trust Engine And Mixed Thesis
The forecasting layer asks:
What does the conditional setup imply?
The trust engine asks:
Should we trust that implication right now?
Low confidence is a valid output. If elevated-vol and lower-vol evidence are balanced, or stationarity is unstable, the system should say so instead of forcing a directional read.
Trust inputs:
- stationarity KS p-value threshold
- mean-shift threshold
- gamma structure stability
- spot-vol correlation trend
- source freshness
- surface-location sample size
- signal agreement
- lookback policy
- mixed-thesis balance
- proxy/estimated/unavailable states
Trust states:
| State | Meaning | Downstream behavior |
|---|---|---|
trusted | Evidence is coherent and source quality is acceptable. | Conditional outputs may be shown with normal caveats. |
discounted | Model produces a read but trust inputs are weak. | Show reduced confidence and avoid strong warning language. |
mixed_thesis | Elevated and lower-vol evidence are both material. | State that evidence is mixed; do not force direction. |
unstable | Stationarity or relationship stability is poor. | Discount conditional probabilities. |
insufficient | Not enough usable data. | Withhold decision statements. |
Model Lab Provenance
Model Lab should stay central because Gordon wants to inspect and tune assumptions.
Every displayed model result should expose:
- risk-free rate
- density method
- lookback window
- post-IBIT inclusion or weighting
- stationarity KS p-value threshold
- mean-shift threshold
- blend weights
- gamma-conditioning parameters
- candidate filters
- source-quality exclusions
- scenario override flags
- calculation version
Model Lab should clearly distinguish:
| Setting source | Meaning |
|---|---|
production_default | Approved current default. |
user_override | Scenario value changed by the operator. |
unsupported | Parameter shown but not used by the current method. |
excluded_by_quality | Parameter or signal excluded because source quality failed. |
Scenario outputs should be exportable or reviewable, but they should not become production defaults unless explicitly approved.
Decision Points
| Decision | Why it matters | Can proceed now? |
|---|---|---|
| Should post-IBIT history be a hard filter, a toggle, or a recency weighting? | Controls relevance of older BTC vol history. | Expose options and document default; final production policy needs sponsor review. |
| What default lookback should each model family use? | Prevents 730d/130d/90d inconsistency. | Centralize defaults in spec and Model Lab. |
| What materiality threshold defines a danger-zone probability gap? | Controls warning sensitivity. | Build parameter and documentation; final threshold needs review. |
| What confidence level is required for strong accident warnings? | Affects trader behavior. | Define severity tiers; strong wording needs review. |
| Should the smile matrix use absolute IV z-score, relative skew, or both? | Determines whether "everything cheap" is meaningful. | Audit and document decomposition first. |
| Should Model Lab scenarios be saved, exported, or only temporary? | Affects governance. | Document provenance now; storage policy later. |
Preparatory Acceptance
This specification is complete when:
- lookback and post-IBIT relevance are explicit model settings
- spot-vol regime is defined by surface location
- risk-reversal divergence has named states
- conditional danger zones have a strike-band output contract
- the smile matrix audit separates broad level from skew and curvature
- Model Lab provenance is documented
- mixed thesis and low-confidence states are treated as valid model outputs