Regime Model Specification

This document defines the preparatory specification for the regime-aware model layer. It covers ENG-5241, ENG-5242, ENG-5243, ENG-5244, ENG-5245, ENG-5251, and ENG-5252.

It is a specification and review artifact only. It does not implement model changes or change current UI behavior.

Purpose

The model layer should convert Gordon's market thesis into auditable diagnostics:

Where is spot-vol behavior changing, does the options surface reflect that
change, and should the conditional model be trusted right now?

The model should not simply label BTC as in one global regime. The sponsor discussion made clear that the effect may live in specific parts of the surface: front month, a strike band, a delta bucket, or a smile segment.

Gordon's Thesis To Encode

Recent BTC behavior may have shifted locally from:

spot up -> vol down
spot down -> vol up

to:

spot up -> vol up

The important divergence is that the risk reversal may still show the older equity-style skew pattern while realized local spot-vol behavior is starting to behave differently. That gap can be alpha, but it can also be noise. The model therefore needs evidence, confidence, and surface-location detail.

Ticket Coverage

Ticket	Model concern	Preparatory output
`ENG-5241`	Lookback and post-IBIT policy.	Configurable lookback policy and historical relevance rules.
`ENG-5242`	Spot-vol regime sensitivity by surface location.	Segmented classification by tenor, delta, moneyness, and smile segment.
`ENG-5243`	Smile-deviation matrix audit.	Decomposition of broad vol level, skew, curvature, tenor, and sparse-cell effects.
`ENG-5244`	Risk-reversal versus realized spot-vol divergence.	Divergence states and decision interpretation.
`ENG-5245`	Conditional-vs-vanilla danger zones.	Strike-band probability gap and warning contract.
`ENG-5251`	Model Lab provenance and scenario tuning.	Parameter provenance and scenario review contract.
`ENG-5252`	Mixed thesis and low confidence.	First-class low-confidence state instead of forced conviction.

Lookback Policy

Current pages use inconsistent historical windows, including approximately 730-day, 365-day, 130-day, and 90-day views. This inconsistency needs to become explicit model configuration.

Gordon's comment on post-IBIT history implies this policy:

older BTC option history is not useless
older history should not be treated as structurally identical to the current IBIT/liquid-options era
Model Lab should allow the user to compare longer history, post-IBIT history, and recency-weighted history
production defaults should be stable and documented
scenario overrides should be visible and not silently promoted to production defaults

Recommended model modes:

Mode	Use	Interpretation
`production_default`	Normal dashboard read.	Uses approved defaults and shows source/version.
`post_ibit_focus`	Test modern options-market structure.	Discounts or filters older market-structure observations.
`long_history_context`	Stress broader historical behavior.	Useful for context, but may dilute current regime.
`recent_local`	Detect fast regime transitions.	Higher sensitivity, lower sample size, more fragile.
`scenario_override`	Gordon/user changes assumptions.	Exploratory only unless approved as default.

Every regime output should show:

lookback window
sample size
data start and end dates
whether old/pre-IBIT history is included
weighting method if recency-weighted
scenario override state

Surface-Location Spot-Vol Regime Detection

The spot-vol regime detector should classify behavior by surface location, not only globally.

Dimensions:

Dimension	Examples
Tenor / DTE	7d, 14d, 30d, 60d, 90d, front month, next listed expiry.
Moneyness	OTM put, ATM, OTM call, strike band such as 82k-95k.
Delta bucket	10d put, 25d put, ATM, 25d call, 10d call.
Smile segment	downside wing, put skew, belly, call wing, upside call area.
Time slice	trailing 30d, trailing 90d, post-event window, scenario window.

Classification states:

State	Meaning
`spot_up_vol_down`	Classic recent regime; vol falls as spot rises.
`spot_down_vol_up`	Crash-risk insurance behavior dominates.
`spot_up_vol_up`	Local positive spot-vol sensitivity.
`decorrelated`	Spot and vol relationship is weak or unstable.
`mixed_by_surface`	Different surface regions produce conflicting classifications.
`insufficient_data`	Not enough observations for this surface location.
`stale_source`	Inputs exist but are too stale to trust.

Minimum evidence fields:

surface_regime = {
  currency,
  as_of,
  tenor_bucket,
  dte_range,
  delta_bucket,
  moneyness_bucket,
  smile_segment,
  lookback_mode,
  sample_size,
  spot_return_measure,
  iv_change_measure,
  correlation,
  slope,
  classification,
  p_value_or_confidence_proxy,
  source_quality,
  caveat
}

The exact estimator can evolve. The product need is stable: show whether the observed behavior is global or local, and whether it is robust enough to affect trade warnings.

Risk-Reversal Versus Realized Spot-Vol Divergence

The model should explicitly compare what the risk reversal implies with what realized local spot-vol behavior is doing.

State	Interpretation
`old_regime_confirmed`	Risk reversal and realized behavior both imply classic downside-vol/upside-vol-down behavior.
`new_regime_confirmed`	Risk reversal and realized behavior both support spot-up/vol-up or upside-vol repricing.
`divergence`	Risk reversal still reflects old regime while realized behavior shows local spot-up/vol-up.
`surface_mixed`	Divergence exists in some tenors/delta buckets but not others.
`insufficient_data`	Cannot compare reliably.

The divergence state should not automatically mean "trade now." It means the model found a potentially important mismatch that needs confidence, surface location, and trust-engine gating.

Conditional-Vs-Vanilla Danger Zones

The conditional distribution page resonated with Gordon because it translated the thesis into a clear question:

Where does the conditional model assign materially more probability than
the vanilla market-implied surface, and are we short options there?

For the sponsor discussion, the notable example was the approximate 85k-90k area on a 30-day tenor. The model should generalize this as a strike-band detection problem.

Minimum danger-zone output:

danger_zone = {
  currency,
  as_of,
  tenor_bucket,
  strike_low,
  strike_high,
  vanilla_probability,
  conditional_probability,
  probability_gap,
  gap_materiality,
  trust_state,
  source_quality,
  message,
  warning_severity
}

Interpretation rules:

Condition	Output
Conditional probability materially above vanilla and trust is acceptable.	Warn that shorting options in the band is dangerous.
Conditional probability above vanilla but trust is weak.	Show low-confidence caveat, not a strong warning.
Conditional probability close to vanilla.	No edge or neutral read.
Vanilla probability above conditional.	Market may already overprice that band, subject to trust.
Inputs are missing or unstable.	Suppress strong wording and show unavailable/low-confidence reason.

Warning language should remain risk-control language, not trade advice.

Smile-Deviation Matrix Audit

Gordon thought the smile-deviation matrix was useful in principle but suspicious because every cell appeared negative. That output may mean IV is broadly near the bottom of the recent range, or it may mean the baseline logic is too crude.

The audit should decompose the matrix into separate effects:

Effect	Question
Broad level	Is the entire surface low versus recent history?
Skew / risk reversal	Are puts rich or calls cheap relative to ATM?
Curvature	Is the smile belly or wing shape unusual?
Tenor normalization	Are short tenors being compared to long-tenor baselines?
Delta comparability	Are equivalent delta buckets being compared through time?
Moneyness drift	Are strike buckets moving as spot moves?
Sparse cells	Are some cells driven by too few observations?
Source staleness	Are current or baseline rows stale or incomplete?

The matrix should eventually distinguish:

"all vol is cheap versus history"
"calls are cheap relative to ATM"
"puts are rich relative to calls"
"front month is cheap versus later tenors"
"cell is unreliable because history is sparse"

Trust Engine And Mixed Thesis

The forecasting layer asks:

What does the conditional setup imply?

The trust engine asks:

Should we trust that implication right now?

Low confidence is a valid output. If elevated-vol and lower-vol evidence are balanced, or stationarity is unstable, the system should say so instead of forcing a directional read.

Trust inputs:

stationarity KS p-value threshold
mean-shift threshold
gamma structure stability
spot-vol correlation trend
source freshness
surface-location sample size
signal agreement
lookback policy
mixed-thesis balance
proxy/estimated/unavailable states

Trust states:

State	Meaning	Downstream behavior
`trusted`	Evidence is coherent and source quality is acceptable.	Conditional outputs may be shown with normal caveats.
`discounted`	Model produces a read but trust inputs are weak.	Show reduced confidence and avoid strong warning language.
`mixed_thesis`	Elevated and lower-vol evidence are both material.	State that evidence is mixed; do not force direction.
`unstable`	Stationarity or relationship stability is poor.	Discount conditional probabilities.
`insufficient`	Not enough usable data.	Withhold decision statements.

Model Lab Provenance

Model Lab should stay central because Gordon wants to inspect and tune assumptions.

Every displayed model result should expose:

risk-free rate
density method
lookback window
post-IBIT inclusion or weighting
stationarity KS p-value threshold
mean-shift threshold
blend weights
gamma-conditioning parameters
candidate filters
source-quality exclusions
scenario override flags
calculation version

Model Lab should clearly distinguish:

Setting source	Meaning
`production_default`	Approved current default.
`user_override`	Scenario value changed by the operator.
`unsupported`	Parameter shown but not used by the current method.
`excluded_by_quality`	Parameter or signal excluded because source quality failed.

Scenario outputs should be exportable or reviewable, but they should not become production defaults unless explicitly approved.

Decision Points

Decision	Why it matters	Can proceed now?
Should post-IBIT history be a hard filter, a toggle, or a recency weighting?	Controls relevance of older BTC vol history.	Expose options and document default; final production policy needs sponsor review.
What default lookback should each model family use?	Prevents 730d/130d/90d inconsistency.	Centralize defaults in spec and Model Lab.
What materiality threshold defines a danger-zone probability gap?	Controls warning sensitivity.	Build parameter and documentation; final threshold needs review.
What confidence level is required for strong accident warnings?	Affects trader behavior.	Define severity tiers; strong wording needs review.
Should the smile matrix use absolute IV z-score, relative skew, or both?	Determines whether "everything cheap" is meaningful.	Audit and document decomposition first.
Should Model Lab scenarios be saved, exported, or only temporary?	Affects governance.	Document provenance now; storage policy later.

Preparatory Acceptance

This specification is complete when:

lookback and post-IBIT relevance are explicit model settings
spot-vol regime is defined by surface location
risk-reversal divergence has named states
conditional danger zones have a strike-band output contract
the smile matrix audit separates broad level from skew and curvature
Model Lab provenance is documented
mixed thesis and low-confidence states are treated as valid model outputs