Brier-calibrated multi-thesis ensemble across US, Korea, FX, and crypto markets. Every prediction carries a full evidence chain — source provenance, per-thesis AUC, confidence intervals, and a deterministic replay guarantee.
Every prediction is traceable. Every thesis is calibrated with empirical data. Every data response is reproducible to the byte.
Outputs Prediction(p_up, p_up_lower, p_up_upper, contributing…) with Brier-derived ensemble weights per thesis. No BUY/SELL output — ever.
Turns any thesis into a calibration row — Brier score, AUC, Sharpe, OOS degradation — with explicit IS/OOS split and full reproducibility guarantees.
Every external API response is persisted as a parquet shard + SQLite index + Merkle leaf. Any prediction can be replayed bit-for-bit months later (INV-GS-022).
MIT license. Fork-friendly. Designed so third-party thesis authors can plug in their own modules and contribute calibration data to the shared table.
Broadcast to Telegram or mass-email raises ComplianceError permanently and unconditionally (INV-GS-024). Every Prediction carries a non-empty disclaimer field.
Every LLM call is pinned to a sha256 so the prompt graph is auditable across versions. Combined with 45+ numbered invariants and 1:1 unit-test mapping.
Free-stack data sources only in MVP — yfinance, SEC EDGAR, Naver Finance, CCXT. Paid sources (Bigdata MCP) are phase-gated behind explicit activation.
v0.6 called these "8 thesis FAIL" against a rigid Sharpe gate and shut down. v1.0 reframes the same data as the calibration baseline — weak signals carry near-zero Brier weight, strong signals carry proportional weight.
Brier-derived weights are illustrative; actual values computed at run time from cache/calibration_table.parquet.
Full table and interpretation: docs/CALIBRATION.md
Requires Python ≥ 3.11. No paid API keys needed in default MVP mode.
# Install (uv preferred) pip install glostat # Run a prediction (mock mode, no network) glostat predict AAPL --horizon 5d --mock # Live KR prediction (free, no Bigdata MCP needed) GLOSTAT_SEC_USER_AGENT="Your Name [email protected]" \ glostat predict 005930 # 삼성전자 glostat predict 096770 # SK Innovation # Quarterly recalibration glostat calibrate --all-thesis --window 365d glostat calibrate --update-table
from glostat.predictor.composite import composite_p_up from glostat.core.types import Prediction # Get a calibrated probability prediction prediction: Prediction = pipeline.predict("AAPL", horizon="5d") print(f"p_up = {prediction.p_up:.3f} " f"90%CI=[{prediction.p_up_lower:.3f}, {prediction.p_up_upper:.3f}]") for c in prediction.contributing: print(f" {c.thesis_name:24} " f"dir={c.direction:4} " f"weight={c.brier_weight:.3f} " f"AUC={c.auc:.3f} " f"n={c.n_calibration_samples}") print(prediction.disclaimer) # Always non-empty (INV-GS-104)
The infrastructure is independent of which thesis you screen. New thesis modules just need calibration data attached to the PR.
Subclass the Thesis protocol in src/glostat/experts/. Return a typed (direction, raw_score, sources). See docs/EXAMPLES.md.
Add a routing entry in data_router.py. The DataRouter enforces phase gating so paid sources stay blocked until you explicitly opt in.
Configure Hindcast, point at a universe, get an IS/OOS report with AUC, Sharpe, and Brier score. Minimum n=50 required (INV-GS-026).
Append the result to calibration_table.parquet. The Brier-weighted ensemble picks the weight automatically at the next recalibration run.