v2.0.1 · MIT License · Python ≥ 3.11

Evidence-based
Probability Predictor
for Global Equities

Brier-calibrated multi-thesis ensemble across US, Korea, FX, and crypto markets. Every prediction carries a full evidence chain — source provenance, per-thesis AUC, confidence intervals, and a deterministic replay guarantee.

$ pip install glostat click to copy

GitHub PyPI Contact

Information tool only. GLOSTAT outputs probability distributions with explicit confidence intervals and source provenance — not investment recommendations, not securities solicitations, not financial advice. Past calibration data does not guarantee future predictive performance. Users are solely responsible for their own decisions.

What GLOSTAT is

A research framework, not a black box

Every prediction is traceable. Every thesis is calibrated with empirical data. Every data response is reproducible to the byte.

📐

Calibrated Probability Predictor

Outputs Prediction(p_up, p_up_lower, p_up_upper, contributing…) with Brier-derived ensemble weights per thesis. No BUY/SELL output — ever.

🔁

Deterministic Hindcast Harness

Turns any thesis into a calibration row — Brier score, AUC, Sharpe, OOS degradation — with explicit IS/OOS split and full reproducibility guarantees.

🗄️

Snapshot Broker

Every external API response is persisted as a parquet shard + SQLite index + Merkle leaf. Any prediction can be replayed bit-for-bit months later (INV-GS-022).

🔓

Open-Source Framework

MIT license. Fork-friendly. Designed so third-party thesis authors can plug in their own modules and contribute calibration data to the shared table.

🛡️

Compliance Gate

Broadcast to Telegram or mass-email raises ComplianceError permanently and unconditionally (INV-GS-024). Every Prediction carries a non-empty disclaimer field.

🔑

Prompt Registry

Every LLM call is pinned to a sha256 so the prompt graph is auditable across versions. Combined with 45+ numbered invariants and 1:1 unit-test mapping.

Supported Markets

US, Korea, FX, Crypto

Free-stack data sources only in MVP — yfinance, SEC EDGAR, Naver Finance, CCXT. Paid sources (Bigdata MCP) are phase-gated behind explicit activation.

XNAS / XNYS

US Large-cap

Active (v1.0)

S&P 500 Top 50 · yfinance + SEC EDGAR

XKRX

Korea KOSPI

Active (v1.1)

KOSPI 200 · yfinance .KS + Naver Finance

XKOS

Korea KOSDAQ

Partial

yfinance .KQ · Naver pending

BINANCE_PERP

Crypto Perp

Research

BTC / ETH · CCXT

NYSE / CBOE

FX & Commodity ETFs

Partial

yfinance + CFTC COT

Calibration Table

8 theses measured, not failed

v0.6 called these "8 thesis FAIL" against a rigid Sharpe gate and shut down. v1.0 reframes the same data as the calibration baseline — weak signals carry near-zero Brier weight, strong signals carry proportional weight.

E_PEAD US 50 0.587 +0.63 0.18

E_FOREIGN_REVERSAL KR 20 0.467 +0.58 0.14

E_INSIDER_CLUSTER US 19 0.339 +0.78 0.05

E_COMMODITY_TS Commodity 0.489 +0.14 0.06

E_SECTOR_ROTATION US 11 0.470 −0.48 0.00

E_FOMC_DRIFT US 12 0.357 −1.34 0.00

E_FX_CARRY FX 8 0.400 −1.53 0.00

E_FUNDING_CARRY Crypto 2 0.505 −0.23 0.02

Brier-derived weights are illustrative; actual values computed at run time from cache/calibration_table.parquet. Full table and interpretation: docs/CALIBRATION.md

Quickstart

Up and running in minutes

Requires Python ≥ 3.11. No paid API keys needed in default MVP mode.

shell

# Install (uv preferred)
pip install glostat

# Run a prediction (mock mode, no network)
glostat predict AAPL --horizon 5d --mock

# Live KR prediction (free, no Bigdata MCP needed)
GLOSTAT_SEC_USER_AGENT="Your Name [email protected]" \
glostat predict 005930   # 삼성전자
glostat predict 096770   # SK Innovation

# Quarterly recalibration
glostat calibrate --all-thesis --window 365d
glostat calibrate --update-table

python

from glostat.predictor.composite import composite_p_up
from glostat.core.types import Prediction

# Get a calibrated probability prediction
prediction: Prediction = pipeline.predict("AAPL", horizon="5d")

print(f"p_up = {prediction.p_up:.3f}  "
      f"90%CI=[{prediction.p_up_lower:.3f}, {prediction.p_up_upper:.3f}]")

for c in prediction.contributing:
    print(f"  {c.thesis_name:24} "
          f"dir={c.direction:4}  "
          f"weight={c.brier_weight:.3f}  "
          f"AUC={c.auc:.3f}  "
          f"n={c.n_calibration_samples}")

print(prediction.disclaimer)  # Always non-empty (INV-GS-104)

Extend it

Add your own thesis

The infrastructure is independent of which thesis you screen. New thesis modules just need calibration data attached to the PR.

1️⃣

Write a thesis module

Subclass the Thesis protocol in src/glostat/experts/. Return a typed (direction, raw_score, sources). See docs/EXAMPLES.md.

2️⃣

Register a data source

Add a routing entry in data_router.py. The DataRouter enforces phase gating so paid sources stay blocked until you explicitly opt in.

3️⃣

Run the hindcast

Configure Hindcast, point at a universe, get an IS/OOS report with AUC, Sharpe, and Brier score. Minimum n=50 required (INV-GS-026).

4️⃣

Submit a calibration row

Append the result to calibration_table.parquet. The Brier-weighted ensemble picks the weight automatically at the next recalibration run.

Evidence-based Probability Predictor for Global Equities