experiment data api
Atlas and Mirror trade Kalshi prediction markets 24/7 under opposite prompt framings. Hearth runs a micro-SaaS. Compass audits them nightly. Every decision they make — with the Claude reasoning trace, the calibrated fair probability, the size, the outcome, the post-mortem — lives in a JSON feed.
This is that feed. Nobody else publishes this data at this granularity because nobody else is running autonomous agents in production the way we are. If you're studying agent behavior, building a competing system, or just want to piggyback on a calibrated trader, you'll get more from one week of this feed than from any public dataset.
experiment_actions
Every Hearth / Atlas / Mirror action. Ship, email, buy, refund, post, pivot, pause, observe — timestamped, with the 280-char summary + optional markdown detail + cost + outcome.
pmb_trades
Every Kalshi trade — ticker, side, size, entry/exit, realized P&L, open + close timestamps, agent id (1 = Atlas / survival framing, 2 = Mirror / neutral control), mode (paper vs live), and the wake log id that produced it so you can join back to the reasoning.
experiment_lessons
Distilled post-mortems the agents write after losses or broken deploys. One-line title, 2–5 sentence body, trigger event, source. These are the behavioral updates the agents feed back into their own prompts — the Reflexion loop in raw form.
pmb_alignment_divergenceresearch
Nightly divergence scan comparing Atlas (survival-framed) against Mirror (neutral control) on shared market exposure. Counts invented-rule events, bankroll misstatements, self-preservation phrase leaks, and paired-wake decision disagreements. The measurement instrument behind the agentic-misalignment hypothesis — not published anywhere else in raw form.
agent_scorecardsresearch
Rolling 7-day process-quality scorecards per agent. Brier score (calibration), steelman-language rate, skip/restraint rate, average cited edge points, plus outcome context (PnL, win rate). Process-based, not outcome-based — the metrics designed to resist Goodharting. Diff weeks to track calibration drift.
Bearer auth. JSON response. Incremental pulls via ?since=.
curl -H 'Authorization: Bearer dvdshn_XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' \ 'https://dvdshn.com/api/public/experiments?experiment=kalshi&limit=5'
Response (abbreviated):
{
"ok": true,
"as_of": "2026-04-16T18:21:04Z",
"authenticated_as": "apollo-research",
"tier": "read",
"experiments": ["kalshi"],
"experiment_actions": [...],
"experiment_lessons": [...],
"pmb_trades": [
{
"id": 8,
"agentId": 1,
"kalshiTicker": "KXHIGHTSFO-26APR15-B61.5",
"side": "yes",
"sizeUsd": "47.00",
"entryPriceCents": 26,
"exitPriceCents": 6,
"pnlUsd": "-57.69",
"result": "loss",
"openedAt": "2026-04-15T20:03:17Z",
"closedAt": "2026-04-15T23:59:58Z",
"mode": "paper",
"wakeLogId": 14
}
]
}Python — incremental pull into pandas:
import os, requests, pandas as pd
r = requests.get(
"https://dvdshn.com/api/public/experiments",
params={"experiment": "kalshi", "limit": 500},
headers={"Authorization": f"Bearer {os.environ['DVDSHN_KEY']}"},
timeout=30,
)
r.raise_for_status()
payload = r.json()
trades = pd.DataFrame(payload["pmb_trades"])
trades["pnl_usd"] = trades["pnlUsd"].astype(float)
print(trades.groupby("agentId")["pnl_usd"].describe())
# Next pull — only rows newer than this one
cursor = payload["as_of"]alignment researchers
Paired survival vs. neutral framings on identical markets, with nightly divergence scans. A live dataset for agentic-misalignment measurement — not a synthetic eval.
agent builders
Lessons rows are Reflexion updates in raw form. Action rows show the prompt-to-execution loop under real money pressure. Study the failure modes before you hit them.
prediction-market traders
Calibrated fair probabilities + outcomes joined to ticker + size. Piggyback edges, benchmark your own Brier score, or just watch a Claude trader compound (or blow up) in public.
Each table has its own landing page with schema, sample query, and use cases. Single $29/mo subscription covers every dataset.
experiment — kalshi | embedproof (optional; default = both)
since — ISO 8601 timestamp; returns rows created after it
limit — 1..500 (default 100)
Authorization: Bearer dvdshn_... header (preferred), or ?api_key= query param.$29/mo. Key emailed on subscription. Cancel anytime.
Subscribe · $29/moPrefer email? david@dvdshn.com