---
name: manyworlds-mcp
description: Invoke when the user asks about Many Worlds Research, wants to find simulation research, inspect predictions, compare mesh output to market prices, evaluate betting recommendations, or analyze predicted-vs-actual outcomes for sports, earnings, or event forecasts.
---

# Many Worlds Research MCP

Use the Many Worlds Research (MWR) MCP server to browse simulation research, inspect predictions, and analyze outcomes. Each research instance represents a single market being analyzed by the mesh -- a multi-agent analytical network that produces structured probability forecasts across "worlds" of possible outcomes.

## MCP endpoint

- URL: `https://manyworldsresearch.com/mcp`
- Auth: bearer token

Never print or expose token values.

## When to use this skill

Activate when the user asks about:
- Specific MWR predictions ("what did the mesh predict for Lakers/Rockets")
- Mesh performance and calibration ("how accurate has the mesh been on NBA")
- Betting recommendations on Polymarket events the mesh covers
- Comparisons between mesh probabilities and market prices
- Analytical reasoning behind a prediction ("why does the mesh think that")
- Resolved outcomes and prediction accuracy

Do not activate for general sports analysis, general betting advice, or questions about Polymarket that don't involve MWR research.

## Core semantic model

Before using any tool, internalize these distinctions. They are the source of most incorrect answers.

**pos_label and neg_label define the probability reference frame.** Every research instance has two sides. `simulated_win_prob` always refers to `pos_label`. The other side's probability is `1 - simulated_win_prob`. Never assume the title's first-named entity is `pos_label`.

**kelly_bet_side refers to pos or neg, not a team name.** A recommendation of "bet pos" means bet the side labeled `pos_label`. Translate to the team or outcome name only when presenting to the user.

**kelly_raw_edge_pp is measured for the positive label unless the payload explicitly says otherwise.** A positive edge generally indicates value on `pos_label`; a negative edge generally indicates value on `neg_label`. Still, present the actionable value side from `kelly_bet_side` when available, because it is the explicit recommendation field.

**kelly_full and kelly_live are bankroll fractions, not probabilities.** `kelly_full` is the theoretical full-Kelly stake fraction implied by the calibrated edge. `kelly_live` is the risk-reduced deployable stake fraction used for practical recommendations. If both are zero, treat the recommendation as pass/no bet even when raw mesh-vs-market edge is nonzero. When reporting a suggested bet, show `kelly_live` first and optionally include `kelly_full` as context.

**The mesh's pick and the recommended bet can differ.** When the mesh slightly favors one side but the market overprices that side more than the mesh's lean, the positive-expected-value bet is the *other* side. This is a value bet, not a contradiction. When presenting recommendations, distinguish "who the mesh thinks will win" from "which side has betting value."

**A resolved event has an actual outcome; an active event does not.** Check `status` before making correctness claims. For resolved events, compare `resolved_outcome` against the mesh's pick and state whether the realized side was favored.

## Answering common user intents

### "What did the mesh predict for X?"

1. Search by name or phrase using `search_research`.
2. From results, identify the right instance by subject, date, and status.
3. Call `list_research_documents` to confirm available documents.
4. Fetch `metadata_json` for the core prediction summary.
5. If the user wants the reasoning, also fetch `report_html` and summarize the narrative sections.

Lead with the probability and the labeled side. State the market comparison. If resolved, state the outcome and whether the mesh was right.

### "How is the mesh performing on X?"

1. Use `list_research` with supported filters such as `sport`, `status=resolved`, `sort=-game_time`, `page`, and `size`.
2. If the user asks for a date range, page through enough results and filter locally by `game_time` or `report_run_at`.
3. For each result, read `metadata_json` when the list payload is insufficient; otherwise use the list payload directly for compact aggregate checks.
4. Compute accuracy, Brier score, calibration buckets, log loss, and any requested aggregates.
5. Present as a table with per-prediction detail plus aggregate summary.

When the user asks about a specific subset (contrarian calls, high-edge calls, value bets), filter by the relevant metadata fields before aggregating.

### "Should I bet this market?"

1. Fetch `metadata_json` for the research instance.
2. Fetch `odds_json` for current market prices and Kelly recommendations.
3. Present the three standard framings:
   - The mesh's directional call and confidence
   - The market's current pricing
   - The recommended bet side and size, explicitly noting when this differs from the mesh's pick (value bet scenario)
4. Note whether the event is still active and how long until close.

Never present Kelly fractions as "the right bet" without context. Kelly fractions are stake sizes, not win probabilities; the user decides based on their own risk tolerance.

### "Why does the mesh think X?"

1. Fetch `report_html` for the narrative explanation.
2. If the user wants deeper structural insight, also fetch `mesh_json` to describe dimensions, worlds, and calibration.
3. Summarize the dominant worlds and their probabilities.
4. Identify the key dimensions driving the forecast (highest sensitivity).
5. Note any uncertainty or limitation the report flags.

Do not invent reasoning that isn't in the report. If the mesh's logic is unclear, say so.

## Response style

**Lead with the answer.** Don't build up to it.

**Attribute probabilities to their labeled side.** Write "the mesh gives Lakers a 63% chance" not "the mesh thinks Lakers."

**Distinguish mesh opinion from outcome.** "The mesh predicted Lakers at 63% and Lakers lost" is clear. "The mesh got it wrong" without context is not.

**When the mesh was wrong, say so directly.** Don't soften. Users value honesty about calibration more than defensive framing.

**Handle missing data explicitly.** If `resolved_outcome` is null on a closed event, say the outcome is unavailable rather than guessing. If `market_prob` and `market_prob_at_run` disagree significantly, mention it -- the gap itself is information.

**Distinguish the mesh's view of what will happen from the recommended bet.** Especially in value bet scenarios, this distinction is the most common source of user confusion.

## Worked examples

The examples below are synthetic patterns, not factual performance records. Always retrieve current documents before giving real probabilities, outcomes, returns, or hit rates.

### Example 1: Specific prediction lookup

User: *"What did Many Worlds predict for the Nuggets vs Spurs game last week?"*

1. `search_research` with query "Nuggets Spurs"
2. Identify the matching resolved instance
3. `list_research_documents` to confirm `metadata_json` exists
4. `get_research_document` for `metadata_json`
5. Present:

> Example only: Many Worlds predicted Nuggets at 61% to win against the Spurs on April 12. The market had Spurs at 79% -- a strong contrarian call. The Nuggets won. The mesh's calibrated confidence was 84.5%, which made this a high-conviction recommendation. Do not include ROI or bankroll returns unless those are computed from retrieved odds and settlement data.

### Example 2: Performance aggregation

User: *"How has the mesh done on NBA games where it disagreed with the market?"*

1. `list_research` with filter `sport=nba`, `status=resolved`
2. For each result, fetch `metadata_json`
3. Filter to instances where `simulated_win_prob` and `market_prob_at_run` favor different sides
4. Compute hit rate, average edge, Kelly returns only when price and settlement data support that calculation
5. Present as table plus summary

> Example only: Across the selected window, the mesh made 9 contrarian NBA calls. It was correct on 6 of 9 (66.7%). The average edge was 18.4pp. The three misses were listed with their realized probabilities. Only report Kelly returns when you have computed them from actual retrieved prices and outcomes.

### Example 3: Value bet distinction

User: *"What's the current bet recommendation for the Tesla earnings market?"*

1. `search_research` for "Tesla"
2. Identify the active earnings instance
3. Fetch `metadata_json` and `odds_json`
4. Identify the value bet scenario from the relationship between `simulated_win_prob` and `kelly_bet_side`
5. Present:

> Example only: The mesh estimates a 52% chance Tesla misses earnings (EPS at or below $0.39) -- essentially a coin flip. The market prices the miss at 61%, meaning traders are meaningfully more confident in a miss than the mesh is.
>
> The recommended bet is on the beat, not the miss. This is a value bet: the mesh slightly favors the miss, but the market overprices the miss so much that the expected-value play is the other side. A calibrated 1/4-Kelly bet would size 23.9% of bankroll on the beat.
>
> Note: the mesh does not predict a beat. It identifies a pricing error on the beat side.

## Tool reference

### `search_research`
Full-text search over research instance titles.
- Required: `query` (string)
- Optional: `sport`, `status`, `size`

### `list_research`
List research instances with filtering.
- Optional: `sport`, `status`, `sort`, `page`, `size` (max 50)

### `list_research_documents`
List available documents for an instance.
- Required: `instance_id` (event ID or slug)

### `get_research_document`
Fetch a single document.
- Required: `instance_id`, `document_id`
- Optional: `max_chars` (default 120000, max 500000)

### Available document types

**`metadata_json`** -- Compact core record. Use for prediction summaries and outcome comparisons.

Key fields: `id`, `slug`, `title`, `status`, `resolved_outcome`, `game_time`, `simulated_win_prob`, `market_prob`, `market_prob_at_run`, `kelly_raw_edge_pp`, `kelly_bet_side`, `kelly_p_hat`, `kelly_full`, `kelly_live`, `kelly_sensitivity_hhi`, `kelly_win_prob_stddev`, `pos_label`, `neg_label`.

**`odds_json`** -- Self-contained pricing and Kelly snapshot for one research instance. Use this when the user asks about current pricing, value bets, line details, or side-by-side model vs market comparison.

Field meanings:

- `title`: Human-readable event title.
- `status`: Event lifecycle state (`active`, `closed`, or `resolved`).
- `resolved_outcome`: Final settled winning side label when available; null if unresolved or missing.
- `sim_prob`: Model probability for the **positive side** (`pos_label`) on a 0-1 scale.
- `market_prob_at_run`: Market-implied probability for the positive side at research-run time (0-1).
- `market_prob`: Most recently fetched market-implied probability for the positive side (0-1).
- `market_prob_fetched_at`: Timestamp for `market_prob`.
- `kelly_raw_edge_pp`: Raw model edge in percentage points, computed from model probability vs market probability baseline.
- `kelly_bet_side`: Recommended side from Kelly logic (`pos` or `neg`). This is the **bet direction**, not necessarily the side with higher model win probability in value-bet scenarios.
- `kelly_p_hat_uncapped`: Reliability-model estimate of win probability for the **bet side** before any guardrails. This comes from a logistic calibration model trained on historical settled bets.
- `kelly_p_hat`: Final calibrated win probability for the bet side after guardrails (log-odds shift cap). Use this as the probability input to Kelly sizing.
- `kelly_full`: Full Kelly fraction computed from calibrated probability and side price.
   - Intuition: fraction of bankroll maximizing long-run log growth if inputs are perfectly calibrated.
   - Formula used: `max(0, (kelly_p_hat - side_price) / (1 - side_price))`.
- `kelly_live`: Deployable/safer Kelly fraction used in practice.
   - Current implementation is **quarter Kelly**: `kelly_live = 0.25 * kelly_full`.
   - So `live` differs from `full` only by risk scaling (same direction, smaller size).
- `kelly_sensitivity_hhi`: Concentration metric for scenario sensitivity (higher generally means fewer scenarios dominate).
- `kelly_win_prob_stddev`: Dispersion of win probability across scenarios.
- `pos_label`: Human-readable label for the positive side (the side `sim_prob` refers to).
- `neg_label`: Human-readable label for the opposite side.
- `markets`: Raw array of market objects used for line/price context.

`markets[]` objects can vary by plugin/market type, but commonly include:

- `question`: Market prompt text.
- `outcomes`: Side labels in exchange order.
- `market_type`: Type such as moneyline/spread/total/prop.
- `line`: Numeric spread/total threshold when applicable.
- `yes_price`, `no_price`: Side prices/probability-like quotes.
- `volume`, `liquidity`: Market activity/depth metrics.
- `market_id`: Provider market identifier.

Interpretation rules:

- Always map probabilities through labels: `sim_prob` applies to `pos_label`; the opposite side is `1 - sim_prob` in binary markets.
- Do not equate `kelly_bet_side` with "the model predicts this side wins"; it can be a value bet against the more likely side.
- Treat Kelly fractions as position sizing guidance only.

What "calibrated" means in this system:

- Raw mesh probabilities are adjusted by a fitted reliability model before staking.
- Calibration uses three features: absolute edge size (`|kelly_raw_edge_pp|`), sensitivity concentration (`kelly_sensitivity_hhi`), and scenario dispersion (`kelly_win_prob_stddev`).
- Result: `kelly_p_hat` is a probability intended to be better aligned with realized outcomes than raw mesh probabilities.

**`report_html`** -- Human-readable narrative report. Use for reasoning and analytical synthesis.

Typical sections: The Call, How This Resolves, What Decides This, What to Watch, Mesh vs. Market, How This Works, Uncertainty and Limitations. Section names vary; treat as examples not guarantees. Strip CSS/SVG when summarizing.

**`mesh_json`** -- Simulation configuration. Use for deep structural analysis.

Key structures: `tiltAxis` (meaning of positive/negative tilt), `calibrationAnchors` (calibration inputs), `dimensions` (scenario variables with values and priors), `worlds` (named scenario outcomes), `tests` (live-update triggers), `dimensionCorrelations`, `worldPriority`, `_extractionStats`, and `_marketAlignment`. Shape is not strictly guaranteed; check field presence before summarizing.

### Document selection guidance

- For quick lookups and outcome comparison -> `metadata_json`
- For betting analysis and current market state -> `odds_json`
- For reasoning and narrative -> `report_html`
- For structural mesh internals -> `mesh_json`

Always `list_research_documents` first. Do not assume all four documents exist for every instance.

## Handling edge cases

**Missing `resolved_outcome` on closed events.** State that the outcome is not yet recorded and do not infer it. The event may be disputed, unresolved, or pending data sync.

**Significant drift between `market_prob_at_run` and `market_prob`.** Note the drift. Large changes between research time and now indicate the market has moved -- relevant for users considering a live bet.

**`kelly_p_hat` differing substantially from `simulated_win_prob`.** This is the calibration layer adjusting raw mesh confidence. Explain both: the raw mesh estimate and the calibrated estimate. They represent different things.

**No documents returned from `list_research_documents`.** The instance exists but hasn't produced outputs. Say so; don't fabricate.

**Conflicting information between documents.** Trust `metadata_json` as the source of truth for core fields. Note the discrepancy rather than silently reconciling.
