MarketSync Docs
This documentation captures the exact formulas powering the arbitrage surfaces you see in MarketSync. Every calculation is deterministic and derived from venue data; as features evolve the derivations below will expand as well.
Quick Reference
MarketSync expresses every venue feed in the same vocabulary before any trading logic runs. The symbols below carry through to the calculations and the UI so that probabilities, odds, and averages always refer to the same constructs.
- p — decimal probability (0-1) derived directly from the stored percentage quote. This is the canonical form for math utilities and simulations.
- π — percentage representation (p × 100) that we persist in the database and surface in dashboards for readability.
- decimal_odds — European-style payout multiple (1 ÷ p) used when comparing against sportsbooks and for translating expected value into currency terms.
- avg — volume weighted average execution price after consuming order-book levels. The same number powers depth readouts and post-trade reconciliation.
- threshold — break-even price for the Polymarket leg given the opposing venue's probability. Any average fill above this number destroys the arbitrage margin.
- edge — surplus probability mass (= 100 − (awayπ + homeπ)) that converts directly into the quoted margin and expected value.
By constraining every calculation to this vocabulary we can detect regressions when new data sources deviate from the agreed semantics.
1. Arbitrage Opportunity Calculation
Detecting arbitrage requires three precise steps: normalising venue percentages, summing the complementary probabilities, and translating the leftover probability mass into a user-facing margin. Each step is mirrored across the backend utilities and the front-end readout to guard against drift.
We start by treating the stored percentages as probabilities. Dividing by 100 removes presentation bias and makes the numbers compatible with deterministic math.
p_away = away_odds / 100 p_home = home_odds / 100
With both legs expressed as probabilities, MarketSync adds the values to measure how close the pair gets to a fully hedged position. When the sum exceeds one, the opportunity disappears; when it falls short, an edge emerges.
combined = p_away + p_home
Finally, we expose the remaining probability mass as a percentage margin. This is the number displayed in the alerts feed and dashboard badges.
profit_margin = (1 - combined) * 100
The backend implementation in back-end/markets/model_utils.py repeats these exact operations after mapping venue-specific naming quirks. The duplication lets us diff any discrepancies between calculation domains and flag stale data before traders rely on it.
2. Order Book Depth & Maximum Safe Buy Size
Once an arbitrage exists we need to know how much liquidity we can safely consume. MarketSync simulates walking the Polymarket ask book, adding size level by level until the resulting average price reaches the break-even threshold implied by the opposing venue.
The pseudocode below mirrors the routine that powers the depth visualisation on the market detail screen. It accumulates size, cost, and recalculates the running average after every fill.
threshold = 1 - other_leg_probability
cumulative_size = 0
cumulative_cost = 0
for each level in order_book_asks:
price = level.price
size = level.size
new_size = cumulative_size + size
new_cost = cumulative_cost + price * size
new_avg = new_cost / new_size
if new_avg <= threshold:
cumulative_size = new_size
cumulative_cost = new_cost
continue
fill_size = (threshold * cumulative_size - cumulative_cost) / (price - threshold)
cumulative_size += clamp(fill_size, 0, size)
cumulative_cost += price * clamp(fill_size, 0, size)
breakThe simulation stops as soon as the average fill price would cross the threshold. At that point any additional liquidity would erase the margin generated by the opposing venue. We surface the resulting safe size, notional cost, and residual edge so traders can size their orders accordingly.
Safe Notional Snapshot
Max spend: 770.80 units(Derived from the same simulation.)
Max size: 1640.0 shares/contracts
Post-fill margin: -0.00%
3. Payout Scaling & Payout Bands
After the depth walk completes we translate the executed fills into payout bands. Internally we refer to each executed price as p_in: the first fill is p_in_best, the last fill is p_in_worst, and the size-weighted mean is p_in_avg. These values drive the min/avg/max payout display in the arbitrage calculator.
levels = normalise(order_book.asks)
remaining = bet_amount
total_cost = 0
total_shares = 0
p_in_best = None
p_in_worst = None
for price, size in levels:
if remaining <= 0:
break
level_cost = price * size
fill_cost = min(remaining, level_cost)
fill_shares = fill_cost / price
p_in_best = p_in_best or price
p_in_worst = price
total_cost += fill_cost
total_shares += fill_shares
remaining -= fill_cost
p_in_avg = total_cost / total_shares
max_payout = total_cost / p_in_best
avg_payout = total_shares
min_payout = total_cost / p_in_worst4. Practical Considerations
Real trades introduce wrinkles beyond the clean math. The points below capture the operational guardrails we maintain while interpreting simulation output.
- Venue-specific fees. Exchange fees, funding spreads, and borrow costs are stored per venue and deducted when we reconcile real fills. The theoretical margin shown above intentionally omits fees so we can layer them in contextually.
- Data freshness. Order book snapshots update with the same cadence as market data. If the Polymarket book is empty or stale we render an explicit warning instead of a misleading depth estimate.
- Missing depth on peer venues. Kalshi depth is not yet streamed. Once available, we will mirror the same averaging routine to size the opposing leg and compare liquidity symmetrically.
Executing in production always pairs these calculations with venue risk checks, portfolio limits, and live settlement monitoring.
5. Model Guardrails & Utilities
Odds ingestion flows through several utilities before arbitrage math fires. These guardrails keep inputs sane and ensure that cross-venue comparisons reference the same underlying event.
- Cross-source validation.
validate_odds_with_other_sourcecompares each venue's quote against its counterpart. If the sums land near 100% but appear inverted, we automatically flip the legs and stamp the event as inverted, preventing false positives. - Identity reconciliation. Helpers like
match_outcome_nameand ticker extractors normalise team and contract names. This guarantees that YES/NO legs map to the correct outcome objects even when data providers rename events mid-stream. - Dual-view parity. We recompute both probability and decimal-odds views of every market. Any drift between the representations raises an alert before payloads reach downstream services.
These utilities run on ingestion and again when alerts trigger, giving us two chances to catch inconsistent data before surfacing opportunities.