Predictions

Predictions are how Apex measures your team's judgment — not just what happened, but whether you anticipated it. Before running an experiment, you log what you think will happen. After results come in, Apex scores your accuracy and builds a long-term calibration profile.

Why Predictions Matter

Most teams run A/B tests, look at results, and move on. They never ask: "Did we expect that?" This is a missed learning opportunity.

Predictions force your team to commit to an expected outcome before seeing data. Over time, this reveals whether your team's intuitions are reliable — or systematically off in ways you can correct.

Tip

Predictions aren't about being right every time. They're about getting calibrated — learning to match your confidence level to actual outcome rates.

Prediction Structure

Each prediction includes:

Field	Description
`metric`	What you're measuring (e.g. "conversion_rate", "signup_rate")
`expectedChange`	The magnitude you expect (e.g. `15` for a 15% improvement)
`direction`	`increase` or `decrease`
`confidence`	How sure you are this will happen (0–1)
`timeHorizon`	How long you expect it to take (e.g. "2 weeks", "1000 visitors")

A concrete prediction looks like: "I'm 70% confident that adding social proof to the pricing page will increase the conversion rate by 15% within 2 weeks."

Creating Predictions

Start from an experiment

Navigate to an experiment in draft status. Click Add Prediction to attach a prediction before the experiment goes live.

Define the expected outcome

Choose the metric you're predicting, the expected direction and magnitude of change, and your confidence level.

Lock it in

Once the experiment starts running, the prediction is locked. You can't edit it after the fact — that would defeat the purpose.

Warning

You can only add predictions to experiments in draft status. Once an experiment is running, predictions are locked to prevent hindsight bias.

The Calibration Loop

After an experiment completes, Apex compares your prediction against the actual results:

Direction match — Did the metric move in the direction you predicted?
Magnitude match — How close was your expected change to the actual change?
Confidence calibration — When you say 70% confident, are you right about 70% of the time?

These three factors combine into an accuracy score for each prediction. But the real value comes from aggregation.

Accuracy Scoring

Individual prediction accuracy is calculated as:

Perfect: Direction correct, magnitude within 20% of actual → score 1.0
Good: Direction correct, magnitude within 50% of actual → score 0.7
Partial: Direction correct, magnitude off by more than 50% → score 0.4
Wrong: Direction incorrect → score 0.0

The accuracy score is weighted by your stated confidence. If you said you were 90% confident and the prediction was wrong, that's a bigger calibration miss than being wrong at 50% confidence.

Calibration Score

Your team's calibration score is the aggregate of all prediction accuracy over time. It answers: "When this team says they're X% confident, are they actually right X% of the time?"

A perfectly calibrated team:

Is right 50% of the time when they say 50% confidence
Is right 80% of the time when they say 80% confidence
Is right 95% of the time when they say 95% confidence

Most teams start overconfident — saying 80% when their actual hit rate is closer to 55%. That's normal. The calibration score makes this visible so you can adjust.

Info

Your calibration score feeds directly into the Intelligence Score. Better calibration means your organization is learning to predict outcomes more accurately — one of the strongest signals of growth maturity.

Connecting to Beliefs

Predictions are tightly linked to beliefs. When you predict an experiment outcome, you're implicitly testing a belief. If your belief says "urgency copy increases conversions" at 0.7 confidence, your prediction for an urgency copy experiment should reflect that confidence level.

When prediction accuracy is consistently high for a belief, that belief's confidence deserves to be high. When predictions keep missing, the belief needs revisiting.

Best Practices

Predict before every experiment. Even a rough prediction is better than none.
Be honest about confidence. Saying 0.9 when you mean 0.5 undermines the entire system.
Review calibration monthly. Look for patterns — are you consistently overconfident about certain types of experiments?
Celebrate accurate predictions and accurate misses. Saying "I'm only 30% confident" and being right 30% of the time is perfect calibration.