pex

MCP Workflows

These workflows show how the MCP tools chain together for common growth engineering tasks. Each workflow is a conversation between you and your AI agent.

Interactive Experiment Creation

The most common workflow: you have an idea, and you want to test it properly.

Plan the experiment

Tell your agent what you want to improve. It calls plan_experiment, which returns related beliefs, past experiments, suggested metrics, and traffic allocation options.

"I want to improve signup conversion on the pricing page"

The agent presents options as a numbered list. Pick your belief, confidence level, and traffic split.

Record the reasoning

The agent calls start_reasoning with your chosen belief and confidence. This creates a belief and hypothesis in the Apex graph, establishing why you're running this test.

Log a prediction

Before seeing any results, the agent calls log_prediction so you commit to an expected outcome. This feeds the calibration loop — over time, Apex tracks how accurately your team forecasts results.

Create the experiment

The agent calls create_experiment with preview: true first. You review the summary — name, control vs. variant, traffic split, linked belief. Confirm to create.

Implement the variant

For SDK-mode experiments, the agent implements the code changes using useApexVariant. For snippet-mode, no code changes are needed — the variant is applied at runtime via the DOM.

Deploy and activate

After pushing the code, the agent calls track_deployment with the commit SHA, then check_deployment to verify it's live. Once confirmed, activate_experiment starts splitting traffic.

Decision Guardrails

Use this workflow before committing to any significant product change — even if you're not planning an experiment.

Evaluate the feature

Describe what you're considering building. The agent calls evaluate_feature, which checks the belief graph for related assumptions, scans past experiment results, and identifies confidence gaps.

"Should we add a chatbot to the homepage?"

Predict the impact

If you decide to proceed, the agent calls predict_impact to search for historical experiments with similar changes. It returns average lift, sample sizes, and a recommendation.

Decide

Based on the evidence, you can:

  • Build with confidence — historical data supports the change
  • Run an experiment first — signals are mixed or data is missing
  • Record a belief and revisit later — you're not ready to commit engineering time

Tip

Even if you skip the experiment, recording a belief with create_belief ensures the assumption is tracked. When you revisit it later, you'll have the context for why you deferred.

Pre-Build Evaluation

Before starting a sprint or picking up a feature card, ask the agent to evaluate it:

"Evaluate this feature: adding social proof badges to the pricing page"

The agent calls evaluate_feature and returns:

  • Related beliefs — what your team already assumes about this area
  • Historical evidence — results from past experiments with similar changes
  • Confidence gaps — untested or low-confidence assumptions that could derail the feature
  • Recommendation — build, experiment first, or gather more data

This prevents the common failure mode of building features based on untested assumptions.

Reviewing Experiment Results

When an experiment has been running long enough, review it:

Check results

The agent calls get_results and presents visitor counts, conversion rates, confidence level, and days running.

Decide next steps

The agent offers options:

  1. Keep running — not enough data for a decision
  2. Promote the winner — clear result, ready to graduate
  3. Pause — something looks wrong, investigate
  4. Suggest next experiment — learn from this and iterate

Promote and clean up

If you promote a winner, the agent calls promote_winner and record_outcome. For SDK experiments, it returns code cleanup instructions — remove the useApexVariant conditional and keep only the winning variant.

Getting Smart Suggestions

When you're not sure what to test next, ask:

"What should I test next on the onboarding flow?"

The agent calls suggest_experiment with your context. It analyzes untested beliefs, low-confidence assumptions, completed experiments with follow-up potential, and returns prioritized suggestions.

Info

Suggestions improve over time. The more beliefs you record and experiments you run, the smarter the recommendations become.

Next Steps