Phase 3 Complete: The Recommendation Engine Is Live

The core value is here. ModelOptix now delivers personalized AI model recommendations that save you money.

TL;DR

✅ Recommendation Engine live - FitScore algorithm finds cheaper alternatives
✅ Opportunities dashboard - See all savings opportunities ranked by ROI
✅ Sanity Check - Test models side-by-side before switching
✅ Guest access - Try it free before signup (no account needed)
✅ Trust Dashboard - Provider transparency scores
✅ One-click switches - Accept opportunity, track savings
🎯 The aha moment delivered - Time to first savings: < 10 minutes
🚀 Try it now - 3 free model comparisons, no signup

What I Shipped: The Value Delivery System

Phase 3 was the big one. This is where ModelOptix stops being a catalog and starts saving you money.

What is ModelOptix? An independent AI model advisor that finds cheaper alternatives to your current models, with zero bias. No affiliate commissions. No vendor partnerships. Just you vs. the market.

The Recommendation Engine (FitScore)

Every night at 5am UTC, ModelOptix scans your AI portfolio and compares your current models to 400+ alternatives using our FitScore algorithm.

The FitScore calculates:

Cost efficiency (35% weight) - Price per task vs. your budget
Speed match (25% weight) - Response time vs. your latency needs
Quality fit (20% weight) - Benchmark scores vs. your requirements
Trust score (15% weight) - Provider reliability + editorial warnings
Context compatibility (5% weight) - Token window vs. your prompts

When an alternative scores >10% better than your current model → Opportunity created.

Example opportunity:

Use Case: "Sentiment Analysis for Trading Signals"
Current Model: GPT-4 Turbo ($10/1M input, $30/1M output)
Recommended: Llama 3.1 70B ($0.35/1M input, $0.40/1M output)

FitScore Improvement: +42%
Estimated Monthly Savings: $280 (89% cost reduction)
Quality Score: Equivalent (93.2 vs 93.5)
Speed: 15% faster average latency

Recommendation: Strong switch candidate

Why this matters: Most teams pick a model in March and never re-evaluate. Six months later, they're overpaying because cheaper models have gotten better. ModelOptix watches the market for you 24/7.

The Opportunities Dashboard

Your mission control for savings.

What you see:

All opportunities ranked by estimated monthly savings
Filter by: Status (active/accepted/dismissed), use case, opportunity type
Sort by: Improvement %, monthly savings, date created
Each card shows:
- Model comparison (current → recommended)
- FitScore improvement percentage
- Estimated monthly savings in dollars
- Recommendation reason (cost, speed, quality, trust)
- One-click "Accept" or "Dismiss" actions

The workflow:

Log in → See opportunities waiting for you
Click opportunity → View detailed comparison
Run Sanity Check → Test with your real prompts
Accept → ModelOptix switches your model, tracks savings
Repeat → Keep finding savings as new models launch

Time to first opportunity: < 10 minutes from signup. Add your product, define one use case, wait for the nightly job. Wake up to savings.

Sanity Check: Test Before You Switch

This is the killer feature. Never blindly switch models again.

How it works:

Click "Run Sanity Check" from any opportunity
Enter 1-3 sample prompts representative of your use case
ModelOptix runs both models via OpenRouter API
See side-by-side results with:
- Full model responses
- Response time comparison
- Token usage
- Cost per request
- Quality assessment

You decide:

✅ Recommended model is better → Accept opportunity, save money
✅ Current model is better → Dismiss opportunity, keep current
✅ It's a tie → Your call based on cost vs. minor quality differences
✅ Neither is good → Flag for investigation

Example Sanity Check result:

Metric	GPT-4 Turbo (Current)	Claude 3.5 Sonnet (Recommended)
Response Time	2.4s	1.8s ⚡ 25% faster
Tokens Used	450	420
Cost	$0.0135	$0.00126 💰 91% cheaper
Quality	Excellent summary	Excellent summary, slightly more concise

Your evaluation: "Recommended model is 25% faster, 91% cheaper, and equally good. Easy accept."

No risk switching. Test first, decide with data.

Guest Sanity Check (No Account Needed)

Want to try ModelOptix before signing up? We got you.

Visit modeloptix.com/try:

Select any two models from our catalog
Enter your test prompt
See side-by-side results
Submit your preference

Free tier: 3 comparisons before signup. Enough to see the value.

Why we offer this: Traditional SaaS hides the product behind signups. We want you to experience the "aha moment" immediately. Try it, see the difference, then decide if you want ongoing monitoring.

Example guest use case:

Testing GPT-4o vs Claude 3.5 Sonnet for customer support responses
Comparing Llama 3.3 70B vs Mistral Large for code generation
Evaluating Gemini 1.5 Flash vs GPT-4o mini for classification

No credit card. No email. Just compare.

Trust Dashboard

We show our work. Every provider gets a trust score based on 6 dimensions:

Trust Tiers:

Tier A (Score 1.0) - OpenAI, Anthropic, Google - Top trust
Tier B (Score 0.7) - Mistral, Cohere - Good trust, minor gaps
Tier C (Score 0.4) - Smaller providers - Use with caution
Unknown (Score 0.2) - Insufficient data

6 Trust Dimensions:

Data Handling - Privacy policy, data retention, GDPR compliance
Transparency - Model cards, benchmark disclosure, pricing clarity
Security - SOC 2, ISO 27001, penetration testing
Compliance - Industry certifications, audit trails
Reliability - Uptime SLA, incident response, status page
Ethics - Bias testing, content policy, responsible AI practices

Why this matters: Cheapest isn't always best. If you're processing PII or financial data, Tier A providers might be worth the premium. If you're generating blog summaries, Tier C is fine.

Editorial overrides: When we discover issues (e.g., provider security incident), we flag or downrank models until resolved. You see warnings before accepting opportunities.

One-Click Model Switches

Found a good opportunity? Accepting is effortless.

The "Accept" flow:

Click "Accept Opportunity"
ModelOptix updates your use case to the recommended model
Tracks estimated monthly savings to your profile
Expires other opportunities for that use case (no duplicates)
Shows success toast: "Switched to Claude 3.5 Sonnet. Estimated savings: $156/month"

Savings tracking:

Per use case: "This switch saves $156/month"
Portfolio-wide: "Total monthly savings: $487 across 6 switches"
Cumulative: "You've saved $2,920 since joining ModelOptix 6 months ago"

Your dashboard shows:

Products monitored
Active opportunities
Accepted switches
Total savings (the big number)

ROI calculation: If ModelOptix saves you $200/month and costs $9.95/month (Solo tier), you're netting $190/month profit. That's 19x return on investment.

Real-World Example: Trading Platform Case Study

Let's walk through a concrete example.

Scenario: You run TradingAI, a platform with 5 AI use cases:

Use Case	Current Model	Monthly Cost
Market sentiment analysis	GPT-4 Turbo	$340
Trading signal generation	Claude 3 Opus	$420
Risk assessment	GPT-4o	$185
News summarization	GPT-4 Turbo	$290
Customer support chat	Claude 3.5 Sonnet	$95
Total		$1,330/month

Day 1: You add TradingAI to ModelOptix, define all 5 use cases.

Day 2 (after nightly scan): You wake up to 4 opportunities:

Opportunity 1: Market Sentiment Analysis

Current: GPT-4 Turbo ($340/month)
Recommended: Llama 3.1 70B ($45/month)
Improvement: +38% FitScore
Savings: $295/month (87% reduction)
Reason: Classification tasks don't need GPT-4's reasoning depth

Opportunity 2: Trading Signal Generation

Current: Claude 3 Opus ($420/month)
Recommended: Claude 3.5 Sonnet ($165/month)
Improvement: +29% FitScore
Savings: $255/month (61% reduction)
Reason: Newer Sonnet model matches Opus quality at lower price

Opportunity 3: Risk Assessment

Current: GPT-4o ($185/month)
Recommended: Mistral Large ($78/month)
Improvement: +31% FitScore
Savings: $107/month (58% reduction)
Reason: Mistral Large matches GPT-4o for structured analysis

Opportunity 4: News Summarization

Current: GPT-4 Turbo ($290/month)
Recommended: GPT-4o mini ($32/month)
Improvement: +45% FitScore
Savings: $258/month (89% reduction)
Reason: Summarization is a simple task, mini model excels

Customer Support Chat: No opportunity (already using optimal model)

Total Potential Savings: $915/month (69% cost reduction)

Your next steps:

Run Sanity Check on Opportunity 1 (sentiment analysis)
Test with 3 real trading scenarios
Results show Llama 3.1 70B is 92% as accurate, 4x faster
Accept → Save $295/month
Repeat for other opportunities

Month 1 savings after all switches: $915/month ModelOptix cost (Solo tier): $9.95/month Net savings: $896/month Annual net savings: $10,752

That's what ModelOptix delivers.

Technical Deep Dive (For The Builders)

FitScore Algorithm Architecture

The core recommendation engine uses a weighted scoring system:

// Five factor scores, each normalized 0-1
const factors = {
  cost: normalizeCost(model.pricing, userBudget),
  speed: normalizeSpeed(model.latency, userRequirement),
  quality: normalizeBenchmark(model.score, userMinimum),
  trust: normalizeTrust(provider.tier), // A=1.0, B=0.7, C=0.4
  context: normalizeContext(model.maxTokens, userNeeds)
};

// User priorities → weights (must sum to 100%)
const weights = calculateWeights(userPriorities);
// Examples:
// "cost" priority → {cost: 55%, speed: 22%, quality: 12%, trust: 7%, context: 4%}
// "quality" priority → {quality: 55%, cost: 22%, speed: 12%, trust: 7%, context: 4%}

// Weighted sum
const fitScore =
  (factors.cost * weights.cost) +
  (factors.speed * weights.speed) +
  (factors.quality * weights.quality) +
  (factors.trust * weights.trust) +
  (factors.context * weights.context);

// Editorial overrides
if (editorialExclude) fitScore = 0; // Never recommend
if (editorialDownrank) fitScore *= 0.5; // 50% penalty
if (editorialFlag) showWarning(); // UI flag, no score penalty

Normalization ranges are cached per factor (5-min TTL) to avoid recalculating min/max across 400+ models on every comparison.

Why weighted scores? Different use cases have different priorities. Real-time chat needs speed > cost. Batch processing needs cost > speed. Quality-critical tasks need quality > everything. Users can set priorities per use case.

Opportunity Generation Job

Runs daily at 5am UTC (after model/pricing sync at 2-4am):

// Pseudocode
for (const useCase of activeUseCases) {
  const currentModel = useCase.current_model;
  const alternatives = getCompatibleModels(useCase.requirements);

  for (const altModel of alternatives) {
    const currentScore = calculateFitScore(currentModel, useCase);
    const altScore = calculateFitScore(altModel, useCase);

    const improvement = (altScore - currentScore) / currentScore;

    if (improvement > 0.10) { // 10% threshold
      createOpportunity({
        use_case_id,
        current_model_id,
        recommended_model_id,
        improvement_percentage: improvement,
        estimated_monthly_savings: calculateSavings(useCase.volume),
        evidence: { currentScore, altScore, factors, weights }
      });
    }
  }
}

Prevents duplicates: Expires old opportunities for a use case when new ones are created.

Performance: Processes 1,000 use cases in ~45 seconds (PostgreSQL + pgvector).

Sanity Check OpenRouter Integration

Side-by-side testing via OpenRouter's unified API:

// Run both models in parallel for fairness
const [currentResult, recommendedResult] = await Promise.all([
  openrouter.chatCompletion({
    model: currentModel.provider_model_id,
    messages: [{ role: 'user', content: userPrompt }],
    max_tokens: userSettings.maxTokens || 500,
    temperature: userSettings.temperature || 0.7
  }),
  openrouter.chatCompletion({
    model: recommendedModel.provider_model_id,
    messages: [{ role: 'user', content: userPrompt }],
    max_tokens: userSettings.maxTokens || 500,
    temperature: userSettings.temperature || 0.7
  })
]);

// Store results with latency tracking
await saveSanityCheck({
  current_response: currentResult.content,
  current_latency_ms: currentResult.latency,
  current_tokens: currentResult.usage.total_tokens,
  current_cost: calculateCost(currentResult, currentModel.pricing),
  recommended_response: recommendedResult.content,
  recommended_latency_ms: recommendedResult.latency,
  recommended_tokens: recommendedResult.usage.total_tokens,
  recommended_cost: calculateCost(recommendedResult, recommendedModel.pricing)
});

Cost tracking: OpenRouter returns exact token counts. We calculate costs using our pricing database (synced weekly).

Guest session handling:

// Guest token: SHA-256 hash stored in DB
const guestToken = generateGuestToken(); // UUID
const hashedToken = sha256(guestToken);

// Store unhashed in localStorage (client)
localStorage.setItem('guest_session_id', guestToken);

// Store hashed in database (server)
await createGuestSession({ guest_session_id: hashedToken });

// Validate on API requests
const submittedHash = sha256(request.headers['x-guest-token']);
const session = await getGuestSession(submittedHash);
if (!session || session.expires_at < now) throw new Error('Invalid session');

7-day TTL, auto-cleanup via cron job.

Trust Score System

Provider trust tiers are manually assigned based on due diligence:

-- providers table
tier: 'A' | 'B' | 'C' | 'unknown'

-- Example assignments
UPDATE providers SET tier = 'A' WHERE name IN ('OpenAI', 'Anthropic', 'Google');
UPDATE providers SET tier = 'B' WHERE name IN ('Mistral AI', 'Cohere');
UPDATE providers SET tier = 'C' WHERE name IN ('Together AI', 'Fireworks AI');

Trust score in FitScore:

Tier A: 1.0 (no penalty)
Tier B: 0.7 (30% trust penalty)
Tier C: 0.4 (60% trust penalty)
Unknown: 0.2 (80% trust penalty)

With default weight (7%), Tier C models need to be 4-5% better on other factors to overcome the trust penalty.

Editorial overrides:

CREATE TABLE editorial_overrides (
  model_id UUID REFERENCES models(id),
  override_type: 'exclude' | 'downrank' | 'flag',
  reason TEXT,
  created_by UUID -- admin user
);

Exclude: Never recommend (e.g., model deprecated, security issue)
Downrank: 50% FitScore penalty (e.g., known quality regression)
Flag: UI warning, no score penalty (e.g., beta model, use with caution)

Why Phase 3 Matters: The Independent Advisor You Need

The Problem We're Solving

You're building a product. You added AI features. You picked models based on what was best 6 months ago.

What changed since then:

Llama 3.3 70B launched (matches GPT-4 Turbo for $0.20/1M vs $10/1M)
Claude 3.5 Sonnet got a price drop (now cheaper than Claude 3 Opus)
Gemini 1.5 Flash improved benchmarks (now viable for complex tasks)
GPT-4o mini was released (1/50th the cost of GPT-4 for simple tasks)

You didn't notice because:

You're busy shipping features, not monitoring model releases
400+ models across 15+ providers - impossible to track manually
Benchmarking alternatives takes days
Fear of switching (what if quality drops?)

Result: You're overpaying. Not because you made a bad choice. Because the market moved and you didn't.

The ModelOptix Solution

We monitor the market for you:

Daily catalog sync (new models, price drops)
Weekly benchmark updates (quality improvements)
Nightly portfolio scans (find opportunities)
Proactive alerts (better model launched for your use case)

We eliminate risk:

Sanity Check lets you test before switching
FitScore algorithm considers quality, not just cost
Trust scores flag risky providers
Editorial overrides warn about issues

We track your savings:

Per-switch savings estimates
Portfolio-wide monthly savings
Cumulative savings since joining
ROI dashboard (savings vs subscription cost)

We're independent:

No affiliate commissions from providers
No vendor partnerships or sponsorships
Revenue from subscriptions only
Our incentive = your savings

You're not our product. Your savings are our product.

What's Next: Phase 4 (Monetization)

The foundation is complete. The value is proven. Now we need to get paid.

Phase 4 deliverables (Target: 5 days):

Stripe Integration - Full subscription flow
Pricing Tiers - Free, Solo ($9.95/mo), Growth ($19.95/mo), Pro ($29.95/mo)
Trial Flow - 7-day free trial with card upfront
Customer Portal - Manage subscription, billing, invoices
Tier Limits - Enforce product/use case limits per tier
Webhook Handlers - Payment failures, subscription changes
Onboarding Funnel - Landing → Trial → Tier selection → Checkout → Dashboard

The business model:

Tier	Price	Products	Use Cases	Sanity Checks	Target User
Free	$0	1	3	0/month	Post-trial downgrade
Solo	$9.95/mo	3	-	10/month	Solo founder, side project
Growth	$19.95/mo	10	-	30/month	Multi-product solopreneur
Pro	$29.95/mo	25	-	100/month	Power user, agency

Value prop: If Solo tier saves you $200/month, you're paying $9.95 to save $200. That's 20x ROI every month.

Beta launch: 7 days after Phase 4 complete. Beta users get lifetime Solo tier free (150+ on waitlist).

Try ModelOptix Today

The product is live. The value is real. Try it now.

Guest Access (No Signup)

modeloptix.com/try

Compare any 2 models side-by-side
Test with your real prompts
See speed and cost differences
3 free comparisons

Perfect for:

"I'm curious how Claude 3.5 compares to GPT-4o"
"Should I switch from GPT-4 Turbo to Llama 3.3 70B?"
"Is Gemini 1.5 Flash good enough for my use case?"

Waitlist (Beta Access)

modeloptix.com

Join builders waiting for beta
Get Solo tier ($9.95)
Early access to all features
Direct line to me for feedback

Beta launching: ~7 days (after Phase 4 monetization)

Try The Savings Calculator

modeloptix.com/#calculator

Enter your monthly API volume
Select your primary use case
Choose your current model
See estimated savings with ModelOptix

Trader7 case study: $747/month saved (40% cost reduction) while maintaining 96.2% quality.

For Other Builders

Lessons from Phase 3:

1. The Value Moment Must Be Instant

We spent 2 weeks building infrastructure. Users don't care.

Phase 3 delivered the "aha moment" - see a savings opportunity in < 10 minutes. That's what converts.

Build the value delivery first, then the infrastructure to scale it.

2. Remove Friction Ruthlessly

Guest Sanity Check has zero friction:

No signup
No email
No credit card
Just "pick two models, enter prompt, see results"

Conversion to paid will be higher because users experience value before committing.

Let users try before they buy. It's 2026, signup walls are dead.

3. Show Your Work (Trust Through Transparency)

We expose:

FitScore calculation methodology
Provider trust tier assignments
Editorial override reasons
Savings calculations (not rounded estimates)

Transparency builds trust. Trust converts.

4. Async Jobs Are Your Friend

The opportunity generation job runs once daily. Users don't see it, but it's the workhorse of the product.

Don't make users wait. Pre-compute expensive operations, serve cached results.

5. Weighted Algorithms > Binary Rules

FitScore doesn't say "this model is better." It says "this model scores 87/100 for YOUR use case with YOUR priorities."

Personalization converts better than generic recommendations.

The Build-in-Public Numbers

Phase 3 Stats:

Duration: 2 days (Jan 21-22)
Tasks completed: 12/15 (2 P1 deferred, 1 P1 dropped)
Code added: ~4,500 lines across 35 files
API endpoints: 8 (opportunities, sanity checks, trust, public models)
UI pages: 6 (opportunities list/detail, sanity check, trust dashboard, guest flow)
Cron jobs: 1 (opportunity generation, daily 5am UTC)

Cumulative Progress:

Phases complete: 4/6 (Phase 0, 0.5, 1, 2, 3)
Tasks complete: 45/72 (63%)
Lines of code: ~13,000
Waitlist signups: 150+
Beta launch: 7 days (Phase 4 complete)

Phase 4 Target:

Start: Jan 22
Complete: Jan 27 (5 days)
Tasks: 11 (Stripe subscriptions, billing, trial flow)
Goal: Paying customers by end of month

Join the Beta

Beta perks:

✅ Solo tier ($9.95/pm)
✅ Early access to all features
✅ Shape the product with your feedback
✅ Direct line to the founder

How to join:

Try guest flow → modeloptix.com/try (3 free comparisons)
Like what you see? → Join waitlist (150+ builders ahead of you)
Get beta invite → Email when Phase 4 launches (7 days)
Sign up → Lifetime free Solo tier activated
Start saving → Add your products, get recommendations

Or just try the calculator: modeloptix.com

Let's Connect

Try the product:

Guest comparison: modeloptix.com/try
Savings calculator: modeloptix.com
Join waitlist: modeloptix.com

Connect with me:

X/Twitter: @Jamie_within
LinkedIn: Jamie Watters
Blog: jamiewatters.work

Follow the build:

Twitter/X: @Jamie_within - Daily updates
LinkedIn: Jamie Watters - Weekly deep dives
Blog: jamiewatters.work

Building something similar? Working with AI models? Have feedback?

Drop a comment or DM - I'd love to hear what you're working on.

Next post: Phase 4 progress (Stripe subscriptions + beta launch)

Previously:

Building in public, one phase at a time. Stop overpaying for AI.

Next milestone: Phase 4 complete (Monetization) - Beta launch - Target: Jan 27

Phase 3 Complete: The Recommendation Engine Is Live

Phase 3 Complete: The Recommendation Engine Is Live

TL;DR

What I Shipped: The Value Delivery System

The Recommendation Engine (FitScore)

The Opportunities Dashboard

Sanity Check: Test Before You Switch

Guest Sanity Check (No Account Needed)

Trust Dashboard

One-Click Model Switches

Real-World Example: Trading Platform Case Study

Technical Deep Dive (For The Builders)

FitScore Algorithm Architecture

Opportunity Generation Job

Sanity Check OpenRouter Integration

Trust Score System

Why Phase 3 Matters: The Independent Advisor You Need

The Problem We're Solving

The ModelOptix Solution

What's Next: Phase 4 (Monetization)

Try ModelOptix Today

Guest Access (No Signup)

Waitlist (Beta Access)

Try The Savings Calculator

For Other Builders

1. The Value Moment Must Be Instant

2. Remove Friction Ruthlessly

3. Show Your Work (Trust Through Transparency)

4. Async Jobs Are Your Friend

5. Weighted Algorithms > Binary Rules

The Build-in-Public Numbers

Join the Beta

Let's Connect

Share this post

I Built a Waitlist Landing Page for PlebTest (For Founders Who Hate Cold Calls)

Phase 4 Complete: ModelOptix Now Has a Business Model