top of page

The Real Price of Building on Foundation Models

Photo by Gustavo Fring via Pexels

The average AI startup burning $50,000 a month on inference in 2024 is spending closer to $180,000 today, and that number is not falling fast enough to matter. The promise sold to founders was simple: rent the intelligence, ship the product, capture the margin. The reality is a cost structure that scales with usage in ways that make traditional SaaS unit economics look quaint. When OpenAI, Anthropic and Google control the underlying model, they also control the single largest variable in your cost of goods sold. That is not a technical problem. It is a structural one, and most cap tables have not priced it in.

The shift from experimentation to production is where the trap closes. During the prototype phase, token costs are invisible because volume is low and investor capital absorbs the friction. The moment a product hits scale, inference costs surface as the dominant line item, often exceeding hosting, headcount and tooling combined. OpenAI's pricing on GPT-4o has come down meaningfully since launch, but the cadence of those reductions has slowed, and the models founders actually need for production-grade reasoning remain expensive. Meanwhile, Anthropic's Claude 3.5 Sonnet and Google's Gemini 1.5 Pro have introduced genuine capability competition, which sounds like good news for buyers until you realise that switching costs between foundation models are far higher than the API pricing tables suggest.

The switching cost problem is underappreciated and underreported. Prompts engineered for GPT-4o do not port cleanly to Claude or Gemini. Evaluation frameworks, fine-tuning workflows and retrieval-augmented generation pipelines are all model-specific in ways that take weeks of engineering time to rebuild. A well-funded startup like Harvey, which has raised over $300 million and built deep integrations on top of OpenAI's stack, faces a meaningful re-platforming cost if it ever needs to move. Smaller companies with less runway face the same problem with none of the buffer. The venture community funded dozens of companies in 2023 and 2024 on the assumption that model commoditisation would compress input costs toward zero. That assumption is increasingly hard to defend in 2026.

Model dependency is not a technical risk. It is a business model risk, and most AI startups have not priced it into their cap tables.

There is a second hidden cost that almost nobody talks about in fundraising decks: model deprecation risk. OpenAI has already sunset GPT-3.5 Turbo in its original form, forcing dependent applications to migrate on a timeline set by the provider, not the customer. Google has deprecated multiple Gemini model versions inside twelve months. Every deprecation event triggers an unplanned engineering sprint, regression testing across the full product surface and, in many cases, a prompt re-engineering cycle that can consume two to four weeks of senior engineering time. For a twenty-person startup, that is not a maintenance cost. It is a strategic disruption. Founders who baked zero deprecation overhead into their roadmaps are now quietly repricing it as a recurring line item.

The investors who spotted this earliest are now pushing portfolio companies toward one of three defensive strategies. The first is model abstraction, building an internal routing layer that can shift workloads between providers based on cost, latency and capability, a pattern that companies like LangChain and LlamaIndex have tried to productise at the infrastructure level. The second is fine-tuning on smaller open-source models, using Mistral, Meta's Llama 3 family or Cohere's Command R series to handle high-volume, low-complexity tasks at a fraction of the per-token cost, reserving frontier models only for tasks that genuinely require them. The third, and most capital-intensive, is building proprietary model capability in-house, a path that makes sense for companies with large, unique datasets and the engineering depth to exploit them, but is a distraction for everyone else. None of these strategies is free, and all of them require a deliberate architectural decision that most early-stage teams are not making early enough.

The next twelve months will force a reckoning that the bull market in AI applications has so far allowed founders to defer. As interest rates on venture debt remain elevated and pressure mounts on path-to-profitability, investors will start demanding gross margin visibility that most foundation-model-dependent businesses cannot yet provide. The companies that survive and scale will be the ones that treated model dependency as a first-order architecture decision from day one, not an optimisation problem to be solved at Series B. The hidden costs were never really hidden. They were just easy to ignore when capital was cheap and conviction was high.

Recent Posts

See All

Upcoming Events

  • Jun 30, 2026, 7:00 AM – 10:00 AM EDT
    <UNKNOWN>
    A curated founder-focused meetup where startups and aspiring entrepreneurs pitch ideas, connect with potential investors, and build meaningful relationships within the startup ecosystem.
  • Jun 30, 2026, 7:00 AM – 10:00 AM EDT
    <UNKNOWN>
    A curated founder-focused meetup where startups and aspiring entrepreneurs pitch ideas, connect with potential investors, and build meaningful relationships within the startup ecosystem.
  • Jul 07, 2026, 2:30 PM – 5:30 PM EDT
    TBD
    Network and build relationships with investors, tech experts, and entrepreneurs in Miami. Pitch your startup to active investors and meet AI & tech professionals.
  • Jul 08, 2026, 11:00 AM GMT+2 – Jul 09, 2026, 8:00 PM GMT+2
    Le Carrousel du Louvre
    The global launchpad for the next wave of AI leaders. Features an AI Startup Competition for founders building in AI, with pitches and networking at one of Paris's most iconic venues.
  • Jul 31, 2026, 7:00 AM – 9:00 AM EDT
    <UNKNOWN>
    A curated founder-focused event featuring startup pitching sessions and investor networking opportunities for early-stage founders and investors.
  • Aug 01, 2026, 2:00 AM – 5:00 AM PDT
    <UNKNOWN>
    The AI Summit at Black Hat USA unites leaders, researchers, and innovators exploring how artificial intelligence is redefining digital defense and cybersecurity.
  • Tue, Aug 04
    Aug 04, 2026, 2:00 AM PDT – Aug 06, 2026, 11:00 AM PDT
    The Venetian
    America's largest AI conference, serving as the epicenter of the global AI industry. Brings together enterprise AI practitioners, vendors, and innovators for three days of content and networking.
  • Aug 10, 2026, 6:00 AM GMT-3 – Aug 14, 2026, 3:00 PM GMT-3
    <UNKNOWN>
    Latin America's leading event for frontier science and tech startups, bringing together 2,500+ participants across 5 days of deep tech innovation, science, and startup activity.
  • Tue, Sep 01
    Sep 01, 2026, 2:00 AM – 5:00 AM PDT
    Santa Clara Convention Center
    The ultimate stage for AI infrastructure players, hosting a unique blend of systems and AI market intelligence for engineers, architects, and business leaders.
  • Tue, Sep 15
    Sep 15, 2026, 2:00 AM PDT – Sep 17, 2026, 11:00 AM PDT
    Santa Clara Convention Center
    Large-scale AI infrastructure conference covering compute, AI data centers, and data movement. Features 8,000 attendees and 400+ speakers from across the industry.
  • Sep 29, 2026, 2:00 AM PDT – Oct 01, 2026, 11:00 AM PDT
    <UNKNOWN>
    Annual San Francisco AI conference bringing together thousands of builders, researchers, and leaders shaping the future of applied artificial intelligence.
  • Sep 29, 2026, 2:00 AM PDT – Oct 01, 2026, 11:00 AM PDT
    <UNKNOWN>
    Annual AI conference bringing together thousands of builders, researchers, and industry leaders focused on applied AI innovation and the future of the field.
  • Nov 10, 2026, 10:00 AM GMT+1 – Nov 12, 2026, 7:00 PM GMT+1
    La Nave
    Madrid's flagship Deep Tech summit connecting founders, investors, corporates, and public-sector leaders to accelerate innovation, funding, and real market impact. Held at La Nave.
  • Nov 10, 2026, 10:00 AM GMT+1 – Nov 12, 2026, 7:00 PM GMT+1
    La Nave
    Madrid's flagship Deep Tech summit connecting founders, investors, corporates, and public-sector leaders to accelerate innovation, funding, and real market impact across frontier technology sectors.
  • Dec 09, 2026, 4:00 AM EST – Dec 10, 2026, 1:00 PM EST
    Javits Center
    A flagship enterprise AI event at Javits Center featuring transformative AI insights, enterprise solutions, interactive workshops, live demos, and vibrant networking for business and tech leaders.
  • Jun 23, 2026, 6:00 PM – 9:00 PM GMT+9
    <UNKNOWN>
    An off-campus MIT Startup Exchange event focused on AI and robotics innovation, connecting researchers, founders, and industry leaders in Tokyo.
  • VIVATECH PARIS
    VIVATECH PARIS
    Wed, Jun 17
    Jun 17, 2026, 8:00 AM GMT+1 – Jun 20, 2026, 5:00 PM GMT+1
    Paris, 1 Pl. de la Prte de Versailles, 75015 Paris, France
    This is where business meets innovation
  • Jun 10, 2026, 10:00 AM GMT+1 – Jun 11, 2026, 7:00 PM GMT+1
    Tobacco Dock
    The flagship AI event of London Tech Week featuring 300+ speakers and 100+ tech leaders. Brings together the global AI community to explore applied AI across enterprise and industry.
bottom of page