The $700 Billion Bet:Why AI Infrastructure Lock-In Is Now a Board Fiduciary Issue
AI chips last one to three years. Hyperscalers depreciate them over five to six. That gap is a subsidy — and it establishes enterprise lock-in before boards realise the window has closed.
In This Article
- 01The $700 Billion Calculated Bet
- 02The Three Coalitions and Circular Financing
- 03Three Scenarios — A Is History, B Is Now, C Is the Target
- 04Why Scenario A Is Already Over — The Performance Plateau
- 05The Accounting Fiction — When CapEx Is Really an Expense
- 06Unit Economics — Why Today's Prices Are Artificially Low
- 07Hardware Replacing Labour — The Layoff Connection
- 08Why the Competitive Window Is Three Years
- 09The Google Search Lesson
- 10Scenario C — How to Build Model-Agnostic
- 11Sovereignty, Sustainability, and Agentic AI
- 12Board Fiduciary Duties Under Caremark
- 13The Board Action Plan
- 14Red Flags vs. Green Flags
- 15Fiduciary Checklist for Directors
- 16References & Sources
The Core Argument in Plain English
The issue: Hyperscalers are spending approximately $700 billion on AI infrastructure in 2026 — Amazon $200B, Alphabet $175–185B, Meta $115–135B, Microsoft $120B+, Oracle $50B — while AI-specific services generate only approximately $35 billion in annual revenue. The ~$665 billion gap is funded by record debt issuance ($121 billion in bonds in 2025 alone — four times the five-year average) and artificially underpriced APIs designed to establish enterprise lock-in before true costs must be recovered. Three incumbent coalitions — Microsoft-OpenAI, Amazon-Anthropic, Google-DeepMind — are the primary beneficiaries. The leverage is stacked three levels deep: hyperscalers borrow to build, AI labs spend on circular terms, and enterprises sign during the subsidised phase.
The context: The era of exponential model scaling is over. Large language models are experiencing a performance plateau driven by data exhaustion — models have largely consumed the public internet — and diminishing returns on compute. Making models larger no longer yields proportional capability gains. We are now in the world where models are broadly capable, lock-in is being established today through API dependencies, compliance architecture, and multi-year contracts, and the integration depth that determines pricing leverage for the next decade is being set right now.
What to do: Architect for model-agnostic, portable, provider-neutral infrastructure before integration depth makes migration prohibitive. This is not about building your own infrastructure. It is about retaining the ability to switch providers when inference becomes a commodity and pricing races to the bottom. The board must formalise this posture; as of March 2026, failure to do so with knowledge of these risks is a potential breach of the Caremark standard of fiduciary oversight.
1. The $700 Billion Calculated Bet
In 2026, the five largest technology companies have collectively committed approximately $700 billion to AI infrastructure capital expenditure. Amazon: $200 billion. Alphabet: $175–185 billion. Meta: $115–135 billion. Microsoft: $120 billion or more. Oracle: $50 billion. Together, this represents over 60% of the group’s total operating cash flow — these are no longer asset-light software businesses. They are capital-intensive infrastructure operators making civilisational-scale bets on a technology that has not yet demonstrated the revenue to justify the spending.
AI-specific services currently generate approximately $35 billion in annual revenue across the group. The gap — roughly $665 billion — is not funded by AI revenue. It is funded by record debt issuance. In 2025 alone, the Big Five collectively issued $121 billion in bonds: four times the five-year average. Meta issued $30 billion in bonds — the largest non-merger-and-acquisition bond issuance in corporate history. Alphabet issued century bonds: debt with 100-year maturities. Amazon’s $54 billion bond offering was four times oversubscribed, signalling institutional appetite for the narrative even at levels that precede the revenue to support them.
The free cash flow consequences are already visible. Morgan Stanley projected Amazon’s free cash flow at negative $17 billion in 2026; BofA at negative $28 billion. Barclays projected Meta’s free cash flow declining by 90%. Alphabet’s free cash flow was projected to fall from $73.3 billion in 2025 to approximately $8.2 billion — also a decline of approximately 90%. These are not signs of companies investing from strength. They are signs of companies front-loading expenditure in advance of revenue that they believe, but cannot guarantee, will arrive.
The Bain September 2025 report on AI infrastructure capital requirements estimated that AI firms face an $800 billion annual revenue hole by 2030, growing to over $1.5 trillion annually once true total cost of ownership is calculated. That gap will ultimately be funded through enterprise customer pricing. The customers who are locked in when that pricing reprices will have no leverage.
The bubble indicator that most closely tracks to historical precedent: Nvidia’s disclosed investments and financing commitments total approximately $110 billion against $165 billion in revenue — 67% exposure, nearly three times Lucent’s peak ratio before the 2001 telecom bust. The important distinction from the telecom analogy: when telecom carriers failed, the fibre they had laid into the ground remained usable for decades. A buyer could acquire stranded fibre assets at fire-sale prices and build a competitive business on them. AI chips at fire-sale prices three years after purchase provide no competitive foundation. The hardware generation will have moved on. Unlike fibre, AI compute depreciates to near-zero competitive value within one hardware cycle.
This is not a reason to avoid cloud AI. Hyperscalers bear the capital replacement cycle risk — that is why enterprises pay for API access rather than owning GPUs. It is a reason to understand who will absorb the cost when the subsidy ends. The answer is: the enterprise customers who are locked in with no leverage to negotiate or migrate. The $700 billion bet is not yours to make. But its consequences may be yours to absorb.
2. The Three Coalitions and Circular Financing
The AI market is not a collection of independent companies competing on merit. It is organised around three deeply integrated coalitions where the hyperscaler’s infrastructure economics directly enable their AI lab partner’s competitive positioning at the application layer. Understanding this structure is essential to understanding why today’s pricing cannot be taken at face value.
| Coalition | Infrastructure Relationship | Enterprise Lock-In Vector |
|---|---|---|
| Microsoft – OpenAI | Microsoft invested ~$13B in OpenAI; provides exclusive Azure infrastructure; OpenAI accounts for 45% of Microsoft’s $625B commercial revenue backlog | Azure OpenAI Service, Azure Active Directory, Microsoft 365 Copilot, Entra identity stack — every enterprise Microsoft product deepens the lock |
| Amazon – Anthropic | Amazon invested ~$8B in Anthropic; Anthropic spent 104% of all revenue on AWS compute through September 2025 — pricing below true economic cost | AWS Bedrock, IAM, S3 data storage, Bedrock Guardrails, custom VPC integration — Anthropic’s pricing reflects AWS subsidy, not market rate |
| Google – Gemini / DeepMind | Google Cloud invested ~$2B in Anthropic plus owns DeepMind outright; Gemini integrated across Workspace and GCP; proprietary TPU cost advantage | Vertex AI, Google Workspace, BigQuery ML, GCP identity — the enterprise data already lives in Google’s infrastructure before the AI decision is made |
The loop works as follows: a hyperscaler invests in an AI lab. The AI lab spends those funds on compute credits from the same hyperscaler. The hyperscaler books that spend as revenue. It reports growth. It borrows more on the strength of that growth narrative. It invests more in the AI lab. No net new capital enters the ecosystem — money changes labels.
OpenAI’s February 2026 $110 billion funding round illustrates the mechanics concretely. Amazon contributed $50 billion in equity plus a separate $100 billion AWS compute commitment. Nvidia contributed $30 billion via letter of intent — though Nvidia’s CEO described this publicly as “never a commitment.” SoftBank contributed $30 billion, representing the most genuinely liquid capital in the round. The result: OpenAI appears to have raised $110 billion. In practice, a substantial portion is committed compute spend flowing directly back to Amazon’s revenue line.
OpenAI accounts for 45% of Microsoft’s $625 billion commercial revenue backlog. Microsoft’s growth narrative — its justification for continued infrastructure investment — depends on OpenAI continuing to operate and spend on Azure. Anthropic spent 104% of all revenue on AWS through September 2025. Nvidia invested approximately $1.6 billion in CoreWeave; CoreWeave signed $22.4 billion in contracts with OpenAI. The money flows in a complete loop, with each transaction recorded as legitimate revenue or investment depending on which company’s accounts are examined.
Nvidia’s disclosed investments and financing commitments total roughly $110 billion against $165 billion in revenue: 67% exposure, nearly three times Lucent’s peak before the 2001 telecom bust. Unlike the fibre assets left by the telecom collapse — which remained usable for decades — AI chips rendered obsolete by the next hardware generation cannot be repurposed competitively. The circular financing amplifies the Scenario B risk, not the Scenario C opportunity.
When your enterprise builds on Azure OpenAI Service, AWS Bedrock, or GCP Vertex AI, it is not simply choosing a model provider. It is entering a bundled relationship where AI capability is integrated with the hyperscaler’s identity, billing, compliance, and network infrastructure. The switching cost is not “change the API key.” It is: re-audit all compliance documentation, re-train staff, re-certify security frameworks, migrate data pipelines, and renegotiate multi-year contracts — at the moment when the hyperscaler’s pricing power is highest.
3. Three Scenarios — A Is History, B Is Now, C Is the Target
These three scenarios are not equal futures to analyse. Scenario A is the world we just left. Scenario B is the world we are in. Scenario C is the architecture directors must build toward. Understanding which scenario governs your enterprise’s current AI commitments determines the urgency and character of the governance response required.
Scenario A — Frontier Models Keep Scaling (The Era That Just Ended)
If frontier models keep improving meaningfully with more compute, training capacity remains the decisive resource. Model quality differentiates providers and customers pay premiums for frontier capabilities. Incumbent coalitions that built massive training capacity retain lasting advantages. New entrants cannot compete on capability regardless of unit economics.
Board implication: This era is largely over. Evidence of performance plateauing across frontier models, combined with the data exhaustion problem documented in Section 4, means training capacity is no longer the decisive differentiator. This world explains the CapEx boom — every major player was racing to lock in compute before the plateau arrived. That race is largely complete. Scenario A reasoning no longer provides a valid justification for current hyperscaler spending levels.
Scenario B — Models Plateau, Enterprise Lock-In Holds (Where We Are Now)
Model capabilities hit diminishing returns. Most models become “good enough” for enterprise tasks. Competition shifts to inference cost, integration depth, and application quality. In this world, incumbent coalitions who built scale in the early years retain advantages through deep enterprise integrations built during the buildout period; multi-year contracts signed when they had pricing power; compliance certifications and security frameworks built around specific coalition stacks; organisational inertia and switching costs; and brand trust established during the subsidised phase.
Board implication: The current evidence — capability plateauing, inference costs falling, open-source models approaching proprietary quality — confirms we are in this scenario. The accounting subsidy enables incumbent coalitions to foreclose competition at the application layer during the critical window. Switching costs prevent correction even when better alternatives emerge. Lock-in is being established right now. This is the Google search outcome: the temporary subsidy enables permanent competitive advantage — not through superior technology but through integration depth.
Scenario C — Models Plateau, Low Switching Costs (The Target Architecture)
Models plateau and switching costs prove lower than expected. APIs standardise across providers, models are equally capable, and price becomes the primary differentiator. New entrants compete effectively on economics using newer, more efficient infrastructure. Enterprises can route workloads to whatever provider offers the best price-performance ratio at any point.
Board implication: In this world, the accounting mismatch is an investor problem, not an enterprise strategy problem. The market self-corrects. Enterprises that architected for Scenario C — and survived Scenario B — are structurally advantaged. The critical insight: enterprises can engineer themselves into Scenario C regardless of which world the market produces, by building LLM-agnostic architecture from the start.
The decisive question for every board is not which scenario the market will produce. It is whether your enterprise’s AI architecture preserves the flexibility to benefit from Scenario C regardless of which world emerges. An LLM-agnostic architecture is the answer to that question.
4. Why Scenario A Is Already Over — The Performance Plateau
The Scenario A world — where more compute reliably produced better models — ran from GPT-3 in 2020 through approximately GPT-4-class models in 2023. We are past that point. Claiming that continued training investment will yield proportional capability gains is now a claim that requires evidence, not an assumption that can be taken as given. Five mechanisms explain why.
1. Data Exhaustion
Models have largely trained on the entire public internet. Common Crawl, Books3, GitHub, Wikipedia — these corpora are not growing proportionally to model appetite. The high-quality, human-generated text that produces capable language models is a finite resource. Models trained in 2024–2026 are encountering the edge of what is available. High-quality new data is scarce, and the quality of marginal training data declines as the corpus approaches saturation.
2. Model Collapse from Synthetic Data
As AI-generated content floods the web, models that train on it risk “model collapse” — a phenomenon where training on AI-generated text causes the model to lose diversity and approach a performance ceiling. Cambridge and Oxford research groups have documented this risk formally. The more AI-generated content enters the training corpora, the more severe the ceiling effect becomes. This creates a structural constraint on capability improvement that is independent of compute budget.
3. Diminishing Returns on Scale
Making models larger no longer yields proportional gains. Capability gaps between top-tier models dropped from 20–30% to under 5% across most enterprise benchmarks between 2024 and 2026. OpenAI’s o1, Google’s Gemini 2.x, and Anthropic’s Claude 3.x family all occupy a capability band that enterprise tasks rarely require differentiating within. For the vast majority of enterprise workloads — document analysis, summarisation, code generation, structured data extraction — the models are interchangeable on quality. Competition has already shifted to price, integration, and reliability.
4. The Shift to Inference-Time Compute
To compensate for training scale limits, leading labs are investing in reasoning at inference time — chains of thought, reflection loops, extended “thinking.” OpenAI’s o1 and o3 series are the primary commercial expression of this shift. This changes the economics fundamentally: inference compute becomes the constraint, not training compute. And inference compute is a commodity. Any sufficiently capitalised provider with access to modern GPUs can offer inference. The moat of “we trained the biggest model” disappears when the capability differentiator shifts to how the model reasons at inference time, not how large its training corpus was.
5. Specialisation over Generalisation
The frontier of AI capability is shifting to smaller, highly optimised specialist models — coding, legal analysis, financial modelling, medical diagnosis — rather than larger general models. Specialist models built on open-weight foundations can outperform frontier general models on domain-specific benchmarks at a fraction of the inference cost. This further erodes the “best model” moat that Scenario A reasoning depends upon.
When the best model is no longer clearly differentiated, competition shifts to integration depth, pricing, compliance frameworks, and switching costs. This structural shift is what makes Scenario B the primary risk for enterprise directors — and what makes the next 12–18 months the decisive window. Organisations that allow lock-in to deepen during this period, under the assumption that the model quality gap justifies the dependency, are making a bet that the evidence does not support.
5. The Accounting Fiction — When CapEx Is Really an Expense
Princeton CITP research and independent analyst work on AI chip economics makes a straightforward argument: AI GPUs are being treated as long-lived assets when their economic and competitive useful life is one to three years. A Google architect, speaking anonymously, assessed that GPUs running at the 60–70% utilisation standard for AI workloads survive one to two years under thermal and electrical stress — three years at the maximum. Yet companies depreciate these chips over five to six years, spreading replacement costs across a period roughly twice the assets’ competitive life. The Economist labelled the result “the $4 trillion accounting puzzle at the heart of the AI cloud.”
The framing matters for enterprise strategy: this is not primarily a financial reporting problem. It is a competition problem. The accounting mismatch creates what Princeton CITP calls a “competitive subsidy” — artificially low reported costs that let incumbent coalitions price AI APIs below their true economic cost during the critical years when enterprise customer relationships are being formed.
Consider Microsoft’s approximately $80 billion annual AI infrastructure spend. If half goes to computing hardware with a true three-year lifespan, Microsoft faces actual replacement costs of roughly $13 billion per year. By depreciating over six years, its reported annual depreciation is only $6.5 billion — creating an apparent $6.5 billion annual cushion to subsidise OpenAI’s API pricing. The same logic applies to Amazon subsidising Anthropic through AWS, and to Google subsidising Gemini through GCP. The subsidy is real; its accounting treatment obscures it.
| Company | Depreciation Schedule | Change (2024–2025) | Financial Impact |
|---|---|---|---|
| Amazon (AWS) | 6 yrs → 5 yrs | Shortened ↓ Feb 2025 | $700M operating income hit; $920M early-retirement charge Q4 2024 |
| Meta | 5 yrs → 5.5 yrs | Extended ↑ Jan 2025 | $2.9B depreciation reduction — inflates reported earnings |
| Microsoft / Google | 4–6 yrs | No public change | Defending extended schedules against analyst pressure |
Amazon’s move signals that even the hyperscalers know the six-year assumption is unsustainable. Meta’s move in the opposite direction signals that short-term earnings management is overriding economic reality. The divergence is itself evidence that no empirically grounded consensus on AI chip useful life exists.
One analyst estimate puts the aggregate earnings overstatement — across hyperscalers continuing to apply six-year schedules — at approximately $176 billion from 2026 to 2028. For enterprise buyers, this means the API pricing being offered today is structurally subsidised by accounting assumptions that are already contested at the CFO level.
The “value cascade” defence is worth examining directly. Hyperscalers argue that chips move through productive phases: training workloads in years one and two, inference workloads in years three and four, batch processing and archival tasks in years five and six. The problem with this defence is a single data point: Blackwell delivers approximately 25 times the inference efficiency of the H100. In a power-constrained data centre, running H100s for inference once Blackwell scales at volume is economically irrational. The cascade argument breaks at the generation boundary — and hardware generations now arrive every 18 to 24 months.
FASB ASU 2025-06, finalised on 30 July 2025, establishes new AI software cost recognition standards under ASC 350-40. Boards must ensure their company’s AI capitalisation practices comply — and that auditors are applying the standards consistently rather than following hyperscaler precedent on depreciation schedules. The audit committee’s AI governance mandate now includes reviewing whether the company’s own AI infrastructure accounting reflects economic reality.
6. Unit Economics — Why Today’s Prices Are Artificially Low
AI API prices have fallen approximately 280 times since November 2022 — from roughly $20 to $0.07 per million tokens, per Epoch AI data. This is frequently cited as evidence of a healthy, competitive market. It is more accurately evidence of a land-grab subsidised by circular financing and loss-leader pricing. The prices reflect strategic intent, not economic sustainability.
OpenAI projected approximately $25 billion in annual recurring revenue against a projected net loss of $14 billion in 2026. The business runs at roughly 56 cents of real cost per dollar of revenue. The gap is funded by the $110 billion in circular financing described in Section 2 — Amazon equity and compute commitments, Nvidia’s letter of intent, and SoftBank’s genuinely liquid capital. OpenAI is not a technology business operating at market rates. It is a technology business operating on strategic subsidy to establish market position before the subsidy must end.
Anthropic reported approximately $19 billion in annual recurring revenue as of March 2026. Through September 2025, it spent 104% of all revenue on AWS compute. Every dollar of Anthropic API revenue is cross-subsidised by Amazon’s $8 billion equity position and its compute commitments. The pricing an enterprise sees when it signs an Anthropic API agreement is not a market rate. It is a strategic acquisition price — set to establish and deepen enterprise relationships before true economic cost recovery is required.
| Lab | ARR (Mar 2026) | 2026 Net Loss (est.) | Cloud Dependency | Revenue Mix |
|---|---|---|---|---|
| OpenAI | ~$25B | ~$14B | Primarily Azure | 75% consumer / 25% enterprise |
| Anthropic | ~$19B | ~$6–9B | 104% of revenue on AWS | 80% enterprise / 20% consumer |
The consumer versus enterprise distinction matters for boards making procurement decisions. OpenAI’s consumer dominance — ChatGPT with approximately 5.6 billion monthly active users — builds brand recognition but generates lower-quality revenue: subscription-based, with heavy users consuming far more than they pay. Anthropic’s enterprise focus, with Claude Code at approximately $2.5 billion in annual recurring revenue by February 2026, generates API-based revenue with computable per-token economics. Neither is pricing at market rate. Both are pricing to establish position. Enterprise procurement decisions made on the assumption that current pricing reflects sustainable economics are decisions made on a false premise.
The agentic AI cost trap compounds the concern. Standard conversational AI interactions cost approximately $0.01 to $0.05 per session. CLI sessions using tool-calling and model context protocol frameworks cost $20 to $50 per hour of use. Simple agentic tasks cost $2 to $40 per completion. Fully autonomous agents working on complex, multi-step projects can exceed $100 per completion. These cost figures are expressed at today’s subsidised API rates. Enterprises currently building agentic workflows are creating future operating expenditure obligations that do not appear on current balance sheets — because the true cost of the inference that powers those workflows is masked by the subsidy.
At current per-token prices, what is your projected annual AI inference spend in 2028, 2029, and 2030? Now what is that figure if prices rise three to five times as the depreciation subsidy unwinds? Can your organisation sustain the agentic workflows you are currently building if that happens? Which provider will have the leverage to reprice that inference, and what will your switching options be at that point?
7. Hardware Replacing Labour — The Layoff Connection
The wave of technology sector workforce reductions since 2023 is not random cost-cutting. It follows a deliberate capital reallocation from labour to hardware. The arithmetic is visible at the company level: $700 billion in hardware capex competes for the same cash from operations as payroll. When a hyperscaler commissions a new data centre full of GPUs, it is acquiring capital that performs tasks previously done by people. Amazon, Meta, and Microsoft each announced simultaneous capacity expansion and workforce reduction programmes in 2025 and 2026. These are not coincidental — they are expressions of the same strategic bet.
The scale of the Labour-as-a-Service bet is important context. The total addressable market for software tools is approximately $5 trillion — the global enterprise IT budget. The total addressable market for autonomous AI agents that replace human knowledge workers is approximately $40 trillion — the global knowledge labour budget. Hyperscalers are not building $700 billion of infrastructure for software tooling. They are building infrastructure for digital workers. The pricing strategy follows: digital workers at 20–30% of the cost of equivalent human labour — while running on subsidised inference — creates compelling return on investment that justifies enterprise adoption before the subsidy ends.
The capex is front-loaded; the returns are back-loaded. Whether the bet pays off depends on three conditions: whether agentic AI reaches the productivity claims being made for it; whether inference costs remain low enough to make digital workers economically rational at scale; and whether the enterprises that build on these platforms retain any pricing leverage when the bill comes due. None of these conditions is certain. All three must hold simultaneously for the hyperscalers’ investment thesis to validate.
When your organisation uses AI to reduce headcount and increase productivity, what is the true long-term cost of the inference that replaces that labour? If inference prices normalise to economic cost — as they must when the accounting subsidy unwinds — do the workforce economics still hold? And which provider will have the leverage to reprice that inference when the workforce reduction has already occurred and the organisational capability to perform that work without AI has been lost?
8. Why the Competitive Window Is Three Years
The claim that infrastructure benefits wear out in three years is not a pessimistic forecast. It is supported by the hardware generation cadence, by market pricing data, and by hardware vendors’ own statements.
Nvidia’s generational cycle: the H100 (“Hopper”) arrived in late 2022. Blackwell (GB200) launched in March 2025, delivering 4–5 times faster inference than the H100. Jensen Huang at the Blackwell launch: “Nobody wanted the previous generation of chip anymore. There are circumstances where Hopper is fine. Not many.” Nvidia then announced Rubin (arriving 2026) less than twelve months after Blackwell, with 7.5 times greater performance than Blackwell. Each generation renders the previous generation economically obsolete for frontier workloads within approximately 18–24 months.
Investment analyst Gil Luria at D.A. Davidson puts the market value decline at 85–90% within three to four years. Meta’s own Llama infrastructure study recorded a 9% annual GPU hardware failure rate — meaning roughly one third of GPUs physically fail before the four-year mark, independent of obsolescence. Barclays cut earnings forecasts for AI firms by up to 10% for 2025 to account for more realistic depreciation assumptions. The convergence of hardware lifecycle data, market value data, and analyst assessments all point to the same conclusion: three years is the economic useful life of an AI chip in an active workload.
The practical consequence for boards is precise: AI infrastructure investments made in 2025–2026 have a competitive economic useful life of one to three years. Any capital expenditure proposal for on-premises GPU infrastructure that uses a standard 10% WACC and five-to-seven-year depreciation schedule is economically incorrect. The Bain September 2025 report quantifies the aggregate problem: AI firms will face an $800 billion annual revenue hole by 2030 to fund capital replacement, growing to over $1.5 trillion annually once true total cost of ownership is calculated. That gap will be funded through customer pricing — and locked-in customers have no leverage when it arrives.
Enterprises do not need to own AI infrastructure. Hyperscalers bear the capital replacement cycle risk — that is precisely why enterprises buy API access rather than GPUs. The strategic implication is not “avoid cloud AI.” It is “avoid letting any single hyperscaler’s account team make your infrastructure decisions for you.” The GPU replacement burden is the hyperscaler’s problem as long as the enterprise has not surrendered its ability to switch.
9. The Google Search Lesson: Lock-In Beats Better Technology
The DOJ v. Google search case provides the analytical framework for AI application-layer lock-in. The parallel is not decorative — it is structural.
Google maintained search dominance through default placements and integration lock-in. It paid billions annually to be the default search engine on Safari, Android, and other platforms. These defaults created inertia: users rarely switched even when alternatives were one click away. The court found this lock-in to be anticompetitive. The insight: in a market where switching took literally a few clicks, default positions and integration created durable advantages that competitors could not overcome with better economics or comparable quality.
Now consider the AI application layer. Switching costs are substantially higher than search:
- Enterprise API integrations require months of engineering work to rebuild
- Fine-tuned models must be retrained from scratch or migrated on new infrastructure
- Security certifications, compliance frameworks, and audit trails are built around specific coalition stacks
- Multi-year contracts lock in pricing and service commitments
- Organisational workflows, training, and tooling are built around specific providers
- In regulated industries, the AI governance documentation references specific provider controls — changing provider means re-certifying the entire compliance architecture
If Google could maintain dominance in search with switching costs measured in seconds, how durable might AI application-layer lock-in prove when enterprises are integrated into the hyperscaler infrastructure stack across identity, data, security, and billing?
Technology deflation makes this worse, not better for newcomers. Capability improvements help everyone, but they help incumbents more when lock-in already exists. For coalitions with locked-in customers, capability expands through replacement spending. New entrants with better technology have lower unit economics but must still acquire customers from behind 18 months of re-engineering switching costs. A new entrant with a 2026 Rubin-based inference advantage cannot displace a 2025 Blackwell-based incumbent if the enterprise customer cannot credibly threaten to migrate.
10. Scenario C — How to Build Model-Agnostic
The goal is not to build your own infrastructure from scratch. Hyperscalers bear the capital replacement cycle risk — that is why enterprises pay for API access rather than owning GPUs. The goal is to retain the ability to switch between providers when inference becomes a commodity and prices race to the bottom. As performance plateaus and competition intensifies, inference pricing will trend toward marginal cost. You want to be positioned to route to the lowest bidder. That requires four things.
An LLM-agnostic architecture is the deliberate engineering of low switching costs. Think of it like a multi-currency treasury. A company that invoices only in USD is exposed when the dollar strengthens. A company with a multi-currency structure retains the option to invoice in whichever currency is most favourable. The LLM-agnostic enterprise retains the ability to route workloads to whichever model provider offers the best price-performance ratio at any point. The locked-in enterprise has surrendered that option.
1. The Interchangeable Model Strategy
Instead of coding to a specific model’s proprietary quirks — GPT-4’s function calling syntax, Claude’s XML formatting conventions, Gemini’s grounding API — use standardised prompt structures and universal orchestration frameworks such as LangChain, Haystack, or LlamaIndex. When DeepSeek or a Llama-based host offers 50% lower cost for equivalent performance on your workload, you update the model identifier in a configuration file and realise the savings immediately — without a refactor of application code, without re-testing integrations, without re-training staff.
2. Provider-Agnostic Hosting
Avoid using a model exclusively through its parent cloud — Claude only through AWS Bedrock, GPT only through Azure OpenAI Service. Use model aggregators: Together AI, Anyscale, and Groq run the same open-weight models (Llama, Mixtral) at lower cost than the Big Three hyperscalers. If Azure becomes too expensive, you point API calls to Fireworks.ai or Lambda Labs the next day. LiteLLM — which provides a single OpenAI-compatible interface to more than 100 providers — is the standard implementation tool for this approach.
3. Small-Model Optimisation — The Semantic Router
The biggest cost saving is not switching providers — it is switching model classes. For approximately 80% of enterprise tasks — summarisation, classification, structured data extraction, document question-and-answer — a frontier model is substantial overkill. Implement a semantic router: it evaluates each incoming request and routes simple tasks to a cheap small model (Llama 3 8B, Mistral Nemo) and complex reasoning to a frontier model. This automatically shifts volume to the cheapest option based on actual task difficulty, maintaining quality where it matters while reducing inference spend on the tasks where quality differentiation does not exist.
4. Data Portability — The Real Lock-In
The data layer, not the model layer, is often where lock-in is most durable. Enterprises that use a provider’s proprietary AI-powered database — Azure AI Search, AWS OpenSearch Serverless, Vertex AI Vector Search — cannot migrate the model without also migrating the entire data infrastructure. Use standalone vector databases (Qdrant, Milvus, Weaviate, or pgvector on self-managed PostgreSQL) that connect to any language model. Then swapping the model does not require re-indexing or migrating the data. The architectural principle: keep the data layer and the model layer independently swappable.
The Abstraction Layer
The core implementation tool is a provider-neutral API gateway that sits between the enterprise’s applications and the model providers:
- LiteLLM: Open-source proxy providing a unified OpenAI-compatible API interface over 100+ providers — OpenAI, Anthropic, Google Vertex, Cohere, Mistral, and local models. Switching model providers requires changing a configuration value, not re-writing application code.
- Portkey: AI gateway with load balancing, fallback routing, and cost management across providers. Also serves as the AI control plane for budget caps, human-in-the-loop routing, and semantic logging.
- LangChain / LangGraph: Orchestration framework with provider-agnostic model interfaces; a ChatAnthropic call and a ChatOpenAI call share the same interface contract.
- OpenRouter: Unified API routing to multiple frontier providers with automatic failover.
- MCP (Model Context Protocol): Anthropic’s open standard for tool and data connectivity that any compliant model can consume — preventing tool-layer lock-in even when the model layer changes.
The Data Architecture
The most durable form of lock-in is data lock-in — when fine-tuning datasets, embedding indices, and custom model weights live exclusively inside a proprietary hyperscaler’s managed service. Mitigation requires three commitments:
- Maintain fine-tuning datasets in provider-agnostic formats (Parquet, JSON) in enterprise-controlled storage — not inside the hyperscaler’s managed ML platform
- Use open embedding models (nomic-embed-text, sentence-transformers) with vectors stored in provider-agnostic vector databases (Qdrant, Weaviate, pgvector on self-managed PostgreSQL) rather than proprietary managed vector stores
- Export and independently store all fine-tuned model weights — not only in the hyperscaler’s model registry
Operational Portability
- Infrastructure as Code: Use Terraform or Pulumi to define environments in vendor-neutral ways, making it straightforward to replicate your stack on a different cloud provider without rebuilding from scratch.
- Containerised workloads: Deploy AI services on Kubernetes so they run identically across any major cloud provider. The workload becomes portable independently of the underlying infrastructure.
- Open standards: Adopt interoperable protocols — Model Context Protocol, Apache Arrow, Parquet — for data and tool connectivity that any compliant model can consume without proprietary adaptation.
Contractual Safeguards
- Negotiate exit clauses: require data portability (export in open formats) and service continuity terms as standard in any hyperscaler AI agreement
- Avoid proprietary managed AI service dependencies that are unique to one platform and have no open-standard equivalent
- Retain model ownership: clarify contractual rights to fine-tuned weights and proprietary training data used on the provider’s infrastructure
- All hyperscaler AI contracts must disclose the provider’s current hardware depreciation schedule; planned changes to that schedule; and pricing mechanisms for years three through seven of the relationship
The Cost Calculation
An LLM-agnostic architecture costs approximately 15–25% more in engineering time at initial build compared to a hyperscaler-native integration. Against that upfront premium: migration at year four or beyond, when pricing pressure creates the incentive to switch, is estimated at 18–36 months of re-engineering work. The upfront premium is the insurance cost. The alternative is paying the migration cost at the moment when the counterparty has maximum pricing leverage.
11. Sovereignty, Sustainability, and Agentic AI
Sovereign AI
EU enterprises are disproportionately adopting Mistral, Llama, and DeepSeek to avoid US cloud lock-in — driven by GDPR compliance requirements, EU AI Act certification architecture, and data sovereignty obligations under national law. The pattern is not ideological preference for European technology; it is rational risk management given that CLOUD Act jurisdiction means US intelligence services can compel disclosure of data held on US-domiciled cloud infrastructure, regardless of where the data centre is physically located.
In Asia-Pacific, Singapore MAS requirements, Hong Kong SFC rules, and the Australian Privacy Act are driving similar decisions toward localised or open-weight deployment models. GCC sovereign wealth funds — UAE’s Mubadala and Saudi Arabia’s PIF — are requiring “Hard Cash for Hardware” investment structures that prioritise physical infrastructure ownership rather than cloud API dependency. The sovereign AI movement is reinforcing the same model-agnostic architecture that protects against commercial lock-in. The governance imperative and the strategic imperative are aligned: the architecture that satisfies data sovereignty requirements is the same architecture that preserves commercial flexibility.
Power, Water, and Sustainability
Microsoft, Google, and Amazon have all been required to revise net-zero commitments after AI data centre energy demand spiked beyond projections. A single complex AI query can use 10 to 30 times the energy of a standard web search. EU CSRD, UK ISSB-aligned reporting, and SEC climate disclosure rules increasingly require Scope 3 emissions disclosure — including cloud compute used in enterprise operations. This creates a regulatory reporting obligation that is directly tied to the volume and character of AI API consumption.
Power grid access is now the binding constraint on hyperscaler data centre expansion in most major markets. This creates structural upward pressure on inference pricing that is independent of the depreciation accounting cycle: as energy costs rise, and as power grid bottlenecks constrain capacity additions, the cost floor for AI inference rises accordingly. Enterprises locked into a single provider have no leverage when that provider’s energy costs drive inference price increases. Enterprises with multi-provider architecture can route away from the highest-cost provider as energy economics shift.
Agentic AI and Deferred Costs
The most expensive AI workload is the agentic workflow. Agentic tasks consume four to fifteen times more tokens per completion than simple conversational AI. A 30-minute Claude Code session can consume API value equivalent to one to three hours of human knowledge-worker time — at today’s subsidised rates. When pricing normalises to true economic cost, enterprise agentic deployment costs could increase five to ten times against current rates. NACD data shows only 6% of boards have AI-related management reporting metrics. Most boards approving agentic AI rollouts have no visibility into actual per-task compute costs — only blended “AI budget” line items that obscure the unit economics entirely.
What is the actual compute cost per completed agentic task today, measured at the per-token level? What does that cost become if inference prices increase three times as the depreciation subsidy unwinds? Have the agentic AI business cases approved by this board been stress-tested at that price level? If not, the board may have approved commitments whose underlying economics have not been evaluated under plausible future pricing conditions.
12. Board Fiduciary Duties Under Caremark: When Is This a Board Issue?
The Caremark standard — derived from the 1996 Delaware Chancery case In re Caremark International Inc. Derivative Litigation — establishes that directors have a duty to maintain oversight systems for known, material, foreseeable risks. That duty does not require directors to prevent every bad outcome; it requires that they have a system in place to identify and manage risks that a reasonable director would recognise as material.
AI infrastructure lock-in meets all three Caremark criteria:
- Known: The Princeton CITP analysis, Bain capital requirements report, analyst commentary from Barclays and D.A. Davidson, and The Economist’s “$4 trillion accounting puzzle” coverage collectively ensure that AI infrastructure lock-in risk is publicly identified. As of March 2026, ignorance is not a viable defence.
- Material: A company that is locked into a single hyperscaler coalition faces concentrated pricing power from a counterparty at the moment when that counterparty’s true infrastructure costs must be recovered. The Bain $800 billion annual revenue hole is the aggregate demand that will be presented to enterprise customers through pricing — and the locked-in customer has no leverage. This is a balance sheet and income statement risk, not a reputational preference.
- Foreseeable: The timing mechanism is disclosed. The three-to-six-year window is identified. The Scenario B pathway is documented. Directors who approve AI infrastructure commitments with knowledge of this framework cannot later claim the risk was unforeseeable.
NACD’s 2025 Board Practices and Oversight Survey found that only 36% of boards have implemented a formal AI governance framework — and only 6% have established AI-related management reporting metrics. These figures mean the majority of boards currently lack both the policy infrastructure and the information flow to apply Caremark-standard oversight to AI infrastructure decisions. Approving a multi-year hyperscaler AI commitment without that infrastructure in place is precisely the scenario Caremark addresses.
- Delaware SB21 (27 February 2026): New liability shields for interested-party transactions — but only when independent directors or disinterested stockholders approve them through a documented process. This raises the process bar, not lowers it.
- FASB ASU 2025-06 (30 July 2025): Finalised new AI software cost recognition standards. Directors must understand how their company’s AI capitalisation policies align with these standards and whether hyperscaler contracts exploit depreciation schedule ambiguities.
- EU AI Act (enforceable 2 August 2026): Compliance architecture for Annex III high-risk AI systems is tied to specific provider controls. Changing AI infrastructure provider post-certification requires re-certifying the compliance architecture. Boards that approve deep hyperscaler integration without considering EU AI Act certification implications are compounding commercial lock-in with regulatory lock-in.
13. The Board Action Plan: Seven Steps to Mitigate Lock-In Risk
These seven measures should be formalised in an AI Infrastructure Policy approved at board level and reviewed annually.
Mandate LLM-Agnostic Architecture
All new AI integrations must use provider-agnostic abstraction layers (LiteLLM, Portkey, or equivalent). No application may call a proprietary hyperscaler SDK directly where an OpenAI-compatible open alternative exists. This policy applies to internal development and to third-party vendors building AI solutions for the company.
Enforce Model Diversity
No single provider may exceed 60% of total AI inference spend. This is not a performance constraint — it is a pricing leverage policy. The moment a single provider holds 80%+ of your AI workload, their contract team knows you cannot credibly threaten migration. At 60%, the threat is real.
Require Provider-Agnostic Data Storage
All fine-tuning datasets, embedding indices, and model weights must be stored in enterprise-controlled infrastructure using open formats. No AI training asset may exist only inside a hyperscaler's proprietary managed service. This is the single most important protection against data lock-in.
Commission an Annual Switching Cost Audit
Once per year, the CTO must report to the board: "What is our current cost, in engineer-months and service disruption days, to migrate 100% of AI workloads to an alternative provider within 90 days?" If the answer exceeds 12 months of re-engineering work, lock-in has occurred and a remediation plan is required.
Require Pricing Transparency in AI Contracts
All hyperscaler AI contracts must include clauses disclosing: the provider's current hardware depreciation schedule; any planned changes to that schedule; the pricing mechanism for API access in years three through seven; and the exit rights and data portability obligations. This is standard practice in enterprise software procurement and should be standard in AI procurement.
Apply a 25–30% Obsolescence Discount to On-Premises GPU Capex
Any capital expenditure proposal for owned AI hardware infrastructure must be evaluated using a 25–30% annual discount rate — not a standard 10% WACC. This reflects the one-to-three-year economic useful life of AI chips. A project that returns capital in year five at 10% WACC does not return capital at all at 30% WACC if the infrastructure is economically obsolete by year three.
Demand Agentic AI Unit Economic Disclosure
Any board-approved agentic AI deployment must include a unit economics report showing: actual compute cost per completed task at current pricing; projected cost at 3× and 5× current pricing; the headcount or productivity assumption the deployment is premised on; and the break-even pricing at which the deployment becomes uneconomic. This report must be refreshed annually. Boards that approve agentic workflows without this disclosure are potentially approving negative-margin commitments that are not visible on current balance sheets.
14. Red Flags vs. Green Flags
| Red Flags — High Lock-In Risk | Green Flags — Defensible Architecture |
|---|---|
| × All AI workloads run on a single hyperscaler (Azure, AWS, or GCP) | ✓ AI inference spend distributed across at least two independent providers |
| × Applications call provider-specific SDKs directly (azure-openai, @anthropic-ai/sdk) | ✓ All model calls routed through a provider-agnostic gateway (LiteLLM, Portkey) |
| × Fine-tuning datasets stored only in the hyperscaler's managed ML platform | ✓ Fine-tuning data in enterprise-controlled S3-compatible storage in open formats |
| × Embedding indices in Azure AI Search, AWS OpenSearch Serverless, or Vertex AI Vector Search only | ✓ Embedding indices in Qdrant, Weaviate, or pgvector on independently managed infrastructure |
| × Multi-year hyperscaler AI contracts with no data portability or pricing transparency clauses | ✓ Contracts include exit rights, data portability obligations, and annual price review mechanisms |
| × Board has never received a switching cost analysis | ✓ Annual switching cost audit reported to audit committee; remediation plan if >12 months |
| × On-premises GPU capex evaluated at standard 10% WACC over 5 years | ✓ On-premises GPU capex evaluated at 25–30% obsolescence discount over 3-year useful life |
| × EU AI Act compliance certification built entirely around one provider's control architecture | ✓ EU AI Act compliance architecture documented as provider-agnostic; portable to alternative providers |
| × Agentic AI deployments approved without per-task unit economic disclosure | ✓ All agentic AI business cases include unit economics at 1×, 3×, and 5× current pricing |
15. Fiduciary Checklist for Directors
The following questions are appropriate for directors to ask of management at the point of approving any material AI infrastructure commitment or AI technology strategy.
“What is the total cost, in engineering months, to migrate our entire AI workload to a different model provider within 90 days? When did we last measure this?”
“What depreciation schedule are our AI infrastructure providers using for their GPU hardware? How does that compare to the one-to-three-year economic lifespan identified in the Princeton CITP and D.A. Davidson analyses? Have we modelled the pricing implications of depreciation schedule normalisation on our AI cost base in years three through seven?”
“Do our hyperscaler AI contracts include pricing transparency clauses and data portability obligations? What are our exit rights and at what cost?”
“Are our fine-tuning datasets, embedding indices, and trained model weights stored in provider-agnostic infrastructure that we control? Or do they exist only inside a hyperscaler's proprietary managed service?”
“Is our EU AI Act compliance certification architecture portable to an alternative AI provider? If we needed to switch providers for competitive reasons, what would the re-certification cost and timeline be?”
“For each agentic AI deployment we have approved: what is the actual compute cost per completed task today? What is that cost at three times current pricing? Have our business cases been tested at that level, and do they still hold?”
“Has management presented us with a formal AI Infrastructure Policy that includes a model diversity requirement, an annual switching cost audit, a documented LLM-agnostic architecture standard, and agentic AI unit economic disclosure requirements? If not, what is the plan and timeline to produce one?”
References & Sources
[1] Princeton CITP (2025). “Lifespan of AI Chips: The $300 Billion Question.” Centre for Information Technology Policy, Princeton University, 15 October 2025. blog.citp.princeton.edu
[2] Princeton CITP (2025). “AI Chip Lifespans: A Note on the Secondary Market.” Centre for Information Technology Policy, Princeton University, 18 December 2025. blog.citp.princeton.edu
[3] Princeton CITP (2025). “Why the GenAI Infrastructure Boom May Break Historical Patterns.” Centre for Information Technology Policy, Princeton University, 25 November 2025. blog.citp.princeton.edu
[4] Bain & Company (2025). AI Infrastructure Capital Requirements Report, September 2025. [Cited in [1]]
[5] Amazon Web Services (2025). Q4 2024 Earnings — server depreciation schedule change to 5 years. February 2025.
[6] Meta Platforms (2025). Q4 2024 Earnings — server depreciation extended to 5.5 years. January 2025.
[7] Luria, G. / D.A. Davidson (2025). GPU market value decline analysis: 85–90% within 3–4 years.
[8] Menlo Ventures (2025). “2025 Mid-Year Enterprise LLM Market Update.” menlovc.com
[9] Anthropic (2026). Series G Funding Announcement, $30 billion at $380 billion post-money valuation, 12 February 2026. anthropic.com
[10] LiteLLM (2025). Open-source LLM proxy documentation. docs.litellm.ai
[11] Portkey AI (2025). AI Gateway documentation. portkey.ai
[12] FASB ASU 2025-06 (30 July 2025). Accounting for Costs of Software — Intangibles — Goodwill and Other: Internal-Use Software. Financial Accounting Standards Board.
[13] The Economist (2025). “The $4trn accounting puzzle at the heart of the AI cloud.” [Cited in [1]]
[14] Gartner (2026). “Global AI Regulations Fuel Billion-Dollar Market for AI Governance Platforms.” 17 February 2026. gartner.com
[15] In re Caremark International Inc. Derivative Litigation, 698 A.2d 959 (Del. Ch. 1996). Foundation for director oversight duty of care standard.
[16] Delaware SB21. Signed into law 27 February 2026. Amendments to Delaware General Corporation Law §144.
[17] NACD (2025). Board Practices and Oversight Survey — AI Governance Data (2025). National Association of Corporate Directors.
[18] WilmerHale (2026). “Board Oversight and Artificial Intelligence: Key Governance Priorities for 2026.” January 2026. wilmerhale.com
[19] Epoch AI (2024). “LLM inference prices have fallen rapidly but unequally.” epoch.ai
[20] Menlo Ventures (2025). “2025 Mid-Year Enterprise LLM Market Update.” menlovc.com
[21] Bain & Company (2025). AI Infrastructure Capital Requirements Report. September 2025. [See also [4]]
[22] Morgan Stanley / BofA Research (2026). Amazon free cash flow projections, 2026. Cited in analyst coverage, Q1 2026.
[23] Barclays (2025). Meta free cash flow projection and AI earnings adjustments, 2025–2026. Analyst research note.
© 2026 AI Board Course. This article is for educational and governance training purposes only. It does not constitute legal, financial, or investment advice. Directors should consult qualified legal counsel for jurisdiction-specific guidance on fiduciary obligations. CFA Institute, CPA Ontario, and Edexcel/Pearson are referenced for credential verification purposes only and do not endorse the content of this article.
Your Board Needs This Framework
The AI Board Course gives directors the language, frameworks, and technical literacy to lead on AI governance — not just defer to IT. Taught by June Lai, CFA, CPA, CMA.