Stories linking everyone in Telecom

Table of Contents

This is not a mistake. At launch, it’s often the only practical option. Usage volumes are low. Provider invoices are manageable. Differences between accounts are hard to see. So teams price with what they have: competitor benchmarks, provider rate cards, early usage estimates, and a target margin.

The price goes live. Customers sign. Sales decks are updated. Contracts are built around that number.

Then production usage data starts to arrive. One customer runs simple tasks. Another runs long workflows with more model calls, retrieval, tool use, and support overhead. Both pay the same rate.

That’s the point when the first price stops being a launch decision and becomes a margin question: did this customer make us money last month? The contract says one thing while the usage logs say another. The provider invoices arrive in tokens, minutes, characters, API calls, and platform fees. Finance sees the blended margin, but not the margin by account.

This article looks at why the first price is always going to be provisional, why the obvious fix – raising prices – creates its own problems, and what fair repricing requires from the underlying data.

How the first price gets set

AI pricing strategies at launch are usually built on some combination of competitor rates, an internal cost estimate, and a target margin. This is standard practice and it works well enough to get a product into the market.

How the first price gets set

The cost estimate at this stage is based on published provider rates – inference pricing from the LLM vendor, per-minute or per-character rates from speech and transcription services, carrier charges for telephony. These numbers are real, but they describe the rate card, not the production workload. A rate card does not show how long each workflow will run, how much context the model will carry, how many tools the agent will call, or which accounts will take the expensive path most often.

At low volumes, this is generally irrelevant. The total vendor bill is small, the customer base is concentrated, and any margin error is absorbed into overhead. The price gets locked into contracts, built into packaging, and carried forward as the baseline for every new deal. By the time real usage data starts arriving in volume, the number is no longer a hypothesis. It is the rate.

Why the first price stops working

Why the first price stops working

Market data points confirm that early AI pricing models are being revisited as real usage arrives.

ICONIQ’s 2026 State of AI report found that 37% of companies plan to change their AI pricing model in the next 12 months, with customer demand, competition, and margin concerns named as primary drivers. Bessemer Venture Partners’ 2026 pricing playbook describes the same pressure from the cost side: unlike classic SaaS, every AI query carries a real cost-of-goods-sold, aka COGS. This includes compute, inference, and in some cases human-in-the-loop review. Bessemer also flags a “renewal cliff” as the AI pilots signed in 2024 and 2025 hit their first renewal cycles – pricing that was based on projected value now has to justify itself against measured results.

The structural reason for why early estimates drift, is that AI SaaS pricing has to account for variable cost-to-serve in a way traditional SaaS pricing never did. Traditional SaaS usually had low marginal cost – the cost of one more active user rarely changed the account margin by much. AI services carry real variable cost per interaction. Inference, retrieval, tool calls, model routing – each customer session generates provider charges that vary by complexity, duration, and configuration. Gartner estimates that agentic AI workflows can require 5 to 30 times more tokens per task than a standard chatbot interaction. A customer who looked profitable during simple-query usage may look very different once they start running multi-step workflows.

ICONIQ’s 2026 data shows AI product builders expecting average gross margins of around 52%, up from 41% in 2024. The improvement is real, but an average tells you the direction. It does not tell you which accounts are above it and which are pulling it down.

Why repricing on averages creates a new problem

When blended margin declines, the obvious response is to raise prices. Most companies that reach this point consider exactly that – a pricing adjustment across the board to recover margin.

Why repricing on averages creates a new problem

The problem is that a uniform price increase treats every account as if it contributes equally to the decline. In practice, every account can be different.

Some accounts run lightweight interactions – short tasks, simple queries, minimal tool use. Their cost-to-serve is low. They’re already profitable at the current rate, sometimes highly so. A price increase applied to these accounts does not fix a margin problem. It penalises the customers who were never causing one.

Other accounts drive complex, multi-step workflows with heavier model calls, retrieval, tool use, and longer context windows. Their cost-to-serve may exceed the current rate. These are the accounts where the margin gap sits – but without per-customer cost attribution, the company cannot identify them.

Raising prices based on averages does not solve the problem. It redistributes it. The lightweight accounts absorb a cost increase they did not create, while the pricing may not cover the cost-to-serve for heavy accounts. And the company still cannot tell which is which because the available margin data is blended, not broken down by account.

The question is not whether to reprice. For many AI companies, that question is already settled. The question is which accounts to reprice, by how much, and based on what data.

Why usage-based pricing alone is not the answer

Why usage-based pricing alone is not the answer

Common instinct at this point is to move to usage-based or consumption-based pricing – charge customers based on how much they use, and the margin problem solves itself.

It helps, but it does not close the gap on its own.

Usage-based pricing charges heavy users more. But more usage does not necessarily mean more provider cost. A customer running 10,000 lightweight classification tasks may generate less provider expense than one running 500 complex agentic workflows – but a per-query billing model would charge the first customer twenty times more. Thus, the billing unit does not map to the cost unit.

And more usage does not necessarily mean more value to the customer. A customer whose agent calls a tool API repeatedly because of a retry loop is generating usage – and provider charges – without receiving additional value. Billing them for that usage is technically accurate but commercially unfair.

The question that matters for fair pricing is not how much did this customer use, but what did it cost to serve this customer, and does the revenue from that account reflect the resources behind it?

For voice AI operators, this problem is especially relevant at the per-minute level. We covered the full cost structure in our voice AI cost per minute breakdown in our first blog post.

That is a different question from “how many tokens did they consume.” It requires connecting usage patterns to provider cost at the account level – not metering usage in isolation.

Fair pricing – pricing that reflects the real cost of serving each customer – does not have a single definitive answer. It varies by product, market, and customer mix. But what is clear is that you cannot get there without knowing two things: what each customer’s usage looks like, and what that usage costs you to deliver.

What fair pricing requires from the data

What fair pricing requires from the data

The companies handling this well do not start by choosing a new AI pricing model. They start with per-customer cost attribution – measuring what each customer uses, what each provider charges for that usage, and what the customer was billed.

AI unit economics first. Pricing second.

In practice, this means connecting the records that already exist across vendor dashboards and API logs. A customer interaction generates many usage events including model calls, tool calls, minutes, characters, and API requests. Each event triggers a provider charge. Those charges are attributed to a customer account and compared against the invoice line on the revenue side. The result is a per-account margin – measured from the records the business already produces.

With that data, repricing becomes targeted rather than uniform. The company can see which accounts need a rate adjustment because their cost-to-serve exceeds their revenue. It can see which accounts are healthy and should be protected from unnecessary increases. It can model the margin impact of a pricing change before committing to it – per segment, per workflow type, per customer.

In telecommunications, interconnect billing solved a similar problem: one customer interaction involving multiple suppliers, different rates, different settlement records, and a customer charge on the other side. The companies that scaled profitably connected vendor cost to customer revenue at the transaction level.

The billing patterns already exist. AI companies can apply them to usage events, provider charges, and customer invoices. Twenty-five years of high-volume billing engineering built the systems for multi-vendor cost attribution, real-time rating, and per-account margin tracking. AI companies reaching the point where their first price needs a second pass have access to practices that were not available when they set the original number.

Where this leads

Those 37% of AI companies identified by the 2026 State of AI report who are planning to change their pricing model are not discovering a better model. They’re discovering what their customers cost – information they needed before the first price was set, not after.

Where this leads

If the second price is still based on blended margin, it is another estimate. It may be better informed, but it will not show which customers are profitable, which workflows are expensive, or where the gap between fair pricing and current pricing sits.

The fair pricing question does not have a universal answer and cannot be answered without knowledge of each customer’s usage and the underlying usage delivery costs.

The data to answer it already exists inside your business. The question is whether your company connects it before or after the margin problem reaches contracts, renewals, and customer relationships. Click here to learn how to start tracking your AI costs.

Share this story