Skip to main content

What are tokens?

Tokens are the unit of measurement for AI model usage. Every conversation your agent has consumes tokens:
  • Input tokens — generated from the caller’s speech (converted to text), plus your agent’s instructions, knowledge base context, and tool responses
  • Output tokens — the agent’s spoken responses

How tokens are counted

A typical call’s token usage depends on:
FactorImpact on tokens
Call durationLonger calls = more tokens
Instruction lengthLonger instructions = more input tokens per turn
Knowledge base contextMore retrieved chunks = more input tokens
Tool usageTool responses add input tokens
Response verbosityLonger agent responses = more output tokens
Keep your instructions concise and focused. Verbose instructions consume extra input tokens on every single turn of every call.

Viewing usage

Check your token usage in Settings → Tokens:
  • Current usage — tokens consumed this billing period
  • Allowance — your plan’s monthly token limit
  • Usage percentage — visual indicator of how much you’ve used

Token top-ups

If you exceed your monthly token allowance, you can purchase top-ups:
1

Go to token settings

Navigate to Settings → Tokens.
2

Select a top-up amount

Choose from available token packages.
3

Complete payment

Pay via the secure payment page. Tokens are added immediately.
Top-up tokens don’t expire at the end of the month. They’re consumed after your monthly allowance is used up.

Reducing token consumption

Remove redundant or overly detailed instructions. The AI model is good at inferring behavior from concise guidelines.
Remove outdated or irrelevant documents. Fewer, higher-quality chunks mean less context per query.
If your use case is straightforward, consider switching to gpt-4o-mini-realtime which uses fewer tokens per response.
Set the max output tokens in your agent’s model settings to cap response length.