Available models
AssistPulse uses OpenAI’s Realtime API for voice conversations. The model determines the quality and capability of your agent’s responses.| Model | Best for |
|---|---|
| gpt-4o-realtime | Most conversations — excellent balance of speed and intelligence |
| gpt-4o-mini-realtime | High-volume, simpler use cases — faster and cheaper |
Model availability may change as new models are released. The default model is recommended for most use cases.
Model settings
Temperature
Controls how creative or deterministic the agent’s responses are:- 0.0 — highly deterministic, always picks the most likely response
- 0.5 — balanced (default)
- 1.0 — more creative and varied responses
Max output tokens
Limits the length of each response. Options:- Default — no explicit limit (model decides)
- Custom value — set a specific token limit
Tool choice
Controls how the agent decides when to use connected tools:- Auto — the model decides when a tool is relevant (recommended)
- Required — forces the model to use a tool on every turn
- None — disables tool usage entirely
Token consumption
Every conversation consumes tokens:- Input tokens — the caller’s speech converted to text, plus instructions and knowledge context
- Output tokens — the agent’s spoken responses
