Pricing & Usage

Pricing, Tokens & Usage Model

Product-facing reference for the paid token-based API experience. Designed to feel familiar to developers who have used commercial AI voice APIs.

Product Positioning

The standalone Voice API is the recommended commercial surface. Frontend developers should think of it as:

Metered voice generation APIPreview and validation APIStreaming preview API for interactive UI
Token-Based Billing Model

The recommended billing abstraction is a voice-token meter with these suggested metering units:

UnitMeaningGood For
Input text tokensTokens derived from request textPreview pricing and estimate UI
Generated audio secondsActual output durationSettlement and final cost
Premium feature tokensExtra charge for higher-cost modesFuture premium tiers

Expose These Concepts in Product UI

API key
Project or workspace
Current token balance
Estimated token cost per request
Consumed tokens after request completes
Monthly usage summary
Rate-limit status
Public Pricing Contract
POST/v1/usage/estimate

Suggested response fields for preflight estimates:

estimated_input_tokensestimated_audio_secondsestimated_total_tokensestimated_cost_usdbalance_after_request

Post-Request Usage Summary

Recommended response metadata after generation:

usage.input_tokensusage.output_audio_secondsusage.total_tokensusage.billable_tokensusage.remaining_balance
Voice Ops & Engine Lifecycle

Admin tooling should expose voice lifecycle controls alongside pricing. Recommended options:

Provision platform voices from reference audio
Manage voice catalog (enable, retire, delete)
Upgrade or downgrade voice quality tiers
Rebuild conditioning embeddings or reference stacks
Train or fine-tune the core XTTS engine
Promote new engine versions across the voice catalog
The admin portal exposes voice provisioning at /admin/voice-provisioning. Core engine training and tuning is managed through the voice pipeline CLI and deployment workflow.
Response Headers

The cleanest pattern is to expose usage through both headers and JSON body fields. Suggested response headers:

HeaderDescription
X-Usage-Input-TokensNumber of input tokens consumed
X-Usage-Output-SecondsDuration of generated audio
X-Usage-Total-TokensTotal tokens consumed
X-RateLimit-LimitRate limit ceiling for the window
X-RateLimit-RemainingRemaining requests in current window
X-RateLimit-ResetTimestamp when the rate limit resets

These headers are part of the recommended commercial contract. They are not fully implemented across the current route surface today.

Frontend Billing States

Healthy

  • Active plan
  • Remaining tokens
  • Average request cost

Low Balance

  • Warning banner
  • Estimated remaining preview count
  • Upgrade or top-up CTA

Hard Stop

  • Insufficient balance
  • Blocked generate button
  • Recharge explanation

Example Product Copy

Preview Estimate

This preview will use about 340 tokens and 5.8 seconds of generated audio.

After Completion

Preview completed. 327 billable tokens used. 12,404 tokens remaining.

Low Balance

Your workspace is running low on voice tokens. Add more credits to keep generating previews.

Pricing Tiers
TierPriceMonthly AllowancePer RequestQualitiesIntended User
Free$0 / month1,000 characters / month500 characters per requestStandardPrototyping and low-volume testing
Pro$29 / month500,000 characters / month5,000 characters per requestStandard, Premium, UltraProduction integration and commercial use
Note: Metering is based on input characters, not tokens. Character counts include spaces and punctuation. Monthly allowances reset on billing cycle date. Unused characters do not roll over.
Integration Guidance

Best Practice

  • Frontend asks your backend for current balance and estimate
  • Backend owns billing truth and API key
  • Backend proxies request to Voice API
  • Backend records final usage after request completes

Avoid

  • Storing balance truth only in frontend state
  • Letting the browser call the metered API with a permanent secret
  • Charging on estimate without reconciling final usage