Endpoint Configuration
Automatic Configuration with Mastra
When using the Mastra provider, you can automatically configure the voice endpoint through the provider configuration:setVoiceEndpoint()
and ensures consistency between your chat and voice endpoints.
Manual Configuration
For other providers or custom setups, configure the endpoint manually:Request Format
The voice system sends amultipart/form-data
POST request to your configured endpoint with the following fields:
Audio Data
- Format: WebM container with Opus codec
- Type: Binary blob from MediaRecorder API
- Quality: Optimized for speech recognition
Voice Settings
Additional Context
The context includes Cedar’s additional context data (file contents, state information, etc.) that can be used to provide better responses.Response Formats
Your backend can respond in several ways:1. Direct Audio Response
Return audio data directly with the appropriate content type:2. JSON Response with Audio URL
3. JSON Response with Base64 Audio
4. Structured Response with Actions
Implementation Examples
Node.js with Express
Python with FastAPI
Mastra Agent Integration
When using Cedar-OS with the Mastra provider andvoiceRoute
configuration, your Mastra backend should handle requests at the specified route. Here’s how to implement the voice handler:
Error Handling
Your backend should handle various error cases:CORS Configuration
Ensure your backend allows CORS for the frontend domain:Performance Considerations
Audio Processing
- Use streaming transcription for real-time responses
- Implement audio compression to reduce bandwidth
- Cache TTS responses for common phrases
Response Optimization
- Stream audio responses when possible
- Use CDN for serving generated audio files
- Implement request queuing for high-traffic scenarios
Security Best Practices
- Rate Limiting: Prevent abuse of voice endpoints
- Authentication: Verify user permissions
- Input Validation: Sanitize audio data and settings
- Content Filtering: Screen transcriptions for inappropriate content