Overview
This setup creates a custom AI voice agent using:
- Hume EVI - Voice interface (speech-to-text, text-to-speech, custom voice clone)
- Claude (Anthropic API) - Brain/LLM that generates responses
- Railway - Hosts the proxy server that connects Hume to Claude
- Vercel - Hosts the frontend (portfolio website)
The key innovation is using a proxy server between Hume and Claude, which allows:
- Custom system prompts (personality, security rules)
- Tool use (weather, stocks, web search, memory)
- UI control via voice commands
- Context injection (time, weather, user memories)
Architecture Diagram
┌─────────────────────────────────────────────────────────────┐
│ USER'S BROWSER │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ Frontend (Vercel) │ │
│ │ - WebSocket to Hume EVI │ │
│ │ - Sends audio from microphone │ │
│ │ - Receives audio responses │ │
│ │ - Polls /pending-actions for UI commands │ │