Engineering
Senior Voice AI / Speech Synthesis Engineer
New York, NY (Hybrid)•Senior•$180k - $230k
Meaningful early-stage options.
Build and optimize real-time neural TTS/voice cloning pipelines for our healthcare voice agent platform (RCM). Own audio quality, latency, and reliability in production.
Responsibilities
- →Develop, fine-tune, and deploy neural TTS, voice cloning, and vocoder models.
- →Optimize audio pipelines for sub-250ms latency in streaming scenarios.
- →Integrate speech components with agent orchestration, ASR, and telephony.
- →Define evaluation metrics (MOS, intelligibility) and automated regressions.
- →Collaborate with Conversational Design and Backend on end-to-end UX.
- →Mentor engineers; contribute to technical roadmap.
Qualifications
- →5+ years in speech/audio ML (PyTorch/TensorFlow).
- →Hands-on with TTS/vocoders (e.g., HiFi-GAN, WaveRNN, TorToiSe, FastPitch).
- →Audio DSP fundamentals; real-time constraints and profiling experience.
- →BS/MS in CS/EE or equivalent; PhD a plus.
Nice to Have
- →Healthcare or regulated environment experience.
- →Experience benchmarking multi-voice libraries and on-device acceleration.
Benefits
- →Medical, dental, vision
- →401(k)
- →Flexible PTO; hybrid NYC office
- →Parental leave
Interested? Email us at careers@voiceadmin.ai