Engineering

Senior Voice AI / Speech Synthesis Engineer

New York, NY (Hybrid)•Senior•$180k - $230k

Meaningful early-stage options.

Build and optimize real-time neural TTS/voice cloning pipelines for our healthcare voice agent platform (RCM). Own audio quality, latency, and reliability in production.

Responsibilities

→Develop, fine-tune, and deploy neural TTS, voice cloning, and vocoder models.
→Optimize audio pipelines for sub-250ms latency in streaming scenarios.
→Integrate speech components with agent orchestration, ASR, and telephony.
→Define evaluation metrics (MOS, intelligibility) and automated regressions.
→Collaborate with Conversational Design and Backend on end-to-end UX.
→Mentor engineers; contribute to technical roadmap.

Qualifications

→5+ years in speech/audio ML (PyTorch/TensorFlow).
→Hands-on with TTS/vocoders (e.g., HiFi-GAN, WaveRNN, TorToiSe, FastPitch).
→Audio DSP fundamentals; real-time constraints and profiling experience.
→BS/MS in CS/EE or equivalent; PhD a plus.

Nice to Have

→Healthcare or regulated environment experience.
→Experience benchmarking multi-voice libraries and on-device acceleration.

Benefits

→Medical, dental, vision
→401(k)
→Flexible PTO; hybrid NYC office
→Parental leave

Interested? Email us at careers@voiceadmin.ai