Engineering

Senior Voice AI / Speech Synthesis Engineer

New York, NY (Hybrid)Senior$180k - $230k
Meaningful early-stage options.

Build and optimize real-time neural TTS/voice cloning pipelines for our healthcare voice agent platform (RCM). Own audio quality, latency, and reliability in production.

Responsibilities

  • Develop, fine-tune, and deploy neural TTS, voice cloning, and vocoder models.
  • Optimize audio pipelines for sub-250ms latency in streaming scenarios.
  • Integrate speech components with agent orchestration, ASR, and telephony.
  • Define evaluation metrics (MOS, intelligibility) and automated regressions.
  • Collaborate with Conversational Design and Backend on end-to-end UX.
  • Mentor engineers; contribute to technical roadmap.

Qualifications

  • 5+ years in speech/audio ML (PyTorch/TensorFlow).
  • Hands-on with TTS/vocoders (e.g., HiFi-GAN, WaveRNN, TorToiSe, FastPitch).
  • Audio DSP fundamentals; real-time constraints and profiling experience.
  • BS/MS in CS/EE or equivalent; PhD a plus.

Nice to Have

  • Healthcare or regulated environment experience.
  • Experience benchmarking multi-voice libraries and on-device acceleration.

Benefits

  • Medical, dental, vision
  • 401(k)
  • Flexible PTO; hybrid NYC office
  • Parental leave

Interested? Email us at careers@voiceadmin.ai

Voice Admin