Voice & Realtime Engineer

About the role

The core interaction in Gray is the interrupt loop: talk, watch it work, redirect mid-task. That loop lives or dies on latency. You will own the realtime pipeline — on-device transcription, streaming TTS, barge-in detection, and the websocket layer that carries it all.

What you will own

Own end-to-end voice latency, from mic open to first spoken syllable
Run transcription on-device and keep it accurate for operator vocabulary — hostnames, flags, paths
Build barge-in that works: detect intent to interrupt without false triggers
Design the streaming protocol between app, self-hosted box, and model providers
Instrument everything; latency regressions should page you before users notice

What you bring

Deep experience with realtime audio — WebRTC, audio codecs, VAD, or speech pipelines
Strong systems instincts; you think in milliseconds and buffers
Production experience with streaming LLM or speech APIs
Comfort working across mobile, server, and protocol layers

Nice to have

On-device ML experience (whisper.cpp, Core ML)
You've built a voice assistant, even a toy one
Contributions to realtime open source

What we offer

Meaningful equity. Every role carries a real stake in Layer Gray, with a 10-year exercise window.
Remote, worldwide. Work from anywhere. We hire for the role, not the time zone.
Hardware budget. $4,000 for your machines, plus a home server allowance — you should run Gray on your own box.
Flexible time off. 25 days minimum, and we mean minimum. We track outcomes, not hours.
Health covered. Full medical, dental, and vision for you and your dependents, wherever you are.
Two offsites a year. The whole crew, one room, twice a year. The rest of the time, async.
Model subscriptions. Claude, GPT, and friends — every frontier model subscription, paid.
Learning budget. $2,000 a year for books, courses, and conferences. No approval theatre.

About Gray

Gray is voice-first AI you operate like a terminal. You speak; it runs real work across your machines — SSH sessions, multi-agent jobs, files, scheduled tasks — then speaks back. It is self-hosted, private by architecture, and built for the people who run the internet’s plumbing. Gray is made by Layer Gray, Inc.