LiveKit is revolutionizing the AI landscape by providing the essential network infrastructure that powers multimodal AI interfaces, enabling seamless audio and visual interactions. Founded in 2021, LiveKit has rapidly grown to support over 3 Billion calls annually, 100,000+ developers globally, and industry giants like OpenAI, Character AI, Spotify, and Meta.
YOU'LL THRIVE AT LIVEKIT IF YOU:
obsess with crafting code that is fast, reliable and practical for the problem
are known as the go-to person for tackling tough technical problems
work hard and can build and ship fast
can clearly explain complex technical concepts to others
are a fast learner, frequently picking up new languages and tools
The best way to impress us is with thoughtful Issues and/or PRs on our Github repos 😊
ABOUT THIS ROLE:
We're looking for a Senior/Staff Engineer to work across some of the most technically demanding parts of LiveKit's platform — core services, telephony, and observability. At LiveKit, the infrastructure is the product — you're not building the layer underneath, you are the layer.
You'll work on problems where latency, availability, and operational simplicity are critical, and where the right answer often requires careful tradeoffs and outside-the-box thinking. While distributed systems experience is valuable, we care just as much about strong programming fundamentals, sound judgment, and the ability to learn fast. The team is small. Your decisions ship directly into production.
You'll thrive as a Distributed Systems Engineer if you:
obsess with crafting code that is fast, reliable and practical for the problem
are known as the go-to person for tackling tough technical problems
work hard and can build and ship fast
can clearly explain complex technical concepts to others
are a fast learner, frequently picking up new languages and tools
The best way to impress us is with thoughtful Issues and/or PRs on our Github repos 😊
WHAT YOU'LL DO:
Design and evolve the core control, data, and observability systems that power LiveKit Cloud
Implement resilient, region-spanning architectures that degrade gracefully under partial failure
Build libraries, protocols, and tooling that raise reliability and developer velocity across the org
Diagnose and harden critical paths using metrics, tracing, testing, and real-world traffic insights
Shape new platform capabilities across identity, scheduling, observability, and distributed state management
Technologies include: Go, psrpc, gRPC, Raft, NATS, Kubernetes, Prometheus, OpenTelemetry, ClickHouse
WHO YOU ARE:
You have experience designing and delivering distributed systems in production
You take ownership end-to-end - prototype, test, ship, monitor, and iterate
You're comfortable with consensus, coordination, and the realities of distributed failure modes
You think in terms of data flow, state, performance, and correctness, and you can reduce complex systems into understandable components
You value clear communication, practical engineering, and building systems that others enjoy working with
Go fluency — if you haven't written Go yet, you've been meaning to
Hands-on experience with pub/sub, RPC, or coordination systems (NATS, etcd, Raft, Paxos)
Exposure to real-time or low-latency infrastructure — you know what microseconds feel like
You've shipped observability tooling you'd actually want to use (tracing, metrics, at-scale logging)
In those school group projects, you did most of the work (:sigh:)
The opportunity to shape the brand of a fast-growing developer platform
Collaboration with a small, senior team that deeply values craft and creativity
Competitive salary and equity package
Health, dental, and vision benefits
Flexible vacation policy