Engineering Team Lead - Site Reliability
We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale—trillions of data points per day—providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way.
The Site Reliability teams at Datadog are responsible for ensuring that our high-volume, low-latency environments continue to perform around the clock. These teams collaborate closely with our product engineers to ensure that Datadog can monitor millions of servers and containers, ensuring our customers always have dependable and actionable data at their fingertips. You’ll be responsible for shaping the infrastructure of our data-intensive, real-time services as we continue to grow at petabyte scale.
As an Engineering Team Lead for SRE team, you will manage a team of engineers, own significant chunks of our architecture, design and build systems at scale, and shape product decisions. You'll work on challenging projects, make an impact, and grow as an engineer and a lead.
- Solve a scaling bottleneck in a critical service
- Mentor other engineers on your team
- Design a new service and write an architecture RFC
- Deploy a new feature to production, progressively rolling it out with feature flags
- Investigate and fix a production issue from a service your team owns
- Plan the most important projects to work on next
- You have been building applications for 4+ years and know the systems you’ve worked on from top to bottom
- You have significant backend programming experience
- You have managed a team of software engineers
- You have architected, built, and operated distributed systems to solve problems at high scale
- You have a BS/MS/PhD in a scientific field or equivalent experience
- You want to work in a fast-paced, high-growth startup environment that respects its engineers and customers
- You've shipped complex projects with teams of engineers
- You've worked at high scale with systems like Redis, Cassandra, Kafka
- You have significant experience with Go, C, or Python
Is this you? Let's chat!
Equal Opportunity at Datadog:
Datadog is an Affirmative Action and Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements.
For more information on how we maintain the privacy of the information you submit as part of your application, please refer to our Applicant and Candidate Privacy Notice.
Back to top