Software Engineer - APM Backend

The company:

We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We build highly scalable systems that process trillions of data points every day to provide real time alerts, visualizations, log aggregations, and application traces for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way.

The team:

The APM Team at Datadog provides mission critical application health and performance monitoring for customers of all sizes and industries. We’re taking a new approach to APM with distributed tracing and seamless integration with Datadog’s Infrastructure and Logs products.

The opportunity:

As a Distributed Systems Engineer for the APM team, you will help architect, build, and maintain the systems and services that power our APM product. This work can range from evolving our high-throughput, high-availability intake pipeline to keep up with ever growing volumes of customer data to building new analytics systems to help us provide even more visibility into our customers’ services and applications.

You will:

  • Build distributed, high-throughput, real-time data pipelines
  • Do it in Go and Python, with bits of C or other languages
  • Solve a scaling bottleneck in a critical service
  • Use Kafka, Redis, Cassandra, Elasticsearch and other open-source components
  • Own meaningful parts of our service, have an impact, grow with the company
  • With your team, plan the most important projects to work on next

Requirements:

  • You have significant experience in one or more languages
  • You value code simplicity and performance
  • You can design architecture to solve problems at high scale
  • You have a BS/MS/PhD in a scientific field or equivalent experience
  • You want to work in a fast, high-growth startup environment that respects its engineers and customers

Bonus points:

  • You wrote your own data pipelines once or twice before (and know what you'd like to change)
  • You've built high scale systems with Cassandra, Redis, Kafka or Numpy
  • You have significant experience with Go, C, or Python
  • You have a strong background in stats

Back to top