Data Scientist - CloudOps
- New York, NY
We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale—trillions of data points per day—providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way.
The Cloud Cost Optimization team works directly with the CTO and is responsible for identifying and executing on cost-saving opportunities, providing engineering teams actionable visibility into spend, and empowering finance to understand and incorporate the complexities of cloud costs.
As a Data Scientist on the CloudOps team, you’ll work with and analyze usage and billing data generated from cloud applications. Datadog operates in sophisticated multi-cloud, multi-region, classical, containerized, and serverless environments, and you will be responsible to unify these disparate sources and disentangle their primary cost drivers. You will share your work directly with the highest levels of leadership at Datadog and it will have an impact on the direction of the engineering organization.
- Build marginal cost models using cloud usage and cost data, and use these models to discover trends and build forecasts
- Own and refine cloud provider billing data ingestion pipelines
- Develop new ways to understand the relationship between cloud resource consumption in complex containerized environments and business drivers
- Think deeply about what data and actionable views to surface from CxO down to individual engineers deploying new projects
- Develop metrics that bring clarity too complicated environments including containers, shared resources, and bespoke cost variables.
- You’ve worked with leadership to define a problem space, select questions to be answered, and created repeatable compelling reporting to “solve” stakeholder understanding
- You have experience creating and maintaining data ETL pipelines using Spark, Luigi, Airflow, and other open-source technologies using programming languages like Python, Scala, SQL
- You’re comfortable spending the day in notebooks (Zeppelin, Jupyter, Observable, etc) and have frequently used notebooks to share findings and create insight for yourself and others
- You enjoy getting to the bottom of arcane data sets; and creating logic to make those same datasets understandable to the masses
- You’re comfortable with ambiguity in the early stages of a project; and appreciate the work needed to determine the how in reaching an objective
- You’re familiar with cloud infrastructure; and what an engineers may consider when deciding on a serverless solution, or a particular cloud server.
- You have a BS/MS/PhD in a scientific/quantitative field or equivalent experience
- Have previously worked with cloud provider billing data
- Interested in business outcomes, high impact projects that have a tangible result on business metrics. You’ve worked on these projects in the past and are eager for more.
- You’ve faced a tradeoff between increasing infrastructure cost or increasing the cost to team time. You’ve been woken up by infrastructure alarms.
- You hold some strong opinions on the features in AWS Cost Explorer, the GCP billing console, or Azure Cost Management + Billing
Back to top