Principal DevOps Engineer - Data Platform Team

Twilio's Data Platform captures, stores, and processes global event data reliably at scale, and makes this data available to Twilio and the market for a large set of products and services.

About the job:

As Principal DevOps Engineer, you will be a core contributor in the Data Platform team and help us deliver the world-class Data Platform Twilio needs in order to succeed. You will face some of the most complex challenges in distributed data systems at scale.


  • Create a resilient and highly operable production environment for Twilio’s Data Platform with 24x7 availability, high performance, scalable and zero downtime releases in AWS environment.
  • Manage large MySQL database clusters and noSQL systems such as DynamoDB and Cassandra.
  • Manage regional deployments and set up disaster recovery of Kafka data pipelines, systems and stores in AWS environment.
  • Collaborate with Engineers to create a continuous delivery environment and processes.
  • Instrument and monitor the health and availability of services, with fault detection, alerting, triage and recovery (automated and manual).
  • Work closely with Twilio’s cloud infrastructure, orchestration, and security teams to help implement company-wide security and operability initiatives and to provide tooling requirements.
  • Performance manage (with benchmarking and monitoring of vital metrics), capacity plan, and resolve performance problems affecting service levels.
  • Write scripts and runbooks to automate procedures.
  • Lead and mentor a DevOps Engineer.
  • Enable auto-scaling.


  • Your background will be that of a hands-on Director or Senior DevOps Engineer who has had considerable
  • Experience in a highly-complex technical operations environment with cloud-based services.
  • Minimum 5+ years experience building complex distributed systems. In this role you focussed on reliability, high-availability, performance, scalability, capacity planning, backup and recovery, business continuity planning and automation of everything.
  • Strong Amazon AWS experience in a production environment.
  • Experience with managing and automating configuration of MySQL database clusters.
  • Hands-on experience with cloud infrastructure technologies, including continuous integration tools, configuration management, systems monitoring and alerting tools.
  • Experience with managing systems in distributed regions in the cloud or on-site.
  • Adept at troubleshooting and administering Linux systems, dealing with networking issues, and fine tuning instrumentation and alerting systems.
  • Demonstrated experience of agile processes, continuous integration, test automation and release management.
  • Significant development experience in at least one modern scripting language, preferably Python.
  • Exceptional communication and troubleshooting skills.
  • Preferably experience with operating a high load data pipeline and exposure to technologies such as Kafka, Kinesis, Spark, S3 and Redshift.
  • Preferably experience with managing noSQL systems such as DynamoDB and Cassandra.
  • Experience with securing distributed systems. You understand the purpose of reasonable security techniques and the tradeoff with operational efficiency.

About us:

Twilio makes communications easy and powerful. With Twilio's platform, businesses can make communications relevant and contextual by embedding real-time communication and authentication capabilities directly into their software applications. Twilio gives businesses the ability to innovate, prototype, create, and connect with their customers at the right time and in the right way. Founded in 2008, Twilio is a public company based in San Francisco, California with other offices around the world.

Twilio is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal opportunity regardless of race, color, ancestry, religion, gender, gender identity, parental or pregnancy status, national origin, sexual orientation, age, citizenship, marital status, disability, or Veteran status and operate in compliance with the San Francisco Fair Chance Ordinance.

Back to top