Site Reliability Engineer

    • Bellevue, WA

Job Description

Are you passionate about availability and performance? Do you strive for maximum efficiency in everything you do, no matter how trivial it seems to others? Do you like solving hard problems, and then automating them away? If you answered yes to all of these questions then you may be the person we are looking for.

The Digital Platform Site Reliability Engineering team brings a software engineering perspective to delivering quality operations at scale, driving automation in every aspect of the job. We collaborate with Engineering and Operations teams, as equals, to provide optimal production services to our customers. We believe that Engineering teams should be able to deploy whatever, whenever with confidence, via lightweight processes with minimal barriers.

As a member of our team, your expertise will not only delight customer by providing reliable, performant services, but you will also help to shape the growing role of Site Reliability across the company.


  • Ensure that production SLAs are defined, measured, monitored and maintained
  • Measure product quality through automated end-to-end testing, performance testing and code coverage
  • Establish and maintain best practices for CI/CD
  • Provide leadership in reducing and resolving production incidents
  • Look for opportunities to improve all engineering processes
  • Evaluate, build and modify automation for deploying and operating production services

  • 5+ years of professional experience in Site Reliability or similar discipline
  • Familiarity with distributed systems theories, such as Microservices and the 12 Factor App
  • Can fix reliability and performance issues at the network, hardware, and software levels
  • Direct responsibility for operating services in IaaS/PaaS/Cloud environments
  • Comfortable programming in modern scripting languages
  • Passionate about measurement, observability and early detection of issues
  • Participation in a global 24x7 on-call rotation
  • Composed urgency in stressful situations
  • Strong interpersonal skills

  • Bachelor's degree in Computer Science or related technical discipline
  • Reputation as the "go-to person" for operating production infrastructure
  • Familiarity with code coverage tools and methodologies
  • Familiarity with performance testing tools and methodologies
  • Technological curiosity, open source contribution, or an interesting personal GitHub page
  • Experience mentoring junior team members

VMware is an Equal Opportunity Employer and Prohibits Discrimination and Harassment of Any Kind: VMware is committed to the principle of equal employment opportunity for all employees and to providing employees with a work environment free of discrimination and harassment. All employment decisions at VMware are based on business needs, job requirements and individual qualifications, without regard to race, color, religion or belief, national, social or ethnic origin, sex (including pregnancy), age, physical, mental or sensory disability, HIV Status, sexual orientation, gender identity and/or expression, marital, civil union or domestic partnership status, past or present military service, family medical history or genetic information, family or parental status, or any other status protected by the laws or regulations in the locations where we operate. VMware will not tolerate discrimination or harassment based on any of these characteristics. VMware encourages applicants of all ages. VMware will provide reasonable accommodation to employees who have protected disabilities consistent with local law.

Back to top