Senior Site Reliability Engineer

Why We Need You

We are fanatics about uptime and building a great product at PagerDuty. Startups, Fortune 100 companies, and everything in between rely on us to alert them when there are problems, and then help them figure out how to make the problem go away. We are looking for a stellar Senior Site Reliability Engineer to help us build faster.

As a Senior Site Reliability Engineer, not only will you be responsible for keeping our systems up and running, you will also have the ability to shape the product you have come to view as vital to properly run systems. Help us build the perfect incident management system that you dream of.

How You Contribute to Our Vision

  • Building and automating the multi-cloud production infrastructure that PagerDuty runs on
  • Setting up and implementing sane security policies that protect us and our customers
  • Leveraging new persistence technologies that improve availability at the data layer

About You

  • You have written vast amounts of code and have solved multiple problems by automating your way out of them. You have replaced yourself time and time again with your code.
  • You have been responsible for running operations for a service used by multiple customers. You understand the importance and impact that good operations can have on the rest of a product and the positive ripple effects that it can have across an entire engineering organization.
  • You have pulled back the covers and know how this Internet thing works end to end. Networks, servers, protocols, operating systems, services, databases, query optimization, disks: to you nothing can be a 'black box.' Whenever there was a technology you did not understand, you grabbed the book and figured it out.
  • You have NOT connected to a box and updated a set of configs manually. You have used at least one configuration management system, such as Puppet, Chef, Ansible or cfengine. You have good knowledge of a scripting language (e.x. Ruby, Python, Perl), and ideally, have contributed code to an open source project.
  • You believe CI servers, push button deploys, time series data stores, metrics dashboards, and centralized logging are not just "nice to haves" they are critical pieces of infrastructure that rapidly pay for themselves. You are familiar with the tool-space, and can suggest products in each of these areas.
  • You are willing to participate in a weekly 24/7 on-call schedule. And yes, we use PagerDuty to manage our on-call schedules.

Our Environment

We do no hire based on experience with a handful of tools. Instead we want smart, capable, and experienced people who can learn our tools (and suggest new ones!) as needed.

Here's what we use:

  • Linux (Ubuntu)
  • Amazon AWS (EC2, S3, RDS)
  • MySQL and XtraDB Cluster
  • Cassandra
  • Chef
  • Nginx
  • Unicorn
  • Postfix
  • Ruby
  • Scala
  • Rails
  • Finagle

Benefits to get Excited About

  • Competitive salaries and company equity
  • Comprehensive benefits package including: medical, dental, and vision plans for you, your spouse and family; 401K, pre-tax commuter benefits, corporate discounts, cell phone allowance and more!
  • Generous paid vacation (3 weeks vacation your first year, 4 weeks afterwards) in addition to 12 paid holidays and ample sick leave.
  • Monthly company wide hack days
  • Catered lunch daily, breakfast on Wednesdays, and plenty of snacks and drinks
  • Convenient office location in SoMa tech hub – accessible by BART, Muni and CalTrain

How We Work

PagerDuty Engineering teams are set up to be mini innovation pods. We practice what we preach, and believe that every engineer can build great products to delight our thousands of customers.

Teams are set up to be able to achieve success autonomously while remaining accountable for results. Every team has full vertical ownership of their own services and are able to release as frequently as they want to. We practice the mantra of 'Code It. Ship It. Own It.' and believe that teams are most successful when they are able to own every decision in order to run their software. Every team gets to be a part of our growth by building highly resilient and durable software that scales from our startup customers to Fortune 100 companies.

We deploy over 1000 times a month and every engineer is able to ship high quality software to production on their own. Teams own their own tests and yes, we use PagerDuty to manage incidents. Teams own their own way of working and can use the agile practices of their choice to work collaboratively via incremental delivery.

We support engineers to explore ideas via monthly Hack Days, actively attack our own infrastructure weekly to learn and get better, host an annual internal technical conference called PagerCon, ask our engineers to represent PagerDuty at industry events, and contribute to the open source community.

Each team has a dedicated Engineering Manager, Product Manager, and agile coach to help support our people and teams to be successful. We believe that Management is a separate skill set and have different career paths for our engineers and managers including a full 'stay technical' career track.

About Us

PagerDuty is the leading digital operations management platform for businesses. Our SaaS-based solution empowers over 8,000 small, mid-size and enterprise global customers such as Comcast, eHarmony, Slack and Lululemon with the insight to intelligently respond to critical disruptions for exceptional customer experience. PagerDuty was founded to deliver a new and innovative approach to increase business response and efficiency. We follow agile methodology to build software that enables full-stack event intelligence, response orchestration, continuous learning and delivery, and facilitate the journey towards improved application, system, and service performance and availability. When brand reputation depends on customer satisfaction, PagerDuty arms businesses with the insight to proactively manage incidents and events that may impact customers across your IT environment. We were recently included in the 2016 Deloitte Technology Fast 500, Inc. 500 and Forbes Cloud 100 lists.

Our dedication to our customers, collaborative spirit, 'risk, fail, learn' attitude, and a Get Stuff Done ethos drive innovation through our product and our culture. We are incredibly proud of our people programs that enable our employees to deliver their best every day. From our performance achievement philosophy to our great benefits, our robust culture of recognition to our commitment to inclusion and diversity, solving complex technical problems to delivering an amazing customer experience, PagerDuty is where you can do the some of the best work of your career.

Learn more at www.pagerduty.com. Follow our blog and connect with us on Twitter, LinkedIn, YouTube and Facebook.

PagerDuty is committed to creating a diverse environment and is an equal opportunity employer. PagerDuty does not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, parental status, veteran status, or disability status.

PagerDuty uses the E-Verify employment verification program.

Our stewardship of the data of many thousands of customers means that a background check is required to join PagerDuty. We will, nonetheless, consider qualified applicants with arrest and conviction records in accord with applicable law, including the San Francisco Fair Chance Ordinance.


Meet Some of PagerDuty's Employees

Sweta A.

Engineering Manager

Managing three teams of engineers—the Front-End UI Team, the Mobile Team, and the Platform Team—Sweta balances her focus between people management, career growth for her engineers, and advancing business goals.

Mark M.

Strategic Account Manager

Mark works with PagerDuty’s biggest clients around the world—mostly in the U.S. and Canada—and helps them make the most out of PagerDuty’s offerings while also ensuring company-wide standardization.


Back to top