Head of Reliability Process, SRE
At Atlassian, we are serious about reliability. As the Head of the Reliability Process Group, you will lead a global group of Technical Program Managers to mature our incident management and response capabilities in the context of a modern tech stack as part of the Site Reliability Engineering team. You will contribute by enhancing our software tools efforts, processes, training, and liaison; as well as running incidents. This means that the ideal candidate has a strong track record of managing global teams and a strong understanding of computer science and/or information system fundamentals, is intensely curious about the technology, and continues to grow in technical expertise over time.
- Build and lead a global team of data engineers through hiring, coaching, mentoring, and hands-on career development.
- Defines, owns and delivers process and tooling that supports reliability, including incident management and post-incident review.
- Identify and/or analyze challenges relating to mission critical services and manage the building of automation tools/processes focused on chaos, anomaly detection, auto-remediation and prevent recurrence.
- Collaborate and coordinate projects with Software Engineering, Cloud/Technical Infrastructure, System Engineering and cross-functional teams on behalf of Site Reliability Engineering across Cloud/Technical Infrastructure.
- Developing a technical program to execute cross functional tabletops and game days• Measures the business-relevant outcomes of the processes you own, including incident rate and time-to-recovery.
- Identifying and closing capability gaps by driving engineering efforts with cross functional teams.
- On call for technical support on Incidents and PIR (Post Incident Review).
On your first day, we'll expect you to have:
- Bachelor’s degree in Information Systems, Computer Science (CS) or related field
- 8+ years of industry experience, working with technical subjects and emerging technologies in cloud, or related technical fields.
- Experience in a Software Engineering and Technical Program/Project Management role.
- 5+ years of experience in Unix/Linux systems programming with C, Java, Python, and/or Shell scripting.
- Experience as a release lead and troubleshooting CI/CD pipelines, production & networking incidents.
- Experience with specific software such JIRA, Confluence, Smartsheets and other project tracking tools.
- Ability to construct and use SQL queries.
It's great, but not required, if you have:
- In addition to a bachelor’s degree, 2+ years of experience with designing and managing large scale complex architecture of software to improve availability, scalability, latency and efficiency, infrastructure, and test automation.
- Familiarity with running web services at scale; understanding of Unix systems internals, storage, and networking.
- 2+ years of cloud development or operations (Azure, AWS or GCP).
- A track record of successfully delivering complex software products and in-depth knowledge of project management.
- Prior work experience with large sites or systems that require high availability.
- Excellent interpersonal, presentation, and communication skills.
- Effective problem-solving skills.
- An understanding Datadog dashboards and Splunk.
More About Our Team
- Atlassian Site Reliability Engineering is a rapidly growing group within the organisation. We are building our teams, tools, and systems as part of Atlassian's mission to build the best SaaS services in the world. This is a truly exciting team to join - we are currently planning to be involved with every technical team across Atlassian. We work side by side with the product family and platform developers to maintain and improve services and performance. We live the company values with a strong customer focus and possess a healthy sense of urgency. We are a heavily data driven team, utilising a variety of data collection, enrichment, analytics and visualisations to learn about our complex systems.
- Atlassian is growing fast. Our teams, products and services are evolving at an astonishing rate, and so our challenge is to grow at the right speed in the right way. Our vision includes moving to ever more automated systems, using our love of analytics and focus on metrics to both feedback to us what is happening in the production and delivery pipelines, as well us drive decisions about where our pain points are and how we fix them. We also live the 'Play, as a team' value by having a strong focus on sharing learning experiences from the front line with the development teams.
More about our benefits
Whether you work in an office or a distributed team, Atlassian is highly collaborative and yes, fun! To support you at work (and play) we offer some fantastic perks: ample time off to relax and recharge, flexible working options, five paid volunteer days a year for your favourite cause, an annual allowance to support your learning & growth, unique ShipIt days, a company paid trip after five years and lots more.
More about Atlassian
Creating software that empowers everyone from small startups to the who’s who of tech is why we’re here. We build tools like Jira, Confluence, Bitbucket, and Trello to help teams across the world become more nimble, creative, and aligned—collaboration is the heart of every product we dream of at Atlassian. From Amsterdam and Austin, to Sydney and San Francisco, we’re looking for people who want to write the future and who believe that we can accomplish so much more together than apart. At Atlassian, we’re committed to an environment where everyone has the autonomy and freedom to thrive, as well as the support of like-minded colleagues who are motivated by a common goal to: Unleash the potential of every team.
We believe that the unique contributions of all Atlassians is the driver of our success. To make sure that our products and culture continue to incorporate everyone's perspectives and experience we never discriminate on the basis of race, religion, national origin, gender identity or expression, sexual orientation, age, or marital, veteran, or disability status.
All your information will be kept confidential according to EEO guidelines.
Back to top