Skip to main contentA logo with &quat;the muse&quat; in dark blue text.
Atlassian

Principal Site Reliability Engineer - Advisory

Job Description

As a Principal SRE at Atlassian, you will join an engineering-led company and the award-winning leader in software development and collaboration tools. With your deep understanding of modern engineering practices, your programming expertise and your operational experience, you will join a tactical taskforce that will engage with many teams across Atlassian to help reliably scale our Cloud products and platform. This is an amazing opportunity for you to impact a broad range of Atlassian teams and services from both a technical perspective, by assessing and recommending reliability-related technical changes; and a non-technical perspective, by enabling and empowering teams to adopt reliability best practices. If you crave variety, love building relationships with people and have a burning desire to make a difference, then this is the role for you.

Want more jobs like this?

Get jobs delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

 

More about you

On your first day, we'll expect you to have:

  • Strong organizational and communication skills, with experience developing and instilling a culture of operational maturity.
  • Ability to analytically select the best of a range of solutions, factoring in input from colleagues, documenting decisions along the way.
  • Experience driving large, complex, cross-organisational initiatives. 
  • Strong stakeholder management and communication skills.
  • An ability and desire to mentor and coach engineers.
  • Expertise with software development in languages like Java, Go and Python.
  • Hands on experience with public cloud offerings (AWS components like EC2, CloudFormation, IAM, RDS, S3, DynamoDB, Kinesis - or equivalents, e.g. in GCP)
  • Experience managing complex systems in AWS that consume many types of AWS resources.
  • Experience with configuration management tools (Ansible, Chef, Puppet, Salt, etc...).
  • Working knowledge of datastores (RDBMS, time-series-database, NoSql, search, analytics).
  • Experience operating software in production: building monitoring into your code, tweaking dashboards, defining alerts, writing runbooks, etc...
  • Understanding of high-availability, fault-tolerant, scalable, distributed systems.
  • Experience diagnosing and resolving problems in high-throughput web applications and network services.
  • A "non-hero attitude": rather than celebrating heroic effort pulled off to resolve an incident, prefer engaging in engineering practices that avoid the incidents in the first place.

 

The following are not required, but definite bonuses:

  • Experience with containerisation technologies like Docker, Kubernetes or Mesosphere.
  • Experience with agile software development methodologies and software development best practices, such as unit testing, pair programming, and continuous integration.
  • Experience engaging with and building trust amongst internal customers and/or developer communities.
  • Experience working with remote teams.
  • Experience with incident management processes and ITIL terminology for incident and problem management.
  • Experience participating in 24/7 on-call rosters.
  • Ability and willingness to learn new programming languages, frameworks and paradigms. Polyglots welcome!

 

More about our team

Atlassian Site Reliability Engineering is a rapidly growing group within the organization. We are in the process of building our teams, tools and systems as part of Atlassian's mission to build the best SaaS services in the world. This is a truly exciting team to join - we are currently or are planning to be involved with every technical team across Atlassian.

We enable Atlassian to go fast by providing real time feedback on production systems. We work side by side with the product family and platform developers to maintain and improve services and performance. We live the company values with a strong customer focus and possess a healthy sense of urgency. We are a heavily data driven team, utilising a variety of data collection, enrichment, analytics and visualisations to learn about our complex systems.

We also live the 'Play, as a team' value by having a strong focus on sharing learning experiences from the front line with the development teams. So, the options for people in the team are vast. If you like mastering a domain and going deep, we need you. If you can juggle three tasks and coordinate multiple people in the heat of an incident, we need you. If you love the benefits of process and methodical improvement, you will love it here. If you want to keep your head down, headphones on and bash out code to support the team, we have a spot for you too.

Additional Information

We believe that the unique contributions of all Atlassians is the driver of our success. To make sure that our products and culture continue to incorporate everyone's perspectives and experience we never discriminate on the basis of race, religion, national origin, gender identity or expression, sexual orientation, age, or marital, veteran, or disability status.

All your information will be kept confidential according to EEO guidelines.

Job ID: 743999668671836
Employment Type: Other

Perks and Benefits

  • Health and Wellness

    • Health Insurance
    • Dental Insurance
    • Vision Insurance
    • Life Insurance
    • Short-Term Disability
    • Long-Term Disability
    • FSA
    • HSA With Employer Contribution
    • Fitness Subsidies
    • Mental Health Benefits
    • On-Site Gym
    • HSA
  • Parental Benefits

    • Adoption Leave
    • Birth Parent or Maternity Leave
    • Non-Birth Parent or Paternity Leave
    • Fertility Benefits
    • Adoption Assistance Program
    • Family Support Resources
  • Work Flexibility

    • Flexible Work Hours
    • Remote Work Opportunities
    • Hybrid Work Opportunities
    • Work-From-Home Stipend
  • Office Life and Perks

    • Holiday Events
    • Casual Dress
    • Pet-friendly Office
    • Happy Hours
    • Snacks
    • Some Meals Provided
    • On-Site Cafeteria
  • Vacation and Time Off

    • Paid Vacation
    • Unlimited Paid Time Off
    • Paid Holidays
    • Personal/Sick Days
    • Volunteer Time Off
    • Sabbatical
    • Leave of Absence
  • Financial and Retirement

    • 401(K) With Company Matching
    • Company Equity
    • Performance Bonus
    • Relocation Assistance
    • Financial Counseling
  • Professional Development

    • Access to Online Courses
    • Internship Program
    • Leadership Training Program
    • Tuition Reimbursement
    • Learning and Development Stipend
    • Promote From Within
  • Diversity and Inclusion

    • Founder led
    • Employee Resource Groups (ERG)
    • Diversity, Equity, and Inclusion Program

Company Videos

Hear directly from employees about what it is like to work at Atlassian.

This job is no longer available.

Search all jobs