Senior Site Reliability Engineer
- San Francisco, CA
Changing the world through digital experiences is what Adobe's all about. We give everyone-from emerging artists to global brands-everything they need to design and deliver exceptional digital experiences. We're passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen.
We're on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours.
Do you thrive on solving hard problems at scale? Do you want to build cloud-native systems that work seamlessly across devices? Are you interested in defining the future of creativity by making it fast, fun, and frictionless to create sophisticated-looking content? Most of all, are you passionate about the next generation of users and building modern solutions for them that connect to leading social and digital platforms?
We are a team of passionate storytellers, technology innovators, and change agents. Building off Adobe Spark's initial success, we are now re-imagining (from the ground up!) the way people discover, create, and publish the full range of media types - from graphics to imaging to video - right in the browser and on their mobile devices. Our aim is to build fast and easy product experiences that empower students, social influencers, marketers, small businesses - really anyone with something to say - to make something that will stand out and impress their audience.
We are seeking an experienced Site Reliability Engineer (SRE) to join our team of talented developers and quality engineers. You have to be very knowledgeable in patterns for infrastructure services, highly available systems and scalable web architectures.
Our Site Reliability Engineers work hand-in hand with a growing team of hardworking engineers, building out a new platform critical to Adobe's effort to enable communicators to be more productive. SREs provide operational tooling and engineering support to create reliable, quick, and transparent services and supporting infrastructure.
A capable Site Reliability Engineer should have one main high-level objective: identify and solve complex problems through software. This is not a traditional sysadmin/operations role (i.e. deployments, ticket work, dashboarding, monitoring, incident response). A significant portion of time (~50%) will be some form of programming/development work, preferably to solve self-identified problems.
What you will do:
- Ensure the highest level of uptime and Quality of Service (QoS) to Adobe's customers through operational excellence
- Embed with product teams to foster strong collaboration/partnership
- Identify areas to improve service resiliency through techniques such as chaos engineering, performance/load testing, etc
- Support and maintain globally distributed, multi-cloud (public and/or private) environments
- Automate common, repeatable tasks at large scale to streamline operational procedures
- Design and maintain production monitoring systems
- Troubleshoot performance and stability issues using a wide variety of tools
- Evaluate and manage application and environment security
- Follow organizational change processes during implementations
- Use and maintain version control for application infrastructure
- Cross-train with other global team members
- Participate in an on-call rotation as required
- Determine root-cause for all production level incidents and write corresponding high-quality RCA reports
- Promote the DevOps/SRE mindset
What you need to succeed :
- 3-5 years production level experience with distributed applications at scale in public cloud (AWS and/or Azure)
- Production level expertise with containerization orchestration engines (i.e. Kubernetes)
- Strong working knowledge of modern, continuous development techniques and pipelines (Agile, CI/CD, Jenkins, Git, Artifactory)
- Experience working within software development or Internet-related industries, particularly in the context of a SaaS offering.
- Curious and self-driven to ask very difficult questions and capable of leading change in a diverse organizational landscape
- Strong written and oral communication skills with a high degree of comfort speaking with engineering management, developers, and leadership
- Demonstrated ability to adapt to new technologies and learn quickly
- B.S. degree in Computer Science, related technical field or equivalent practical experience
- Standout colleague
- Self-starter requiring minimal direction
- Ability to learn quickly and adapt to changing priorities and requirements
Back to top