Engineering Manager, Platform Services

1 month ago• Austin, TX

At Apple, new insights often become revolutionary products, services, and customer experiences very quickly. Bring passion and dedication to your job, and there's no telling what you could accomplish.

Enterprise Technology Services (ETS) is part of IS&T and delivers global-scale platforms and services that keep Apple's operations secure and running. The team manages identity, device security, and anti-abuse platforms - covering everything from manufacturing and repairs to software updates and activations. ETS also oversees supply chain, manufacturing, and partner integration platforms, protecting data on more than 2.5 billion devices worldwide. And when Apple prepares for a global product launch, ETS owns the systems that ramp factory production - managing serial numbers, network credentials, and verified software.

The Emerging Technologies team specializes in building forward-looking, extremely scalable platforms. The team is passionate about solving challenging problems, exploring new domains, and engineering transformational solutions. The diversity of our team and thinking inspires innovation that runs through everything we do.

Want more jobs like this?

Get Software Engineering jobs in Austin, TX delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

In this role, you will lead a team designing and developing world-class operations solutions for platforms that power many of the services that bring Apple's products to our customers. You will work with a passionate team that lives and breathes scale and security. If you're excited about being challenged and making a significant impact in the tech world, we want to hear from you.

Description

As a Manager for Site Reliability and Operations (SRE), you will lead a team of Site Reliability Engineers to ensure the reliability, scalability, and performance of production systems. This role combines technical expertise with leadership skills to drive operational excellence and foster a culture of collaboration and continuous improvement. As part of the role, you will work with the team to automate operations, optimize infrastructure, and troubleshoot issues in an exciting, fast-paced environment.

This role is designed for driven individuals who:

- Love learning new technologies and thrive in solving complex challenges.

- Comfortable in a fast-paced, changing environment and able to manage competing priorities.

- Ability to work effectively across teams and influence without authority.

- Are independent, motivated, and excited to take on ambitious projects.

- Excel at collaborating with engineering teams and can stay calm under pressure.

- Have a passion for delivering quality, reliable solutions in a dynamic, high-energy workplace.

Responsibilities:

Lead, mentor, and develop a team of SREs.

Foster a culture of reliability and excellence within the team.

Promote continuous learning and knowledge sharing.

Help the team build and maintain robust, highly available systems.

Automate CI/CD processes.

Ensure the availability and performance of production systems.

Oversee incident response, post-mortem analysis, and root cause investigations.

Implement and maintain service-level objectives (SLOs) and service-level indicators (SLIs).

Work closely with development, quality, product, and other engineering teams to ensure reliability is prioritized in the development lifecycle.

Communicate effectively with stakeholders regarding reliability metrics, incident reports, and team progress.

Develop and execute a strategic roadmap for the SRE team.

Identify areas for improvement and propose solutions that align with business goals.

Optimize resource allocation and usage for operational efficiency.

Identify and assess risks to production systems and work to mitigate them.

Preferred Qualifications

Knowledgeable with container-based technologies such as Docker, Kubernetes, or EKS.

Knowledgeable with modern web services architectures and cloud platforms such as AWS and GCP.

Exceptional analytical and troubleshooting skills in complex Unix/Linux systems environments and applications implementations.

Ability to build tools from scratch.

Ability to work in a collaborative environment.

Minimum Qualifications

BS degree or higher in Computer Science or a related field.

5+ years in a site reliability engineering, DevOps, or related role, with at least 2 years in a lead capacity.

Strong understanding of systems architecture, cloud infrastructure, and monitoring tools.

Proficiency in one or more programming languages, in particular Java.

Proven experience in leading and mentoring engineering teams.

Strong analytical skills and the ability to troubleshoot complex systems.

Knowledge of fundamentals of network, databases, system administration, version control, CI/CD automations.

Machine Learning will be a plus.

Strong problem-solving and communication skills.

Client-provided location(s): Austin, TX

Job ID: apple-200663131-0157

Employment Type: OTHER

Posted: 2026-05-31T19:12:47

Perks and Benefits

Health and Wellness
Parental Benefits
Work Flexibility
Office Life and Perks
Vacation and Time Off
Financial and Retirement
Professional Development
Diversity and Inclusion

Company Videos

Hear directly from employees about what it is like to work at Apple.

Want more jobs like this?

Perks and Benefits

Health and Wellness

Parental Benefits

Work Flexibility

Office Life and Perks

Vacation and Time Off

Financial and Retirement

Professional Development

Diversity and Inclusion

Company Videos