Senior Engineering Program Manager, Enterprise Technology Services
We are seeking a Engineering Program Manager (EPM) to lead large-scale Site Reliability Engineering (SRE) initiatives that underpin the resilience, scalability, and performance of our cloud-native services. This senior role requires strategic thinking, program leadership, and deep collaboration across engineering, operations, and product to drive reliability outcomes at scale. You will be a key partner to senior engineering leaders, ensuring alignment of priorities, disciplined execution, and operational excellence across the SRE portfolio.
Description
Our organization works with many cross functional teams across the company. We're looking for an intellectually curious and creative individual who is comfortable operating in ambiguity, a strategic and operational thinker with strong analytical and creative problem-solving skills. They have a passion for process improvement, operational efficiency, and contributing to delivering on some of Apple's most important product goals through operational execution. You will work directly with our cross-functional team across Global Operations to execute global projects from inception to launch.
Responsibilities
- Program Leadership & Strategy
- Define and drive multi-year SRE program roadmaps that enhance service availability, performance, and scalability.
- Translate strategic reliability objectives into actionable execution plans with clear milestones, KPIs, and accountability.
- Partner with engineering, product, and operations leaders to prioritize investments balancing short-term delivery with long-term platform evolution.
- Execution & Delivery
- Lead cross-functional engineering programs spanning capacity planning, incident reduction, observability, automation, and infrastructure modernization.
- Establish governance models, reporting cadences, and decision frameworks to improve delivery velocity and predictability.
- Manage complex dependencies across SRE, platform engineering, security, and product teams.
- Operational Excellence
- Drive adoption of reliability best practices (SLAs, budgets, incident retrospectives) across services.
- Ensure consistent application of security and compliance standards across heterogeneous hardware and software environments.
- Champion automation and tooling that reduces toil and accelerates recovery.
- Stakeholder Management & Communication
- Communicate program status, risks, and impact to executives, engineering leaders, and partner orgs with clarity and transparency.
- Build trust and alignment across globally distributed teams, fostering a culture of accountability and collaboration.
- Serve as a bridge between business needs and technical execution, ensuring customer impact remains central.
Minimum Qualifications
- 5 + years of technical program or project management for large-scale Infrastructure projects.
- Proven track record and technical knowledge in infrastructure delivery (data stores, storage compute, network), DevOps, SRE, and/or software engineering.
- Build strong customer relationships with operations, data centers, suppliers, vendors, manufacturing teams. Identify opportunities that benefit the customer and deliver solutions that meet customer expectations.
- Experience with Tableau or similar dashboard applications.
- Expert in project scoping, identifying risks, developing mitigation strategies, stakeholder management, data-driven analysis for decisions and facilitating resolutions along with application readiness and change adoption.
- Willing and able to travel to international manufacturing partners (can be up to 2 weeks at a time)
Preferred Qualifications
- Experience in technical program management, service delivery, or engineering leadership, driving reliability, infrastructure, or platform engineering programs.
- Proven track record of leading large, multi-team programs in highly available, large-scale distributed systems or cloud environments.
- Strong understanding of SRE practices, DevOps principles, and modern infrastructure (Kubernetes, containers, cloud platforms like AWS/Azure/GCP).
- Readily learns and adopts new technologies. Knowledge of DevOps, continuous delivery, Splunk, AWS services like S3, hosting components such as Netscalers, OS, servers, storage, databases, backup, load balancers, DMZ, WAF, networking, Citrix, VMWare, Linux etc,. Also Deep understanding of incident management processes and best practices. Ability to drive the root cause analysis, identify the corrective actions, and followup to closure.
- Deeply understands architecture and integration points of application sets to support process, products, configuration, policies that takes into consideration the needs of supply chains, data centers and global contract manufacturing sites.
- Demonstrated success in executive stakeholder management and influencing without direct authority.
- Excellent communication, negotiation and presentation skills to globally dispersed project teams and leadership
- Clear, measurable improvements in service availability, latency, and operational efficiency across key platforms.
- Cross-org alignment and execution confidence in reliability programs.
- A repeatable framework for SRE program delivery that scales across services and geographies.
Want more jobs like this?
Get jobs in Austin, TX delivered to your inbox every week.

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .
Submit Resume
Perks and Benefits
Health and Wellness
Parental Benefits
Work Flexibility
Office Life and Perks
Vacation and Time Off
Financial and Retirement
Professional Development
Diversity and Inclusion
Company Videos
Hear directly from employees about what it is like to work at Apple.