Engineering Manager - System and Panic Triage
The Core OS team is seeking an exceptional engineering manager to lead the team responsible for enabling Apple's operating systems to achieve world-class reliability. This team develops and owns mission-critical tools and services that detect, analyze, and classify kernel panics and low-level crashes across all Apple platforms. You will be partnering with engineering teams across Software, Hardware, and Silicon groups to drive and deliver the rock-solid OS reliability for over 2 billion currently active Apple devices and shape the future of system reliability across Apple's entire product ecosystem.
Description
Lead a team of engineers triaging kernel panics and critical system-level issues across all Apple platforms (macOS, iOS, watchOS, tvOS). Build intelligent automation pipelines that analyze, group, and prioritize failure signatures based on their reliability impact. Mentor engineers to design and develop advanced systems diagnostic and at-scale debug services to realize the vision of zero-iteration debugging and fully automated triage and root cause analysis. Develop telemetry-based dashboards to monitor at-scale panic/crash triage and analysis services to ensure they are working as expected and efficiently. Collaborate with Core OS, Hardware, Silicon, and other engineering teams to champion and advance improvements in debuggability, panic data quality, symbolication, and automation of triage and debug workflows.","responsibilities":"Build and manage a world-class Panic Triage & Tools team, developing senior systems engineers into technical leaders
Define and execute the multi-year technical roadmap for platform triage and reliability, partnering with senior cross-functional leaders to align with Apple's quality standards
Attract, develop, and retain top-tier talent while fostering a culture of technical innovation, collaborative problem-solving, and engineering excellence
Drive engineering quality, scalability, and reliability for debug and triage services handling large scale of daily events across Apple's ecosystem
Ensure the team's tools and processes directly contribute to the stability and reliability that defines the Apple user experience
Preferred Qualifications
Experience applying ML/AI for automated triage and reliability services is preferred
Experience with large-scale telemetry systems processing millions of events daily is preferred
Minimum Qualifications
Demonstrated track record of building and scaling high-performing engineering teams
Passion for solving challenging technical problems that directly impact millions of users
Strong communication skills with ability to influence technical direction across organizational boundaries
Experience managing complex, multi-platform technical initiatives with measurable reliability improvements
Strong technical depth in operating system internals will be helpful
BS/MS in Computer Science, Compute Engineering, Electrical Engineering, or equivalent experience
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .
Want more jobs like this?
Get jobs in Cupertino, CA delivered to your inbox every week.

Perks and Benefits
Health and Wellness
Parental Benefits
Work Flexibility
Office Life and Perks
Vacation and Time Off
Financial and Retirement
Professional Development
Diversity and Inclusion
Company Videos
Hear directly from employees about what it is like to work at Apple.