Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Machine Learning Infrastructure Engineer - SIML, ISE

1 week ago Cupertino, CA

Are you passionate about Generative AI? Are you interested in working on groundbreaking generative modeling technologies to enrich billions of people? We are the Intelligence System Experience (ISE) team within Apple's software organization. The team operates at the intersection of multimodal machine learning and system experiences. Our multidisciplinary ML teams focus on a broad spectrum of areas, including Visual Generative Foundation Models, Multimodal Understanding, Visual Understanding of People, Text, Handwriting, and Scenes, Personalization, Knowledge Extraction, Conversation Analysis, Behavioral Modeling for Proactive Suggestions, and Privacy-Preserving Learning. These innovations form the foundation of the seamless, intelligent experiences our users enjoy every day.

We are seeking a ML Infrastructure Engineer to design, optimize, and scale the systems that power large-scale model training across the organization. This role sits at the intersection of high-performance computing, machine learning, and infrastructure engineering, delivering the core capabilities teams rely on to iterate quickly and reliably.

Description

The ideal candidate brings strong software engineering fundamentals, deep familiarity with distributed training, and a passion for building infrastructure that is efficient, observable, and easy for ML practitioners to use. You'll work closely with model developers and platform teams to ensure training workflows are fast, reliable, and cost-effective-while also supporting users operationally to keep them unblocked and productive.

Want more jobs like this?

Get jobs in Cupertino, CA delivered to your inbox every week.

Job alert subscription


As an ML Training Infrastructure Engineer, you will:

Build and maintain distributed training infrastructure

Optimize training performance through profiling, parallelization strategies and hardware-aware tuning.

Develop reliable pipelines for data loading, checkpointing, logging, and monitoring to support high-throughput training jobs.

Collaborate directly with ML engineers to understand scaling bottlenecks and design solutions that improve both training speed and resource efficiency.

Create and maintain tooling that simplifies how users configure, launch, and debug distributed training jobs.

Implement strong observability across training workflows-telemetry, dashboards, alerts, and diagnostics.

* Support training workloads, investigate failures, triage performance regressions, and gather real feedback from users.

Preferred Qualifications

Strong communication skills and the ability to collaborate with ML practitioners and infra teams.

Minimum Qualifications

Bachelors, Masters degree in Computer Science, or a related technical field; or equivalent practical experience.

3+ years of experience in software development, with strong Python skills and familiarity with systems programming concepts.

Hands-on experience with ML training frameworks (e.g., PyTorch Distributed, DeepSpeed, JAX, TensorFlow).

Knowledge of distributed systems, parallel computing, and accelerator architectures (GPU/TPU).

Experience debugging performance and reliability issues in complex, large-scale systems.

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .

Client-provided location(s): Cupertino, CA
Job ID: apple-200633545-3337_rxr-658
Employment Type: OTHER
Posted: 2025-11-25T19:23:11

Perks and Benefits

  • Health and Wellness

    • Parental Benefits

      • Work Flexibility

        • Office Life and Perks

          • Vacation and Time Off

            • Financial and Retirement

              • Professional Development

                • Diversity and Inclusion

                  Company Videos

                  Hear directly from employees about what it is like to work at Apple.