AIML- Machine Learning Engineer, Machine Learning Platform Technologies
Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish.
Do you want to make Siri and Apple products more intelligent for our users? The Information Intelligence Infrastructure team is building groundbreaking technology for search, natural language processing, artificial intelligence and machine learning. Our infrastructure is the back-bone of Apple Intelligence. It powers the largest Apple foundation models on servers and a wide gamut of services at Apple including Apple Search, Apple Music, AppleTV, AppStore, iMessages, Photos & Camera, Spotlight, Safari, Siri and upcoming ever exciting Apple products serving millions of queries every day with incredible low latencies, drawing every ounce of compute from our hardware.
As part of this group, you will work with one of the most exciting high performance computing environments, with petabytes of data, millions of queries per second, and have an opportunity to imagine and build products that delight our customers every single day. You will have a chance to work on optimizing billions of parameter language and vision and speech models using state of the art technologies and make it run at scale of Apple.
Description
We design, build, and operate the core infrastructure that powers some of Apple's most visible intelligence and search experiences. Our systems serve billions of requests every day, operate on trillions of records and petabytes of data, and form the backbone of Apple's search and foundation model platforms.
As part of this team, you will own services end-to-end-from initial design and architecture, through implementation, deployment, and long-term operation. You'll collaborate closely with Foundation Model research teams to optimize for cutting-edge model quality improvements, and partner with product and search teams to ship production-grade, low-latency, high-reliability features used by hundreds of millions of users in real time.
If you enjoy solving hard distributed systems problems, care deeply about reliability and performance at scale, and want to shape how intelligence is delivered across Apple's ecosystem, this role is for you.","responsibilities":"Design, build, and operate large-scale distributed services that power Apple's search and foundation model platforms.
Own systems end-to-end: from architecture and design reviews to implementation, testing, deployment, monitoring, and continuous improvement.
Work with Foundation Model researchers to optimize for data quality, scale, and cost.
Build and evolve petabyte-scale data pipelines and feature pipelines and serve models in real time.
Want more jobs like this?
Get jobs in Santa Clara, CA delivered to your inbox every week.

Define and uphold SLIs/SLOs for critical services, driving reliability, performance, and capacity planning for billions of daily requests.
Collaborate with cross-functional partners (research, engineering, program management, privacy, security) to design solutions that meet user, business, and regulatory requirements.
Improve the overall engineering quality bar through technical reviews, mentoring, documentation, and best practices in testing and observability.
Preferred Qualifications
Proven experience building or operating high-throughput, low-latency services at web scale.
Familiarity with deep learning model architectures such as Transformers and Encoder/Decoder models, and how they are served in production.
Hands-on experience with one or more inference and serving frameworks (e.g., NVIDIA TensorRT-LLM, vLLM, NVIDIA Triton Inference Server) or similar systems.
Experience with data-intensive systems (e.g., large-scale data pipelines, feature generation, Spark, Airflow, Dataflow) and storage technologies at petabyte scale.
Background in performance optimization (profiling, capacity planning, cost optimization) for ML systems &/or large-scale backend systems.
Prior experience working on search, recommendations, or foundation model platforms is a plus.
Minimum Qualifications
Strong computer science fundamentals: algorithms, data structures and distributed systems.
12+ years of experience designing, building, and operating large-scale distributed systems in production.
Proficiency in at least one modern programming language Python, Go, or Scala, with a focus on writing robust, maintainable code.
Experience with production services: performance tuning, telemetry, on-call, incident management.
Familiarity with at least one popular ML framework such as PyTorch, TensorFlow, JAX and an understanding of how ML/AI systems are deployed in production.
Familiarity with Kubernetes, Google cloud, Dataflow.
Excellent collaboration and communication skills; able to work independently and cross-functionally with research, product, and operations teams.
BS/MS in Computer Science or equivalent industry experience.
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .
Perks and Benefits
Health and Wellness
Parental Benefits
Work Flexibility
Office Life and Perks
Vacation and Time Off
Financial and Retirement
Professional Development
Diversity and Inclusion
Company Videos
Hear directly from employees about what it is like to work at Apple.