Senior Software Engineer, Machine Learning Platform Technologies - Traffic Infrastructure
Are you an expert in large-scale networking and traffic infrastructure with a passion for building next-generation platforms for machine learning systems? We're seeking a hands-on technical leader with deep expertise in Envoy, Istio service mesh, L4/L7 load balancing, and modern internet protocols (HTTP/2, gRPC, HTTP/3) to design and scale traffic platforms that power Apple's Search and ML ecosystems.
If you've contributed to CNCF or networking projects such as Envoy, Istio, Kubernetes networking, or related data-plane technologies, and you're excited about building capacity-aware, metrics-driven traffic systems for ML inference and training, this role offers the opportunity to architect at Apple scale-delivering highly performant, resilient, and intelligent traffic infrastructure supporting billions of requests.
Description
The MLPT Traffic Infrastructure Team within Apple's Services organization builds the foundational networking and traffic management platforms that power Search and large-scale ML workloads. Our focus is on designing modern L4/L7 traffic systems that intelligently route, balance, and optimize requests across heterogeneous compute environments-including GPU-backed inference services and multi-cloud deployments.
Want more jobs like this?
Get jobs in Cupertino, CA delivered to your inbox every week.

We are reimagining traffic infrastructure as a programmable, metrics-driven, and capacity-aware platform, leveraging Envoy-based data planes, Istio service mesh, and dynamic control planes to support low-latency, high-throughput ML workloads. You'll work closely with ML engineers, SREs, and platform teams to enable secure, observable, and adaptive request routing for both server-to-server and client-to-server use cases.
","responsibilities":"Architect and build L4/L7 traffic platforms for ML training and inference using Envoy, Istio, and modern load-balancing techniques.
Design and implement dynamic, capacity-aware, and metrics-driven load balancing strategies for HTTP, gRPC, and streaming ML inference workloads.
Develop and optimize service mesh architectures for high-throughput, low-latency ML systems, including multi-cluster and multi-region topologies.
Lead the evolution of client-to-server and server-to-server traffic patterns, including adoption of HTTP/3 where appropriate.
Collaborate with ML and platform teams to support scalable inference, A/B traffic shifting, canarying, and adaptive routing strategies.
Contribute to and upstream improvements in Envoy, Istio, Kubernetes networking, or related CNCF projects, representing Apple in the open-source community.
Implement observability, telemetry, and debugging frameworks for traffic flows (latency, tail behavior, retries, backpressure, saturation).
Ensure traffic platforms are secure, resilient, and cost-efficient, supporting hybrid and multi-cloud environments at global scale.
Mentor engineers and drive architectural decisions across networking and traffic-infra domains.
Preferred Qualifications
9+ years in networking, traffic infrastructure, or large-scale distributed systems roles.
Contributions to CNCF or networking open-source projects (Envoy, Istio, Kubernetes networking, eBPF, etc.).
Experience with HTTP/3, QUIC, or next-generation transport protocols.
Strong understanding of capacity-based routing, adaptive load balancing, and feedback-driven traffic systems.
Experience supporting ML inference platforms, GPU-backed services, or latency-sensitive ML workloads.
Familiarity with observability stacks (OpenTelemetry, Prometheus, Grafana) for traffic and networking telemetry.
Experience operating traffic systems across multi-region, multi-cloud, or hybrid environments.
Excellent communication, technical writing, and cross-functional leadership skills.
B.S., M.S., or Ph.D. in Computer Science, Computer Engineering, or equivalent practical experience.
Minimum Qualifications
BS/MS in Computer Science or equivalent practical experience.
5+ years of experience in distributed systems, networking, or traffic infrastructure engineering.
Strong programming experience in Golang and Python, especially for control-plane or data-plane systems.
Deep expertise in L4/L7 networking concepts, including load balancing, connection management, retries, timeouts, and congestion control.
Hands-on experience with Envoy, Istio, or similar service mesh / proxy technologies.
Strong understanding of HTTP/1.1, HTTP/2, gRPC, and modern transport protocols.
Experience designing and operating high-throughput, low-latency systems in production.
Proven ability to lead complex technical initiatives and mentor engineers.
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .
Perks and Benefits
Health and Wellness
Parental Benefits
Work Flexibility
Office Life and Perks
Vacation and Time Off
Financial and Retirement
Professional Development
Diversity and Inclusion
Company Videos
Hear directly from employees about what it is like to work at Apple.