Imagine what you could do here. At Apple, new ideas have a way of becoming products, services, and customer experiences very quickly. Every single day, people do amazing things at Apple! Do you want to impact billions of users by developing an extraordinary product with a prime focus on accuracy, understandability, and performance of the product? Dynamic, Inspiring people and Innovative technologies are the norm here. The Systems Performance Architecture team is seeking an outstanding Data Center Architect to contribute to the design and optimization of computer architectures specifically for Machine Learning and Artificial Intelligence applications. This position is a multi-disciplinary and multi-functional lead engineering role encompassing all aspects of computer system design with a focus on performance. The candidate will need the skills and experience to create complex system architectures, surprise and delight our customers, and advance our products' performance, size, power, thermal and cost goals. Our team is collaborative, creative and passionate about what we do and the value we add in future product designs. Come join us!
Want more jobs like this?
Get jobs in Cupertino, CA delivered to your inbox every week.
Description
As a Technical Specialist, the individual will negotiate and model Data Center infrastructure solution details from a computer architecture and performance perceptive. Collaborate and leverage domain expertise knowledge to provide guidance, and leadership to multi-functional engineering teams to integrate cluster network architectures into overall system architecture to ensure efficient data flow, impact product definitions, and meet scalability requirements. Contribute to the definition of rack and cluster capabilities, configurations, and scale out requirements to support the deployment of dense compute and specialty compute workloads and applications, including but not limited to the following: Pathfinding novel Data Center cluster and node architecture choices with a broad group of architects and system engineers, networking, technical leads, and HW/SW partners. Providing guidance in optimized network designs for large-scale AI/ML clusters considering factors like bandwidth, latency, and scalability. Influencing networking hardware and software components selection for the cluster, including switches, adapters, and protocols. Analyzing network traffic patterns and implementing strategies to improve data transfer speeds within the cluster for target topologies and choice configurations. Collaborating with mechanical, physical, electrical, thermal, power, networking, OS, SW, datacenter infrastructure stakeholders for performant scalable deployments. Exploring and champion new product-level features and workflows. Mentoring junior engineers to best practices and data-driven processes.
Minimum Qualifications
- BS/MS in Computer Engineering or equivalent with 10+ years of relevant industry experience.
- Possesses functional experience in defining and deploying datacenter cluster networking architectures over highly dense mesh networks and interconnected nodes for AI/ML based workloads.
- Proven track record of deploying AI/ML experiences at scale in large-scale data centers; Has strong experience with deployment of modern ML architectures.
- Possesses strong technical breadth across several computer subsystem technologies, e.g., CPU, GPU, TPU, storage, memory, power delivery, power management, high speed networking, I/O, thermal management.
- Has core competence and subject matter expertise architecting complex system architectures for general purpose compute, or specialty compute (GPUs, TPUs) systems running datacenter workloads for AI/ML applications.
Preferred Qualifications
- Detailed knowledge of network protocols, expertise in Ethernet, Infiniband, RoCE, UE, UAL, or other relevant networking protocols.
- Has strong analytical, verbal, written, and communication skills. Ability to summarize and effectively communicate technical issues and actions to key partners and leadership teams
- Ability to comprehend the roles of HW/FW/SW layers and how they interact in system design.
- Ability to create, review and approve engineering requirement specification documents.
- Prior experience with data center and large-scale cluster systems is desired.
- Machine Learning experience is desired.
- Prior experience in performance modeling is desired.
Pay & Benefits
At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $181,100 and $318,400, and your base pay will depend on your skills, qualifications, experience, and location.
Apple employees also have the opportunity to become an Apple shareholder through participation in Apple's discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple's Employee Stock Purchase Plan. You'll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses - including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits.
Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .
Submit Resume