Meta is seeking an AI Research Engineer/Software Engineer to join our team. The ideal candidate will have experience working on maximizing performance of AI models. This role involves applying these skills to solve some of the most crucial and exciting problems that exist on the web.The AI Applications Engineering team is dedicated to maximizing training and inference performance of Generative AI (GenAI) and Recommendation models on Meta's Training and Inference Accelerator (MTIA). We employ innovative optimization and data parallelization strategies to maximize training throughput for the next generations of GenAI and recommendation models. Additionally, we work cross-functionally with many partner teams to ensure end-to-end performance of large-scale pre-training and inference, enabling us to deliver the next generation of AI experiences more quickly to our users.
Want more jobs like this?
Get jobs in Oslo, Norway delivered to your inbox every week.
Software/Research Engineer Responsibilities:
- Applying state-of-the-art optimization techniques to our latest large-scale AI workloads running on Meta's fleet of accelerators
- Profiling, analyzing, debugging, and optimizing large-scale workloads on our next-generation training superclusters
- Work tightly with our customers to co-design models to maximize pre-training and inference efficiency
- Set direction and goals for the team related to project impact, capacity, and developer efficiency
- Collaborating cross-functionally with the compiler, framework, communication and firmware teams to capture performance bottlenecks
- Implement custom kernels to maximize model performance
- Lead large and complex technical efforts across many engineers and teams
- Bachelor's degree in computer science or a related STEM field
- Experience programming AI accelerators (e.g. GPUs, custom silicon etc.) using AI frameworks such as PyTorch or similar
- Experience developing custom kernels and compiler infrastructure to improve performance using low-level programming models such as CUDA, OpenCL or similar
- Minimum 5 years of experience developing and optimizing performance in modern C/C++
- Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
- Master's/PhD in computer science or related STEM field
- A proven track record of impactful contributions to pre-training of AI models at scale using GPUs/custom ASIC or similar (publications, relevant work experience, shipped products, patents etc)
- Experience with neural network training using ML frameworks such as PyTorch etc.
- Experience with distributed AI systems and communication protocols such as MPI or collective libraries such as NCCL etc.
- Experience or knowledge in one or more of LLMs and recommender systems.
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today-beyond the constraints of screens, the limits of distance, and even the rules of physics.
Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity or sales incentives, if applicable. In addition to base compensation, Meta offers benefits. Learn more about benefits at Meta.