Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Applied Research Engineer - Multimodal LLMs for Human Interaction

Yesterday Sunnyvale, CA

Are you excited about the amazing potential of foundation models, LLMs, and multimodal LLMs? We are looking for individuals who thrive on collaboration and have a desire to push the boundaries of what is possible today! The Video Computer Vision org is a centralized applied research and engineering organization responsible for developing real-time on-device Computer Vision and Machine Perception technologies across Apple products. We balance research and product to deliver Apple quality, pioneering experiences, innovating through the full stack, and partnering with HW, SW, and ML teams to influence the sensor and silicon roadmap that brings our vision to life.

Description

We are seeking a highly motivated and skilled senior Applied Research Engineer to join our team. The ideal candidate will have a strong background in developing and exploring capabilities of foundation models and multimodal large language models that integrate various types of data such as text, image, video, and audio. The ideal candidate should have familiarity with agentic AI, reasoning, and large-scale evaluations of agentic systems. In this role, you will work on ground breaking research projects to advance our AI and computer vision capabilities, contributing to both foundational research and practical applications. You will:

- Conduct research and development on multimodal large language models, focusing on exploring and utilizing diverse data modalities.

- Design, implement, and evaluate algorithms and models to enhance the performance and capabilities of our AI systems.

- Collaborate with multi-functional teams, including researchers, data scientists, and software engineers, to translate research into practical applications.

- Stay up-to-date with the latest advancements in AI, machine learning, and computer vision, and apply this knowledge to drive innovation within the company.

Preferred Qualifications

PhD in Computer Science, Electrical Engineering, or a related field with a focus on AI, machine learning, or computer vision.

Expertise in one or more of: computer vision, NLP, multimodal fusion, Generative AI, Self-Supervised Learning, Reinforcement Learning, Agentic AI.

Experience with at least one deep learning framework such as PyTorch, JAX, or similar.

Publication record in relevant venues such as NeurIPS, ICML, ICLR, CVPR, ICCV, and ECCV.

Experience in leading ML initiatives and a proven record of shipping products.

Minimum Qualifications

Experience in developing, training/tuning foundation models and multimodal LLMs.

Programming skills in Python.

Master of Science and a minimum of 2 years of relevant industry experience.

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .

Want more jobs like this?

Get jobs in Sunnyvale, CA delivered to your inbox every week.

Job alert subscription
Client-provided location(s): Sunnyvale, CA
Job ID: apple-200630255-3956_rxr-658
Employment Type: OTHER
Posted: 2025-11-10T19:06:41

Perks and Benefits

  • Health and Wellness

    • Parental Benefits

      • Work Flexibility

        • Office Life and Perks

          • Vacation and Time Off

            • Financial and Retirement

              • Professional Development

                • Diversity and Inclusion

                  Company Videos

                  Hear directly from employees about what it is like to work at Apple.