Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Software Engineer: Agentic Evaluation

Yesterday Cupertino, CA

At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish.

Do you want to help measure and improve the quality of Siri across the devices, features, and experiences people rely on every day? Apple's Agentic Evaluation Engineering organization builds the infrastructure that determines how Siri's quality is measured, trusted, and improved. You'll join a team focused on expanding what that platform can reach: the devices and environments we evaluate on, the features and interaction modalities we exercise, and the realistic, repeatable conditions we stage to ground each evaluation. The surface area is large and growing. You'll have real autonomy in how you tackle it, and you'll build infrastructure the team can rely on as priorities shift.

Description

In this role you'll contribute to the infrastructure, tooling, and pipelines that let us evaluate Siri reliably and at scale. You'll have meaningful autonomy in how you get there, and the work will move across several areas of expansion as priorities evolve. The specific platforms, frameworks, and components will change over time, so we're looking for someone who can transition smoothly across them and bring strong evaluation and systems engineering fundamentals to whatever the team needs next.

Responsibilities:

Extending evaluation capabilities to new devices, platforms, and runtime environments, with designs that favor portability over any single target

Supporting the evaluation of new Siri features and interaction modalities, working from ambiguous early requirements toward concrete, automated coverage

Diagnosing failures across the stack, from environment provisioning through pipeline execution to scoring, enabling auto-diagnostics and driving durable fixes

Contributing to architecture decisions for the team's evaluation systems

Partnering across engineering, infrastructure, and program teams to align on interfaces, priorities, and shared standards

Preferred Qualifications

Experience staging, provisioning, or controlling test or evaluation environments to produce repeatable, deterministic conditions

Experience evaluating ML, LLM or agent-based systems, including familiarity with metrics, scoring methodology, or trajectory and outcome analysis

Experience designing or operating test infrastructure at scale, such as device provisioning, environment restore, warm pools, or continuous integration systems

Proficiency with Python and Swift in a production setting

A track record of approaching problems flexibly and cutting through ambiguity, adapting your approach to reach the right outcome and setting a clear path when requirements are not yet defined

A talent for focusing and simplifying, stripping away what is not essential and distilling complex decisions down to the factors that matter

A history of collaborating across teams and communicating effectively with both technical and program audiences

Minimum Qualifications

Strong programming skills in one or more compiled languages (Swift, C++ or Objective-C).

Python scripting skills for tooling and automation

Solid understanding of computer science fundamentals

Ability to quickly learn new technologies and adapt to evolving requirements

Excellent communication skills and ability to collaborate across teams

M.S. or B.S. in Computer Science, Machine Learning, or related field (or equivalent experience)

Pay & Benefits

At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $181,100 and $318,400, and your base pay will depend on your skills, qualifications, experience, and location.

Apple employees also have the opportunity to become an Apple shareholder through participation in Apple's discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple's Employee Stock Purchase Plan. You'll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses - including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits

Want more jobs like this?

Get Software Engineering jobs in Cupertino, CA delivered to your inbox every week.

Job alert subscription


Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.

Client-provided location(s): Cupertino, CA
Job ID: apple-200666976-0836
Employment Type: OTHER
Posted: 2026-06-12T19:56:59

Perks and Benefits

  • Health and Wellness

    • Parental Benefits

      • Work Flexibility

        • Office Life and Perks

          • Vacation and Time Off

            • Financial and Retirement

              • Professional Development

                • Diversity and Inclusion

                  Company Videos

                  Hear directly from employees about what it is like to work at Apple.