ISE, SIML - Data Collections Lead

    • Cupertino, CA


Posted: Jul 27, 2020

Weekly Hours: 40

Role Number: 200141506

Do you think Computer Vision and Machine Learning can change the world? Do you think it can transform the way millions of people capture, discover and share the most special moments of their lives? We truly believe it can! The System Intelligence and Machine Learning (SIML) group is responsible for crafting machine learning solutions to extract high level structure information from images, videos and text shipping on all Apple platforms (macOS, iOS, tvOS, watchOS). Examples include face recognition, scene classification, OCR, handwriting recognition as well as the support for internal tools. The group combines research and development in a dynamic and engaging environment. Our data team is responsible for designing and building high quality datasets at scale. At the heart of machine learning, data defines how Apple features and products operate and what is the final user experience that will impact millions of our customers. This is an exciting time to join us: grow fast, and have an impact on multiple key features on your first day at Apple!

Key Qualifications

  • Passion to create great products and to understand the challenges associated to building datasets for machine learning features, targeting a global and diverse user base, while addressing the challenges of inclusion, bias removal, fairness, etc.
  • Ability to define/design/develop data collection and QA workflows, focusing on the end-to-end user experience: anticipate potential failure modes and edge cases, create examples, detect anomalies
  • Capacity to multitask and manage several projects in parallel, meeting deadlines and providing visibility to client and partners
  • Problem solving & critical thinking capacities, with an eye for innovation and for continuous optimization (improve the diversity and quality of assets, reduce time to delivery and cost)
  • Excellent communication and interpersonal skills
  • Coding experience in Python to help with asset/ data manipulation, a strong plus


Our team focuses on data collection/generation, smart filtering/selection, annotation, as well as failure analysis. Each year, we power dozens of features and work closely with ML teams across the entire company. Apple's commitment to deliver incredible experiences to a global and diverse set of users in full respect of their privacy leads our team to explore innovative data collection processes. This position focuses on setting up and managing data collection projects end to end and ensuring the quality of the data delivered to R&D. This requires the candidate to: - Communicate with R&D team to understand expectations and define specs of data collection efforts, with a constant focus on fairness and on potential biases - Coordinate the efforts of internal teams (privacy, legal, procurement, security) and lead the administrative setup - Co-define a data collection workflow with the partners, in accordance with Apple values - Establish guidelines and training material, and estimate the time per task - Calibrate task with partners - Define and dynamically adapt the QA methodology (eg types of errors, penalties, QA workflow, desired labels) - Lead the QA effort - Track quantities and quality delivered

Education & Experience

Bachelors degree in: Graphics Design, Computer Science, Mathematics, Physics, or equivalent experience

Additional Requirements

  • Experience with statistical models, machine learning, data analysis is a plus

Back to top