Lead Machine Learning Scientist, Search

3+ months agoLawrenceville, NJ

SiriusXM and Pandora have joined together to create the leading audio entertainment company in the U.S. Together, we are uniquely positioned to lead a new era of audio entertainment by delivering the most compelling subscription and ad-supported audio experiences to millions of listeners -- in the car, at home and on the go. Our talent, content, technology and innovation continue to be at the forefront, and we want you to be a part of it! Check out our current openings below and at

Position Summary:

As part of the Foundation, Search, and Voice Science team, you will design and build the next generation of Pandora, SiriusXM, and Stitcher’s search experiences. Search is our most common listener interaction after play/pause, and is core to content discovery. You will lead the scientific direction for Search Science and design algorithms, train models/embeddings, launch experiments, and analyze performance to drive ranking quality across our search products (mobile apps, automotive products, and third-party devices). You will join a team of scientists with diverse expertise in machine learning, statistics, natural language understanding, and recommendation systems.

As the lead search scientist, you will be an expert in areas spanning information retrieval, learning to rank, query understanding, document embedding, semantic search, and personalization. You will research and invent state-of-the-art solutions in these fields, drive the scientific roadmap, mentor other scientists, and build experiences that impact over 100MM listeners. In this Staff-level role, you will represent the Science & Machine Learning organization when proactively communicating and coordinating with cross-functional teams.

 Duties and Responsibilities:

  • Drive the scientific direction of Search Science.
  • Research, design, experiment with, and build machine learning systems, to drive innovation in search products.
  • Propose, design, and analyze new experiments to validate scientific hypotheses.
  • Build production data pipelines and models at scale, and review methods and code of other scientists.
  • Generate ideas for high-leverage, long-term projects with broad reach.
  • Promote and role-model best practices of data science, engineering, and communication throughout the organization.
  • Mentor and guide research and development of other scientists.

Supervisory Responsibilities:


Minimum Qualifications:

  • Ph.D.. in a quantitative field (CS, EE, Statistics, Physics, Math, Computational Linguistics, Neuroscience, etc.) or equivalent practical experience.
  • 5+ years of research and development experience in real world information retrieval, search,  NLU/P, or voice/dialog systems.
  • Expertise in machine-learning-based information retrieval.

 Requirements and General Skills:

  • Demonstrated research and development in information retrieval, learning to rank, or natural language processing.
  • Excellent written and verbal communication skills, with the ability to effectively advocate technical solutions to scientists, engineers, and product audiences.
  • Demonstrated ability to lead technical decisions and teams.
  • Demonstrated ability to invent novel data science, machine learning, and engineering solutions and deploy them at scale.
  • Passion for data-driven research and development, reliability, and disciplined experimentation.
  • Self-motivated, growth-oriented, and driven to pursue solutions to challenging problems.
  • Must have legal right to work in the U.S.

 Technical Skills:

  • Research and development expertise with embedding-based retrieval and learning-to-rank techniques.
  • Experience with Elasticsearch, Solr, Lucence, or equivalent.
  • Experience with ML-frameworks (e.g., TensorFlow, TensorFlow Serving, PyTorch, Vowpal Wabbit, scikit-learn).
  • Production experience implementing machine learning pipelines and models at scale in Python, Java, Scala, or similar languages.
  • Proficiency with distributed processing, warehousing, and orchestration frameworks (e.g., Spark, Hive, Airflow, etc.).
  • Experience with the research and development workflow/life-cycle for large-scale batch and streaming machine learning systems.
  • Ability to gather stakeholder requirements and evaluate technical trade-offs.

 Bonus Skills/Qualifications:

  • Ph.D. in a quantitative field, with a focus on information retrieval, search ranking, or natural language processing.
  • Demonstrated scientific excellence through publishing in first-tier conferences (e.g., SigIR, ACL, EMNLP, NeurIPS, ICLR, AAAI).
  • Experience with contemporary techniques/tools for NLP (e.g., word2vec, RNNs, transformers, spacy, Hugging Face, BERT).
  • Experience with any of the following:
    • Cloud computing: Google Cloud Platform, Amazon Web Services, Azure
    • Additional ML concepts: NLU, Language generation, Reinforcement learning.
    • MLOps tools and practices: feature stores (e.g. Feast), ML pipelines (e.g., TFX/MLFlow/Kubeflow), continuous retraining, serving (e.g., TFServing/KFServing).