Principal MLOps Engineer

3+ months ago• Santa Clara, CA

Our Mission

At Palo Alto Networks®, we're united by a shared mission-to protect our digital way of life. We thrive at the intersection of innovation and impact, solving real-world problems with cutting-edge technology and bold thinking. Here, everyone has a voice, and every idea counts. If you're ready to do the most meaningful work of your career alongside people who are just as passionate as you are, you're in the right place.

Who We Are

In order to be the cybersecurity partner of choice, we must trailblaze the path and shape the future of our industry. This is something our employees work at each day and is defined by our values: Disruption, Collaboration, Execution, Integrity, and Inclusion. We weave AI into the fabric of everything we do and use it to augment the impact every individual can have. If you are passionate about solving real-world problems and ideating beside the best and the brightest, we invite you to join us!

We believe collaboration thrives in person. That's why most of our teams work from the office full time, with flexibility when it's needed. This model supports real-time problem-solving, stronger relationships, and the kind of precision that drives great outcomes.

Job Description

Your CareerWe are looking for a Principal MLOps Engineer to lead the design, development, and operation of production-grade machine learning infrastructure at scale. In this role, you will architect robust pipelines, deploy and monitor ML models, and ensure reliability, reproducibility, and governance across our AI/ML ecosystem. You will work at the intersection of ML, DevOps, and cloud systems, enabling our teams to accelerate experimentation while ensuring secure, efficient, and compliant deployments.This role is located at our dynamic Santa Clara California headquarters campus, and in office 3 days a week. Not a remote role.Your ImpactEnd-to-End ML Architecture and Delivery Ownership: Architect, design, and lead the implementation of the entire ML lifecycle. This includes ML model development and deployment workflows that seamlessly transition models from initial experimentation/development to complex cloud and hybrid production environments.Operationalize Models at Scale: Develop and maintain highly automated, resilient systems that enable the continuous training, rigorous testing, deployment, real-time monitoring, and robust rollback of machine learning models in production, ensuring performance meets massive scale demands.Ensure Reliability and Governance: Establish and enforce state-of-the-art practices for model versioning, reproducibility, auditing, lineage tracking, and compliance across the entire model inventory.Drive Advanced Observability & Monitoring: Develop comprehensive, real-time monitoring, alerting, and logging solutions focused on deep operational health, model performance analysis (e.g., drift detection), and business metric impact.Champion Automation & Efficiency: Act as the primary driver for efficiency, pioneering best practices in Infrastructure-as-Code (IaC), sophisticated container orchestration, and continuous delivery (CD) to reduce operational toil.Collaborate and Lead Cross-Functionally: Partner closely Security Teams, and Product Engineering to define requirements and deliver robust, secure, and production-ready AI systems.Lead MLOps Innovation: Continuously evaluate, prototype, and introduce cutting-edge tools, frameworks, and practices that fundamentally elevate the scalability, reliability, and security posture of our production ML operations.Optimize Infrastructure & Cost: Strategically manage and optimize ML infrastructure resources to drive down operational costs, improve efficiency, and reduce model bootstrapping times.

Qualifications (Additional Job Description)

Your Experience 8+ years of software/DevOps/ML engineering experience, with at least 3+ years focused specifically on advanced MLOps, ML Platform, or production ML infrastructure and 5+ yeas of experience building ML ModelsDeep expertise in building scalable, production-grade systems using strong programming skills (Python, Go, or Java).Expertise in leveraging cloud platforms (AWS, GCP, Azure) and container orchestration (Kubernetes, Docker) for ML workloads.Proven hands-on experience in the ML Infrastructure lifecycle, including:Model Serving: (TensorFlow Serving, TorchServe, Triton Inference Server/TIS).Workflow Orchestration: (Airflow, Kubeflow, MLflow, Ray, Vertex AI, SageMaker).Mandatory Experience with Advanced Inferencing Techniques: Demonstrable ability to utilize advanced hardware/software acceleration and optimization techniques, such as TensorRT (TRT), Triton Inference Server (TIS), ONNX Runtime, Model Distillation, Quantization, and pruning.Strong, hands-on experience with comprehensive CI/CD pipelines, infrastructure-as-code (Terraform, Helm), and robust monitoring/observability solutions (Prometheus, Grafana, ELK/EFK stack).Comprehensive knowledge of data pipelines, feature stores, and high-throughput streaming systems (Kafka, Spark, Flink).Expertise in operationalizing ML models, including model monitoring, drift detection, automated retraining pipelines, and maintaining strong governance and security frameworks.A strong track record of influencing cross-functional stakeholders, defining organizational best practices, and actively mentoring engineers at all levels.Unwavering passion for operational excellence, building highly scalable, and securing mission-critical ML systems.MS/PhD in Computer Science/Data Science, Engineering

Want more jobs like this?

Get jobs in Santa Clara, CA delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

Compensation Disclosure

The compensation offered for this position will depend on qualifications, experience, and work location. For candidates who receive an offer at the posted level, the starting base salary (for non-sales roles) or base salary + commission target (for sales/com-missioned roles) is expected to be the annual range listed below. The offered compensation may also include restricted stock units and a bonus. A description of our employee benefits may be found here.

$157,200.00 - $254,100.00/yr

Our Commitment

We're trailblazers that dream big, take risks, and challenge cybersecurity's status quo. It's simple: we can't accomplish our mission without diverse teams innovating, together.

We are committed to providing reasonable accommodations for all qualified individuals with a disability. If you require assistance or accommodation due to a disability or special need, please contact us at accommodations@paloaltonetworks.com .

Palo Alto Networks is an equal opportunity employer. We celebrate diversity in our workplace, and all qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or other legally protected characteristics.

All your information will be kept confidential according to EEO guidelines.

Is role eligible for Immigration Sponsorship?: Yes

Client-provided location(s): Santa Clara, CA

Job ID: Palo_Alto_Networks-JR-011103

Employment Type: OTHER

Posted: 2025-10-02T19:56:29

Perks and Benefits

Health and Wellness
- Health Insurance
- Dental Insurance
- Vision Insurance
- FSA
- HSA
- HSA With Employer Contribution
- Life Insurance
- Short-Term Disability
- Long-Term Disability
- Fitness Subsidies
- On-Site Gym
- Pet Insurance
- Mental Health Benefits
- Virtual Fitness Classes
Parental Benefits
- Fertility Benefits
- Adoption Assistance Program
- Family Support Resources
- Birth Parent or Maternity Leave
- Non-Birth Parent or Paternity Leave
- Adoption Leave
Work Flexibility
- Flexible Work Hours
- Remote Work Opportunities
- Hybrid Work Opportunities
- Work-From-Home Stipend
Office Life and Perks
- Commuter Benefits Program
- Casual Dress
- Happy Hours
- Snacks
- On-Site Cafeteria
- Holiday Events
Vacation and Time Off
- Paid Vacation
- Unlimited Paid Time Off
- Paid Holidays
- Personal/Sick Days
- Leave of Absence
- Volunteer Time Off
Financial and Retirement
- 401(K)
- 401(K) With Company Matching
- Company Equity
- Stock Purchase Program
- Performance Bonus
- Relocation Assistance
Professional Development
- Promote From Within
- Mentor Program
- Access to Online Courses
- Leadership Training Program
- Tuition Reimbursement
- Lunch and Learns
- Internship Program
- Professional Coaching
- Work Visa Sponsorship
Diversity and Inclusion
- Diversity, Equity, and Inclusion Program
- Employee Resource Groups (ERG)
- Founder led
- Veteran founded/led
- Asian founded/led

Company Videos

Hear directly from employees about what it is like to work at Palo Alto Networks.

Want more jobs like this?

Perks and Benefits

Health and Wellness

Parental Benefits

Work Flexibility

Office Life and Perks

Vacation and Time Off

Financial and Retirement

Professional Development

Diversity and Inclusion

Company Videos