Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Senior Lead Site Reliability Engineer

Yesterday Plano, TX

Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.

As a Sr Lead Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking Data and Analytics team, you will help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure and reducing work through automation. You'll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment you'll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow. A

Want more jobs like this?

Get Science and Engineering jobs in Plano, TX delivered to your inbox every week.

Job alert subscription


Job responsibilities

  • Creates high quality designs, roadmaps, and program charters that are delivered by you or the engineers under your guidance
  • Provides advice and mentoring to other engineers and acts as a key resource for technologists seeking advice on technical and business-related issues
  • Demonstrates site reliability principles and practices every day and champions the adoption of site reliability throughout your team
  • Collaborates with others to create and implement observability and reliability designs for complex systems that are robust, stable, and do not incur additional toil or technical debt
  • Identify application patterns and analytics in support of better service level objectives
  • Design self-healing and resiliency patterns
  • Design automated software and product upgrades, change management, and release management solutions
  • Works toward becoming an expert on the applications and platforms in your remit while understanding their interdependencies and limitations
  • Evolves and debug critical components of applications and platforms
  • Provides comprehensive and ongoing guidance, tools, and solutions to support the firms' growth
  • Makes significant contributions to JPMorgan Chase's site reliability community via internal forums, communities of practice, guilds, and conferences

Required qualifications, capabilities, and skills

  • 16+ Years of software engineering experience with 5+ years of Site Reliability Engineering experience.
  • Advanced knowledge in site reliability culture and principles with demonstrated ability to implement site reliability within an application or platform.
  • At least 2+ years of hands-on experience in architecting, scaling, and providing SRE support for AI/ML platforms and products, including infrastructure tech stacks such as Databricks, GPU clusters, Model Serving frameworks, Feature Stores, Vector Databases, and LLM inference pipelines.
  • Demonstrated ability to apply core SRE fundamentals - including reliability patterns, capacity planning, incident management, performance tuning, and toil reduction - specifically to AI/ML and data-intensive, compute-heavy workloads.
  • Experience in defining and enforcing SLOs/SLIs tailored to AI/ML workloads (e.g., model latency, throughput, data freshness, inference availability) to drive reliability at scale.
  • Proven hands-on experience in designing and implementing Agentic AI-based solutions to deliver SRE capabilities at scale, including practical expertise with AI Agents, Skills, Context Management, Retrieval-Augmented Generation (RAG), and tool-use patterns.
  • Ability to apply Agentic AI frameworks to automate and augment core SRE functions such as intelligent incident detection and remediation, automated root cause analysis, predictive alerting, self-healing infrastructure, runbook automation, and observability enrichment to reduce toil and accelerate MTTR.
  • Contribute to governance and controls of AI usage with site reliability mindset and principles of CCB systems and platforms.
  • Advanced knowledge and experience in observability such as white and black box monitoring, service level objectives, alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc.

Preferred Qualifications

  • Experience with cloud-based data and analytics architecture, including AWS storage, Snowflake, Kubernetes (EKS), event-driven architectures, streaming services, batch jobs, and ETL pipelines.
  • Proficiency with modern data processing frameworks such as Apache Kafka, Apache Spark, and similar tools, with a focus on ensuring scalability, reliability, and performance of data and analytics platforms.
  • Strong communication skills with ability to mentor and educate others on site reliability principles and practices.
  • Recognized as an active contributor of the engineering community.


ABOUT US

Chase is a leading financial services firm, helping nearly half of America's households and small businesses achieve their financial goals through a broad range of financial products. Our mission is to create engaged, lifelong relationships and put our customers at the heart of everything we do. We also help small businesses, nonprofits and cities grow, delivering solutions to solve all their financial needs.

We offer a competitive total rewards package including base salary determined based on the role, experience, skill set and location. Those in eligible roles may receive commission-based pay and/or discretionary incentive compensation, paid in the form of cash and/or forfeitable equity, awarded in recognition of individual achievements and contributions. We also offer a range of benefits and programs to meet employee needs, based on eligibility. These benefits include comprehensive health care coverage, on-site health and wellness centers, a retirement savings plan, backup childcare, tuition reimbursement, mental health support, financial coaching and more. Additional details about total compensation and benefits will be provided during the hiring process.

We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.

Equal Opportunity Employer/Disability/Veterans

ABOUT THE TEAM

Our Consumer & Community Banking division serves our Chase customers through a range of financial services, including personal banking, credit cards, mortgages, auto financing, investment advice, small business loans and payment processing. We're proud to lead the U.S. in credit card sales and deposit growth and have the most-used digital solutions - all while ranking first in customer satisfaction.

Client-provided location(s): Plano, TX, Jersey City, NJ
Job ID: JPMorgan-210738827
Employment Type: FULL_TIME
Posted: 2026-05-08T20:01:21

Perks and Benefits

  • Health and Wellness

    • Parental Benefits

      • Work Flexibility

        • Office Life and Perks

          • Vacation and Time Off

            • Financial and Retirement

              • Professional Development

                • Diversity and Inclusion