Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Infrastructure Production Management & Reliability Engineering III - AVP / Director P3 - ETS

Yesterday Hong Kong

About Morgan Stanley

Morgan Stanley is a leading global financial services firm providing a wide range of investment banking, securities, investment management and wealth management services. The Firm's employees serve clients worldwide including corporations, governments, and individuals from more than 1,200 offices in 43 countries.

As a market leader, the talent and passion of our people is critical to our success. Together, we share a common set of values rooted in integrity, excellence, and strong team ethic. Morgan Stanley can provide a superior foundation for building a professional career - a place for people to learn, to achieve and grow. A philosophy that balances personal lifestyles, perspectives and needs is an important part of our culture.

Overview
Join Morgan Stanley's Application Services Infrastructure team to keep a set of business-critical infrastructure applications reliable for technologists across the firm. Our platforms help teams schedule, coordinate, and monitor their production workloads.

You'll combine deep Linux troubleshooting with automation and reliability engineering: improving monitoring, reducing toil, leading upgrades, and driving root-cause fixes that prevent repeat incidents.

What you'll do
- Own production reliability for multiple infrastructure applications: incident response, triage, and sustained follow-through to resolution.
- Drive stability work: improve alerting quality, monitoring coverage, and operational tooling to reduce noise and speed recovery.
- Lead or execute production changes (upgrades, hygiene fixes, reconfiguration) with strong change-management and rollback planning.
- Perform in-depth RCAs and prevent recurrence of incidents and escalations through long-term fixes, automation, and better runbooks

- Build self-service workflows and high-quality documentation to improve user experience and reduce time-to-production.
- Partner with product engineers and infrastructure teams to identify systemic issues and deliver cross-team solutions.

On-call & schedule
- After onboarding, you'll join a rotating on-call roster with periodic weekend coverage (~1 weekend/month).
- L3 support focuses on high-impact incidents where documentation is incomplete-success requires calm, structured troubleshooting in distributed systems.
- Occasional off-hours work may be needed for planned changes and incident follow-up (we aim to minimize this through automation and process).

Required experience
- At least 7 years of experience in production support / reliability experience for applications on Linux/UNIX.
- Strong command-line troubleshooting skills: logs, processes, networking, and dependency health in distributed systems.
- Ability to write production-ready automation in bash/shell plus one language (Python preferred; Go/Ruby/Perl/C/others welcome).
- Strong written communication for technical documentation and incident/RCA write-ups.
- Working understanding of distributed architecture (load balancers, app servers, databases, messaging).

- AI-assisted development and operational automation.

Preferred experience
- Cloud-native deployment/support and/or containers (Docker/podman).
- Observability tooling (Grafana, Splunk, or similar), log forwarding/agents, and alert tuning.
- Linux administration and performance troubleshooting.
- Any database experience (SQL/NoSQL).
- Experience with workflow/scheduling platforms (Autosys, Apache Airflow) or coordination systems (Apache Zookeeper).

Work model
Hybrid: 3 days/week in-office

WHAT YOU CAN EXPECT FROM MORGAN STANLEY:

At Morgan Stanley, we raise, manage and allocate capital for our clients - helping them reach their goals. We do it in a way that's differentiated - and we've done that for 90 years. Our values - putting clients first, doing the right thing, leading with exceptional ideas, committing to diversity and inclusion, and giving back - aren't just beliefs, they guide the decisions we make every day to do what's best for our clients, communities and more than 80,000 employees in 1,200 offices across 42 countries. At Morgan Stanley, you'll find an opportunity to work alongside the best and the brightest, in an environment where you are supported and empowered. Our teams are relentless collaborators and creative thinkers, fueled by their diverse backgrounds and experiences. We are proud to support our employees and their families at every point along their work-life journey, offering some of the most attractive and comprehensive employee benefits and perks in the industry. There's also ample opportunity to move about the business for those who show passion and grit in their work.

Want more jobs like this?

Get jobs in Hong Kong delivered to your inbox every week.

Job alert subscription


To learn more about our offices across the globe, please copy and paste https://www.morganstanley.com/about-us/global-offices into your browser.

Morgan Stanley is an equal opportunities employer. We work to provide a supportive and inclusive environment where all individuals can maximize their full potential. Our skilled and creative workforce is comprised of individuals drawn from a broad cross section of the global communities in which we operate and who reflect a variety of backgrounds, talents, perspectives, and experiences. Our strong commitment to a culture of inclusion is evident through our constant focus on recruiting, developing, and advancing individuals based on their skills and talents.

Client-provided location(s): Hong Kong
Job ID: Morgan-PT-JR030966
Employment Type: FULL_TIME
Posted: 2026-03-12T18:39:52

Perks and Benefits

  • Health and Wellness

    • Health Insurance
    • Dental Insurance
    • Vision Insurance
    • Life Insurance
    • Short-Term Disability
    • Long-Term Disability
    • Fitness Subsidies
    • On-Site Gym
    • Pet Insurance
    • Mental Health Benefits
    • FSA
    • Virtual Fitness Classes
    • HSA
  • Parental Benefits

    • Fertility Benefits
    • Adoption Assistance Program
    • Family Support Resources
    • Return-to-Work Program
    • Birth Parent or Maternity Leave
    • Non-Birth Parent or Paternity Leave
    • Adoption Leave
  • Work Flexibility

    • Hybrid Work Opportunities
  • Office Life and Perks

    • Commuter Benefits Program
    • Company Outings
    • On-Site Cafeteria
    • Holiday Events
  • Vacation and Time Off

    • Paid Vacation
    • Paid Holidays
    • Leave of Absence
    • Volunteer Time Off
    • Personal/Sick Days
  • Financial and Retirement

    • 401(K) With Company Matching
    • Stock Purchase Program
    • Performance Bonus
    • Relocation Assistance
    • Financial Counseling
  • Professional Development

    • Tuition Reimbursement
    • Promote From Within
    • Mentor Program
    • Access to Online Courses
    • Lunch and Learns
    • Work Visa Sponsorship
    • Leadership Training Program
    • Associate or Rotational Training Program
    • Internship Program
  • Diversity and Inclusion

    • Diversity, Equity, and Inclusion Program
    • Employee Resource Groups (ERG)