Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Cloud Site Reliability Engineer

Yesterday Mumbai, India

We are seeking a highly skilled Site Reliability Engineer (SRE) with strong experience in Kubernetes troubleshooting, incident response, and deep knowledge of monitoring and alerting systems, along with solid experience in CI/CD pipeline design and maintenance. You will play a key role in building and maintaining reliable infrastructure, enhancing observability, and ensuring uptime for mission-critical systems.

In this role, you will...

  • Diagnose and resolve issues in Kubernetes clusters, including deployments, pod failures, networking issues, and autoscaling.
  • Lead incident management efforts including on-call response, root cause analysis, and continuous improvement of incident playbooks.
  • Design and maintain monitoring, logging, and alerting systems using tools such as Prometheus, Grafana, and ELK (Elasticsearch, Logstash, Kibana).
  • Set up and manage Kibana dashboards and maintain the ELK stack to ensure high availability and performance of logging infrastructure.
  • Integrate metrics, logs, and traces into a unified observability platform.
  • Build and maintain alerting pipelines to reduce noise and improve signal-to-noise ratio for production incidents.
  • Contribute to infrastructure automation using tools like Terraform, Helm.
  • Set up and support CI/CD pipelines for automated testing, deployment, and rollback across multiple environments.
  • Participate in shift rotations and continuously improve observability and response systems.

Want more jobs like this?

Get jobs in Mumbai, India delivered to your inbox every week.

Job alert subscription

You've Got What It Takes If You Have...

  • 2+ years in an SRE, DevOps, or Infrastructure Engineer role.
  • Bachelor's degree in computer science, IT, or related technical field.
  • Hands-on experience on AWS and GCP Cloud
  • Deep hands-on experience with Kubernetes (EKS, AKS, GKE)
  • Strong understanding of Linux internals, container orchestration, and microservice architecture.
  • Hands-on experience with monitoring/logging tools:
  • Prometheus, Grafana, InfluxDB
  • ELK stack (Elasticsearch, Logstash, Kibana)
  • Proficient in incident response and alerting tools (PagerDuty etc.).
  • Basic knowledge of:
  • Kafka - topic monitoring, consumer health
  • ElastiCache / Redis - caching patterns and troubleshooting
  • InfluxDB - time-series metrics storage
  • Experience writing and maintaining automation scripts in Bash, Python, or Go.

#LI-Onsite

Client-provided location(s): Mumbai, India
Job ID: CornerstoneOnDemand-req10866
Employment Type: OTHER
Posted: 2026-01-30T19:18:20

Perks and Benefits

  • Health and Wellness

    • Health Insurance
    • Health Reimbursement Account
    • Dental Insurance
    • Vision Insurance
    • Life Insurance
    • Short-Term Disability
    • Long-Term Disability
    • FSA
    • HSA
    • HSA With Employer Contribution
    • Pet Insurance
    • Mental Health Benefits
  • Parental Benefits

    • Birth Parent or Maternity Leave
    • Non-Birth Parent or Paternity Leave
    • Fertility Benefits
    • Family Support Resources
    • Adoption Leave
  • Work Flexibility

    • Flexible Work Hours
    • Remote Work Opportunities
    • Hybrid Work Opportunities
  • Office Life and Perks

    • Casual Dress
    • Snacks
    • Company Outings
    • On-Site Cafeteria
    • Holiday Events
  • Vacation and Time Off

    • Paid Vacation
    • Unlimited Paid Time Off
    • Paid Holidays
    • Personal/Sick Days
    • Leave of Absence
    • Summer Fridays
  • Financial and Retirement

    • 401(K) With Company Matching
    • Stock Purchase Program
    • Performance Bonus
    • Relocation Assistance
    • Financial Counseling
    • Profit Sharing
  • Professional Development

    • Tuition Reimbursement
    • Promote From Within
    • Work Visa Sponsorship
    • Leadership Training Program
    • Internship Program
    • Shadowing Opportunities
    • Access to Online Courses
  • Diversity and Inclusion

    • Employee Resource Groups (ERG)
    • Unconscious Bias Training
    • Diversity, Equity, and Inclusion Program