Skip to main contentA logo with &quat;the muse&quat; in dark blue text.
EPAM Systems

Site Reliability Engineer (Azure)

Río Grande, Mexico

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
About the Project
As a Site Reliability Engineer you will be responsible for the availability, performance, monitoring, and incident response of Microsoft solutions employed by our clients. You will also be tasked to proactively build and implement solutions and services to allow our IT and support teams perform at the highest level.

Want more jobs like this?

Get Software Engineering jobs in Río Grande, Mexico delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.


Responsibilities
  • As a Lead Azure SRE, you will be responsible for driving the reliability, performance, and scalability of cloud-based applications and services. Your expertise in Kubernetes, scripting, troubleshooting, and observability will be instrumental in ensuring a seamless and efficient cloud operations environment
  • Take ownership of managing Kubernetes clusters, ensuring their reliability, scalability, and performance. Implement best practices for deploying, monitoring, and optimizing containerized applications in a cloud environment
  • Utilize scripting skills in Python, Bash, and PowerShell to develop automation tools and streamline repetitive tasks. Automate infrastructure provisioning, deployment, and maintenance to achieve operational efficiency
  • Demonstrate expertise in troubleshooting cloud environments, diagnosing and resolving issues to maintain high availability and performance. Implement proactive monitoring and alerting solutions to identify and address potential problems before they escalate
  • Integrate with Azure DevOps to optimize the CI/CD pipeline, enabling continuous delivery and deployment of applications. Collaborate with development teams to streamline the release process and ensure smooth deployments
  • Implement and maintain the modern observability stack, including tools like Grafana, Prometheus, Loki, etc. Leverage these tools to monitor the health and performance of systems and applications, enabling quick identification and resolution of incidents
Requirements
  • Kubernetes
  • Scripting (Python, Bash, PowerShell... in that order of preference)
  • Troubleshooting in cloud environments
  • Azure DevOps
  • Good understanding/knowledge about modern observability stack i.e., tools like Grafana, Prometheus, Loki, etc
Nice to have
  • Experience working with Windows
  • Knowledge of CI/CD (especially Azure DevOps)
  • Knowledge of Istio
  • Knowledge of GitOps tools (like ArgoCD)
We Offer
  • Career plan and real growth opportunities
  • Unlimited access to LinkedIn learning solutions
  • International Mobility Plan within 25 countries
  • Constant training, mentoring, online corporate courses, eLearning and more
  • English classes with a certified teacher
  • Support for employee's initiatives (Algorithms club, toastmasters, agile club and more)
  • Enjoyable working environment (Gaming room, napping area, amenities, events, sport teams and more)
  • Flexible work schedule and dress code
  • Collaborate in a multicultural environment and share best practices from around the globe
  • Hired directly by EPAM & 100% under payroll
  • Law benefits (IMSS, INFONAVIT, 25% vacation bonus)
  • Major medical expenses insurance: Life, Major medical expenses with dental & visual coverage (for the employee and direct family members)
  • 13 % employee savings fund, capped to the law limit
  • Grocery coupons
  • 30 days December bonus
  • Employee Stock Purchase Plan
  • 12 vacations days plus 3 floating days
  • Official Mexican holidays, plus 5 extra holidays (Maundry Thursday and Friday, November 2nd, December 24th & 31st)
  • Relocation bonus: transportation, 2 weeks of accommodation for you and your family and more
  • Monthly non-taxable amount for the electricity and internet bills
Conditions
  • By applying to our role, you are agreeing that your personal data may be used as in set out in EPAM's Privacy Notice and Policy

Client-provided location(s): Mexico
Job ID: EPAM-87805
Employment Type: Other