SRE Production DevOps Engineer
Job Description Summary
The Production DevOps Engineer serves as a critical link in the "Middle-Mile" of software delivery for the GE Vernova's Grid Software SaaS products. This role is responsible for ensuring that software moves from development to production environments through a standardized, secure, and highly observable path. You will own the Change Management Process, serving as a primary authority for production deployments to ensure that new SaaS product versions do not compromise the stability of global energy grid operations. This position requires a strong technical background in automation and a disciplined approach to release safety in a 24/7 operational environment.
Works independently and is seen as a Technical Leader. The role demonstrates deep understanding of concurrent software development, its effect on build management and releasing the builds across versions and environments
Job Description
Roles and Responsibilities
Day 0: Pipeline Implementation & Standardization
- Golden Path Execution: Maintain and improve standardized CI/CD pipelines using GitHub Actions and ArgoCD, ensuring all product teams follow the established "Golden Path" to avoid bespoke, non-standard deployment utilities.
- Policy Enforcement:Implement and manage automated "quality gates" within the delivery pipeline to verify that every release meets security and architectural standards before reaching production.
- Provisioning Support:Assist the SaaS Cloud Engineers in automating highly secure, resilient customer's cloud infrastructure.
Day 1: Release Authority & Deployment Management
- Change Control Authority: Review and provide final approval for production deployment requests, ensuring all pre-release criteria-such as performance testing and security scanning-are satisfied.
- Progressive Delivery: Execute advanced rollout strategies, including Canary and Blue/Green deployments on Kubernetes, to minimize the "blast radius" of changes .
- Validation:Perform automated verification and acceptance testing post-deployment to confirm service health and trigger automated rollbacks if necessary.
Want more jobs like this?
Get jobs in Hyderabad, India delivered to your inbox every week.

Day 2: Operational Support & Optimization
- 24/7 Follow-the-Sun Support:Participate in global on-call rotations, ensuring a seamless transition of operational responsibility between time zones through standardized handover protocols.
- Incident & Root Cause Analysis: Support high-severity incident response and participate in blameless Root Cause Analysis (RCA) to identify and fix systemic deployment risks .
- FinOps & Capacity: Track and report on cloud resource consumption for CI/CD infrastructure, assisting in cost-optimization efforts and right-sizing production workloads.
- Manage key deliverables and mentors junior team members.
- Contribute in driving initiatives such as defining standards and processes to ensure quality.
- Develop and enhance the test infrastructure and continuous integration framework used across teams.
- Learn new build and releases techniques and methodologies and trains the team in the same.
- Partner with and provides direction to fellow team members to diagnose bugs and formulate solutions.
Technical Requirements
- CI/CD & GitOps: Hands-on experience with Jenkins, Artifactory, GitHub Actions and ArgoCD for automated software delivery.
- Container Orchestration: Proficiency in managing workloads on Kubernetes, specifically with EKS clusters.
- Automation Tools: Strong skills in Ansible and Terraform for configuration management and infrastructure-as-code.
- Cloud Platform: Solid understanding of AWS cloud services (VPC, IAM, EKS, RDS, S3, MSK, etc) in a production setting.
- Observability: Experience using Prometheus, Grafana, Splunk, Datadog or Dynatrace to monitor deployment health and system performance .
Experience & Qualifications
- Professional Background: 5+ years of experience in DevOps, SRE, or Release Engineering roles for cloud-native SaaS applications.
- Overall Experience: 8+ Years.
- Operational Discipline: Proven ability to manage production changes and troubleshooting under pressure in a high-stakes environment.
- Compliance Awareness: Familiarity with regulated industries and security frameworks such as NERC CIP, SOC2, ISO 27001, IEC 62443 is highly preferred.
- Communication: Strong ability to document technical procedures and communicate clearly with stakeholders during global shift handovers.
Key Performance Indicators (KPIs)
- System Availability: Help maintain 99.99% availability of mission critical grid SaaS products.
- Customer Onboarding Speed: Contribution towards the 4-hour SLA target.
- Change Failure Rate: Maintaining a low rate of failed production deployments through improved quality gates .
- Mean Time to Recover (MTTR): Ensuring fast restoration of service through automated rollbacks and executing run books diligently.
- Toil Reduction: Automating repetitive manual tasks to ensure at least 50% of time is spent on engineering improvements.
Education Qualification
Bachelor's Degree in Computer Science or "STEM" Majors (Science, Technology, Engineering and Math) with advanced experience.
Business Acumen:
• Strong problem solving abilities and capable of articulating specific technical topics or assignments
• Experience in building scalable and highly available distributed systems
• Skilled in breaking down problems and estimate time for development tasks
• Evangelizes how our technology solves customer problems from a technology and business perspective
Leadership:
• Demonstrates clarity of thinking to work through limited information and vague problem definitions
• Influences through others; builds direct and "behind the scenes" support for ideas
• Proactively identifies and removes project obstacles or barriers on behalf of the team
• Shares knowledge, power, and credit, establishing trust, credibility, and goodwill
Personal Attributes:
• Able to work under minimal supervision
• Excellent communication skills and the ability to interface with senior leadership with confidence and clarity
• Skilled in providing oversight and mentoring team members. Shows ability to effectively delegate work.
• Applies values, business strategy, policies, precedent, and experience to make complex decisions in
ambiguity and with uncertain consequences.
Additional Information
Relocation Assistance Provided: Yes
Perks and Benefits
Health and Wellness
- Health Insurance
- Health Reimbursement Account
- Dental Insurance
- Vision Insurance
- Life Insurance
- Short-Term Disability
- Long-Term Disability
- FSA
- FSA With Employer Contribution
- HSA
- HSA With Employer Contribution
- Fitness Subsidies
- On-Site Gym
- Mental Health Benefits
Parental Benefits
- Adoption Assistance Program
- Family Support Resources
- Birth Parent or Maternity Leave
- Adoption Leave
Work Flexibility
- Flexible Work Hours
- Remote Work Opportunities
- Hybrid Work Opportunities
Office Life and Perks
- Commuter Benefits Program
- Casual Dress
- On-Site Cafeteria
- Holiday Events
Vacation and Time Off
- Unlimited Paid Time Off
- Paid Holidays
- Personal/Sick Days
- Summer Fridays
Financial and Retirement
- 401(K)
- Stock Purchase Program
- Performance Bonus
- Relocation Assistance
- Financial Counseling
- Profit Sharing
- 401(K) With Company Matching
Professional Development
- Tuition Reimbursement
- Access to Online Courses
- Lunch and Learns
- Leadership Training Program
- Internship Program
- Associate or Rotational Training Program
Diversity and Inclusion
- Diversity, Equity, and Inclusion Program
- Employee Resource Groups (ERG)
- Unconscious Bias Training
Company Videos
Hear directly from employees about what it is like to work at GE Vernova.