Job Description:
Senior Delivery Manager (Production Support and DevOps)
The person is responsible for ensuring the smooth and reliable operation of production systems and applications, acting as a point of contact for incidents and ensuring the efficient resolution of issues. They also play a key role in incident management, root cause analysis, and continuous improvement efforts. The person should have excellent leadership and people management skills and should be able to lead a large team of Production Support and DevOps Engineers.
Key roles and responsibilities:
Incident Management and Support:
Monitoring and Troubleshooting: Continuously monitor systems and applications for performance issues, incidents, and alerts, and proactively respond to incidents.
Want more jobs like this?
Get jobs delivered to your inbox every week.
Issue Resolution: Diagnose and resolve production issues using advanced troubleshooting techniques.
Root Cause Analysis: Perform in-depth analysis to identify the root causes of incidents and prevent recurrence.
Documentation and Communication: Create and maintain documentation related to production issues and resolutions, and effectively communicate with stakeholders, including development and operations teams.
Incident Management: Oversee the incident management process, including prioritization, escalation, and resolution, ensuring timely and effective incident resolution.
System Performance and Optimization:
Performance Monitoring: Monitor system performance metrics, identify bottlenecks, and recommend solutions for performance optimization.
Process Improvement: Implement and maintain processes and procedures to improve production support efficiency and reduce downtime.
Automation: Identify and implement automation opportunities to streamline repetitive tasks and reduce manual effort.
Data Analysis: Analyze data related to production performance, incident trends, and support requests to identify areas for improvement and optimization.
Cross-Functional Collaboration:
Collaboration with Development and Operations: Work closely with development, operations, and other relevant teams to ensure seamless software deployment and integration.
Communication and Reporting: Provide regular reports on system performance, incident status, and support metrics to senior management and stakeholders.
On-Call Support: Participate in on-call rotations and respond to production issues after business hours.
Other Responsibilities:
Training and Documentation: Develop and deliver training materials and documentation to support production support teams.
Process Improvement: Identify and implement improvements to production support processes and procedures.
Knowledge Management: Maintain and update knowledge databases and documentation to support troubleshooting and incident resolution.
Continuous Improvement: Drive continuous improvement initiatives to enhance the overall efficiency and reliability of production support.
Technical Skills:
Excellent knowledge of ServiceNow, NewRelic, AWS Cloud, Application, System, Network, Cloud and DevOps.
Experience:
12+ years
Certification:
ITIL, AWS Certification are desired
We offer you a competitive total rewards package, continuing education & training, and tremendous potential with a growing worldwide organization.
DISCLAIMER:
Nothing in this job description restricts management's right to assign or reassign duties and responsibilities of this job to other entities; including but not limited to subsidiaries, partners, or purchasers of Alight business units.
.