Join us as we pursue our disruptive new vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we're committed to our work, customers, having fun and most importantly to each other's success. Learn more about Splunk careers and how you can become a part of our journey!
The Continual Service Improvement Program Manager is a member of the Global Support Incident & Escalations Management Team at Splunk, responsible for leading various proactive programs and initiatives to drive continuous improvement with our products, services, and processes. This role will partner closely with our global team of incident and escalation managers on various proactive initiatives to drive incident & escalation avoidance, lessons-learned analysis, closed-loop insights to actions, systemic incident remediation, process governance, reporting / trend analysis / measures of success, and be a liaison for building out our next generation of tooling and automation.
The ideal Splunker for this role will have a good blend of technical and soft skills, program / project management experience, strong deductive reasoning capabilities, and is able to communicate with a high degree of efficiency and "awesome" up / down / across our Global Support, Customer Success, Product, Engineering, and Operations organization. You will have the opportunity to shape this position from the very beginning. This role is highly visible across the organization and will help shape Splunk's continuous improvement culture for years to come.
The role responsibilities may include all or a subset of the items listed below.
- Incident / Escalation Avoidance and Lessons-Learned
- Develop procedures and facilitate lessons-learned actions from Major Incidents and Red / Orange critical accounts. Expand scope over time.
- Build and track issues-register, identify owners, and drive resolution (bring large systemic issues to weekly Production and ELT Reviews, create reports showing top contributing causes, and build scorecards for product areas)
- Partner with the Cloud Problem Management (CPM) team to drive non-cloud product related improvements. Participate in CPM-run PIRs (Post Incident Review) for major incidents and Red / Orange critical accounts
- Drive end-to-end / comprehensive continuous improvement across the customer journey
- Customer Quality Risk Management
- Devise methodology to proactively identify high impacts bugs (high customer pain score) and trends from Incidents and Escalations
- Insights to Actions - identify, monitor, assess impact, and drive expedient solutions / fixes / patches / releases for the fleet to minimize customer risk and increase product availability
- Systemic Incident Remediation
- Program manage remediation plan and cross-functional efforts for widespread and long-running systemic / P0 incidents. Collaborate with Engineering, Release Management, Tech Ops, Support, Professional Services, etc as needed
- Own internal and external comms and work with the core team (including Legal as needed for ACP incidents)
- Track remediation progress - how many customers are impacted / potentially impacted, have they been communicated to, have they been remediated and when, status, etc.
- Reporting, Analytics, and Process Governance
- Track and own measures of success and provide relevant reports - weekly, monthly, quarterly as appropriate
- Assist with weekly report generation such as Production Review, Global Account Technical Health Review (GATHR), weekly ELT escalation / incident review, etc. as guided by leadership
- Ensure processes are designed using a closed-loop feedback mechanism
- Build a unified team newsletter, manage content, and send periodically to relevant audiences
- Unified Response Tooling, Automation, and Process Management
- Be the liaison for the Incident and Escalation Management teams with the SPURvot development team
- Assist with the tooling roadmap and prioritization to simplify and create an effective and efficient toolset for the organization
- Ensure consistent, consolidated, and updated documentation (Confluence pages, process docs, training material, onboarding info, etc.)
- 5+ years of proven experience in a related or similar position - prior experience in incident and escalation management is a plus.
- Have a clear understanding of the ITIL framework.
- You can think outside the box and work on multiple tasks simultaneously while dynamically prioritizing based on changing conditions.
- Ability to work multi-functionally and to influence and execute across geographically dispersed groups.
- You enjoy problem solving and analyzing global-scale distributed systems.
- You have outstanding interpersonal and communication skills.
- You have program /project management experience.
- Experience of working in a highly technical environment with the ability to drive investigations with technical and non-technical teams
- Strong ability to communicate efficiently and effectively with different teams, from Engineers to Support and Management with the ability to communicate technical issues to non-technical teams
- Solid understanding of cloud platforms, software deployments, monitoring tools . Prior experience with Cloud or SaaS companies puts you in the front of the line
- Exudes Customer Success. Passionate about doing what's right for the customer. Willing to take on the tough projects and challenges to support the growth of the business.
- Ability to work independently with a "make it happen" attitude; can operate and execute in areas of uncertainty and ambiguity; problem solver and quick learner.
- Thrives to be seen as a trusted advisor and technical leader who is highly requested by management and peers.
We value diversity at our company. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying. For job positions in San Francisco, CA, and other locations where required, we will consider for employment qualified applicants with arrest and conviction records.
Note: Splunk provides flexibility and choice in the working arrangement for most roles, including remote and/or in-office roles. We have a market-based pay structure which varies by location. Please note that the base pay range is a guideline and for candidates who receive an offer, the base pay will vary based on factors such as work location as set out below, as well as the knowledge, skills and experience of the candidate. In addition to base pay, this role is eligible for incentive compensation and benefits, and may be eligible for equity or long-term cash awards.
Benefits are an important part of Splunk's Total Rewards package. This role is eligible for a competitive benefits package which includes medical, dental, vision, a 401(k) plan and match, paid time off, an ESPP and much more! Learn more about our comprehensive benefits and wellbeing offering here.
Base Pay Range
SF Bay Area, Seattle Metro, and New York City Metro Area
Base Pay Range: $104,000 - 143,000 per year
California (excludes SF Bay Area), Washington (excludes Seattle Metro), Washington DC Metro, and Massachusetts
Base Pay Range: $94,400 - 129,800 per year
All other cities and states excluding California, Washington, Massachusetts, New York City Metro Area and Washington DC Metro Area.
Base Pay Range: $84,800 - 116,600 per year