Director of Incident Response

The Incident Response Team is a newly formed team, reporting directly to the VP of Engineering, Security and Infrastructure and will be at the heart of Global Operations at MuleSoft. This team will be responsible for the initial response and triage of all operational incident issues and will be the champion for the lifecycle of these incidents, working directly with Engineering Managers to groom work backlogs to prioritize high impact fixes.

The Director of Incident Response will build and lead the Incident Response Team responsible for making sure our services maintain the highest availability. You will lead Incident Retrospectives across the engineering teams who identify failures in people, process, and technology that lead to incidents and develop corrective actions and track through to completion. This will involve communicating statuses of incidents to the business and support for communication outbound to customers. You will have the ability to lead, own, develop, and refine the Change Management Process, the overall Cost Management Initiatives, and the Change Control Review Board (CCRB), as well as developing statistical measures of success for the CCRB. You will own the end-to-end Incident Management and Problem Management Processes, build the policies and procedures to respond to incidents and match the business needs, and partner with various groups. 

Goals for your first three months:

30 days:

  • Collaborate with the Engineering and DevOps teams to start to understand the environments and staffing requirements for operating a 24/7 team to respond to incidents
  • Build the overall Incident and Event Management Policies and Process
  • Work with various stakeholders in the organization to build requirements and identify gaps in documents and runbooks
  • Start to hire a team in both SF, BA or ORD (the team doesn’t need to be 24/7 to start)

60 days:

  • Establish and exercise the incident response plan for operational issues
  • Build metrics around SLAs, MTTx and other core KPIs for the team and start to own the statistical reporting and data management functions for incidents (SLAs, Mean Time to X calculations), Change (Change Induced Incident Minutes, etc.), and Problem Management (Actions, Completion %)
  • Work with engineering teams to make sure that we have full coverage of operational issues across all services
  • Start to build end-to-end knowledge and instrumentation of the system to identify if we have issues

90 days:

  • Establish the cadence of the team and have all the foundational set of policies and procedures in place
  • Have buy-in from all engineering management and leadership for the direction of the team
  • Have the team off the ground and working incidents, RCA process, and change management

The ideal candidate will have:

  • Senior leadership experience with incident, change and problem management in a software engineering organization with dozens of stakeholders and conflicting priorities, and the ability to build a team from the ground-up 
  • PMO, PGM, Jira, and Agile experience
  • Experience and ability to build and present SLA and other technical data to executive management 
  • Certifications involving disaster, security, incident and problem management (GIAC, SANS, ITIL, CERT, FEMA, etc.) - these are helpful but not required 

About Our Benefits:

  • Equity and generous Employee Stock Purchase Program
  • Unlimited vacation
  • Gym discounts and weekly onsite yoga classes
  • Catered lunches three times a week and a fully stocked kitchen
  • Annual MeetUp, our company-wide offsite to learn, grow, and connect
  • Frequent office activities and offsites, like Muleys at the Ballpark, Waffle Wednesdays, and family nights
  • Regular opportunities to give back to the community together
  • Comprehensive medical, dental, and vision insurance for you and your family
  • 401(k) and pre-tax health insurance, dependent care, and commuter benefits (FSA)





Meet Some of MuleSoft's Employees

Dana R.

Account Development Representative

Dana’s role is to partner with MuleSoft’s Sales Team to drive new business. She collaborates with account executives to determine opportunities in the market to sell MuleSoft and help transform those organizations' businesses.

Ashley J.

Product Manager

Ashley oversees the core services of MuleSoft’s Anypoint Platform. He works with a team of engineers to build the product in ways that allow other teams to do their jobs most efficiently.

Back to top