Lead Incident Manager
Job Description
Let's start with the day to day of this job:
The Lead Incident Manager will lead the response to major incidents as well as proactive/reactive problems. The goal is to lead responses using ITIL processes to expedite resolutions to incidents and problems to minimize business, financial, reputational, and legal/regulatory risks. This role will be responsible for engaging stakeholders, business owners and key technical resources. Key tasks will be driving bridge calls providing clear communication to various stakeholders, delivering process training, identifying trends, producing monthly management/operational reports and various other activities related to IT service management.
The Incident/Problem Manager is expected to use their extensive knowledge of both incident & problem management processes to mitigate impact and reduce the time to restore business services. Post restoration, this individual will need to track RCAs to completion and any associated remediation efforts to avoid future recurrence. This individual will be responsible for understanding multiple lines of businesses including customer segments and critical services and developing a deep understanding of the applications and infrastructure components supporting these services. Individual will collaborate with team members to deliver an enterprise-wide service will be a key success factor.
In this position, you will be responsible for this:
Responsible for developing the Major Incident Management process roadmap and aligning the team around the roadmap.
Engage in outreach to the business units to socialize the Major Incident Management Services.
Establish and drive the means to resolve incidents and restore business services as rapidly as possible.
Lead major incident bridge by driving resolver teams to incident resolution within the SLAs.
Continually assess business impact and articulate/communicate across multiple groups including executive management
Prepare monthly report of Major Incident which meets the Enterprise Risk Criteria. Lead post-incident event activities (reactive problem management) to identify/assign action items to prevent future occurrences by leveraging best practices of ITIL problem management.
Develop and execute strategies to identify and address potential problems before they impact customers (proactive problem management).
Create and maintain training materials and team procedures and ensure team members are aligned with the procedures.
Ensure Incident Managers properly manage the Incidents - regularly requesting updates from the technical team, sending communications per SLAs, ensuring clarity on business impact, monitoring activities and cutoff time for critical processes.
Conduct training and knowledge sharing sessions across various teams and new hires to help adapt to standardized processes.
Continuous improvement and evolution of service management processes and procedures to maintain high quality and "industry-best" customer service.
Collaborate closely with other service management functions (change management, service desk, etc.) and assist as necessary in the spirit of one-team.
Engage with the Business Continuity team as necessary.
Provide on-call support during non-business hours (team provides 24x7 support for incidents).
Ownership and execution of key activities of incident and problem management processes, including:
Event analysis, documentation and leveraging established processes to assign priority to incidents.
Engage key resources including vendors as necessary (technical, product and executive management personnel).
Prompt communication to all affected parties, including executive management.
Accurately track incident and problem records in ticketing systems.
Documentation of incident timeline and remediation actions.
Lead post incident review meetings (Root Cause Analysis phase).
Record, assign and track corrective actions through closure.
Prepare and present monthly metrics, status, and service health reports.
You've got to have some or all of this too:
Bachelor's degree in information technology, engineering, or a related field.
7+ years of experience in Major Incident/Problem management.
ITIL Foundations Version 4 Certification preferred.
Experience with IT Service Management platforms (ServiceNow, FreshService, Atlassian Jira Service Management etc.).
Knowledge/experience with a wide range of enterprise technologies, including but not limited to, cloud technologies (Azure/AWS), distributed services (server and database), network, storage, web architecture, is a plus.
These things are required for the position:
Excellent written and verbal communication skills with ability to explain/articulate technical and business concepts clearly and effectively.
Strong analytical and problem solving skills.
Ability to facilitate conversations confidently and clearly on conference calls, in meetings, via collaboration tools, via email, and at all levels of the organization is essential.
Strong people skills with ability to foster a positive working environment.
Crisis management skills: able to set priorities, pursue multiple threads at the same time, accurately reflect current state and drive towards desired state.
Ability to maintain calmness during stressful situations.
Strong organizational skills with the ability to manage multiple tasks simultaneously.
Client focus and ownership, use of own initiative and a proactive approach to work.
Ability to support On-call (team provides 24x7 support for incidents).
Experience working with engineering, product, and business teams.
Work Environment
The work environment characteristics described here maybe encountered while performing the essential functions of this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.
Moderate noise (i.e. business office with computers, phone, and printers, light traffic).
Ability to work in a confined area.
Ability to sit at a computer terminal for an extended period of time. Occasional stooping or kneeling may be necessary.
While performing the duties of this job, the employee is regularly required to stand, sit, talk, hear and use hands and fingers to operate a computer keyboard and telephone.
Specific vision abilities are required by this job due to computer work.
Light to moderate lifting is required.
Regular, predictable attendance is required.
#LI-BA1
Application Instructions
Please click on the link below to apply for this position. A new window will open and direct you to apply at our corporate careers page. We look forward to hearing from you!
Apply Online