Is Your IT Workforce Ready For AIOps?

This is a copy of an original post on the Forbes blog here.



The Strategic Brief:

AIOps will change the way organizations operate.

In the AIOps-enabled enterprise, where artificial intelligence and machine learning automate tasks to augment technology operations teams, businesses undergo a monumental shift that enables them to be more proactive, predictive and ultimately preemptive.



IMG_6312 2

Along the journey to the AIOps-enabled enterprise, the skills needed in your ITOps, DevOps, and site reliability engineering (SRE) teams will also evolve, requiring skills in customization, integration, automation, and governance. Most organizations aren’t ready for this seismic shift, however. A recent survey of 6,000 IT professionals shows the vast majority of global enterprises have yet to start an AIOps strategy.

The Current State of Operations

Let’s examine how we got to today’s IT operations organization. Specifically, I mean the people monitoring and managing the production environment, whether or not they have “operations” in their title.

In the last decade, the drive to agile and DevOps solutions moved operations towards development, creating the new skill set requirement of release engineering (RelEng), which is responsible for automating application deployment and providing structure for the software development lifecycle (SDLC). This required connecting the dots across domains (server, network, database, frameworks, code dependencies, and so on), and began changing development and operations from I-shaped professionals (deeply skilled in one area) to T-shaped professionals (skilled in one area but also knowledgeable in other domains).

You may notice that RelEng focuses on SDLC tasks such as automating builds, tests and QA— essentially automating all the work for deploying an application into production. In the past, DevOps failed to pay equal attention to the operations effort needed during the production lifespan of an application. AIOps addresses this DevOps weakness by applying AI/ML to anomaly detection, root cause analysis, resolution and verification, and by driving automation of anomaly resolutions. This means the AIOps enterprise will require a different skill set from ITOps, DeVops and SRE professionals.

The New Skill Profile for AIOps

AIOps is reducing the shelf life of two operational skills: the Sherlock Holmes-esque investigative skill that is the heart of root cause analysis, and the experience-based knowledge that lives within an individual. Instead, AIOps will identify or short-list the root cause, and resolvable actions will be captured and automated where warranted. When a clear root cause is found and a matching automated resolution in place, AIOps will be able to address the issue without human interaction.

Similar to cloud services, AIOps will require skills in customization, integration, automation, and governance. While team members with specialist skills will still have value, AIOps will encourage learning and collaboration with other disciplines, and allow you to measure how IT capability and growth are helping to achieve a goal. This represents a shift from the I-shaped and T-shaped specialist to a full-fledged versatilist.

The AIOps professional is a cross-domain expert who uses domain-specific skills to control a progressively widening scope of coverage, and who is equally at ease communicating the technical and business impacts of an issue.

Capability Levels Track Transition to AIOps

To align your team with the AIOps profile, define an alternate career path for them. IT professionals may see their careers tied to a siloed technology certification, and consider time spent learning other domains as coming at the expense of their specialization. You can delineate an alternate path by assessing their current skills, setting goals for the level your enterprise requires, and then building training and incentive programs to transition them into the new skill set.

A simple, six-level scale (based loosely on Bloom’s taxonomy used in education to assess learning effectiveness) can be used for assessment and goal-setting. Each domain’s skills can be measured against the individual’s capability.

The Six Levels Of IT Capability

  1. Awareness: The most basic level; professionals are aware that the technology or practice is in use somewhere in your enterprise.
  2. Understanding: The ability to understand where the technology or practice is used in the enterprise, and which team to contact if anything needs to be done with it.
  3. Applying: Performing basic tasks to manage the technology or practice, with a standard operating procedure (SOP) providing guidance.
  4. Analyzing: Knowing how to view related measures in an application performance monitoring (APM) solution and describe the cross-domain integration present for the technology or practice.
  5. Automating: Defining, creating and deploying automated processes for the technology or practice, allowing automatic resolution of anomalies by AIOps.
  6. Architecture: Designing and enacting an architecture for new implementations of the technology or practice. There may be vendor or institutional certifications available at this level.

The above capability scale can be applied across specialized technologies and more general practice and soft-skill areas. The technologies you assess, which will depend on what is used in your enterprise, may include: AWS, Azure, containers, microservices, Kubernetes, databases, network, infrastructure hardware, embedded frameworks, cloud service providers, APM tools, management tools, and more.

In addition, you will need to add categories for non-technical areas including:

  • Sharing: to incent the capture of knowledge for automation
  • Security: while this may appear as a technology, security is also a process and a behaviour that overlaps with governance
  • Programming: assessing the ability to create automation scripts and actions, including knowledge of language and usage of APIs
  • Governance: understanding where the technology sits within industry regulation and best practices

You can deploy AIOps without waiting for your skills transition to complete, as the technology provides significant benefits immediately. To realize the full value of AIOps, it’s essential to move your existing teams to a new skills profile. This transition can occur during your AIOps deployment. By using capability levels, goals and incentives, you’ll gain a clear path for growth, allowing teams to help your AIOps deployment succeed.

Augmenting operations teams via AIOps frees up time for team members. This time can be used to extend capabilities across domains and into the business, transforming professionals’ skills to fit the new AIOPs profile. Just as the business organization evolved to support citizen technologists and citizen data scientists, IT must evolve to support citizen business evangelists and automation strategists.