Adapting traditional service management processes for our DevOps environment

Photograph of the IT service management team stood on stage at an awards ceremony celebrating their award

In June, our Digital Service team won the Special Innovation Devops award at the IT Service Management Forum (ITSMF) Professional Service Management awards.

Each year, ITSMF present innovation awards to organisations who are exploring new territory, often around the edges of traditional IT service management or those who have found innovative solutions to well known problems.

We’re proud our work has been recognised as being innovative and thought this would be a good time to share our story.

IT service management at the Co-op

When we talk about ‘IT service management’ we mean making sure we can operationally support our products and services.

Over the years Co-op has put in place IT service management policies and processes based on an IT infrastructure library (ITIL) – the industry standard framework for developing and running IT services. It includes processes to help manage incidents, requests and changes.

The principles of the framework aim to manage business change whilst maintaining stable services. And because the Co-op has been going through digital transformation and business change over the past few years, maintaining stable services whilst being able make frequent changes in an agile model has been hugely important.

Adapting traditional processes for an agile environment

ITIL processes were created before working in an agile way was commonplace and the Co-op service management policies and processes were originally written for traditional, on premise, waterfall applications. So recently, the Co-op Digital IT service management team have been adapting them so they’re better suited to our fast-paced, cloud-hosted, agile world.

Here are some of the ways we’ve been working innovatively.

Working collaboratively (especially when things go wrong)

Typically, development teams are separate from the IT service management teams who operate live services. But we’ve been involving them. For example, our monitoring systems continually check the health of our services and when something breaks, we’ve set up alerts so that problems are automatically posted into incident chat rooms. We’ve made these visible to the whole Digital team. This way, the wider team can swarm on fixing the problem.

We also review incidents together for 2 reasons:

  1. To make sure we’re continually improving by preventing recurring issues.
  2. Reviews act as training guides for new colleagues to learn from past mistakes.

Creating patterns to make things more efficient

We created patterns for how we build and support infrastructure, how we deploy, and how we manage availability and change. Every service follows the same patterns and is scaled appropriately for its size.

Patterns make getting a service live for the first time simpler and quicker. When a service needs something different, we can fully concentrate on those areas rather than trying to reinvent the more basic, standard things. Before we put patterns in place, teams would often hit a wall just as they were planning to launch because they hadn’t sufficiently considered all the security and operational needs that needed to be satisfied. Now, our digital teams can take learnings for an alpha, and create the application and infrastructure for a production-ready service within months.

So far, so good

We’re now consistently doing 5-10 releases a day without service outages, we display our alerting and monitoring in the open so we’re transparent about our weak points and we share our post incident reviews widely so everyone can learn from our mistakes.

As a result we’ve seen improved uptime, typically never falling below 99.95%, have a change failure rate of less than 1% and we’re catching more issues proactively, all while supporting an increased number of services with the same size team.

A reasonable amount of governance

As product teams take on more responsibility for managing their own services, our role as a service team is shifting from being the gatekeepers of production, to making sure we have great processes and governance in place.

We’re giving teams the tools they need to manage changes and incidents themselves which saves time. Our aim to create processes that are supported by tools as well as automation that makes sure the appropriate governance is being done, rather than relying on people to do repetitive admin tasks. And as we try new tools and techniques, we’re sharing these with the rest of the Co-op IT teams, as well as here on our Digital blog, so that they can build on what we’ve learnt.

Michaela Kurkiewicz
Principal Service Manager

Go big or go home

Hello. I’m Dave Johnson, Director of Digital Engineering at the Co-op and one of the people Mike mentioned here. I’m @davej_leeds on Twitter.

30 days ago Mike, Mat, Tom, Russell and Ben joined the Co-op to lead the creation of fantastic digital services with co-op values. Our team of colleagues across the Co-op have been busy…the first 30 days looked like this…

  • Launch Co-op Digital Blog on day 1…done
  • Launch Slack across Digital Strategy teams…done
  • Lots more Slack channels in the pipeline…
  • Purchase team boards and post-its
  • Purchase more team boards and post-its…done
  • Bring the internet inside the Co-op — ambitious for the first 30 days but progress…social media now accessible…more to do
  • Bring co-op values to the internet — already there, build on that…

blog quote

  • Tool up……jira…confluence…github…jenkins…ansible…teraform…done!
  • Developer builds for MacBooks…done. You can read about Lee Murray’s experience of joining the Co-op digital team here
  • Buy some computers…
    • AWS – done
    • Azure – researching
    • Heroku – done
  • Digital platform sprints * 2
  • Show & tells * 2

dave blog 22dave blog 2

  • Product roadmaps * 1
  • Platform and infrastructure backlogs * 2
  • Interviews with world class digital talent * 2
    • Job offers made 2, accepted 2
  • Team drinks after work * 2

Day 31 –

dave blog 3

Our job is to lead and build a world-class digital engineering team at the Co-op, and then build fantastic new digital services with Co-op values. We’re starting at the heart with Membership, and we’ll grow quickly. We’ll need the best digital talent to do that.

Here at the Co-op things are changing. We’ve learnt. We’ve grown. We’ve invested in the things that matter to our customers. We’ve invested in Digital and we are working to make the Co-op a world-leading digital player. Our ambition is bold: to re-create the Co-op for a digital era, and demonstrate a different way of doing business for an increasingly connected community. With this transformation comes a lot of opportunity. As a result we are looking to recruit  a number of talented players within the Digital space. Developers, Engineers, Architects, Business Analysts, User Researchers and Product Managers are just a few of the positions available within our growing and innovative team.

If you want to get involved by playing a key role within our digital transformation, please get in touch with  Polly Haslam today.