The art of roadmap

I enjoy planning because when planning anything seems possible.On the other side, I don’t like plans because they represent the clash between our expectations and reality. To be precise, what I really don’t like is plan adherence, especially when used as success measure.


Plans are nothing; planning is everything.

Dwight D. Eisenhower

When I joined Deeper Insights the first step was to learn the way of work and the way of managing work, we can call it company culture. After understanding what was my mission I was ready to start planning, so I started to create the first draft of the roadmap, in this context a “DevOps roadmap”.

The roadmap serves me as a conservation starter with other people on the company, and I like to think that by providing a roadmap, at least I’m able to provide something for other people disagree and ignite a conversation.

Roadmap chapters

Next I will explore the chapters that I think should be part of the roadmap. Don’t forget that this was done under a specific context in a given moment.

Chapter 0: General ideas

In this chapter the goal is to assess the basic ideas around the company vision, mission, and values.

  • What kind of business are we building? What makes us special?
  • How we define goals? How we achieve that goals?
  • What kind of work do we do? What kind of work we do not do?

Chapter 1: Mapping the business

How our business model looks like? The goal is to understand how we build and sell our knowledge and artifacts.

  • Are we driven by projects?
  • Are we driven by product?
  • Are we driven by projects and product?

Chapter 2: Making work visible

This chapter is dedicated to the work that we as a team need to execute and manage (at least priority and scope). And since we work in a collaborative environment, the team needs to agree on the communication contract that will be used to make the work (and the state of that work) visible to everyone on the team (and outside of the team).

  • Select the tool that will support your dashboard
    • This selection normally is influenced by the agile, if that’s the case, methodology like Scrum or Kanban;
    • Define the columns/states: To Do, Doing Done;
      • One of my favorites configuration: To Do, Doing, Ready To Review, Reviewing, Ready To Test, Testing, Ready To Deliver, Done.
    • Select the one of the approaches: pull or push
      • In the push approach you push work to the next state (for example when you have finish the development stage of a work item you push that work item from Doing to Testing). In this case the work is being pushed from the left side of the value stream.
      • In the pull approach (my personal preference since I’m a TPS enthusiast) you pull work from the previous state (for example when you have some work item in Ready To Test you can pull it to Testing). In this case the work is being pulled from the right side of the value stream. This approach implies the existence of columns like “Ready To Test” or “Ready To Deliver” and I call it cache columns because they make visible work that is waiting (aka in stock). So, if we have a lot of work items in this kind of columns it will be clear that we have a problem to fix.
    • Define the team events that will keep the work visualization up to date
      • Daily meeting, planning meeting and retrospective meeting.
  • Identify the nature of demand. The team should know the different types of work that needs to handle.
    • Business requirements (aka features);
    • Maintenance (dependencies upgrades, database index rebuild, …)
    • Unplanned work (bugs, production chaos, …)
    • Improvements (architecture, automation, technical debt)

Chapter 3: Architecture

  • Architecture diagrams
    • This must be created to all relevant systems and should include: applications, infrastructure and database.
  • Architecture path
    • Definition of an exploration path with the goal of research new approaches and identify potential improvements. Some examples:
      • Microservices
      • Event-driven
      • Kubernetes
      • Service Mesh

Chapter 4: Quality Assurance

The purpose of this chapter is to assess and align the organization’s understanding around Quality Assurance.

  • Define a QA manifest: definition of quality created by the organization
    • What’s the minimum acceptable quality level?
    • What’s included in the quality definition? Is the documentation included?
    • What kind of tests should be considered? Unit tests, Integration Tests, End-to-end Tests?
    • What should be the test coverage? Should this number the same in every context?
  • Select tools to support the test stage based on the following capabilities:
    • Automation – deployment pipeline integration;
    • Static / Dynamic code analysis.

Chapter 5: Security

The previous chapter is definitively a good contribution for security (through code analysis).

  • Password management: it’s necessary to define a process and/or select a tool to manage passwords that will be used by:
    • Humans
    • Systems
  • Data protection and auditing
    • This is a big topic since data is one of the biggest assets of an organization and becuase of that we need to be able to answer some questions like:
      • What kind of data is stored? (data classification)
      • What kind of data is processed?
      • Where is the data stored?
      • Where is the data processed?
      • Who have access to the data and what kind of access?
      • Are backups made from the data?
      • Where are the backups store for how log?
    • An important reminder is that all this questions need to be auditable. In other words, we need to demonstrate our answers.

Chapter 6: Monitoring

Monitoring is a very important capability that allow us to observe the behavior of our systems . Otherwise, we would be in a blind state regarding our applications.

Things that can be monitored:

  • Applications
  • Infrastructure
  • Databases
  • Deployment pipeline
  • Business KPIs

In most cases organizations have a monitoring platform with a set of tools and not just a single tool. This happens because I never saw (so far) a single monitoring tool capable to cover all the different monitoring contexts and needs of an organization.

A monitoring platform normally has the following components:

  • Data collection
    • Uptime
    • Log collection
    • Metrics
    • Tracing
  • Data storage
  • Visualization (analytics and reporting)
  • Alerting

Chapter 7: Infrastructure provisioning and configuration

The infrastructure provisioning chapter contains two main concerns:

  • Infrastructure provisioning
    • How can the infrastructure be provided in a standardized and reliable way?
    • How to deal with vendor lock?
    • How to provide a development environment close as possible top the production environment?
  • Infrastructure cost control
    • We need to monitor the cost of the infrastructure in order to avoid unpleasant surprises in the end of the month.

Chapter 8: Package Management

  • Identify the different artifacts produced by the organization. For example:
    • Docker container
    • Python package
    • Node package
  • Identify the usage of each artifact:
    • Release (an application or a service)
    • Dependency (a library used to build other libraries or an application)
  • Select the package manager for each artifact.

Chapter 9: Change Log and Release Notes

This chapter covers how software changes are communicated internal and with the customer.

  • Versioning
    • The team (probably the entire organization) needs to agree on the versioning scheme to be used.
    • Independently of the selected versioning scheme a specific version (for example 1.0.1) must have the same meaning in all the environments;
  • Changelog
    • A changelog is a file that contains a chronologically ordered list of relevant changes for each version of an application. The team/organization needs to define how this file should be build.
    • The information contained in the changelog tend to have an internal perspective about what change in the application. However, done right, the changelog can be used to build the release notes.
  • Release Notes
    • The release notes is the list of changes of a certain version of the application produced to the customer. In other words, is the list of changes form the business perspective.

In the end, the combination of the three points above, should allow to follow and track the information since the requirements definition, developing code, build process with the resulting artifact, and finally the deployment of the artifact.

Chapter 10: Deployment Pipeline

  • Deployment pipeline as a product covering all stages: code, test, build, deploy
    • Pipeline as code (not the same as GitOps)
    • Tries to answer the question: in case of destruction of the deployment pipeline (or a part of it) what can be done to recover from that?
  • Deployment pipeline as a platform
    • Automation of the interaction/usage with the deployment pipeline
      • CLIs, APIs, Bots, …

Chapter 11: MLOps

This chapter is dedicated to the operational capabilities to deal with the Machine Learning stages:

  • Train
  • Experiment and evaluate
  • Productize
  • Test
  • Deploy
  • Monitor

The implementation of this set of capabilities is a effort 50% technical and 50% cultural since since involves two different disciplines: Machine Learning and Operations.

Leave a reply:

Your email address will not be published.

Site Footer