Brownfield DevOps


Old stuff, new way of building

Folks put a lot of energy into defining DevOps so rather than hope you prefer the same one:

DevOps is a set of patterns to merge a set of traditionally separated concerns; specifically the developers who write/build and the ops who deploy/run

The DevOps Topologies page is my favorite for describing the organizational approaches that folks attempt. In these works, the focus will be on incrementing toward Type 1 with the ultimate goal of Type 2, “Type 1: Dev and Ops Collaboration” and “Type 2: Fully Shared Ops Responsibilities”.

The DevOps Topologies page goes on to describe more patterns with specific folks filling external roles, or adding in DBAs, or SREs, or whathaveyou. These database issues and rigid policies will come up time and time again but all of those patterns are suboptimal. They are better than nothing and this whole project is about incremental improvements.

Why Coder?

Coder Enterprise, Code-server, and other software are an easy on-ramp to a containerized developer experience. This new way of working provides a more production-like environment for creating software. It allows all construction acitivty to be offloaded from the corporate network and devices. Sharing resources, dependencies, and creating cloud native applications is suddenly much easier for organizations that use Coder. Other similar software packages (containerized IDEs) provide some of these facets but lack the compliance and assurance aspects that makes Coder unique in this space.

Developer Experience

There’s a lot of growing momentum behind making the Developer Experience good for the sake of talent retention. With everybody going remote, a lot more people can work for a lot more companies than in 2019. Companies that focus on making it enjoyable to write their software will have a much better time bringing in that newly available talent.

Operations Experience

There’s a flip-side to DevEx which is that DevOps creates a lot more bleed-over from the operations side to the developers. This includes things like being on call, looking through (and fixing) production, etc. This is a slowly changing dynamic but the better Observability gets, the more product and engineering teams will demand this freedom and take on the obligations.

Security and Compliance Experience

There’s an undercurrent of trying to keep quality high, specifically focused on security controls and policies to meet various requirements. When people usually think DevSecOps, they focus on adding security scans into pipelines. Additional places for security and compliance should feed into the Developer and Operations experiences for the people responsible for the systems. Using observability tools to keep an eye on whether security controls are met is a billion times better than a pipeline scan or linter.

Why GitLab?

Note: Despite leaving GitLab, the below is all still entirely true.

GitLab’s source code repository, CI, and environments/metrics being tightly integrated makes it a great tool to enable the DevOps transformation. One of the biggest changes from a traditional project “with transition to operations” phase is that the permissions model given to the code and branches directly impacts the running system (at least as far as it can without CD). The policies around how code makes its way to production is a core component of the DevOps dynamics and GitLab forces addressing that while providing some light weight but effective constraints.

DevOps Practices

Certain things may work better or worse for different projects, but generally the transformation will follow one of the following patterns:

1. Bad Containers Are Better Than No Containers

Containers are meant to be these magical units that can be moved around and recreated at will. There are a lot of dogmatic best practices for containers that people follow.

For this project, if there’s any way to containerize a system or a component of the system such that it still works, it should be done. The benefits of perfect containers are plentiful, yet the benefits of bad containers are still impressive.

They may not run in Kubernetes, but spinning up a 4 gigabyte container to run tests against is a much better pattern than overwriting files and managing a database.

2. Deal with Data – ORM Workarounds

DevOps hand-waves away database interactions due to a popular trend of adding an Object Relationship Manager (ORM) to software. Ruby on Rails uses ActiveRecord. ASP.NET has Entity Framework. Others exist for basically every stack.

Legacy applications don’t use ORMs because they weren’t a thing. Entity Framework alleges that it can take over an existing schema and use it, but migrations get weird and there are a lot more unexpected bugs and performance issues. No retrofitting of ORMs.

The data maturity scale looks like this:

  1. Database for staging system is kept clean by a team of people
  2. Database backup that gets restored into an environment-specific database instance
  3. Script that does the database restore when a fresh stage build is launched
  4. Database schema and test data seeding scripts
  5. Full ORM built into the app
  6. Data is farmed out to message queues or microservices (which may use ORMs)

3. Measure Whatever ASAP

DevOps heavily relies on instrumentation and centralized logging to support microservices. One of the good things about a monolith is that the logs are all inside the monolith, so it’s less to deal with. The bad news is that logging was probably not consistent across teams. Enabling “debug” logs on a monolith can have severe consequences for performance as well…

The recommendation here is to take an inventory of what is logged and see what the delta is between what is logged now and what needs to be logged to make reasonable assertions about the performance and security of the running system.

Some technologies like Tomcat or Nginx may help with logging via configuration rather than requiring source code changes.

4. Decompose the Monolith

This one is never easy and there are volumes of information written about various ways to do it. It’s tough to get the budget. It’s tough to justify the new bugs and business logic failures that would not have happened if the monolith was left intact. It’s tough to try to fix things in a system that was supposed to be turned off 12 years ago.

There will be deeper dives into a lot of these topics. The overall theme is to take functions from the monolith and put them in something else. Even if it’s the same code in the same language in a smaller application without a bunch of other chunks. Having it isolated makes it easier to instrument and most importantly, a new team member joining will be able to understand it ever.

5. Dependencies

One workflow that comes from old architecture is to have dependency classes which are consumed by a project but are included in the same code repo or cloned into the IDE. This creates tight coupling between the application and the dependency, and prevents reuse and sharing.

Managing dependencies in a shared group allows multiple stakeholders to approve changes to the code. This creates an environment where the dependencies are improved for all users and changes that would negatively impact a group are discussed and mitigated.

Along with moving the dependencies out, semantic versioning or some other versioning approach should be used. The way semantic versioning allows consumers to make assumptions about the scope of a change from one version to the next is very helpful. It also allows vulnerability management systems to track down usage of an out-of-date dependency for updates if security issues are found and fixed.