Category: Technology

How to Manage Technical Debt in 2023: A Guide for Leadership

How to Manage Technical Debt in 2023: A Guide for Leadership

In this article, I will summarise effective strategies and best practices to tackle tech debt head-on.

Technical debt is an inevitable reality in software development. But, it can be leveraged just like a financial loan/debt can help you achieve your goals, if managed properly.
It can be used to drive competitive advantage by allowing companies to launch new products and features faster, experiment with new technologies, and improve the scalability and performance of their systems. However, like all loans, it need to be “Repaid” properly and at the right time, failing on it will create a downward spiral.

If you’re not careful, technical debt can quickly become a major burden that slows down development and makes it difficult to add new features or even fix bugs in a timely manner.

We will discuss how to identify technical debt and the signs of poorly managed debt, and then provide a strategy for reducing it. We will also discuss what a healthy level of technical debt looks like and how leaders can use it to their advantage.

Good Tech Debt Vs Bad Tech Debt

Robert Kiyosaki, the author of Rich Dad Poor Dad, famously said:

Bad debt takes money out of your pocket, while good debt puts money in your pocket.

– Robert Kiyosaki

The same is true of tech debt.

Technical debt is the cost of not doing things the right way the first time. Good technical debt is accrued when you make trade-offs to meet deadlines or deliver new features quickly. Bad technical debt is accrued when you make poor decisions or cut corners.

Bad tech debt will probably make your PMs, Sales and CEO happy for a quarter or two. But after that, they will be asking why everything is behind schedule and dealing with customer complaints because things aren’t working properly.

Now that I have presented the obvious in a familiar “Quadrant”, you can actually skip the terminologies and definitions part of this article! 😀

For my verbal brethren, Which is the Tech Debt you’d need to ruthlessly hunt down to extinction? Obviously, it is the untracked, undocumented ones. And the ones which are dragging your team on a downward spiral (immaterial of whether it is tracked or not)

Why does your Tech Debt keep accumulating?

Before we can think about building a strategy to solve tech debt, we need to understand how it gets out of control in the first place.

It’s called “impact visibility”.

Fixing code debt issues is impossible if:

1, You’ve no record of what technical debt issues you have

2, You’ve got a backlog, but you can’t see which issues are related to what code

In both cases, you can’t prioritise tech debt over shipping new features.

We need to get more granular about what impacts these two tech debt cases above.

  • Issue invisibility — There’s no source of shared knowledge. Codebase health info is locked in (few) engineers’ heads.
  • No code quality culture — Shipping fast, whatever the cost, like it’s going out of fashion.
  • Poor process — Tech debt work sucks. Nobody likes creating Jira tickets. “Jira” has become a dirty word.
  • Low-time investment — Justifying the time to fix tech debt or to refactor is a constant uphill battle. After a point, engineers become silent!

Lack of context — Issues in Jira are a world away from the hard reality of the codebase. They’re not related in any way.

So what’s the source of this? Let’s talk strategy.

Spoiler… It’s about changing organisational culture and developer behaviour to track issues properly.

Creating a strategy to reduce technical debt

Track. Issues. Properly.

Good tech debt management starts with team-wide excellence at tracking issues.

You can’t have a tech debt strategy without tracking.

The engineering leader’s job is to make that “issue tracking” easy for your team. There is supposed to be a software for that – Jira, Asana, Rally or something of that sort.

The problem is, I’ve never believed they really get to the bottom of the problem, and after speaking with scores of engineers and leaders about it, they usually don’t either. My personal belief is most companies suffer on the velocity after their Jira rollout! It is a bit like,

No two countries that both have a McDonald’s have ever fought a war against each other.

Thomas L. Friedman – in The Lexus and the Olive Tree!

As a leader, You need to find a way to…

  • Show engineers when they’re working on code with tech debt, without them having to jump thru 3 hoops.
  • Make it really easy for team members to report tech debt.
  • Create a natural way to discuss codebase issues.
  • Integrate tech debt work into your workflows and involve PMs if required.

There are multiple ways to achieve this, the easiest is to not address it. Ie: not address it intentionally, just tweak your existing pipeline. This can be done by,

  • A very robust linting & integration to the IDE
  • Tighter Git rules for commits
  • SAST which runs on the pipeline
  • and can feed into the IDE

Prioritising impactful tech debt

At this point, it should be obvious, but prioritising the right issues is only possible if you’re tracking the impact of these “issues” and it could be direct or indirect (Dependency, Sequencing, Rework avoidance etc) .

Once you’ve got them, you should regularly and consistently use them to decide what to address. This usually happens during the backlog grooming or sprint planning sessions. But, this decision-making process needs to be strategic. Not at all tactical, ie: DO NOT delegate it to the whims and whimsicals of your TL/PM or even EM.

You or someone with a context of the organisation and position on sales, clients, revenue etc., should be doing this.

A good way to start is by choosing a theme each time you prioritise issues. For example, you could prioritise issues that…

  • Are impacting a specific feature you need to work on in the next quarter
  • Are impacting the customer’s UX
  • Are affecting efficiency/morale on the team
  • Are impacting the security posture

This is often straightforward if you’ve got high-quality issues that traceable to code and tagged as such.

Most people wonder how to get the time for these “Tasks”. I have two recommendations.

  • Take an entire sprint every quarter to repay the tech debt (Will need high-level buy-in, It is slightly harder to align your CXOs)
  • Allocate 15-20% of bandwidth in every sprint. (Easier to achieve buy-in from CXOs, harder to drive with engineers)

Engineers generally won’t prioritise tech debt work by themselves because of the conflict of interest/pressure of shipping fast. This was evident from multiple high velocity/impact software engineering teams including ones at AirBnb, Netflix and Spotify. A commitment to code refactoring and maintenance work should be endorsed and supported from the top and reinforced regularly.

How much Tech Debt can you take on?

Managing technical debt is like managing financial debt. You can use it to your advantage, but you need to be careful not to let it get out of control.

Your technical debt budget is the amount of technical debt that you are willing to take on in order to achieve your business goals. You should not try to solve all of your technical debt at once, but instead focus on the most important items.

Prudent technical debt is debt that you take on deliberately and knowingly, in order to achieve a specific goal. For example, you might take on technical debt to launch a new product quickly, or to add a new feature that is in high demand by your customers.

If you manage your technical debt properly, it can be a powerful tool for gaining a competitive advantage. However, if you let your technical debt get out of control, it can lead to serious problems, such as increased costs, delays, and security vulnerabilities.

Concluding remarks:

Technical debt is one of the most neglected areas of software development. It is often only given priority when it is too late and has already caused serious problems.

However, when leaders work together and develop a consistent and process-driven strategy, technical debt can be effectively managed.

The best engineering teams are constantly thinking about how to use their technical debt budget to their advantage.

References and Further Reading:

Can ChatGPT accelerate No-Ops to Replace DevOps in EarlyStage Startups?

Can ChatGPT accelerate No-Ops to Replace DevOps in EarlyStage Startups?

Background: I work with multiple CTOs and Heads of Engineering of early-stage startups to help them set up their engineering orgs, review their product architecture, help them prioritise their hiring etc. I also help multiple engineering leaders via Plato. Recently, there have been too many questions on whether can I use ChatGPT for this or that, but the most interesting one among these is “Can ChatGPT accelerate No-Ops to Replace DevOps?”. I have answered that multiple times in verbatim. But thought that writing it down will help me with two things, I can point them to the URL and I can also clearly structure my thoughts on this. So this is an attempt at that.  

Glossary First:

DevOps:

For the uninitiated, DevOps is a software development methodology that emphasizes collaboration and communication between developers and operations teams. The goal of DevOps is to increase the speed and quality of software delivery while reducing errors and downtime. DevOps engineers are responsible for the design, development, and delivery of software applications, as well as the management and maintenance of the underlying infrastructure including the pipelines, quality assurance automation etc.

No-ops:

No-ops, on the other hand, is a philosophy that aims to automate and simplify the operations side of software development. The goal of no-ops is to eliminate the need for manual intervention in the deployment and maintenance of applications, freeing up time for developers to focus on creating new features and fixing bugs.

ChatGPT:

ChatGPT is a language model developed by OpenAI that has the potential to revolutionize the way we interact with technology. It can perform a wide range of tasks, from answering questions to generating text and even a rudimentary bit of coding. In the context of DevOps, ChatGPT could be used to automate many of the manual tasks that DevOps engineers currently perform, such as infrastructure management, deployment, and monitoring.

Now, The question: Can ChatGPT accelerate No-Ops to Replace DevOps?

ChatGPT (or any of the other generative AIs)  has the potential to automate many of the manual tasks performed by DevOps teams, it could potentially replace the most mundane tasks that DevOps perform pretty soon. So, before writing this piece, I wanted to actually put my understanding to test.

I had to use Bard for this experiment, but I do not think there is going to be much of a difference in the outcomes.

Experiment 1: Beginner-Level Task

Writing a simple Autoscaling script to scale as per CPU and memory utilisation.

Simple AutoScaling Script, written by Bard

Experiment 2: Intermediate-Level Task

Creation of a new VPC with 3 autoscaling groups of EC2 and 1 NAT gateway and VPC peering.

Results were mixed.

Though Bard did give information that this can be tweaked, there was one major gap between the ask and the outcome. The NAT gateway was supposed to be the single point of ingress/egress, whereas the script is entirely different.

Assume an early-stage startup that gets used to the early success of Generative AI to preempt the DevOps culture, at some point the AI won’t have the context of what all is in your infrastructure and you could end up misfiring things.

My submission is, ChatGPT or Bard or any Generative can be super helpful for a good DevOps engineer and cannot replace her/him anytime soon. In military parlance, certain materiels are termed Force Multipliers. But, they themselves cannot be the force! (Aircraft carriers or Tanker aircraft are the prime example)

Why do I believe so?

There are several reasons for this:

  1. Human creativity: Despite its advanced capabilities, ChatGPT is just another AI model and lacks the creativity and innovation that a human DevOps engineer brings to the table. DevOps engineers can think outside the box and find new and innovative solutions to “Business” problems, whereas ChatGPT operates within the constraints of its programming and is simply solving the “Constraints” for that technical problem. 
  2. Human oversight: While ChatGPT can automate many tasks, it still requires human oversight to ensure that everything is running smoothly. DevOps play a crucial role in monitoring and troubleshooting any issues that may arise during the deployment and maintenance of applications.
  3. Complexity: Many DevOps tasks are complex (or at the very least, the DevOps teams would want us to believe that) and require a deep understanding of the underlying infrastructure and applications. ChatGPT does not yet have the capability to perform these tasks at the same level of expertise as a human.
  4. Customization: Every organization has unique requirements for its development and deployment process. ChatGPT may not be able to accommodate these specific needs, whereas DevOps engineers can tailor the process to meet the non-stated requirements and organisational and platform context
  5. Responsibility: DevOps engineers are ultimately responsible for ensuring the success of the development and deployment process. While ChatGPT can assist in automating tasks, it is not capable of assuming full responsibility for the outcome.

In conclusion, while ChatGPT has the potential to automate many manual tasks performed by DevOps engineers, it is unlikely to replace DevOps entirely in the near future. The role of DevOps will continue to be important in ensuring the smooth deployment and maintenance of software applications, while ChatGPT can assist in automating certain tasks and increasing efficiency. Ultimately, the goal of both DevOps and no-ops is to increase the speed and quality of software delivery, and the use of ChatGPT in DevOps can play a significant role in achieving this goal.

References:

https://humanitec.com/whitepapers/devops-benchmarking-study-2023

Is NoOps the End of DevOps?

Is NoOps the End of DevOps?

Some say that NoOps is the end of DevOps. Is that really true? If you need to answer this question, you must first understand NoOps better.

Things are moving at warp speed in the field of software development. You can subscribe to almost anything “as a service” be it storage, network, computing, or security. Cloud providers are also increasingly investing in their automation ecosystem. This leads us to NoOps, where you wouldn’t require an operations team to manage the lifecycle of your apps, because everything would be automated.

Picture Courtesy: GitHub Blog

You can use automation templates to provision your app components and automate component management, including provisioning, orchestration, deployments, maintenance, upgradation, patching and anything in between meaning significantly less overhead for you and minimal to no human interference. Does this sound wonderful? 

But is this a wise choice, and what are some advantages and challenges to implementing it?

Find out the answers to these questions, including whether NoOps is DevOps’s end in this article.

NoOps — Is It a Wise Choice?

You already know that DevOps aims to make app deployments faster and smoother, focusing on continuous improvement. NoOps — no operations — a term coined by Mike Gualtieri at Forrester, has the same goal at its core but without operations professionals!

In an ideal NoOps scenario, a developer never has to collaborate with a member of the operations team. Instead, NoOps uses serverless and PaaS to get the resources they need when they need them. This means that you can use a set of services and tools to securely deploy the required cloud components (including the infrastructure and code). Additionally, NoOps leverages a CI/CD pipeline for deployment. What is more, Ops teams are incredibly effective with data-related tasks, seeing data collection, analysis, and storage as a crucial part of their functions. However, keep in mind that you can automate most of your data collection tasks, but you can’t always get the same level of insights from automating this analysis.

Essentially, NoOps can act as a self-service model where a cloud provider becomes your ops department, automating the underlying infrastructure layer and removing the need for a team to manage it.

Many argue that a completely automated IT environment requiring zero human involvement — true NoOps — is unwise, or even impossible.

Maybe people are afraid of Skynet becoming self-aware!

NoOps vs. DevOps — Pros and Cons

DevOps emphasizes the collaboration between developers and the operations team, while NoOps emphasizes complete automation. Yet, they both try to achieve the same thing — accelerated GTM and a better software deployment process. However, there are both advantages and challenges when considering a DevOps vs. a true NoOps approach.

Pros

More automation, less maintenance

By automating everything using code, NoOps aims to eliminate the additional effort required to support your code’s ecosystem. This means that there will be no need for manual intervention, and every component will be more maintainable in the long run because it’ll be deployed as part of the code. But does this affect DevOps jobs?

Uses the full power of the cloud

There are a lot of new technologies that support extreme automation, including Container as a Service (CaaS) or Function as a Service (FaaS) as opposed to just Serverless, so most big cloud service providers can help you kickstart NoOps adoption. This is excellent news because Ops can ramp up cloud resources as much as necessary, leading to higher capacity, performance & availability planning compared to DevOps (where Dev and Ops work together to decide where the app can run).

Rapid Deployment Cycles

NoOps focuses on business outcomes by shifting focus to priority tasks that deliver value to customers and eliminating the dependency on the operations team, further reducing time-to-market.

Cons

You still need Ops!

In theory, not relying on an operations team to take care of your underlying infrastructure can sound like a dream. Practically, you may need them to monitor outcomes or take care of exceptions. Expecting developers to handle these responsibilities exclusively would take their focus away from delivering business outcomes and wouldn’t be advantageous considering NoOps benefits.

It also wouldn’t be in your best interest to rely solely on developers, as their skill sets don’t necessarily include addressing operational issues. Plus, you don’t want to further overwhelm devs with even more tasks.

Security, Compliance, Privacy

You could abide by security best practices and align them with automatic deployments all you want, but that won’t completely eliminate the need for you to take delicate care of security. Attack methods evolve and change each day, therefore, so should your cloud security controls.

For example, you could introduce the wrong rules for your AI or automate flawed processes, inviting errors in your automation or creating flawed scripts for hundreds or thousands of infrastructure components or servers. If you completely remove your Ops team, you may want to consider investing additional funds into a security team to ensure you’re instilling the best security and compliance methods for your environments.

Consider your environment

Considering NoOps uses serverless and PaaS to get resources, this could become a limiting factor for you, especially during a refactor or transformation. Automation is still possible with legacy infrastructures and hybrid deployments, but you can’t entirely eliminate human intervention in these cases. So remember that not all environments can transition to NoOps, therefore, you must carefully evaluate the pros and cons of switching.

So Is NoOps Really the End of DevOps?

TL:DR: NO!

Detail: NoOps is not a Panacea. It is limited to apps that fit into existing #serverless and #PaaS solutions. As someone who builds B2B SaaS applications for a living, I know that most enterprises still run on monolithic legacy apps and even some of the new-gen Unicorns are in the middle of Refactoring/Migration which will require total rewrites or massive updates to work in a PaaS environment, you’d still need someone to take care of operations even if there’s a single legacy system left behind.

In this sense, NoOps is still a way away from handling long-running apps that run specialized processes or production environments with demanding applications. Conversely, operations occur before production, so, with DevOps, operations work happens before code goes to production. Releases include monitoring, testing, bug fixes, security and policy checks on every commit, etc.

You must have everyone on the team (including key stakeholders) involved from the beginning to enable fast feedback and ensure automated controls and tasks are effective and correct. Continuous learning and improvement (a pillar of DevOps teams) shouldn’t only happen when things go wrong; instead, members must work together and collaboratively to problem-solve and improve systems and processes.

The Upside

Thankfully, NoOps fits within some DevOps ways. It’s focused on learning and improvement, uses new tools, ideas, and techniques developed through continuous and open collaboration, and NoOps solutions remove friction to increase the flow of valuable features through the pipeline. This means that NoOps is a successful extension of DevOps.

In other words, DevOps is forever, and NoOps is just the beginning of the innovations that can take place together with DevOps, so to say that NoOps is the end of DevOps would mean that there isn’t anything new to learn or improve.

Destination: NoOps

There’s quite a lot of groundwork involved for true NoOps — you need to choose between serverless or PaaS, and take configuration, component management, and security controls into consideration to get started. Even then, you may still have some loose ends — like legacy systems — that would take more time to transition (or that you can’t transition at all).

One thing is certain, though, DevOps isn’t going anywhere and automation won’t make Ops obsolete. However, as serverless automation evolves, you may have to consider a new approach for development and operations at some point. Thankfully, you have a lot of help, like automation tools and EaaS, to make your transition easier should you choose to switch.

Bitnami