What's wrong with software debugging? [Q&A]

By Ian Barker
Published 3 years ago

We've seen a tidal wave of developer-enabling technologies over the last ten years. From DevOps, to CI/CD, to containers and microservices -- all of these best practices and technology patterns aim to speed up the process of shipping code fast from the developer into production.

But while software has become increasingly easy to package and deploy, the process of diagnosing and fixing bugs in production has become much more difficult. When services crash in the middle of the night, developers still find themselves in the world of logs, hotfixes and desperation -- but now with much greater surface area to investigate as applications span distributed systems.

We spoke with Ilan Peleg, co-Founder and CEO at Tel Aviv-based Lightrun, to learn more about the growing complexity of software debugging, and what his company is doing to try to give developers better tooling.

BN: Why is software debugging today so much harder?

IP: Debuggers were built for single-instance applications. Meaning, the piece of software running in production was a singular, complete entity, with all the pieces required for its proper operation neatly situated next to each other, in the same binary format, on the user’s machine.

But today applications are abstracted from the underlying hardware. This is great for scalability but transforms the world we work in into one of vastly distributed systems, which makes debugging significantly more difficult. It's now not just hard to understand what is going on, but where it is going on too. It's hard to locate that single piece of binary code that’s misbehaving.

A microservices-based application is spread over many virtualized hosts, each possessing a replica (or multiple replicas) of the services. While containers make it easy to package and distribute software, it becomes very hard to understand not only which service is the problem when there is a bug, but also which replica of the service is the problem.

BN: What's fundamentally different about how Lightrun is approaching the software debugging problem?

IP: All observability solutions today rely on an old ops world paradigm. The idea is basically log everything we can, then analyze this insane amount of logs later.

Lightrun has the opinion that instead of sorting through all those logs, you should only ask for the relevant information when you need it, in real-time and on-demand. We're the first vendor in the world that allows you to connect to an application in real-time and define all sorts of temporal data -- including a wide array of custom metrics -- at the code level of the running applications and with high granularity. Rather than adding more logs into your production application and re-deploying it, we let developers add real-time, read-only logs to the application when an issue occurs. No hotfixes, no rollbacks and no infinite log buckets to sort through.

BN: How is this different from the majority of application performance monitoring and management tools?

IP: For starters, what the APM tools do is display and visualize information relating to the state of the hosts running your application, and a set information (like logs) that was defined during development. These help answer questions that are in the realm of 'known unknowns' -- it's hard to understand what’s going on in a specific part of the application, and the APM is there to shed a light on the situation.

But what about questions that belong to the realm of 'unknown unknowns'? Things that are obviously wrong with the system, but you can't account for them during development? Lightrun helps to -- almost surgically -- break apart the black boxes that sit inside your production system with real-time, contextual information that is defined in the present, while the application is running, as opposed to information that was defined in the past, when the application was developed.

The other aspect we felt needed to be rethought on debugging is why -- as a developer who lives inside his IDE and his CLI -- do I need to go into a separate ops tool to investigate bugs on my running production applications? Developers don't often open application performance monitoring or logging tools (where the logs reside) during their workday -- actually, they often don't even have access to them on a daily basis. And so access to this production information is not natural to them. They do, however, open their IDE every single day.

Our approach at Lightrun is to bring the knowledge closer to them, inside the IDE, instead of pushing it further away. We believe that issues should be investigated within the developer tooling itself -- a concept that can be referred to as 'Shift Left Observability,' meaning that within the software development lifecycle, the tooling on the left of the SDLC (IDEs, CLIs) that developers use to create the software is the ideal location of the debugging solution.

BN: When a developer gets that notification that something's broken in production, what advantage does Lightrun give over other debugging tools that they might have reached for?

IP: Instead of diving head-first into the logs to figure out the code path your application took and the path that resulted in the current issue, developers can gradually add Lightrun logs, metrics and traces to get real-time, code-level information from the running application. This resembles more the experience of debugging a local application -- something developers literally do every single day -- and less like assembling a puzzle with missing pieces, trying to make sense of what the full picture might look like.

Photo credit: McIek/Shutterstock

No Comments

Comments are closed.

What's wrong with software debugging? [Q&A]

Recent Headlines

The psychological impact of phishing attacks on your employees

Canonical releases Ubuntu Linux 24.04 LTS 'Noble Numbat'

New tool lets enterprises build their own secure gen AI chatbots

Younger women are going into cybersecurity but more needs to be done

Politically motivated DDoS attacks on the rise

TCL 50 XL 5G Android smartphone hits Metro by T-Mobile: Big features, small price

The increasing sophistication of synthetic identity fraud

Most Commented Stories

Say goodbye to Microsoft Windows 11 and hello to Nitrux Linux 3.4.0 'pl'

The stunning Windows 13 -- yes, 13! -- is the Microsoft operating system we want

Microsoft 'improves' Windows 11 by bringing ads to the Start menu in the US

Microsoft is up to its old tricks yet again -- Windows 10 users harassed with full-screen Windows 11 upgrade warnings

Windows 11 slammed for its 'comically bad' performance even on high-end hardware

Outrageous: Microsoft to charge $61 for Windows 10 updates -- consider switching to Linux!

Microsoft releases preview version of Office 2024 for Windows and macOS -- download it now!

Easter giveaway! Get a licensed copy of 'VideoProc Converter for Windows/Mac' (worth $78.90) for FREE