- systemic tech debt is now addressable at scale with LLMs. Future models will be good enough to sustain this, if people don’t believe this I would challenge them to explain why. First consider if you understand what scaling laws are like chinchilla and how RL with verification works fundamentally
- I completely agree with you about fundamentally the limitation being the business able to coherently articulate itself and its strategy
- BUT the benefit now is you can basically prototype for free. Before we had to be extremely careful with engineer headcount investment. Now we can try many more things under the same time constraints.
> BUT the benefit now is you can basically prototype for free.
But.. so can your competitors. And that changes the value proposition.
>- systemic tech debt is now addressable at scale with LLMs. Future models will be good enough to sustain this, if people don’t believe this I would challenge them to explain why.
Is this some sort of troll attempt? Like, are you fundamentally misunderstanding the problem with tech debt? This is the equivalent of throwing garbage on the floor and expecting professional cleaners to keep your house clean.
You can produce tech debt faster than you can pay it back, that's the core aspect of tech debt. If tech debt was more expensive in the short term than not doing it, nobody would be doing it.
A labor saving device doesn't reduce or deal with tech debt since tech debt is a decision made independently of the competence of the developers. If you have a company with a tech debt culture, the labor saving device will just let you accumulate more tech debt until you reach the same level of burden per person.
>First consider if you understand what scaling laws are like chinchilla and how RL with verification works fundamentally
Honestly, this tells me that you basically understand nothing, not even chinchilla scaling laws and how RL works. Not only are you trying to brute force the problem, you're listing completely irrelevant factors to the problem at hand.
Chinchilla scaling laws are "ancient" by LLM standards. Everyone who designs a model architecture that is supposed to beat their competitors is pulling out every trick in the books and then come up with their own on top of that and chinchilla scaling laws have been done to death in that regard.
Reinforcement Learning is also a pretty bad example here, because there is no obvious way to encode a reward function to deal with something as ill defined as tech debt. You didn't even say avoid tech debt which would be actionable to some extent, just "systemic tech debt is now addressable at scale with LLMs". I.e. you're implying that if LLMs were to generate tech debt, you can just keep scaling and produce more of it, solving the problem once and for all Futurama style with ever bigger ice cubes.
> systemic tech debt is now addressable at scale with LLMs.
Is there any reason to believe this? I've only seen the evidence of the contrary so far.
My experience with AI coding aides is that they, generally:
1. Don't have an opinion.
2. Are trained on code written using practices that increase technical debt.
3. Lack in the greater perspective department, more focused on concrete, superficial and immediate.
I think, I need to elaborate on the first and explain how it's relevant to the question. I'll start with an example. We have an AI reviewer and recently had migrated a bunch of company's repositories from Bitbucket to GitLab. This also prompted a bunch of CI changes. Some projects I'm involved with, but don't have much of an authority, that are written in Python switched to complicated builds that involve pyproject.toml (often including dynamic generation of this cursed file) as well as integration with a bunch of novelty (but poor quality) Python infrastructure tools that are used for building Python distributalbe artifacts.
In the projects where I have an authority, I removed most of the third-party integration. None of them use pyproject.toml or setup.cfg or any similar configuration for the third-party build tool. The project code contains bespoke code to build the artifacts.
These two approaches are clearly at odds. A living and breathing person would either believe one to be the right approach or the other. The AI reviewer had no problems with this situation. It made some pedantic comments about the style and some fantasy-impossible-error-cases, but completely ignored the fact that moving forward these two approaches are bound to collide. While it appears to have an opinion about the style of quotation marks, it completely doesn't care about strategic decisions.
My guess as to why this is the case is that such situations are genuinely rarely addressed in code review. Most productive PRs, from which an AI could learn, are designed around small well-defined features in the pre-agreed upon context. The context is never discussed in PRs because it's impractical (it would usually require too much of a change, so the developers don't even bring up the issue).
And this is where real large glacier-style deposits of tech debt live. It's the issues developers are afraid of mentioning because of the understanding that they will never be given authority and resources to deal with.
The problem with tech debt is not that it is some poorly designed code in a few repositories that can just be changed. True tech debt is the kind that requires significant architectural changes across many systems and is almost always coupled with major data migrations. You need the rest of the business to agree that you want to invest all that time and energy to fix a problem someone else created 10 years ago. You likely will also need other teams to set aside time on their own road map to address it. You also might need customers to change what they are doing because if software lets you do something, you can guarantee that someone has learned to do it - even if that 'something' was actually a bug.
LLMs don't solve any of those problems by itself.