I guess I am mostly enjoying learning the fundamentals of AI stuff, even though I disagree with the direction it is going.
But I am struggling to put into words how alarming I find the comments on threads like this — all sorts of good-natured anecdotes about how XYZ works for them that are more like the suggestions in pet care or cookery threads on Facebook.
(Or worse still, like any Facebook 3D printing group: anyone who prints but wants to understand what is actually going on will know what I mean, I think)
Any shared sense of rigour is just completely torpedoed by the LLM world, particularly the cloud LLM world it seems, and we are reduced to cargo culting. Nobody is any more right or wrong than anyone else.
Have you tried cleaning your context with dawn dish soap, letting it dry and then adding a layer of glue stick?
--
ETA: I don't want to sound so mean about people who try to help, here or in facebook groups. I guess I just find these threads so different to threads on more or less any other topic, where someone's suggestion can be debated or refined by other commenters and then someone will explain a thing about how bash history selections work that will change your entire life. With these threads they devolve to "isn't it weird that threatening it works?"
The arbitrary and non-deterministic nature of LLM workflows gives me full on ick. As an old embedded/systems guy I have always prioritized determinism and repeatability in my workflows.
But damn, agents are amazing and I'm enjoying being a "thought process designer". I'm not going back. Even if AI development stops today my career will never be the same.
This has always been a thing with IT advice, though - the more complex a system and the outcome, the harder it is to clearly define "better" or "worse". Add in the fact that LLMs are intensely and emphatically non-deterministic and LLM guidance basically becomes gardening advice.
Heck, even the 'benchmarks' are mostly somebody's attempt to crystallize their vibes with varying amounts of success.
> Any shared sense of rigour is just completely torpedoed by the LLM world
Consider that this shared sense of rigour you have in mind is illusory, and LLMs and their context struggles are simply revealing this. I see precious little rigour in any of the 'tech' world I've lived in for decades. The tools proliferate, paradigms emerge and die and reemerge, and whatever stick you consider using to measure any of it has competitors with different units. Past the physics of power and signaling, and the prevailing cost of a silicon wafer, we are almost all, relative to a small number of much older disciplines, muddlers of various degrees of skill.
I've found dealing with context limits relatively easy: specify and confine. LLMs need clear specifications and strong guidance to produce good work.
But that's just my current muddling take on the practice. Perhaps, 90 days from now, even this burden will be gone, and a simple prompt will generate world class operating systems, programming languages and a formal basis in mathematics for both.
If you want my best guess: I think large context windows cannot be trained properly. There's not enough material, nor computing power, to train such large networks (to the same degree as small windows).
It's not just you! Here's a lovely quote from an influential paper, "We offer no explanation as to why these architectures seem to work; we attribute their success, as all else, to divine benevolence." I think people went through a similar phase with steam engines. Lot's of practical engineering and heuristics to explain what works, before the emergence of a solid theoretical foundation (thermodynamics) to explain why.
I feel your frustration for sure and agree to a large extent. Any attempts I’ve made to try to formalize any LLM-based workflows has resulted in me being again dismayed that no one seems to have any real idea of how or why certain things work or don’t work. So I just go back to /plan and “write this down in a markdown document for posterity before we iterate on the implementation”, hoping that maybe next month there might be something a little more rigorous with some kind of rational backing.
> Have you tried cleaning your context with dawn dish soap
I don’t do the glue stick thing at all because I don’t need to, but Dawn really seems to do a good job at getting my Bambu build plate working again. I didn’t seek it out specifically, I already had some for doing dishes. IPA hadn’t worked so I tried Dawn and it has gotten me back having prints stick multiple times now. Not quite up to N=30 yet.
What sense of rigour is going to be in a field (LLM usage as a user) where models, context sizes, tooling and broadly "rules" (scary quotes) change every few weeks? There is no literal change to have a scientific approach to anything, churn is too high, there are papers about model XYZ v 12345 from a few months ago that are already old because there is model ABC on version 54321 that addresses half of the issue shown in the paper and add 3 new problems though.
This lack of rigour feels a lot like “did you try restarting the computer? Most of the time, others tried restarting the computer and it works”
first of all, LLM-assisted coding is less than 3 years old. 3 years ago all we had was GPT-4 with 8192 token context, which wasn't enough for most things.
and second of all...
>Any shared sense of rigour is just completely torpedoed by the LLM world, particularly the cloud LLM world it seems, and we are reduced to cargo culting. Nobody is any more right or wrong than anyone else.
what "sense of rigour"? it's way too soon to put those rose-tinted glasses on.
Programming has already become this way. Opinions about different languages and architectures are taste, or sometimes even just vibes. Few try to actually ask “can I quantify whether microservices or monoliths are better in terms of either maintainability or scaling?”
A lot of this is a result of systems having long ago exceeded the complexity threshold of things people can hold in their heads. There are too many layers, subsystems, languages, APIs, all glued together. Attempts at radical simplification fail because each of those layers and subsystems has features or behaviors someone needs, and a lot of it isn’t even documented.
AI takes this to the extreme. I’ve already learned that certain models have “personalities.” Some are more likely to go with you on magical journeys into hallucination while others are more critical. Some are better at detail while others seem better at abstraction but fall over on detail. Some are better instruction followers. All their quirks are complex and the systems themselves are impossible to understand.
Computer systems are becoming organic, biological.
It's in the hype train's interest to keep the actual value unknowable. If you quantify what you're paying for then the FOMO is greatly reduced.
> But I am struggling to put into words how alarming I find the comments on threads like this — all sorts of good-natured anecdotes about how XYZ works for them that are more like the suggestions in pet care or cookery threads on Facebook.
It will always be this way going forward. Everyone thinks differently about problems. In the past we had experts and only they could do the work at a high level. But now we have many people that are cranking out expert level solutions without much knowledge. Worrying about the minutia is a dying trend.
Edit: I see I touched a nerve. But that is how it is now. You can't fight reality.
> Any shared sense of rigour is just completely torpedoed by the LLM world, particularly the cloud LLM world it seems, and we are reduced to cargo culting. Nobody is any more right or wrong than anyone else.
There was always some of this in the tech world, long before LLMs came along.
I've sat in so many meetings when decisions were made based on "that's what _slightly more prestigious company_ does" rather than objective measurable criteria. (And the evidence that the thing in question wasn't universally followed by _slightly more prestigious company_ carried surprisingly little weight).