Most of us were amused when DALL-E and its peers went mainstream, and we were quick to point out the obvious flaws.
Then ChatGPT hit the scene and again, many of us dismissed it as a parlor trick that would never amount to much.
Using LLMs for coding initially was a only small step up from basic code completion, and a welcome farewell to Stack Overflow.
I am curious: what was the specific moment that you went from those quaint, dismissive observations to a slightly panicked, "Uh Oh" realization of what these models can do?
I gave it an image of a complex maze and asked it to solve the maze. It returned the image with the shortest path drawn that not even I had found.
I haven't had one. It still sucks and doesn't provide value, due to the inherent inaccuracy that requires me to carefully check every little thing it does.
When it started being forced on me in tools I was already using begrudgingly.
For me the "oh shit" moment is when I realized that otherwise sane professionals, frequently in positions of authority, insist on taking these tools seriously. Zero thought put into any of the implications around unchecked anthropomorphism, security issues, employee knowledge retention, liability and other legal concerns, etc.
Agentic development. From "chat bot" to bonafide, capable developer. "Oh, shit!"
My oh shit moment was probably deep Q learning in 2013 (I guess that's not gen AI), but GPT-3 was pretty remarkable too.
Didn't had one yet. Apparently all I have is "crap, here we go again" whenever Claude is giving me a solution to the problem I am presenting to it. Because I understand where it goes and it's full of errors, but those are errors I can avoid. Together we cobble something in the end, I do learn something new as well, but was never "here is my prompt, then Claude delivered final solution next" - like so many commenters here point out they have.
Frankly, to an outsider whatever it presents looks legit, but as an expert I recognize its failures, which makes me even more entrenched in the idea to never use it outside my area of expertise.
I have a question for all them believers: If on a hypothetical scenario you, having no medical experience, find yourself and your child on a mountain, 12 hours away from nearest road, and your offspring is having appendicitis (let's assume your recognize this 100%), with a sharp knife and Claude at your disposal - would you risk to operate on your child? Or hurry the fuck down to get him to a hospital? I know I would chose to get him to a hospital, because that would be a better chance for my kid to live than me to operate on my kid with Claude's assistance. I am pretty sure I would kill my kid on that mountain. So yeah, outside my area of expertise I don't trust Claude one bit.
I have yet to have such a moment. To me it is still just a compressed database.
Though I am surprised at how these databases turn professionals into amateurs, like when Meta publishes some chatbot that can trivially be queried into sending account resets to any email address or when large corporations just dump their entire secret sauce into some remote SaaS led by obviously kooky people.
It's like established pros and big corps want to experience what it was like to be a self-taught PHP coder in 2007, like some kind of false nostalgia.
When it translated a paragraph of one language into another flawlessly.
My oh shit moment lately has been realizing Gen AI is a distraction. language models are manipulating non-Gen AI media, agentic-ally
moving images around layers in photoshop, changing languages, exporting 1000s of variations for teams. Same with video compositing and editing
the human work that creatives thought they were insulated from as long as there was some backlash towards generative AI, and yet
Gen AI 2022 - 2025
Asked AI to generate some code.
It looked absolutely unmaintainable and horrible.
"oh shit" there are serious developers using this crap? As an industry, we are so fsck'd
The biggest "oh shit" one was that people are willing to believe LLM over humans and even humans that are in domain of the thing asked for.
The gullibility is terrifying
I still haven’t had it.
I’ve been working with ML for most of my career, and “gen ai” since the days of matrix crunching for NLP to a 10-element response array on my 1080Ti.
The current generation of AI is frankly, only marginally more impressive to me than that era. The only thing I’m saying “oh shit” to is the deranged amount of capital debt being leveraged to make it usable.
Watching companies spend billions of tokens per minute letting their dev teams that barely know how to write a prompt beyond some tips and tricks to gain a fluctuating slightly negative to slightly positive productivity change that no one can quantify is making me feel like one of the only sane people left in the world.
Quantization is the only interesting change I’ve seen in years.
I feel like with the hype cycle and constant publishing of sketchy claims that I pretty much daily have an "oh shit" moment followed by a "nope, everything is about the same" moment. It's frankly exhausting. It's hard for me to recall a subject that has irritated me as much over a period of years, and it's barely even about AI itself but instead just feeling harassed with the constant anxiety and rage baiting.
My moment was when absolute everything I put into Gemini, ChatGPT et al comes back with a super convincing sounding lie followed by 'Oh you are absolutely right for calling me out on this'.
It's a fucking joke and most people are blinded by it sounding very sophisticated and convincing
"Translate this poem. Maintain meter and rhyme."
AI dungeon, a gpt2 product on iOS. Had almost no context, no memory, but could generate endless slop story. It was the first time I’d seen something like that, and the wild implications felt clear. I wasn’t aware at the time how immense the computational needs were to run the tech as it grew and the social implications, but just couldn’t believe that something like the MUDs I’d played in the late 80s early 90s could be autogenerated in a way now. It had no guardrails like now to prevent it from adopting a personality and so on, so it was in some ways more interesting than what the general public has now.
My "oh shit" moment with AI was when an industry where licensing was the cornerstone of projects and employment contracts decided to just adopt pirated code without any source attribution.
The other one was when a CTO boss sent me an AI proposal to review and the experience was like being gaslit by a con artist.
Many professional developers have started acting like the kind of employee that previously would've been fired after 3 months.
I thought coding agents were probably BS and then I asked Cline to build me a test app to do something (I forgot what, something not that simple) and it built an entire working app. This was before Claude Code which was another step function improvement.
My oh shit moment was when I thought it was going to be the future but it ended up leaving me disappointed, frustrated and annoyed. It's closed down tech, stealing work, ruining our climate and it doesn't work remotely as well as advertised.
My original "oh shit" moment is lost but recently I was looking to support some hardware on Mac when it originally had Linux support. So codex-5.5 downloaded the Linux OS firmware that supported the device (it's afixed feature device, that runs a full Linux OS that also includes drivers for said device) which was buried inside that firmware. Codex then ran binwalk to extract the OS from the firmware, found the shell scripts that actuated the device, used those to "reason" about how the device worked, used that to start writing a Mac driver for it. It did that with very few prompts to get that far. I did still have to guide it with advanced directives after that in order to get to a working Mac driver, so I'm not totally replaceable just yet, but to go from the product name to it finding the Linux OS firmware, to the finding the actual firmware inside that OS download via binwalk, to then getting to a place where the Mac driver started to take shape, was very little advanced knowledge of how computers work.
I haven’t had that yet.
I tried again this week, and CoPilot Plan Mode read the same 5-line markdown file 18 times over the course of 5 minutes of churning on a simple request, then provided zero value over what I posed in the request itself, and hallucinated things about my terraform repo that were just flat-out wrong.
As an Infrastructure/Cloud engineer, I’m far from worried about AI coming for my job.
I won’t deny they are useful tools, but the hyperbole from the tech CEOs about them replacing all white collar workers in 12-18 months set the expectation so high that I’m still in the “fancy auto-complete” camp. It still feels nowhere close to replacing anyone, at least where I work. While useful, they haven’t been anywhere close to as useful as promised. Hallucinations and poor guidance are still a regular day-to-day issue that makes it impossible for me to trust agents with anything.
Had they been more realistic with the promises and didn’t frame it as replacing all of us within 2 years, I would have been more excited about the tech. Now that their claims are proving to be false and they’re trying to walk it back, it’s too late. The time for excitement has passed and it’s just something that exists.
The data center battles have also thrown a wet blanket on the tech, as they file lawsuits against towns near me to force construction to begin, despite the towns voting against it. The town can’t afford the fight, so the will of the people and the town gets bulldozed. It’s pretty gross to watch.
"We're traveling to Tokyo on our way home from China. We'd like to plan a trip accessible by train that hits some beaches, some hot springs, and allows me to get the 4th does of a rabies vaccine sequence (the first three shots were rabvac)"
The are lots of small "oh shit" moments for me. First interaction with an llm was already magical.
"This shit can emulate understand language, find a solution, answer it into words" .
Then came realisations it's not limited to single human languages, you can ask in one language and it could answer in another. It's also capable of understanding and generating code. Not only that, it's better than most humans for that. It can hear, it can see, it can paint, it can do music, it can sing.. It can combine, give a picture, ask for a music from that picture. Give a video, get software. It can mix and match.
After that came improvements, - no The revolutions - It started as a 4 year old with encyclopedic knowledge. It knew but could not convey, could not make sense sometimes. Was incorrect most of the time. Blubber. In a few years it matured to impeccable levels. It now can relate information with a lot of clarity, and it's less and less wrong. Nearly no hallucinations. It can do maths! Correct maths! Maths that I could not even my life depends on it. It's getting to a stage that it can proof where humans failed.
I am getting "oh shit moments" day by day.
I am using codex and claude on a linux host connecting from a Widnows machine using ssh.
No matter what I tried I couldn't get "Shift+Enter" to work. I said fuck it, cloned kitty and alacritty and asked Claude to implement a terminal emulator for Windows that would render everything using DX12 and support modifyOtherKeys plus DA responses, and within a few days it was ready!
My "oh shit" moments come every time I see people glazing AI
"Oh shit. My skills I spent my life building are going to go to zero value. I'm going to have to dramatically change careers in my forties or I'm just going to wind up being a schmuck prompting these stupid fucking machines for the rest of my life"
Oh shit indeed
My oh shit moment was Opus 4.6 before it got nerfed.
It helped me refactor my old app. Something I always wanted to do, but didn't have time/mental capacity to do in a short space of time.
I wrote a short prompt, explaining how I want it to look like and which files it should go through. It asked me a few clarifications and then basically one shotted it.
Everything compiled and worked. Now my internal app is much much easier to extend and test.
I tried few more things like that and spent like £5k in the tokens in those two weeks.
Then it got nerfed and never worked like that again.
Now I don't use AI, because it is shite again. Even Opus 4.8.
I don’t know about “Oh shit”. I’ve had many “It’s shit” moments.
I use claude code on a daily basis, but honestly it becomes more annoying the more I use it. Why? I think because I ask it to do something and unless I'm extremely specific, either the code is verbose or the feature I'm designing is done in a poor way. For me, the productivity gains aren't that great and I'm even considering whether to go back to doing things by hand to save myself the frustration. Sure, if you don't care about code quality or scalability, it's a great thing to generate code. And yes, there are times when I don't, but for real projects, I actually do because I know as an engineer those things do matter in the long run. So, to be honest, I still haven't had that moment.
[dead]
[flagged]
[flagged]
[flagged]
[flagged]
[dead]
[flagged]
[flagged]
[flagged]
[dead]
[flagged]
[dead]
[flagged]
[dead]
[flagged]
[flagged]
>Then ChatGPT hit the scene and again, many of us dismissed it as a parlor trick that would never amount to much.
No, ChatGPT was the "oh shit" moment for me.
Anyone who had touched a computer before that knows how big of a leap that was.