logoalt Hacker News

Ask HN: What was your "oh shit" moment with GenAI?

466 pointsby andrehackerlast Thursday at 11:42 PM835 commentsview on HN

Most of us were amused when DALL-E and its peers went mainstream, and we were quick to point out the obvious flaws.

Then ChatGPT hit the scene and again, many of us dismissed it as a parlor trick that would never amount to much.

Using LLMs for coding initially was a only small step up from basic code completion, and a welcome farewell to Stack Overflow.

I am curious: what was the specific moment that you went from those quaint, dismissive observations to a slightly panicked, "Uh Oh" realization of what these models can do?


Comments

jzemeocalayesterday at 8:55 PM

I bought an Alesis QS8.1 super cheap in perfect condition (was a top grade digital piano/synth in the 90s).

and then i realized that ALL of the software (which i collected from defunct websites and archived on github) related to it was ancient and after a while of getting tired of using WINE every single time i decided i wanted a cross platform modern equivalent that did everything that several of these different programs did (plus break out some stuff that was now potentially possible with modern computer)

i thought it would be extremely hard because the computer to synth communication is pretty much only via sysex commands (of which the actual wave file encoding protocol was undocumented)

Claude walked me through examining the some of the original software in GHIDRA, and I had a working demo that night.....now im just playing with adding new features to it.

show 10 replies
awbvioustoday at 9:47 AM

Not sure, but I can tell you what my "oh s** astroturfing is so bad, it's even in Hacker News" moment. And if I learned GenAI was used to make some of the astroturf, that's more an "ah s*“ than an "oh s*“ thing. I mean, the prominence, ubiquity, and breathlessness. One out of three, sure. Two out of three, maybe. And some corpo shilling definitely happens here. But this is like, well, covering an entire area with artificial grass, to the point where nothing lives. Crazy.

show 7 replies
andrewthorntonyesterday at 8:34 PM

My furnace went out during the 2025 holiday and I couldn't get an appointment with a repair person for 2 days. It was getting very cold in my house so I went into my attic and made several videos of the furnace attempting to start and gave it to gemini. It diagnosed the issue immediately and had me spin one of the components (a small exhaust fan) while the furnace tried to fire. It came on immediately. I had to do that several times, but it worked until the HVAC service showed up.

show 11 replies
shreddudeyesterday at 7:56 PM

I could go on and on, but Claude recently decompiled the firmware of my camper van, documented all the CAN interfaces, then programmed an ESP32 module to talk to the van’s integrated systems (power, HVAC, lighting, tanks). That sort of embedded systems integration is completely out of my wheelhouse.

I honestly don’t understand AI naysayers. I use Claude every day both professionally as a Solution Architect and personally in a variety of projects I simply could not have ever approached alone.

show 9 replies
jackdoetoday at 8:49 AM

I have had many, but the last one was quite funny:

It fixed my printer after dist-upgrade and separate chrome upgrade, the printer worked everywhere but not in chrome.

After 30 years of using linux I didn't even want to know what is wrong, is it colord again? dbus + cups issue? I completely accepted that I wont be able to print from chrome for a couple of months until next update.

I just ran it in dangerously-skip-permissions mode and said 'my printer doesnt work in chrome' few minutes later I heard the printer printing "This is test" and it said 'I think its fixed, do you see a page coming out of the printer now?'

show 4 replies
angusturnertoday at 12:28 PM

In 2017 I worked tirelessly with my colleagues to implement and replicate the first transformer paper.

Yesterday I left Opus 4.8 to go do some architecture research, with GPU access.

It replicated and trained a credible baseline. It implemented some ideas I'd been thinking about, and wrote custom CUDA kernels for them. It read and summarised dozens of related papers.

It has since run dozens of experiments, with minimal supervision. When a model is unstable it kills it, documents why, fires off a new configuration.

The realisation that frontier labs are doing this at scale with unlimited GPU and token budgets.

It actually scares me a bit. The realisation that the next big breakthroughs will only have light human involvement.

The prospect of recursive self improvement feels more to real to me all of sudden

show 4 replies
nativeittoday at 5:46 PM

When I saw that on the second day of token-based pricing I’d already consumed my usual monthly spend on GitHub Copilot. That’s when I fully realized that it would never be economical, nor useful, to solo shops like mine.

plagasultoday at 12:20 PM

Several. Yesterday a friend with no prior coding experience or knowledge showed me an app he initially built to help him study for public administration job positions. The exams for this positions are public (spain), but the tools are scarce, expensive or he did not like. So he used lovable, then switched to web gemini and claude, then paid claude. He now has +130 very active users on an initial free tier, while he figures out. The app is on github, runs on vercel with supabase, react, tailwind, bun... he has no idea what he is doing. I even installed claude code for him, got him an ssh key so he can do it locally, etc.

Another: claude code cracked for me some software that was calling a home that did not exist anymore via headless ghidra.

Another: I am a teacher, and qualifications and feedback is very very time consuming, specially in loose workflows with several sources and tools that are not connected. During class presentations I take loose notes. Now I have a local folder where I drop my 1 student list, with names and emails, 2 my loose notes, and 3 a qualification & feedback sheet model; then claude creates a sheet per student, formats and copies the feedback to the right sheet cell, waits for my corrections, then sends everything to their school emails. Much easier, much less time consuming.

show 3 replies
jp57yesterday at 9:32 PM

Actually seems absurdly simple now, but sometime last year I was trying to figure out what I'd need to tow my daughter's car cross country with my truck: what are the trailer/dolly options, what do they cost, can my truck actually tow the combined weight, etc.

I started out prompting ChatGPT kinda how I would with Google, one small prompt at a time, asking about various details. But after one or two of those I just tried "I want to tow a car of make A with my truck model B, from point C to point D, what are my options?" And it wrote me a report with comparison tables and computed towing weights and other details for different options.

At that point, I was like "Oh. This is different. And it's just the beginning."

show 4 replies
loudmaxtoday at 2:41 AM

For me it was torrenting a 7G ball of weights leaked from Meta and running alpaca.cpp (an early variant of llama.cpp) on my desktop computer in early 2023. I started asking it questions about the Roman empire and it answered me in English! The responses were generally incorrect, but no worse than what your average American college student might guess at, though delivered with much more confidence.

This was my desktop computer responding to questions in English, not some fancy server in a massive Google data center. Who cares if what it says isn't reliable? Being able to converse with my CPU in English is like having a conversation with a dog!

show 2 replies
monuszerotoday at 4:10 AM

We had a monthlong sprint adding robot motion planning features to our codebase years ago, and I was never satisfied with the result. As a small team wanting to leverage oss we vendored in OMPL, did the usual thing around caching and roadmap management. I knew there was a way to parallelize some of the algorithm we were using with simd or a gpu kernel, plenty of that in the literature, but it was never worth fighting CUDA or metal/accelerate or whatever for uncertain gains.

So when cooking dinner one night, I set opus 4.6 on a from-scratch native and accelerated roadmap planner implementation (after previously porting IK, FK, collision checking with some success) I had primed it by having a research agent drop a literature review in its docs folder covering the type of planner we needed. By the time the pasta water was boiling it was done- getting plans in a few hundred ms compared to several of seconds on our good old fashioned OMPL code.

For me it was the revelation that the economic value of cooking dinner could be compared to tackling an honest two weeks of coding work. The calculus has shifted - work that was once a risky or extravagant use of time is now worth considering.

For a small team who wants to focus on substance rather than implementation, knows what they want, and how to set up the agent for success, it’s a complete game changer in terms of what we can take on. Incumbents beware

SubiculumCodetoday at 5:58 AM

For me it was right at the beginning. They said it was a dungeon game. It would describe a room, etc, and I would take some action. But I thought that this dungeon was built in some intricate database. But then I told it that I wanted to leave, got to an inn, where I flirted with the bar waitress, and soon we were watching the sunset in some meadow. As cheesy as that was, it was then that I went "oh shit" this is a machine that can respond to language with language in a way that simulated actual understanding and intelligence, concepts and schema, and everything else, and I knew then that the world would never be the same again. People here talk about the crazy things they solved with AI, and I get that...but the first time I actually talked to a machine and didn't feel like it was either random gibberish or scripted, but dynamic and responsive. The first alien I ever met, and he knew my language.

show 3 replies
AussieWog93yesterday at 11:07 PM

Literally just last night I have Claude Code the following prompt, verbatim:

"Whenever I launch Kodi on my Chromecast 4k, it crashes. I think this is related to a plugin or skin. It goes away for a bit if I clear cache but will eventually come back. Can you connect to the device via adb (I've run adb connect already), and debug exactly where it's crashing? Once you've done that, propose a solution. If this requires downloading, fixing, rebuilding and then uploading the broken extension via adb, don't be shy. I should have Android dev tools (Gradle etc.) on this Mac."

Lo and behold, without human intervention, it pinpointed the crash, downloaded the Kodi source, patched out a bug that had existed since 2016, recompiled it, signed it, then pushed it to my Chromecast all while carefully making sure to keep all my settings intact.

Got it to make a PR too (which is as of this moment unpublished; going to test more over the coming weeks).

show 3 replies
raesene9today at 2:39 PM

The one I remember most is, when experimenting with Opus 3.5 for the first time, I asked it to generate a Firecracker backed local VM creation and management tool, something I'd wanted for a while but not found.

My expectation was that it might get something barely functional but would probably fail, and instead it generated a working piece of software which achieved a lot of what I wanted.

That definitely made me realise that, for at least some classes of software task this was a major change in how things could be done.

More recently when I can give the model a Local Privilege Escalation PoC in Linux and ask it to test whether it can be used for container breakout and then generate a working container breakout, all in one prompt... that definitely changes things.

evdubsyesterday at 8:01 PM

I tried to see if an LLM service provider could rewrite some legal docs where nothing was hallucinated in order to follow a consistent format to see what may be missing in the document. It could do that.

Next, I wanted to see if this could be done with a local LLM. Gemma-4 handles this fine with an 8GB video card and a large context (128k).

Next, I wanted to see if the model could also OCR these docs and translate them. The same model can handle that quite well.

This was when I realized LLMs should be great for handling work where:

- I already know what I want to do

- I already know how to do it

- I don't think this task will help develop skills I find to be valuable

- If I have to do it manually myself, I will probably cut corners

So now I view LLMs through the lens of, "what work can I send to an LLM that I otherwise would not really care about doing."

show 3 replies
ozgungtoday at 4:04 PM

For me it's not about the capabilities but what they can be used for. Think of all the recent drama between Anthropic and the Department of War. A real wake up call (especially if you are not a US citizen). Proves that AI is essentially a Surveillance and Warfare technology (which justifies the big valuations).

Or see this simple and fun site: https://hn-wrapped.kadoa.com

AI automatically analyzes all your social media posts in your life and can generate a pretty accurate profile about you in a second. We have no privacy anymore. Social media sites like Reddit already do that for moderation. Others do for more sinister reasons.

Note that Profiling is illegal in many countries. But laws can't protect us anymore.

Yes, it was always possible to that manually. But with AI it's so easy, fast and accurate to do in large scales. A hacker having access to your computer, reading your mails and messages is one thing. An AI reading and analyzing all your mails, messages and data is something different. Doing this for whole demographics (Cambridge Analytica style) is at another level.

tempoponettoday at 11:24 AM

I can actually use and enjoy Linux. The "year of the desktop" never came for me, but instead I got the "year of the cli".

For 20 years I've used Linux in one form or another, but I've felt like I was kneecapped for the most basic things. Just trying to plug in an external drive or a second display meant hours of stack overflow and pasting commands I didn't understand.

Now I'm using several Linux machines for Steam, NAS, local LLM, development, and what used to derail a weekend project now amounts to a coffee break while Claude figures it out.

show 2 replies
kstrauseryesterday at 9:11 PM

I have a large token budget as part of my work. A coworker was scanning some repos for vulnerabilities as a test. He found a scary looking remote exploit in a popular project and shared it with me for a second opinion. I spun up a local instance of the project and ran the POC against it: nothing. Turns out it needed some configuration knobs tweaked to lower some security protections.

So I told the AI what happened, and asked it to fix the POC so that it would work with the default configuration. It chewed away at that for a few minutes until it cheerfully patched the POC into a weaponized version. I ran it. The local instance, which I had just downloaded, compiled myself, and launched with the default config file, immediately crashed.

I got the cold sweats. I've read this novel. I've seen this movie. Wow. I have a blinking cursor on the console of a nuclear information bomb. I tossed and turned all night, got about half an hour of actual sleep, and probably looked like I'd seen a ghost at work the next day.

On the plus side, it gave our team some very clear ethical and moral guidance: we're going to do this, and we're going to share our findings with the relevant authors, because we can. Because I want to live in a world where the good guys are trying to fix problems before the bad guys can find them, I decided to help build that world. It was like, well, I guess this is what I'm doing now.

show 1 reply
UncleOxidanttoday at 1:41 AM

I guess I've had several of those moments over the last year and a half. But a recent one was that I was working with Claude to create a spiking neural net MNIST classifier in an FPGA for a demo. Claude took it from concept to PyTorch, to training (training a Spiking neural net isn't necessarily straightforward - that's a whole post in itself, but Claude came up with a working solution), and then to implementation in Verilog and through synthesis into the FPGA. I asked Claude to create a drawing app to run on the PC side that would allow the user to draw a digit with a mouse and then click a classify button. The data from the digit drawing app was to be transferred via USB to SPI to the FPGA. I didn't have a SPI adapter yet (it was on order from Adafruit) so I asked claude to let me communicate with the simulated verilog code running in the Verilator simulator, through a virtual SPI interface. Then I went to lunch. I came back to see the digit drawing app displayed on the monitor. I drew a '2' and it classified it as a 2. In another window I could see the Verilator simulator running and the data being passed. Chills.

alexfooyesterday at 11:03 PM

Someone in the house pressed the button to update the printer (Brother DCP-L3550CDW) firmware and the CSV page that was the basis for an existing Prometheus exporter (drum/toner lifespan, page counts, etc) stopped being a thing. Instead there was an HTML page with all of the information buried in various divs/etc.

I'd planned on writing something myself to parse the HTML and write a suitable exporter but I thought I'd give Claude a chance.

In a sandboxed VM I gave Claude a single static HTML file of the status page from the printer, also in the directory was the equivalent of "hello world" in Go, literally just the minimum needed to do `fmt.Printf("OK\n")`. The directory was called `brother-exporter`. That was it. No other instructions or information. I hadn't told it what it needed to write. I hadn't said what it should do. I hand't told it what language it was supposed to use.

Just by doing a `/init` in that directory Claude decided that it needed to write a Prometheus exporter in Go that would fetch and parse the HTML file from a printer (defaulting to 192.168.1.1) and then present the associated metrics in a way that they could be scraped by Prometheus.

It did this flawlessly in about 10 minutes.

I could have done it in several hours but this was definitely an "oh shit" moment for me. I think the biggest thing was the fact that it guess/assumed so much (correctly) from so little information in the beginning.

mindcrimetoday at 12:08 AM

I don't remember one specific moment, but I was fairly impressed with ChatGPT from the first time I started interacting with it. Was I ready to call it "AGI"? No, absolutely not. But it was clear that it was something new, and it was also intuitively obvious to me that "this AI is as bad today as it will ever be" and that predicting the rate of change would be difficult.

The more I use these things, the more I'm 100% convinced that it makes sense to say they are "intelligent" (for some meaning of "intelligent"). AGI or "human level intelligence"? Still no[1]. But some kind of intelligence. And I'm quite happy to allow that there can be "intelligence" that doesn't work anything at all like human intelligence, so arguments of the form "this isn't real intelligence", etc, etc. carry very (very) little weight with me. I've actually been sitting on a half written blog post on this very topic for a while, titled "The Marquee Sign Says 'Artificial' Intelligence"[2]. Finding time to finish it has been the challenge.

And before somebody says "Use AI to write it for you". Nah. I am generally what you might call "pro AI" and / or an "AI enthusiast" but I still draw lines. I'll use AI for research, for outlining, for brainstorming, etc. sure. But I have a hard-line stance against letting AI fundamentally write for me. I want anything that goes out with my name associated with it to have my genuine voice.

[1]: I like the term "jagged intelligence" that Demis Hassabis has been using. That is to say, the bounds of the intelligence are jagged or spiky: very intelligent in certain areas, much less so in others.

[2]: for any old-skool pro-wrestling fans, yes, that is an intentional nod to "Double A" Arn Anderson and his "The marquee sign says 'wrestling'" catchphrase. :-)

chaoxutoday at 9:54 AM

I'm a researcher working in theoretical computer science. Chatgpt found a counterexample of some conjecture I've been trying for 2 years. Also, it one shot many problems I've worked on. It also improved some of my work greatly.

I feel quite useless in the sheer brutal proof writing, counterexample generating skill chatgpt is demonstrating, and wonder what would be the future of my profession.

show 1 reply
aswegs8today at 4:44 PM

Kind of peculiar and memorable story for me.

I was on the couch on my Nintendo Switch, playing around with ChatGPT 3 and asked it where to find a specific item in Zelda Breath of the Wild. When it provided a coherent answer I was just dumbfounded. To be fair, the answer was semi-hallucinated but partly true. But it made me realize what kind of breakthrough it must be for some program to provide an answer to this without searching external sources (which it couldn't do yet). Such a small data point, like a drop in the vast sea of human knowledge space.

Prompted me to do some back on the envelope calculation. The weights of this model were a few hundred GBs. I just realized what kind of quantum leap it was to compress this seemingly infinite knowledge space into a few GB of weights.

show 1 reply
mlmonkeyyesterday at 8:56 PM

I have a buddy who's a consultant. His niche area is Netsuite and Oracle (I think). He's an accountant by training and as a consultant his gig was setting up these instances for clients, charging them an arm and two legs. He'd spend a lot of time golfing, and doing these setups was more than enough money for him. In other words, he had cornered that little slice of the market and was making bank.

Shortly after ChatGPT 2.2(?) came out and hit mainstream, I was chatting with him (I was excited af about the possibilities of AI). He tried to pop by bubble by saying "I bet it can't do what I do for my job!".

So I decided to test it out. We went home and I pulled out my laptop. Went to chatgpt.com and then I asked him to enter the specifications of what Netsuite configuration he wanted. So he proceeded to type in the description of what he wanted, the various settings, configurations, etc. i.e., the specs that he typically gets from his clients. And asked it to give him the commands to set it up.

Lo and behold. ChatGPT came back with a series of commands that he needed to run; the options he needed to configure, etc.

He was crestfallen. "Those are the exact commands I run!"

Luckily for him he recovered. He has since settled on a small stable of clients, all privately held companies whose owners he knows and between them he makes enough to keep his golfing hobby fed.

show 1 reply
vitorbaptistaayesterday at 11:22 PM

I am the CTO of a small NGO (10 people total, only 1 other junior Dev at the time). We supported two apps that were built by consultants. They were a mess. NextJS, React, about 4 micro services for a site that had 50 users per WEEK.

I configured a devcontainer with the old codebase and an empty repository and asked Claude to rewrite it as an old school server side rendered Django app.

Went to sleep. When I woke up it was 80% done. Spent another couple days prompting and reviewing and reached feature parity.

A bit later did the same with the other app.

Now both are deployed, reduced the server costs, complexity, and are orders of magnitude faster.

Without AI agents we wouldn't be able to do so (as usually is the case with tech debt).

AI is amazing for small organisations!

show 1 reply
adamkftoday at 4:25 PM

I'll give you two:

The first was when I first realized that I could tell codex to use gdb to debug a core dump. This was about a year ago, so it made a bunch of incorrect theories, but it enabled me to go much further than I would have been able to go by myself. I eventually solved the problem.

The second was when I decided to ask it about my Linux Wi-Fi issue that I had been having for several years. The computer would infrequently have multi second pings and dropped packets, then go back to normal. I thought it was due to the weak signal, but after describing the problem to codex, it immediately disabled power management on the Wi-Fi interface (this is a desktop computer, so I don't care much for that anyway) and the problem has never come back. I had been dealing with this for years, and I had tried searching for a solution before, but codex just solved it directly.

simonwyesterday at 8:19 PM

ChatGPT Code Interpreter back in ~March 2023. I uploaded a CSV file (of police incidents in San Francisco) and watched it load that into Pandas, show me some charts, then export the data to a SQLite database file for me to download.

I write software for data journalists and this new thing appeared to be able to do everything I wanted my software to do just as an unplanned side effect of having the ability to run Python against a folder with some uploaded files in it.

With hindsight it was my first exposure to a coding agent, but we hadn't named the category at that point.

show 1 reply
robkamtoday at 3:57 PM

My skepticism turned into a realization when I first asked an LLM to write anything nontrivial, and it just breezed through it. I am curious why many projects mentioned here seem to take people only a few hours or a weekend at most. I have been using LLMs to help rewrite the Ytree file manager originally written in nineties C. While the AI enables creating code of this complexity, the project still demands months of persistent effort.

dangyesterday at 7:25 PM

(1) Watching it do log file analysis in seconds that would have taken me hours (edit: days really), and which I would therefore never have done in the first place.

(2) Helping me with optimizations that I had been putting off for years because they involved learning curves that I never had time to take on.

(3) Tracking down bugs in code, especially race conditions and other concurrency issues, that were otherwise baffling.

(4) Finding information that I had been unable to find using Google searches (e.g. https://news.ycombinator.com/item?id=42653136).

There have been others, but those are what come to mind - perhaps because, in each of these cases, it made something happen that would otherwise never have happened - not because it was impossible, but because the time and effort required was prohibitive.

show 2 replies
binarysolotoday at 7:46 AM

I run a remote-first ecom business with a dozen or so team members.

About a year ago, one of our account managers had a life issue, ghosted us, and she held a fairly critical role in the business and gate-kept a bunch of knowledge to some high value vendor accounts.

Because we ran our ops in Google Workspace, we essentially had off-the-shelf RAG and was able to get answers to a lot of things by asking Gemini to go through all her emails/docs/calendar/meetings, reverse engineer what she did, and create an onboarding doc for her successor.

This happened once more a few months later when one of our analysts broke his wrist on vacay, and we were again able to replicate what they did to cover for their absence, this time dabbling in AI agents ("gems") to do a bunch of the regular simple tasks and again it covered things without too many issues.

I def expect Amazon/shopify to at some point replace all of us brand owners with AI bots if they can, but we'll see how long the gravy train goes on.

show 1 reply
PopePompusyesterday at 10:36 PM

I had an old astronomy app I wrote for pre-iPhone app store era Nokia phones (N900 etc.). I decided to get Claude code recreate it as an Android app. The old app produced several display pages for things like the positions of the planets. I was having Claude code recreate the app display page by display page, describing the display that should be produced, with no reference at all to the original app's code (or even its existence). After having it reproduce several pages, it added another one unprompted. The page it added was in the original app, but I had not gotten around to adding it to the Android app. The Nokia app's code is still on github, and somehow Claude must have made a connection between what I was asking it to code (without ever mentioning the Nokia app) and my github repository's Nokia code. It correctly implemented the page without me even mentioning the missing page. My jaw hit the floor.

fulafeltoday at 10:22 AM

When I realized they're going to be largely powered by increased natural gas use in the USA, neatly combining with our biggest problem so far (the climate catastrophe).

terntoday at 3:59 AM

Opus 3.x building me a productivity system with Obsidian MCP originally.

Next was discovering "create a mathematical model of the problem and derive the solution as a result" type prompts.

But, the real "oh s**" was a longer process of spec'ing a compiler/runtime for real-time DSP (with a lot of novel ideas) and it actually working.

My sequence was: (1) if helps me understand myself, (2) if helps me put together good ideas, (3) it can generate novel ideas given the right inputs, (4) it can build useful tools on my machine, (5) it can compound good ideas into better and better ideas with repeated passes, (6) it can build significant, ambitious machinery that's way beyond my ordinary capacity.

Current frontier: it can compound large codebases into better and better machinery with repeated passes.

The key thing I track is whether I'm running a process that converges and compounds or whether I'm spinning in place / diverging.

show 1 reply
tliltocatltoday at 4:28 PM

Still haven't had one. It is impressive, it is sometimes useful, it will be insightful (once the smoke settles), it is nowhere close to become self-improving world-as-we-know-ending ultimate solution to every problem it is being sold as. And much of the progress we have seen so far relied on tons of natural data being available thru the Web. After LLM killed SO, where would we get the answers to train LLMs on?

djfergustoday at 2:46 AM

I had an old 1st gen Amazon Firestick in a drawer for years, it had updated to the latest software and there were no public root exploits.

I spent a day bouncing between Claude and Codex and they researched, downloaded kernel sources, tried exploits and eventually got root via "FBUF/VCHIQ kernel zero-write primitive to patch live kernel memory". I was able to make the root permanent, debloat the amazon apps, downgrade the firmware etc.

It was amazing to watch and made me excited for the future where more hardware (old and new) will be available for repurposing.

show 1 reply
CompleteSkeptictoday at 1:25 AM

I helped train some of the first "magic" models at OpenAI[1] and it was a wild ride. We were a pretty sane + skeptical team and we weren't totally convinced the models were as general as they seemed, but the query that convinced me (and later got included in the paper[2]) was "Why is it important to eat socks after meditating?" (something that almost certainly did not appear on the internet before).

An interesting follow up would be when did you realize GenAI wasn't as good as you thought in that "oh shit" moment

[1] co-author of InstructGPT/RLHF/ChatGPT

[2] https://arxiv.org/pdf/2203.02155

show 1 reply
bonoboTPyesterday at 8:46 PM

The big one was definitely ChatGPT upon release in 2022 and specifically when people showed how it can role play as a Linux terminal and you can narrate events like "the data enter is now on fire" and "run" nvidia-smi, it would show high temps on the gpus etc. Or you could "explore" the homedir or some famous person. It convinced me that if it can understand so well how terminals work, tool use and agents are around the corner.

Then Opus 4.5 convinced me that this has finally arrived. In 2022 I expected things to arrive faster actually, in 2023-2024. I expected we'd have much more realtime collaborative integrations with AI including GUI computer use. Maybe in 1-2 years.

For images, it was nano banana where I realized AI images can truly work, and all these adhoc issues like hands and limbs, or "it will never do horse riding a astronaut" were temporary. It's now clear that making feature length films is within reach. Not in one go but with an agent orchestrating, designing a screenplay, characters, shots etc and generating those. Whether the result will be worth watching or a flat story on the high level is another question. But it will be a "film" for sure.

show 2 replies
jmkniyesterday at 7:55 PM

Not coding, but reading logs.

I was trying to figure out a nightmare bug that only happened in production and Claude code was able to connect to Google Cloud and read the logs in real time

I recreated the bug in the UI and it was instantly able to see ion the logs what the problem was, then because it had the context of my whole codebase it was able to point me to the exact line of code causing the problem

That was certainly an "oh shit" moment

conartist6today at 11:50 AM

When LinkedIn filled up with 1000 copies of what seemed like the same exact post: 20 lines long, breathless, declaring humanity over.

I thought, "I will never let myself become a zombie like that. I am me. I am worthy of my own respect"

hgoelyesterday at 9:41 PM

I've had many, but a recent one was when I figured I'd try asking Claude for help with my attempts at learning to draw, specifically anatomy.

I uploaded one of my sketches and asked for feedback, expecting it to not be too useful, but it actually pointed out many issues that no one had ever pointed out to me, but perfectly explained some of the things that felt off to me. Out of curiosity I then also asked it to label the issues in the sketch. It wrote a python script with the coordinates to put everything at and labeled the sketch that way.

I'm still used to vLLMs not being that great at vision, so it was pretty surprising to get genuinely useful advice.

show 2 replies
tmalytoday at 4:34 PM

It was last Summer. I was at an AirBnB and the fire alarm system had a fault and kept beeping.

I took a picture of the panel and the AI was able to diagnose the issue and tell me how to temporarily disable the beeping sound.

I knew nothing about fire systems. I had the owner call a repair person the next day to resolve the issue.

Recently I was trying to find a matching stain for wood flooring in a house build in 1999. I uploaded a clear picture in bright sunlight and ChatGPT was able to search online and find a matching stain color. It presented me with ordering options and I got a quart delivered yesterday.

I have been working on my own variant of OpenClaw written in go. I got the voice mode wired up a few weeks ago and it just started having a conversation with me. My wife freaked out and was asking who was talking to me.

mboyesterday at 8:20 PM

Look, not to brag but DALL-E's "armchair in the shape of an avocado" was mine (https://openai.com/index/dall-e/). I remember trying to convey the gravity of this capability to my friends at the time, who I guess were not as impressed as me.

show 1 reply
dannyobrienyesterday at 8:47 PM

I got early access to the pre-ChatGPT OpenAI API (actually by pinging someone from OpenAI who posted about it on HN). At work, we were setting up to play a livestreamed JackBox game for a charity event. This would have been in 2019.

In a previous life, I'd been a writer for the original You Don't Know Jack game (the UK variant), where the job was to crank out as many funny quips about a topic as you could, and then use a handful of them in the recording of the game itself. Some of the later JackBox games are like that, but for the players -- you're given a set piece, have to come up with little funny improvisations within a time limit.

As an experiment, I tried the set-up lines with the OpenAI API, and see whether it could come up with some responses. Of course, 90% of them were unfunny or incoherent, but 1/10 were not bad, or even pretty good.

I'm not sure that would have been impressive to anyone else -- but remember, I'd had this as a job, and sat in a writer's room, where everyone did this, for hours. In that environment, you expect a large proportion to be duds: the discipline is keep pumping them out, and not flagging creatively until you find a rich vein. I realised that this was a tool that would have been the perfect complement to that work -- and it was a pretty good JackBox player too.

show 1 reply
ddxvtoday at 4:45 AM

Most of the time using LLM generated code the feeling is "Oh Awesome!"

My "Uh Oh" feelings are weeks later when I realize there is a subtle bug in what the model presented as test passing "awesome" that I didn't read closely.

The biggest uh-oh is when I get lazy and let it modify multiple files and make many changes at once, and YOLO because I didn't fully understand what it did. I can usually get away with that for frontend, but for data manipulation tasks if I don't understand it, it's likely not what I wanted and I'll be back again in weeks or more trying to figure out what changed.

That's more or less what life was before LLMs and copy pasting from StackOverflow. Most of the time if I didn't fully understand something, I knew I had to eventually get back to it to grok what changed before committing.

Now with LLMs the 'copy pasting' is much faster and handles boilerplate super well letting me focus on edge cases.

vishvanandayesterday at 10:44 PM

For me it was earlier this year when I started dusting off some old stalled projects and had an agent work on them. In a few days I:

* Built a clone of the Alpha Zero implementation[1] my team built at oracle

* Ported my hobby NES emulator from javascript to rust[2] (this actually took less than 30 minutes and worked on the first try)

* Implemented all of the lessons from the C++ Grandmasters Challenge (which eventually led to a complete c++ compiler[3])

The thing that flipped the switch was using it to build things that I actually put sweat-equity in to previously. I knew how hard these things were to build, so it landed in a way that other projects had not.

[1]: https://medium.com/oracledevs/lessons-from-implementing-alph...

[2]: https://github.com/vishvananda/popeye

[3]: https://medium.com/@vishvananda/i-spent-2-billion-tokens-wri...

rerdaviesyesterday at 8:45 PM

Working on a Spice compiler to convert schematics for classic guitar pedals into real-time executable code.

I provided a reference to a The Spice Manual 2nd ed. a page number and an equation number, and asked Claude to implement it (not really expecting it to succeed).

It proceeded to implement not only the equation, but the calculation of the Langrangian of the functio, another 30 lines below, which required taking symbolic partial derivatives for a not-at-all trivial function, and successfully figuring out which variable was which in the resulting matrix. The source material just said "Lagrangian of", and did not provide the partial differential equations. And then providing a comment that identified the page number and equation number in the source text for the "Lagrangian of" equation.

show 2 replies
amaranttoday at 7:10 AM

I had Claude build a private podcast station for me. It integrated with Gemini to create a script for the show, based on a topic of my choosing, each talking segment ends with a presentation of the next song, which is played via Spotify, and is selected to have some sort of tie-in with the previous discussion. A tts model generates audio files based on the script, and a playlist is generated to play local file audio segment, then Spotify track, then the next segment etc.

An AI made a program integrating with 2 other AI, it's AI all the way down! and the result is great! I'm learning so much by having my own private radio host speaking about topics that interest me.

show 1 reply
takeeyesterday at 10:51 PM

I was working on a science experiment (electromagnetics) with my 10-year-old kid that was going to be demonstrated at a science fair in his school. We ran into a hiccup with the experiment that we couldn't debug ourselves. I turned on Gemini live video call to help us root cause the problem. It was able to clearly articulate all the possible issues and eventually was successful in making our apparatus work as expected. Turned out the wire that I was wrapping around the screw had some insulation that was not scraped off well on the side it was connecting to the battery. Gemini was able to capture this detail even though my bare eyes could not. My kid and 2 of his friends were impressed not just by the experiment, but because the live audio/video back and forth we had with the AI was almost magical!

show 3 replies
t_seatoday at 5:22 PM

Was the early ChatGPT. Someone on the team showed off a poem about postgres in the style of the King James Bible. Totally blew my mind.

nrjamesyesterday at 8:41 PM

We were experiencing abnormally high electrical bills and I could not figure out what was happening, so I downloaded the granular usage data (15 min increments) from Duke Energy, explained what we had in our house and when we typically used those items (washer/dryer, EVs, etc), provided a rundown of our energy usage plan, then asked Claude to build me a Streamlit dashboard that would help us understand what was going on and predict what was going to happen over the next months. The dashboard had a few simple toggles a levers. Claude was basically able to one-shot this, knew how to manage the XML from Duke Energy, etc... In about 20 minutes of prompting, I had a very comprehensive dashboard that was extremely helpful not only in diagnosing that specific issue but also in helping us understand how to further lower our electrical bills.

show 2 replies

🔗 View 50 more comments