logoalt Hacker News

psancheztoday at 6:23 AM11 repliesview on HN

This reminds me of a story from 15 years ago, where I was developing a technology to download games on demand by hooking into the OS calls.

There was a particular game that was superslow when this tech was applied. Original game loading took around 15-20 seconds, whereas once the tech was applied it took easily 3-5 min, even with all data already downloaded.

When I started digging into it, I realized the reason was the game was using something like

   fread(data, 1, 65536, fptr);
instead of

   fread(data, 65536, 1, fptr);
Which basically expanded back in the day to 65k reads of 1 byte for several MB file. Each fread translated to 65k reads of ReadFile Windows API. Since my code was hooking on ReadFile system call, and my call was heavier than ReadFile, the game loading felt really slow. Unusable. It would have not been fun for players.

The easy fix was to swap arguments for certain calls. The long fix required to use an internal cache to account for these cases so that the hooked ReadFile was faster when data was already in disk.

Funny thing is that as we started rolling out the tech and applying it to more and more games we realized lots of games did this. We went for the cache fix and games ended up loading faster than before. Honestly, games could have load all the data in a couple of seconds by just swapping the args. I'm guessing developers did this on purpose so that games seemed like they were loading a lot of stuff, although you never know.


Replies

Taniwhatoday at 6:57 AM

I used to be a graphics card/chip architect for macs in the early/mid 90s - our chips were the fastest, but some programs were resistant because they did stupid stuff: pagemaker invalidated the font cache every time it went thru its main loop, quark with ATM did an n*2 thing every time it wrote text etc etc. We had special hardware to accelerate text drawing and it did nothing because the software pissed it away. We considered creating a plugin that fixed all these things, it would have been hard to maintain, in the end we travelled around to the people who made these apps and talked them through their problems

To be fair excel would erase places white that it wanted to write up to 9 times before it drew any black pixels, we made that very fast! we didn't tell them :-)

At the time 24-bit framebuffers were so slow that before we built graphics acceleration hardware people would switch back to 8-bit to get stuff done, making 24-bit/true colour your daily driver was a big step forward.

show 4 replies
Xirdustoday at 2:34 PM

Reminds me of the "community patch" to GTA Online from a few years ago. The game was plagued by 10+ minute loading times. The situation remained for years and only got worse with time. Some hacker figured out that the game spent 80% of loading time reading the in-game store listing file. The file was tens of megabytes IIRC, and it literally used the Schlemiel the Painter's Algorithm - for each entry, start reading from the beginning byte after byte. The hacker made a tiny patch that made it remember where it found the last entry. This cut the total loading time by 80%, from over 10 minutes to less than 3.

Edit: removed incorrect information.

show 1 reply
Someonetoday at 8:11 AM

> Which basically expanded back in the day to 65k reads of 1 byte for several MB file. Each fread translated to 65k reads of ReadFile Windows API

What software did that that badly? If the code asks for (up to) 65,536 single byte items, why would you split that into 65,536 calls?

Also, that change changes behavior. The old call could read anything from zero to 65,536 bytes, the new one only can read zero or 65,536 bytes.

(Reading the source of a few implementations, I think most implementations will fill the output buffer with partial objects if the input doesn’t supply an integral number of them, but the return value of fread cannot signal that to the caller)

show 2 replies
mort96today at 9:58 AM

Wait, is that wrong? I always call fread as:

    fread(data, 1, sizeof(buffer), f);
with the rationale that I'm interested in reading sizeof(buffer) individual bytes. The buffer size is incidental, not the size of the items I'm trying to read from the file; "read one item whose size is sizeof(buffer)" seems semantically wrong.

Is this just the case of Windows having a bad stdlib fread implementation 15 years ago or is my thinking here actually wrong?

show 2 replies
fsfodtoday at 10:42 AM

Part of Windows Explorer actually does tons of tiny 4 byte ReadFile calls in to its tracking database like file when you delete a file. If you deleting lots of files this quickly adds up.

show 1 reply
Dwedittoday at 8:35 PM

Is this actually real? I thought fread just multiplied the two numbers together to compute a total size. Meanwhile, the Win32 API call ReadFile actually does do a separate system call if you call it multiple times.

somenameformetoday at 7:41 AM

Doesn't that break anything relying on the return value? fread gives you the number of objects read as a return. So I think a pretty typical thing would be to fread and then parse that number of characters, and that'd just break?

show 3 replies
gwbas1ctoday at 6:13 PM

> The long fix required to use an internal cache to account for these cases

That's because the OS does the same thing too. It's the right fix, when I implemented something similar, we implemented caching right away.

lukantoday at 8:09 AM

"I'm guessing developers did this on purpose so that games seemed like they were loading a lot of stuff"

I really hope that was not the case and rather think incompetence or to deal with obscure legacy problems, but the gamer in me gets enraged at the thought someone would artificially increase loading times.

dfoxtoday at 1:00 PM

The most important fix in SP1 for Office 2007 was fixing exactly that in Excel. Doing ridiculous amount of 4 byte reads made it basically unusable on network filesystems.

chadgpt3today at 10:11 AM

Why does your fread to anything other than multiplying the two arguments?

show 1 reply