A month ago, I went on a performance quest trying to optimize a PHP script that took 5 days to run. Together with the help of many talented developers, I eventually got it to run in under 30 seconds. This optimization process with so much fun, and so many people pitched in with their ideas; so I eventually decided I wanted to do something more.
That's why I built a performance challenge for the PHP community
The goal of this challenge is to parse 100 million rows of data with PHP, as efficiently as possible. The challenge will run for about two weeks, and at the end there are some prizes for the best entries (amongst the prize is the very sought-after PhpStorm Elephpant, of which we only have a handful left).
I hope people will have fun with it :)
Pitch this to whoever is in charge of performance at Wordpress.
A Wordpress instance will happily take over 20 seconds to fully load if you disable cache.
5 days to 30 seconds? What kind of factor/order of magnitude is that damn
What takes 5 days to run
> A month ago, I went on a performance quest trying to optimize a PHP script that took 5 days to run. Together with the help of many talented developers, I eventually got it to run in under 30 seconds
That's a huge improvement! How much was low hanging fruit unrelated to the PHP interpreter itself, out of curiosity? (E.g. parallelism, faster SQL queries etc)
Hehe. Optimization ... it's a good way to learn. Earlier in my career I did a lot of PHP. Usually close to bare.
Other than the obvious point that writing an enormous JSON file is a dubious goal in the first place (really), while PHP can be very fast this is probably faster to implement in shell with sed/grep, or ... almost certainly better ... by loading to sqlite then dumping out from there. Your optimization path then likely becomes index specification and processing, and after the initial load potentially query or instance parallelization.
The page confirms sqlite is available.
If the judges whinge and shell_exec() is unavailable as a path, as a more acceptable path that's whinge-tolerant, use PHP's sqlite feature then dump to JSON.
If I wanted to achieve this for some reason in reality, I'd have the file on a memory-backed blockstore before processing, which would yield further gains.
Frankly, this is not much of a programming problem, it's more a system problem, but it's not being specced as such. This shows, in my view, immaturity of conception of the real problem domain (likely IO bound). Right tool for the job.
Using a language that is 100x slower than naive native programs to do a "speed challenge" is like spending your entire day speed walking to run errands when you can just learn how to drive a car.
exec(‘c program that does the parsing’);
Where do I get my prize? ;)
A month ago, I went on a performance quest trying to optimize a PHP script that took 5 days to run. Together with the help of many talented developers, I eventually got it to run in under 30 seconds.
When people say leetcode interviews are pointless I might share a link to this post. If that sort of optimization is possible there is a structures and algorithms problem in the background somewhere.
Do not update the leaderboard.... at all.