This reads like Claude wrote it (more than ChatGPT.) Interesting data but I am unsure how actionable it is. Are they suggesting, for example, that specific commit messages get scanner more closely? Why is CAN more severe than Intel? (It does worry me. I feel like bugs, of any sort, in car systems are terrifying.)
Not happy with the lack of statistical testing, some of the smaller differences in % could probably be coincidence
I'd also like to see this broken down for C vs Rust.
These smell like the kind of metrics that cause someone to feel informed and then to miss the forest for the trees. The kind of data for a "data driven" decision maker who will just invent a narrative to explain the numbers, and then do what they wanted to do all along.
The map is not the territory.
I'm not sure why this isn't included in the blog, but I was curious about the ratio between bugs and commits. Presented here are my calculations in order of total number of bugs:
Intel : 11.86%
[1] Independent : 2.27%
Red Hat : 9.74%
Linaro : 12.73%
Google : 12.78%
AMD : 9.70%
The above is based on the bug count table in the article.
[1] I combined the total bug count for independent and kernel.org because they are combined for the total contributions here, https://github.com/quguanni/kernel-archaeology/blob/main/scr...
This suggests that corporations are introducing significantly more bugs than independent developers. However, I have not done statistical testing on this nor have I recreated the numbers. If I had to speculate, I would assume that the analysis from the author was partly vibe-coded or they purposely left this analysis out due to fear of retaliation. Extending my speculation would also include that corporations are purposely introducing bugs out of malice such that there are backdoors available for them. The author mentions that there is no "corporate takeover" but perhaps there are more interesting conclusions to be found.
Bugs Georg, who is an outlier and should be excluded from the analysis.
> Half the kernel is still built by individuals: people using gmail.com, personal domains, or university emails. The "corporate takeover" narrative is overstated. Companies contribute heavily, but the kernel remains a genuinely collaborative project.
Isn't the assumption here flawed? Someone may be employed by a corporation but still use their gmail/personal domain/university domain. This needs to be cross-correlated against some secondary source of employment data to give a more accurate picture.