logoalt Hacker News

puppionyesterday at 9:27 PM1 replyview on HN

Really nice introduction. Two things stood out to me that I think set this apart from the dozens of "intro to PyTorch" posts out there:

1. The histogram visualization of the different tensor initialization functions is a great idea. I've seen so many beginners confused about rand vs randn vs empty, and seeing the distributions side by side makes the differences immediately obvious. More tutorials should lead with "the best way to understand is to see it."

2. I appreciate that the article is honest about its own results. A lot of intro tutorials quietly pick a dataset where their simple model gets impressive numbers. Here the model gets 18.6% MAPE and only 37% of predictions within 10% — and instead of hand-waving, the author correctly diagnoses the issue: the features don't capture location granularity, and no amount of architecture tuning will fix missing information. That's arguably the most important ML lesson in the whole piece, and it's buried at the end almost as an afterthought. "Great models can't compensate for missing information" is something I wish more practitioners internalized early.

The suggestion to reach for XGBoost/LightGBM for tabular data is also good advice that too many deep learning tutorials omit. Would love to see a follow-up comparing the two approaches on this same dataset.


Replies

0bytemattyesterday at 11:53 PM

Thank you so much. Really appreciate the thoughtful feedback!

I've watched many intros. Somehow they always end with 90%+ accuracy and that was just not my experience while learning on datasets I picked myself. I remember spending hours tuning different parameters and not quite understanding why I was getting way worse accuracy. I showed this intentionally, and I'm glad you commented on this!

The XGBoost comparison is a great idea.