A fun thing I do with Qwen 3.5 0.8b is to take a screenshot of the Hackernews homepage and ask it to give me a JSON representation of the data and it does surprisingly well. With a well structured prompt I think it could be made to be pretty reliable tool for that type of task out of the box.
While a fun poc, surely it would be better to just use the API (see the footer)? Or just `curl | x2j | jq` and map the HTML directly to JSON?