Is it possible able to host your website in a way so that it couldn't be found via search engines (and thus wouldn't be crawlable I hope)?
I know this has repercussions on findability, but if that wasn't a concern, I'm curious how one might circumvent getting crawled.
Possible yes, probable not likely. The moment you're issued a certificate your domain will be shown in the Certificate Transparency logs which are constantly monitored from anyone who wants to find new sites.
robots.txt is a way of leaving the door unlocked but kindly asking bots to stay outside.
You could just put your website content behind its own chat interface. The crawler would just see a form input for a prompt.
If you really wanted and are interested in doing so and perhaps are even happy with just text and normal styling limitations, I recommend you to test out other protocols like creating a gemini website or gopher website. I don't think that scraping happens on even remotely the same scale there as compared to conventional websites
That being said you would require your user to download a compatible browser for gemini/gopher.
Sure, depends on how accessibly to people you want it to be.
Most legit search engines are going to honor robots.txt and you can disallow access.
Next level would be using something like rate limiting controls and/or Cloudflare's bot fight mode to start blocking the bad bots. You start to annoy some people here.
Next would be putting the content behind some form of auth.