These guardrails are solely a reason for using your data for training purposes. Every flagged message can be used for training.
> We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We won’t use this data to train new Claude models, or for any non-safety-related purpose
Whatever problem we might have with them, they explicitly say that they do not do this in the launch post.
If they can train the classifier to have fewer false positives that would be great.
I'd expect that everything they see gets used for for training purposes (and data mining in general) regardless of if it's flagged or not. It'd take a whistleblower for you to ever find out either way.
this reasoning is inverted lol they would get a lot more information by letting you use it. so much weird drama around reasonable guardrails for an experimental model
If we're doing conspiracy theories what if fable is really dumb and not better than opus and the guardrails hide that nicely. Meanwhile the hype train keeps chugging.
This sounds backwards, any interrupted conversation becomes less useful for training.