It also gets confused if the entire prompt is in a text file attachment.
And the summarizer shows the safety classifier's thinking for a second before the model thinking, so every question starts off with "thinking about the ethics of this request".
I'd get confused if I was a LLM and you put my entire prompt in a text file attachment. I'd be like, "is this the user or is this a prompt injection??"
I'd get confused if I was a LLM and you put my entire prompt in a text file attachment. I'd be like, "is this the user or is this a prompt injection??"