Running a RAG system over 11M characters of classical Buddhist texts —
one natural defense against poisoning is that canonical texts have
centuries of scholarly cross-referencing. Multiple independent
editions (Chinese, Sanskrit, Pali, Tibetan) of the same sutra serve as
built-in verification. The real challenge for us is not poisoning but
hallucination: the LLM confidently "quoting" passages that don't
exist in any edition.
The multi-edition cross-referencing is a natural implementation of what the embedding anomaly detection layer does artificially; a poisoned document that contradicts centuries of independently verified canonical text would cluster anomalously against the existing corpus almost immediately. Your attack surface is genuinely different from enterprise RAG.
The hallucination problem you're describing is in some ways the inverse of poisoning. Poisoning is external content overriding legitimate content. Hallucination is the model generating content that was never in the knowledge base at all. The defenses diverge at that point. Retrieval grounding and citation verification help with hallucination, ingestion controls don't.