The document standardization problem you're describing maps closely to what we see in DeFi infrastructure, different chains, different data formats, no consistent standards, and existing tools breaking on real-world inputs. The "model agnostic base + market-specific fine-tuning" architecture is smart. Curious how you handle cases where the same lender operates across multiple markets with conflicting document conventions, does the model layer stay separate per market or do you find cross-market signal bleeding actually helps fraud detection?