From the TFA: Document Intelligence (RAG): Ingest docs, ask questions by voice — ~4ms hybrid retrieval.
Seems pretty clear. You can supply documents to the model as input and then verbally ask questions about them.