Grab frames, lower res, classify, combine meta data. Write to sql

m3kw9 • today at 6:14 PM • 1 reply • view on HN

Replies

Not really. Grab frames, lower res, classify, combine metadata, transcribe the audio, convert those data (text, visual and audio) to embedding, save them over a vector DB and SQL DB. Which helped me to do semantic search, RAG, search using a screenshot of the video to find the exact the moment in the video plus search using an audio file as well. And other features unlocked with vector DB

alt Hacker News

Replies