In theory, locally you'd use these where lossiness is acceptable for audio transcription and im...

mhitza • today at 5:42 PM • 0 replies • view on HN

In theory, locally you'd use these where lossiness is acceptable for audio transcription and image labeling (as simple examples).

In practice I haven't got around to building something around multimodality since I'm primarily using their text generation capabilities.

alt Hacker News