The model has 3B active parameters. We put the code, homepage, paper and model links here:
- Code: https://github.com/bytedance/Lance
- Homepage: https://lance-project.github.io/
- Paper: https://arxiv.org/abs/2605.18678
- Model: https://huggingface.co/bytedance-research/Lance
p.s. Lance is a research project, not a polished product. The model was trained using fewer than 128 GPUs.
Any plans to port to sglang or vLLM?
Great quality, forked and going to try
Seems like the video output is crippled. Resolution is low (720 or so), as is the frame rate. The samples are shown up-scaled and frame-interpolated.
Why do that? Seems strange to be building sub-hd resolution video models in 2026.
Nice work. Wish they had picked another name given how popular lance/lancedb is.
Imagine having virtually unlimited compute and programming resources, and silly little slop videos is the result.
Fabulous.
[flagged]
Video understanding is kind of new, especially if done well, and hopefully working well with UI and UX, that'd be great. Current agents already struggle a bit with 2D space with normal screenshots of unconventional UIs, wonder if this model would do better with actual recordings of navigating and using applications, feels like it could help a bunch with understanding UX at least hopefully. Will be fun to play around with :)