
lucataco
/
kosmos-2
Grounding Multimodal Large Language Models to the World
Want to make some of these yourself?
Run this model