multimodal vlm projects
drex 062225 experimental
drex 062225 exp (document retrieval and extraction expert) is a specialized fine-tuned version of docscopeocr, optimized for document retrieval, content extraction. built on top of the qwen2.5-vl.
model
•
june 2025
total downloads: 1.91k+
qwen2 vl ocr 2b instruct 191224
fine-tuned version of qwen2-vl-2b, tailored for tasks that involve messy handwriting recognition, optical character recognition (ocr), and latex formatting.
model
•
december 2024
total downloads: 233.13k+
virex 062225 experimental
virex 062225 exp (video information retrieval and extraction expert) model is a fine-tuned version of qwen2.5-vl, optimized for advanced video understanding, sense of reasoning.
model
•
june 2025
total downloads: 2.1k+
qwen2 vl ocr2 2b instruct 030325
a fine-tuned version of qwen2-vl-2b, tailored for tasks involving improved optical character recognition (ocr) and latex formatting compared to the previous version.
model
•
march 2025
total downloads: 5.29k+
docscope ocr 7b 050425 exp
post-trained version of qwen2.5-vl-7b-instruct, optimized for document-level optical character recognition (ocr), and accurate image-to-text conversion with mathematical latex formatting.
model
•
april 2025
total downloads: 2.89k+
multimodal ocr demo
multimodal ocr space for experimenting with the qwen2 vl ocr 2b, tailored for tasks involving improved optical character recognition (ocr) and latex formatting compared to the previous version. runs on nvidia h200.
space
•
december 2024
total visits: 63.3K+
callisto ocr3 2b 010425
fine-tuned version'3 of qwen2-vl-2b-instruct, specifically optimized for messy handwriting recognition, optical character recognition (ocr) and math problem solving with latex formatting.
model
•
may 2025
total downloads: 1.4k+
core ocr 7b 050325 preview
post-trained version of qwen2-vl-7b, optimized for document-level optical character recognition (ocr), and accurate image-to-text conversion with mathematical latex formatting.
model
•
april 2025
total downloads: 2.29k+