References

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models https://arxiv.org/pdf/2401.15947.pdf

Routers in Vision Mixture of Experts: An Empirical Study https://arxiv.org/pdf/2401.15969.pdf

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Models https://arxiv.org/pdf/2401.16420.pdf

LLaVA-1.6: Improved reasoning, OCR, and world knowledge https://llava-vl.github.io/blog/2024-01-30-llava-1-6/

MouSi: Poly-Visual-Expert Vision-Language Models https://arxiv.org/pdf/2401.17221.pdf

https://github.com/vikhyat/moondream https://huggingface.co/LanguageBind/MoE-LLaVA-Phi2-2.7B-4e-384 https://replicate.com/yorickvp/llava-v1.6-mistral-7b https://qwenlm.github.io/blog/qwen-vl/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

03.02.2024.md

03.02.2024.md

References

Files

03.02.2024.md

Latest commit

History

03.02.2024.md

File metadata and controls

References