π± Currently, I am an engineer at OpenDataLab, focusing on Large Vision-Language Models (LVLMs) Data, particularly on document understanding and parsing.
π Recent Projects
π Here is my google scholar
β‘ Reach out to me: [email protected]
π± Currently, I am an engineer at OpenDataLab, focusing on Large Vision-Language Models (LVLMs) Data, particularly on document understanding and parsing.
π Recent Projects
π Here is my google scholar
β‘ Reach out to me: [email protected]
This is a repository for ACMMM22 paper "Exploring Effective Knowledge Transfer for Few-shot Object Detection"
Forked from Vision-CAIR/MiniGPT-4
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models
Python 1
Forked from haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Python 1
Forked from ultralytics/ultralytics
NEW - YOLOv8 π in PyTorch > ONNX > OpenVINO > CoreML > TFLite
Python