- large language models
- multi-modal large language model.
- Efficient large language models: A survey [paper] [repo]
- Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security [paper] [repo]
- MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (Arxiv 2024) [paper]
- A Performance Evaluation of a Quantized Large Language Model on Various Smartphones (Arxiv 2023) [paper]
- EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models (Arxiv 2023) [paper]
- AutoDroid: LLM-powered Task Automation in Android (Arxiv 2023) [paper] [code]
- Towards an On-device Agent for Text Rewriting (Arxiv 2023) [paper]
- LLM as a System Service on Mobile Devices (Arxiv 2024) [paper]
- Explore, Select, Derive, and Recall: Augmenting LLM with Human-like Memory for Mobile Task Automation (Arxiv 2023) [paper]
- LLMCad: Fast and Scalable On-device Large Language Model Inference (Arxiv 2023) [paper]
- LLM in a flash: Efficient Large Language Model Inference with Limited Memory (Arxiv 2023) [paper]
- PrivateLoRA For Efficient Privacy Preserving LLM (Arxiv 2023) [paper] [code]
- Efficient Streaming Language Models with Attention Sinks (ICLR 2024) [paper] [code]
- Efficient LLM inference solution on Intel GPU (Arxiv 2023) [paper]
- Accelerating LLM Inference with Staged Speculative Decoding (Arxiv 2023) [paper]
- Revolutionizing Mobile Interaction: Enabling a 3 Billion Parameter GPT LLM on Mobile [paper]
- MoE-LLaVA: Mixture of Experts for Large Vision-Language Models (Arxiv 2024) [paper] [code]
- MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices (Arxiv 2023) [paper] [code]
- MobileVLM V2: Faster and Stronger Baseline for Vision Language Model (Arxiv 2024) [paper] [code]
Open
- llama.cpp [code]
- alpaca.cpp [code]
- Qwen-1.8B [code]
- deepseek [code]]
- Phi2-mini-Chinese [code]
- Yuan [code]
- Baby-llama2-Chinese [code]
- TinyLlama-1.1B [code]
- MINI-LLM [code]
- ChatLM-mini-Chinese [code]
- alpaca-electron [code]
- FreedomGPT [code]
- Lumos [code]
- code-llama-for-vscode[code]
- AutoDroid [code]
- gradio [web]