diff --git a/index.md b/index.md
index 1eb04ad..98861f9 100644
--- a/index.md
+++ b/index.md
@@ -72,6 +72,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [HuggingChat](https://huggingface.co/chat/)                                                    | Making the community's best AI chat models available to everyone.                                                                                                                              |          |              |   Tool   |
 | [Hugging Face API Unity Integration](https://github.com/huggingface/unity-api)                 | This Unity package provides an easy-to-use integration for the Hugging Face Inference API, allowing developers to access and use Hugging Face AI models within their Unity projects.       |          |     Unity     |   Tool   |
 | [ImageBind](https://github.com/facebookresearch/ImageBind)                                     | ImageBind One Embedding Space to Bind Them All.                                                                                       |[arXiv](https://arxiv.org/abs/2305.05665)  |        |   Tool   |
+| [Index-1.9B](https://github.com/bilibili/Index-1.9B)                                           | A SOTA lightweight multilingual LLM.                                                                                                                                                            |          |              |   Tool   |
 | [InteractML-Unity](https://github.com/Interactml/iml-unity)                                    | InteractML, an Interactive Machine Learning Visual Scripting framework for Unity3D.                                                                                                            |          |     Unity     |   Tool   |
 | [InteractML-Unreal Engine](https://github.com/Interactml/iml-ue4)                              | Bringing Machine Learning to Unreal Engine.                                                                                                                                                    |          | Unreal Engine |   Tool   |
 | [InternLM](https://github.com/InternLM/InternLM)                                               | InternLM has open-sourced a 7 billion parameter base model, a chat model tailored for practical scenarios and the training system.   |[arXiv](https://arxiv.org/abs/2403.17297)  |     |   Tool   |
@@ -90,6 +91,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [LLaSM](https://github.com/LinkSoul-AI/LLaSM)                                                  | Large Language and Speech Model.                                                                                                                                                               |          |              |   Tool   |
 | [LLM Answer Engine](https://github.com/developersdigest/llm-answer-engine)                     | Build a Perplexity-Inspired Answer Engine Using Next.js, Groq, Mixtral, Langchain, OpenAI, Brave & Serper.                                                                              |           |             |   Tool   |
 | [llm.c](https://github.com/karpathy/llm.c)                                                     | LLM training in simple, raw C/CUDA.                                                                                                                                                            |          |              |   Tool   |
+| [LLMUnity](https://github.com/undreamai/LLMUnity)                                              | Create characters in Unity with LLMs!                                                                                                                                                          |          |     Unity    |   Tool   |
 | [LLocalSearch](https://github.com/nilsherzig/LLocalSearch)                                     | LLocalSearch is a completely locally running search engine using LLM Agents.                                                                                                                   |          |              |   Tool   |
 | [LogicGamesSolver](https://github.com/fabridigua/LogicGamesSolver)                             | A Python tool to solve logic games with AI, Deep Learning and Computer Vision.                                                                                                                 |          |              |   Tool   |
 | [Large World Model (LWM)](https://github.com/LargeWorldModel/LWM)                              | Large World Model (LWM) is a general-purpose large-context multimodal autoregressive model.                                |[arXiv](https://arxiv.org/abs/2402.08268)  |              |   Tool   |
@@ -104,6 +106,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [MLC LLM](https://github.com/mlc-ai/mlc-llm)                                                   | Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.                                                                                                      |          |              |   Tool   |
 | [MobiLlama](https://github.com/mbzuai-oryx/MobiLlama)                                          | Towards Accurate and Lightweight Fully Transparent GPT.                                                                         |[arXiv](https://arxiv.org/abs/2402.16840)  |              |   Tool   |
 | [MoE-LLaVA](https://github.com/PKU-YuanGroup/MoE-LLaVA)                                        | Mixture of Experts for Large Vision-Language Models.                                                                                |[arXiv](https://arxiv.org/abs/2401.15947)  |              |   Tool   |
+| [Moshi](https://www.moshi.chat/?queue_id=talktomoshi)                                          | Moshi is an experimental conversational AI.                                                                                                                                                    |          |              |   Tool   |
 | [MOSS](https://github.com/OpenLMLab/MOSS)                                                      | An open-source tool-augmented conversational language model from Fudan University.                                                                                                             |          |              |   Tool   |
 | [mPLUG-Owl🦉](https://github.com/X-PLUG/mPLUG-Owl)                                            | Modularization Empowers Large Language Models with Multimodality.                                                               |[arXiv](https://arxiv.org/abs/2304.14178)  |              |   Tool   |
 | [Nemotron-4](https://arxiv.org/abs/2402.16819)                                                 | A 15-billion-parameter large multilingual language model trained on 8 trillion text tokens.                               |[arXiv](https://arxiv.org/abs/2402.16819)  |              |   Tool   |
@@ -118,6 +121,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [Perplexica](https://github.com/ItzCrazyKns/Perplexica)                                        | An AI-powered search engine.                                                                                                                                                                   |           |             |   Tool   |
 | [Pi](https://heypi.com/talk)                                                                   | AI chatbot designed for personal assistance and emotional support.                                                                                                                             |          |              |   Tool   |
 | [Qwen1.5](https://github.com/QwenLM/Qwen1.5)                                                   | Qwen1.5 is the improved version of Qwen.                                                                                                                                                       |           |             |   Tool   |
+| [Qwen2](https://github.com/QwenLM/Qwen2)                                                       | Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.                                                                                                                |           |             |   Tool   |
 | [Qwen-7B](https://github.com/QwenLM/Qwen-7B)                                                   | The official repo of Qwen-7B (通义千问-7B) chat & pretrained large language model proposed by Alibaba Cloud.                                                                                    |          |              |   Tool   |
 | [RepoAgent](https://github.com/OpenBMB/RepoAgent)                                              | RepoAgent is an Open-Source project driven by Large Language Models(LLMs) that aims to provide an intelligent way to document projects.    |[arXiv](https://arxiv.org/abs/2402.16667)  |              |   Tool   |
 | [Sanity AI Engine](https://github.com/tosos/SanityEngine)                                      | Sanity AI Engine for the Unity Game Development Tool.                                                                                                                                          |          |     Unity     |   Tool   |
@@ -154,6 +158,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [AgentSims](https://github.com/py499372727/AgentSims/)                                         | An Open-Source Sandbox for Large Language Model Evaluation.                                                                                                                            |         |              |   Agent  |
 | [AI Town](https://github.com/a16z-infra/ai-town)                                               | AI Town is a virtual town where AI characters live, chat and socialize.                                                                                                                |         |              |   Agent  |
 | [anime.gf](https://github.com/cyanff/anime.gf)                                                 | Local & Open Source Alternative to CharacterAI.                                                                                                                                         |        |              |   Game   |
+| [Astrocade](https://www.astrocade.com/)                                                        | Create games with AI                                                                                                                                                                    |        |              |   Game   |
 | [Atomic Agents](https://github.com/KennyVaneetvelde/atomic_agents)                             | The Atomic Agents framework is designed to be modular, extensible, and easy to use.                                                                                                     |        |              |   Agent  |
 | [AutoAgents](https://github.com/Link-AGI/AutoAgents)                                           | A Framework for Automatic Agent Generation.                                                                                                                                             |        |              |   Agent  |
 | [AutoGen](https://github.com/microsoft/autogen)                                                | Enable Next-Gen Large Language Model Applications.                                                                              |[arXiv](https://arxiv.org/abs/2308.08155)  |              |   Agent  |
@@ -185,8 +190,10 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [Langflow](https://github.com/logspace-ai/langflow)                                            | Langflow is a UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows.                                                               |        |              |   Agent  |
 | [LARP](https://github.com/MiAO-AI-Lab/LARP)                                                    | Language-Agent Role Play for open-world games.                                                                                  |[arXiv](https://arxiv.org/abs/2312.17653)  |              |   Agent  |
 | [LlamaIndex](https://github.com/run-llama/llama_index)                                         | LlamaIndex is a data framework for your LLM application.                                                                                                                                  |      |              |   Agent  |
+| [Mixture of Agents (MoA)](https://github.com/togethercomputer/MoA)                             | Mixture-of-Agents Enhances Large Language Model Capabilities.                                                                   |[arXiv](https://arxiv.org/abs/2406.04692)  |              |   Agent  |
 | [Moonlander.ai](https://www.moonlander.ai/)                                                    | Start building 3D games without any coding using generative AI.                                                                                                                          |       |              | Framework |
 | [MuG Diffusion](https://github.com/Keytoyze/Mug-Diffusion)                                     | MuG Diffusion is a charting AI for rhythm games based on Stable Diffusion (one of the most powerful AIGC models) with a large modification to incorporate audio waves.               |       |              |   Game   |
+| [OmAgent](https://github.com/om-ai-lab/OmAgent)                                                | A multimodal agent framework for solving complex tasks.                                                                                                                                 |        |              |   Agent  |
 | [OpenAgents](https://github.com/xlang-ai/OpenAgents)                                           | An Open Platform for Language Agents in the Wild.                                                                                                                                       |        |              |   Agent  |
 | [Opus](https://opus.ai/)                                                                       | An AI app that turns text into a video game.                                                                                                                                             |       |              |   Game   |
 | [Pipecat](https://github.com/pipecat-ai/pipecat)                                            | Open Source framework for voice and multimodal conversational AI.                                                                                                                           |       |              |   Agent   |
@@ -198,6 +205,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [Translation Agent](https://github.com/andrewyng/translation-agent)                            | Agentic translation using reflection workflow.                                                                                                                            |       |              |   Agent  |
 | [Video2Game](https://github.com/video2game/video2game)                                         | Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video.                                             |[arXiv](https://arxiv.org/abs/2404.09833)  |              |   Game   |
 | [V-IRL](https://virl-platform.github.io/)                                                      | Grounding Virtual Intelligence in Real Life.                                                                                          |[arXiv](https://arxiv.org/abs/2402.03310)  |              |   Agent  |
+| [WebDesignAgent](https://github.com/DAMO-NLP-SG/WebDesignAgent)                                | An agent used for webdesign.                                                                                                                                             |        |              |   Agent  |
 | [XAgent](https://github.com/OpenBMB/XAgent)                                                    | An Autonomous LLM Agent for Complex Task Solving.                                                                                                                                        |       |              |   Agent  |
 
 <p style="text-align: right;"><a href="#table-of-contents">^ Back to Contents ^</a></p>
@@ -213,6 +221,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [Chapyter](https://github.com/chapyter/chapyter)                                               | ChatGPT Code Interpreter in Jupyter Notebooks.                                                                                                     |                                               |              |   Code   |
 | [CodeGeeX](https://github.com/THUDM/CodeGeeX)                                                  | An Open Multilingual Code Generation Model.                                                                                   |[arXiv](https://arxiv.org/abs/2303.17568)    |              |   Code   |
 | [CodeGeeX2](https://github.com/THUDM/CodeGeeX2)                                                | A More Powerful Multilingual Code Generation Model.                                                                                               |                                                |              |   Code   |
+| [CodeGeeX4](https://github.com/THUDM/CodeGeeX4)                                                | CodeGeeX4: Open Multilingual Code Generation Model.                                                                                               |                                                |              |   Code   |
 | [CodeGen](https://github.com/salesforce/CodeGen)                                               | CodeGen is an open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.                  |[arXiv](https://arxiv.org/abs/2203.13474)    |              |   Code   |
 | [CodeGen2](https://github.com/salesforce/CodeGen2)                                             | CodeGen2 models for program synthesis.                                                                                        |[arXiv](https://arxiv.org/abs/2305.02309)    |              |   Code   |
 | [Code Llama](https://github.com/facebookresearch/codellama)                                    | Code Llama is a large language models for code based on Llama 2.                                                                                    |                                              |              |   Code   |
@@ -250,6 +259,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | :------------------------------------------------------------------------------------------ | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-----------: | :-----------: | :-------: |
 | [AnyDoor](https://ali-vilab.github.io/AnyDoor-Page/)                                           | Zero-shot Object-level Image Customization.                                                                                     |[arXiv](https://arxiv.org/abs/2307.09481)  |              |   Image   |
 | [AnyText](https://github.com/tyxsspa/AnyText)                                                  | Multilingual Visual Text Generation And Editing.                                                                                |[arXiv](https://arxiv.org/abs/2311.03054)  |              |   Image   |
+| [AutoStudio](https://github.com/donahowe/AutoStudio)                                           | Crafting Consistent Subjects in Multi-turn Interactive Image Generation.                                                        |[arXiv](https://arxiv.org/abs/2406.01388)  |              |   Image   |
 | [Blender-ControlNet](https://github.com/coolzilj/Blender-ControlNet)                           | Using ControlNet right in Blender.                                                                                              |                                          |    Blender    |   Image   |
 | [BriVL](https://github.com/BAAI-WuDao/BriVL)                                                   | Bridging Vision and Language Model.                                                                                             |[arXiv](https://arxiv.org/abs/2103.06561)  |              |   Image   |
 | [CLIPasso](https://github.com/yael-vinker/CLIPasso)                                            | A method for converting an image of an object to a sketch, allowing for varying levels of abstraction.                          |[arXiv](https://arxiv.org/abs/2202.05822)  |              |   Image   |
@@ -261,6 +271,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [Dashtoon Studio](https://www.dashtoon.ai/)                                                    | Dashtoon Studio is an AI powered comic creation platform.                                                                                                                               |         |              |   Comic   |
 | [DeepAI](https://deepai.org/)                                                                  | DeepAI offers a suite of tools that use AI to enhance your creativity.                                                                                                                   |        |              |   Image   |
 | [DeepFloyd IF](https://github.com/deep-floyd/IF)                                               | IF by DeepFloyd Lab at StabilityAI.                                                                                                                                                    |          |              |   Image   |
+| [Depth Anything V2](https://github.com/DepthAnything/Depth-Anything-V2)                        | Depth Anything V2                                                                                                               |[arXiv](https://arxiv.org/abs/2406.09414)  |              |   Image   |
 | [Depth map library and poser](https://github.com/jexom/sd-webui-depth-lib)                     | Depth map library for use with the Control Net extension for Automatic1111/stable-diffusion-webui.                                                                             |          |              |   Image   |
 | [Diffuse to Choose](https://diffuse2choose.github.io/)                                         | Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All.                                          |[arXiv](https://arxiv.org/abs/2401.13795)  |              |   Image   |
 | [Disco Diffusion](https://github.com/alembics/disco-diffusion)                                 | A frankensteinian amalgamation of notebooks, models and techniques for the generation of AI Art and Animations.                                                                       |          |              |   Image   |
@@ -282,6 +293,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [InstantID](https://github.com/InstantID/InstantID)                                            | Zero-shot Identity-Preserving Generation in Seconds.                                                                            |[arXiv](https://arxiv.org/abs/2401.07519)  |              |   Image   |
 | [InternLM-XComposer2](https://github.com/InternLM/InternLM-XComposer)                          | InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.    |[arXiv](https://arxiv.org/abs/2401.16420)  |              |   Image   |
 | [KOALA](https://youngwanlee.github.io/KOALA/)                                                  | Self-Attention Matters in Knowledge Distillation of Latent Diffusion Models for Memory-Efficient and Fast Image Synthesis.                                                       |                |              |   Image   |
+| [Kolors](https://github.com/Kwai-Kolors/Kolors)                                                | Kolors: Effective Training of Diffusion Model for Photorealistic Text-to-Image Synthesis.                                                                                              |             |              |   Image   |
 | [KREA](https://www.krea.ai/)                                                                   | Generate images and videos with a delightful AI-powered design tool.                                                                                                                |             |              |   Image   |
 | [LaVi-Bridge](https://github.com/ShihaoZhaoZSH/LaVi-Bridge)                                    | Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation.                           |[arXiv](https://arxiv.org/abs/2403.07860)  |              |   Image   |
 | [LayerDiffusion](https://github.com/layerdiffusion/LayerDiffusion)                             | Transparent Image Layer Diffusion using Latent Transparency.                                                                    |[arXiv](https://arxiv.org/abs/2305.18676)  |              |   Image   |
@@ -289,10 +301,12 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [LlamaGen](https://github.com/FoundationVision/LlamaGen)                                       | Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation.                                                      |[arXiv](https://arxiv.org/abs/2406.06525)  |              |   Image   |
 | [MetaShoot](https://metashoot.vinzi.xyz/)                                                      | MetaShoot is a digital twin of a photo studio, developed as a plugin for Unreal Engine that gives any creator the ability to produce highly realistic renders in the easiest and quickest way. |  | Unreal Engine |   Image   |
 | [Midjourney](https://www.midjourney.com/)                                                      | Midjourney is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species.                                               |            |              |   Image   |
+| [MIGC](https://github.com/limuloo/MIGC)                                                        | MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis.                                                         |[arXiv](https://arxiv.org/abs/2402.05408)  |              |   Image   |
 | [MimicBrush](https://github.com/ali-vilab/MimicBrush)                                          | Zero-shot Image Editing with Reference Imitation.                                                                               |[arXiv](https://arxiv.org/abs/2406.07547)  |              |   Image   |
 | [Omost](https://github.com/lllyasviel/Omost)                                | Omost is a project to convert LLM's coding capability to image generation (or more accurately, image composing) capability.                                                                     |            |              |   Image   |
 | [Openpose Editor](https://github.com/fkunn1326/openpose-editor)                                | Openpose Editor for AUTOMATIC1111's stable-diffusion-webui.                                                                                                                          |            |              |   Image   |
 | [Outfit Anyone](https://humanaigc.github.io/outfit-anyone/)                                    | Ultra-high quality virtual try-on for Any Clothing and Any Person.                                                                                                                     |          |              |   Image   |
+| [PaintsUndo](https://github.com/lllyasviel/Paints-UNDO)                                        | PaintsUndo: A Base Model of Drawing Behaviors in Digital Paintings.                                                                                                                    |          |              |   Image   |
 | [PhotoMaker](https://photo-maker.github.io/)                                                   | Customizing Realistic Human Photos via Stacked ID Embedding.                                                                    |[arXiv](https://arxiv.org/abs/2312.04461)  |              |   Image   |
 | [Photoroom](https://www.photoroom.com/backgrounds)                                             | AI Background Generator.                                                                                                                                                              |           |              |   Image   |
 | [Plask](https://plask.ai/)                                                                     | AI image generation in the cloud.                                                                                                                                                      |          |              |   Image   |
@@ -335,6 +349,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [InstructHumans](https://github.com/viridityzhu/InstructHumans)                                | Editing Animated 3D Human Textures with Instructions.                                                                           |[arXiv](https://arxiv.org/abs/2404.04037)  |              |  Texture  |
 | [InteX](https://github.com/ashawkey/InTeX)                                                     | Interactive Text-to-Texture Synthesis via Unified Depth-aware Inpainting.                                                       |[arXiv](https://arxiv.org/abs/2403.11878)  |              |  Texture  |
 | [MaterialSeg3D](https://github.com/PROPHETE-pro/MaterialSeg3D_)                                | MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets.                                                         |[arXiv](https://arxiv.org/abs/2404.13923)  |              |  Texture  |
+| [MeshAnything](https://github.com/buaacyw/MeshAnything)                                        | MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets.                                                         |[arXiv](https://arxiv.org/abs/2406.10163)  |              |  Mesh  |
 | [Neuralangelo](https://github.com/NVlabs/neuralangelo)                                         | High-Fidelity Neural Surface Reconstruction.                                                                                    |[arXiv](https://arxiv.org/abs/2306.03092)  |              |  Texture  |
 | [Paint-it](https://github.com/postech-ami/paint-it)                                            | Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering.                                             |                                              |              |  Texture  |
 | [Polycam](https://poly.cam/material-generator)                                                 | Create your own 3D textures just by typing.                                                                                                             |                                             |              |  Texture  |
@@ -342,6 +357,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [Text2Tex](https://daveredrum.github.io/Text2Tex/)                                             | Text-driven texture Synthesis via Diffusion Models.                                                                             |[arXiv](https://arxiv.org/abs/2303.11396)  |              |  Texture  |
 | [Texture Lab](https://www.texturelab.xyz/)                                                     | AI-generated texures. You can generate your own with a text prompt.                                                                                     |                                             |              |  Texture  |
 | [With Poly](https://withpoly.com/browse/textures)                                              | Create Textures With Poly. Generate 3D materials with AI in a free online editor, or search our growing community library.                          |                                            |              |  Texture  |
+| [X-Mesh](https://github.com/xmu-xiaoma666/X-Mesh)                                              | X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance.                             |[arXiv](https://arxiv.org/abs/2303.15764)  |              |  Texture  |
 
 <p style="text-align: right;"><a href="#table-of-contents">^ Back to Contents ^</a></p>
 
@@ -363,12 +379,15 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [BlenderGPT](https://github.com/gd3kr/BlenderGPT)                                              | Use commands in English to control Blender with OpenAI's GPT-4.                                                                                         |                                          |    Blender    |   Model   |
 | [Blender-GPT](https://github.com/TREE-Ind/Blender-GPT)                                         | An all-in-one Blender assistant powered by GPT3/4 + Whisper integration.                                                                              |                                            |    Blender    |   Model   |
 | [Blockade Labs](https://www.blockadelabs.com/)                                                 | Digital alchemy is real with Skybox Lab - the ultimate AI-powered solution for generating incredible 360° skybox experiences from text prompts.         |                                          |              |   Model   |
+| [CharacterGen](https://github.com/zjp-shadow/CharacterGen)                                     | CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization.                     |[arXiv](https://arxiv.org/abs/2402.17214)  |              |   3D   |
 | [chatGPT-maya](https://github.com/LouisRossouw/chatGPT-maya)                                   | Simple Maya tool that utilizes open AI to perform basic tasks based on descriptive instructions.                                                |                                           |     Maya     |   Model   |
 | [CityDreamer](https://github.com/hzxie/city-dreamer)                                           | Compositional Generative Model of Unbounded 3D Cities.                                                                          |[arXiv](https://arxiv.org/abs/2309.00610)  |              |   3D   |
 | [CSM](https://www.csm.ai/)                                                                     | Generate 3D worlds from images and videos.                                                                                                             |                                           |              |   3D   |
 | [Dash](https://www.polygonflow.io/)                                                            | Your Copilot for World Building in Unreal Engine.                                                                                                     |                                            | Unreal Engine |   3D   |
 | [DreamGaussian4D](https://github.com/jiawei-ren/dreamgaussian4d)                               | Generative 4D Gaussian Splatting.                                                                                               |[arXiv](https://arxiv.org/abs/2312.17142)  |              |   4D   |
 | [DUSt3R](https://github.com/naver/dust3r)                                                      | Geometric 3D Vision Made Easy.                                                                                                  |[arXiv](https://arxiv.org/abs/2312.14132)  |              |   3D   |
+| [GALA3D](https://github.com/VDIGPKU/GALA3D)                                                    | GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting.                            |[arXiv](https://arxiv.org/abs/2402.07207)  |              |   3D   |
+| [GaussianCube](https://github.com/GaussianCube/GaussianCube)                                   | A Structured and Explicit Radiance Representation for 3D Generative Modeling.                                                   |[arXiv](https://arxiv.org/abs/2403.19655)  |              |   3D   |
 | [GaussianDreamer](https://github.com/hustvl/GaussianDreamer)                                   | Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors.                                                     |[arXiv](https://arxiv.org/abs/2310.08529)  |              |   3D   |
 | [GenieLabs](https://www.genielabs.tech/)                                                       | Empower your game with AI-UGC.                                                                                                                         |                                           |              |   3D   |
 | [HiFA](https://hifa-team.github.io/HiFA-site/)                                                 | High-fidelity Text-to-3D with advance Diffusion guidance.                                                                                              |                                           |              |   Model   |
@@ -402,6 +421,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [3DTopia](https://github.com/3DTopia/3DTopia)                                                  | Text-to-3D Generation within 5 Minutes.                                                                                         |[arXiv](https://arxiv.org/abs/2403.02234)  |              |   3D   |
 | [threestudio](https://github.com/threestudio-project/threestudio)                              | A unified framework for 3D content generation.                                                                                                           |                                         |              |   Model   |
 | [TripoSR](https://github.com/VAST-AI-Research/TripoSR)                                         | A state-of-the-art open-source model for fast feedforward 3D reconstruction from a single image.                                |[arXiv](https://arxiv.org/abs/2403.02151)  |              |   Model   |
+| [Unique3D](https://github.com/AiuniAI/Unique3D)                                                | High-Quality and Efficient 3D Mesh Generation from a Single Image.                                                              |[arXiv](https://arxiv.org/abs/2405.20343)  |              |   3D   |
 | [UnityGaussianSplatting](https://github.com/aras-p/UnityGaussianSplatting)                     | Toy Gaussian Splatting visualization in Unity.                                                                                                          |                                          |     Unity     |   3D   |
 | [ViVid-1-to-3](https://github.com/ubc-vision/vivid123)                                         | Novel View Synthesis with Video Diffusion Models.                                                                               |[arXiv](https://arxiv.org/abs/2312.01305)  |              |   3D   |
 | [Voxcraft](https://voxcraft.ai/)                                                               | Crafting Ready-to-Use 3D Models with AI.                                                                                                               |                                           |              |   3D   |
@@ -420,19 +440,25 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [ChatAvatar](https://hyperhuman.deemos.com/chatavatar)                                         | Progressive generation Of Animatable 3D Faces Under Text guidance.                                                              |     |              |  Avatar  |
 | [ChatdollKit](https://github.com/uezo/ChatdollKit)                                             | ChatdollKit enables you to make your 3D model into a chatbot.                                                                                                |                                      |     Unity     |  Avatar  |
 | [DreamTalk](https://github.com/ali-vilab/dreamtalk)                                            | When Expressive Talking Head Generation Meets Diffusion Probabilistic Models.                                                   |[arXiv](https://arxiv.org/abs/2312.09767)  |              |  Avatar  |
+| [Duix](https://github.com/GuijiAI/duix.ai)                                                     | Duix - Silicon-Based Digital Human SDK 🌐🤖                                                                                                                     |                                  |              |  Avatar  |
+| [EchoMimic](https://github.com/BadToBest/EchoMimic)                                            | EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions.                           |[arXiv](https://arxiv.org/abs/2407.08136)  |              |  Avatar  |
 | [EMOPortraits](https://github.com/neeek2303/EMOPortraits)                                      | Emotion-enhanced Multimodal One-shot Head Avatars.                                                                                                            |                                     |              |  Avatar  |
+| [E3 Gen](https://github.com/olivia23333/E3Gen)                                                 | Efficient, Expressive and Editable Avatars Generation.                                                                          |[arXiv](https://arxiv.org/abs/2405.19203)  |              |  Avatar  |
 | [GeneAvatar](https://github.com/zju3dv/GeneAvatar)                                             | Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image.                                                    |[arXiv](https://arxiv.org/abs/2404.02152)  |              |  Avatar  |
 | [GeneFace++](https://github.com/yerfor/GeneFacePlusPlus)                                       | Generalized and Stable Real-Time 3D Talking Face Generation.                                                                                                     |                                  |              |  Avatar  |
 | [Hallo](https://github.com/fudan-generative-vision/hallo)                                      | Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation.                                                        |[arXiv](https://arxiv.org/abs/2406.08801)  |              |  Avatar  |
 | [HeadSculpt](https://brandonhan.uk/HeadSculpt/)                                                | Crafting 3D Head Avatars with Text.                                                                                             |[arXiv](https://arxiv.org/abs/2306.03038)  |              |  Avatar  |
 | [Linly-Talker](https://github.com/Kedreamix/Linly-Talker)                                      | Digital Avatar Conversational System.                                                                                                                          |                                    |              |  Avatar  |
+| [LivePortrait](https://github.com/KwaiVGI/LivePortrait)                                        | LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control.                                              |[arXiv](https://arxiv.org/abs/2407.03168)  |              |  Avatar  |
 | [MotionGPT](https://github.com/OpenMotionLab/MotionGPT)                                        | Human Motion as a Foreign Language, a unified motion-language generation model using LLMs.                                 |[arXiv](https://arxiv.org/abs/2306.14795)  |              |  Avatar  |
 | [MusePose](https://github.com/TMElyralab/MusePose)                                             | MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation.                                                                                        |                                    |              |  Avatar  |
 | [MuseTalk](https://github.com/TMElyralab/MuseTalk)                                             | Real-Time High Quality Lip Synchorization with Latent Space Inpainting.                                                                                        |                                    |              |  Avatar  |
 | [MuseV](https://github.com/TMElyralab/MuseV)                                           | Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising.                                                        |                                    |              |  Avatar  |
+| [Portrait4D](https://github.com/YuDeng/Portrait-4D)                                            | Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data.                                                                |[arXiv](https://arxiv.org/abs/2311.18729)  |              |  Avatar  |
 | [Ready Player Me](https://readyplayer.me/)                                                     | Integrate customizable avatars into your game or app in days.                                                                                                  |                                    |              |  Avatar  |
 | [StyleAvatar3D](https://github.com/icoz69/StyleAvatar3D)                                       | Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation.                                                  |[arXiv](https://arxiv.org/abs/2305.19012)  |              |  Avatar  |
 | [Text2Control3D](https://text2control3d.github.io/)                                            | Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model.                |[arXiv](https://arxiv.org/abs/2309.03550)  |              |  Avatar  |
+| [Topo4D](https://github.com/XuanchenLi/Topo4D)                                                 | Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture.                                                       |[arXiv](https://arxiv.org/abs/2406.00440)  |              |  Avatar  |
 | [UnityAIWithChatGPT](https://github.com/haili1234/UnityAIWithChatGPT)                          | Based on Unity, ChatGPT+UnityChan voice interactive display is realized.                                                                                      |                                     |     Unity     |  Avatar  |
 | [Vid2Avatar](https://moygcc.github.io/vid2avatar/)                                             | 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition.                                       |[arXiv](https://arxiv.org/abs/2302.11566)  |              |  Avatar  |
 | [VLOGGER](https://enriccorona.github.io/vlogger/)                                              | Multimodal Diffusion for Embodied Avatar Synthesis.                                                                                                           |                                     |              |  Avatar  |
@@ -478,16 +504,21 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 
 | Source                                                                                      | Description                                                                                                                                                                                    |   Paper   |  Game Engine  |   Type   |
 | :------------------------------------------------------------------------------------------ | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-----------: | :-----------: | :-------: |
+| [Cambrian-1](https://github.com/cambrian-mllm/cambrian)                                     | Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs.                                                            |[arXiv](https://arxiv.org/abs/2406.16860)  |              |   Multimodal LLMs  |
 | [CogVLM2](https://github.com/THUDM/CogVLM2)                                                 | GPT4V-level open-source multi-modal model based on Llama3-8B.                                                                                       |                           |              |   Visual  |
 | [CoTracker](https://co-tracker.github.io/)                                                  | It is Better to Track Together.                                                                                                      |[arXiv](https://arxiv.org/abs/2307.07635)  |               | Visual |
 | [FaceHi](https://m.facehi.ai/)                                                              | It is Better to Track Together.                                                                                                                       |                                           |               | Visual |
+| [InternLM-XComposer2](https://github.com/InternLM/InternLM-XComposer)                       | InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.           |[arXiv](https://arxiv.org/abs/2404.06512)  |               | Visual |
 | [LGVI](https://jianzongwu.github.io/projects/rovi/)                                         | Towards Language-Driven Video Inpainting via Multimodal Large Language Models.                                                                         |                                           |               | Visual |
 | [LLaVA++](https://github.com/mbzuai-oryx/LLaVA-pp)                                          | Extending Visual Capabilities with LLaMA-3 and Phi-3.                                                                                                     |                                     |              |   Visual  |
 | [MaskViT](https://maskedvit.github.io/)                                                     | Masked Visual Pre-Training for Video Prediction.                                                                                      |[arXiv](https://arxiv.org/abs/2206.11894)  |              | Visual |
 | [MiniCPM-Llama3-V 2.5](https://github.com/OpenBMB/MiniCPM-V)                                | A GPT-4V Level MLLM on Your Phone.                                                                                                                        |                                      |              |   Visual  |
+| [MoE-LLaVA](https://github.com/PKU-YuanGroup/MoE-LLaVA)                                     | Mixture of Experts for Large Vision-Language Models.                                                                                  |[arXiv](https://arxiv.org/abs/2401.15947)  |              |   Visual  |
 | [MotionLLM](https://github.com/IDEA-Research/MotionLLM)                                     | Understanding Human Behaviors from Human Motions and Videos.                                                                          |[arXiv](https://arxiv.org/abs/2405.20340)  |              |   Visual  |
 | [PLLaVA](https://github.com/magic-research/PLLaVA)                                          | Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning.                                                      |[arXiv](https://arxiv.org/abs/2404.16994)  |              |   Visual  |
 | [Qwen-VL](https://github.com/QwenLM/Qwen-VL)                                                | A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond.                                          |[arXiv](https://arxiv.org/abs/2308.12966)  |              |   Visual  |
+| [ShareGPT4V](https://github.com/ShareGPT4Omni/ShareGPT4V)                                   | Improving Large Multi-modal Models with Better Captions.                                                                              |[arXiv](https://arxiv.org/abs/2311.12793)  |              |   Visual  |
+| [Video-LLaVA](https://github.com/PKU-YuanGroup/Video-LLaVA)                                 | Learning United Visual Representation by Alignment Before Projection.                                                                 |[arXiv](https://arxiv.org/abs/2311.10122)  |              |   Visual  |
 | [VideoLLaMA 2](https://github.com/DAMO-NLP-SG/VideoLLaMA2)                                  | Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs.                                                            |[arXiv](https://arxiv.org/abs/2406.07476)  |              |   Visual  |
 | [Video-MME](https://github.com/BradyFU/Video-MME)                                           | The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis.                                              |[arXiv](https://arxiv.org/abs/2405.21075)  |              |   Visual  |
 | [Vitron](https://github.com/SkyworkAI/Vitron)                                               | A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing.                                                                      |                                      |              |   Visual  |
@@ -513,6 +544,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [CoNR](https://github.com/megvii-research/CoNR)                                                | Genarate vivid dancing videos from hand-drawn anime character sheets(ACS).                                                   |[arXiv](https://arxiv.org/abs/2207.05378)     |              |   Video   |
 | [Decohere](https://www.decohere.ai/)                                                           | Create what can't be filmed.                                                                                                                                                      |              |              |   Video   |
 | [Descript](https://www.descript.com/)                                                          | Descript is the simple, powerful , and fun way to edit.                                                                                                                           |              |              |   Video   |
+| [Diffutoon](https://github.com/modelscope/DiffSynth-Studio)                                    | High-Resolution Editable Toon Shading via Diffusion Models.                                                                  |[arXiv](https://arxiv.org/abs/2401.16224)     |              |   Video   |
 | [dolphin](https://github.com/kaleido-lab/dolphin)                                              | General video interaction platform based on LLMs.                                                                                                                                 |              |              |   Video   |
 | [DomoAI](https://domoai.app/)                                                                  | Amplify Your Creativity with DomoAI.                                                                                                                                             |               |              |   Video   |
 | [DynamiCrafter](https://doubiiu.github.io/projects/DynamiCrafter/)                             | Animating Open-domain Images with Video Diffusion Priors.                                                                    |[arXiv](https://arxiv.org/abs/2310.12190)     |              |   Video   |
@@ -548,6 +580,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [MicroCinema](https://wangyanhui666.github.io/MicroCinema.github.io/)                          | A Divide-and-Conquer Approach for Text-to-Video Generation.                                                                  |[arXiv](https://arxiv.org/abs/2311.18829)     |              |   Video   |
 | [Mini-Gemini](https://github.com/dvlab-research/MiniGemini)                                    | Mining the Potential of Multi-modality Vision Language Models.                                                                                                                     |             |              |   Vision   |
 | [MobileVidFactory](https://arxiv.org/abs/2307.16371)                                           | Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text.                                                                                              |             |              |   Video   |
+| [MOFA-Video](https://github.com/MyNiuuu/MOFA-Video)                                            | Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.             |[arXiv](https://arxiv.org/abs/2405.20222)     |              |   Video   |
 | [MoneyPrinterTurbo](https://github.com/harry0703/MoneyPrinterTurbo)                            | Use large models to generate short videos with one click.                                                                                                                           |            |              |   Video   |
 | [Moonvalley](https://moonvalley.ai/)                                                           | Moonvalley is a groundbreaking new text-to-video generative AI model.                                                                                                               |            |              |   Video   |
 | [Mora](https://github.com/lichao-sun/Mora)                                                     | More like Sora for Generalist Video Generation.                                                                              |[arXiv](https://arxiv.org/abs/2403.13248)     |              |   Video   |
@@ -597,6 +630,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [Video LDMs](https://research.nvidia.com/labs/toronto-ai/VideoLDM/)                            | Align your Latents: High- resolution Video Synthesis with Latent Diffusion Models.                                           |[arXiv](https://arxiv.org/abs/2304.08818)     |              |   Video   |
 | [Video-LLaVA](https://github.com/PKU-YuanGroup/Video-LLaVA)                                    | Learning United Visual Representation by Alignment Before Projection.                                                        |[arXiv](https://arxiv.org/abs/2311.10122)     |              |   Video   |
 | [VideoMamba](https://github.com/OpenGVLab/VideoMamba)                                          | State Space Model for Efficient Video Understanding.                                                                         |[arXiv](https://arxiv.org/abs/2403.06977)     |              |   Video   |
+| [Video-of-Thought](https://github.com/scofield7419/Video-of-Thought)                           | Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition.                                                                                                       |             |              |   Video   |
 | [VideoPoet](https://sites.research.google/videopoet/)                                          | A large language model for zero-shot video generation.                                                                       |[arXiv](https://arxiv.org/abs/2312.14125)     |              |   Video   |
 | [Vispunk Motion](https://vispunk.com/video)                                                    | Create realistic videos using just text.                                                                                                                                           |             |              |   Video   |
 | [VisualRWKV](https://github.com/howard-hou/VisualRWKV)                                         | VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.                                                            |              |              |   Visual   |
@@ -623,14 +657,17 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [AudioLDM 2](https://github.com/haoheliu/audioldm2)                                            | Learning Holistic Audio Generation with Self-supervised Pretraining.                                                        |[arXiv](https://arxiv.org/abs/2308.05734)      |              |   Audio   |
 | [Auffusion](https://github.com/happylittlecat2333/Auffusion)                                   | Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation.                                   |[arXiv](https://arxiv.org/abs/2401.01044)      |              |   Audio   |
 | [CTAG](https://github.com/PapayaResearch/ctag)                                                 | Creative Text-to-Audio Generation via Synthesizer Programming.                                                                                                                     |              |              |   Audio   |
+| [FoleyCrafter](https://github.com/open-mmlab/FoleyCrafter)                                     | FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds.                                            |[arXiv](https://arxiv.org/abs/2407.01494)      |              |   Audio   |
 | [MAGNeT](https://pages.cs.huji.ac.il/adiyoss-lab/MAGNeT/)                                      | Masked Audio Generation using a Single Non-Autoregressive Transformer.                                                                                                             |              |              |   Audio   |
 | [Make-An-Audio](https://text-to-audio.github.io/)                                              | Text-To-Audio Generation with Prompt-Enhanced Diffusion Models.                                                             |[arXiv](https://arxiv.org/abs/2301.12661)      |              |   Audio   |
+| [Make-An-Audio 3](https://github.com/Text-to-Audio/Make-An-Audio-3)                            | Transforming Text into Audio via Flow-based Large Diffusion Transformers.                                                   |[arXiv](https://arxiv.org/abs/2305.18474)      |              |   Audio   |
 | [NeuralSound](https://github.com/hellojxt/NeuralSound)                                         | Learning-based Modal Sound Synthesis with Acoustic Transfer.                                                                |[arXiv](https://arxiv.org/abs/2108.07425)      |              |   Audio   |
 | [OptimizerAI](https://www.optimizerai.xyz/)                                                    | Sounds for Creators, Game makers, Artists, Video makers.                                                                    |            |              |   Audio   |
 | [SEE-2-SOUND](https://github.com/see2sound/see2sound)                                          | Zero-Shot Spatial Environment-to-Spatial Sound.                                                                             |[arXiv](https://arxiv.org/abs/2406.06612)      |              |   Audio   |
 | [SoundStorm](https://google-research.github.io/seanet/soundstorm/examples/)                    | Efficient Parallel Audio Generation.                                                                                        |[arXiv](https://arxiv.org/abs/2305.09636)      |              |   Audio   |
 | [Stable Audio](https://www.stableaudio.com/)                                                   | Fast Timing-Conditioned Latent Audio Diffusion.                                                                                                                                      |            |              |   Audio   |
 | [Stable Audio Open](https://huggingface.co/stabilityai/stable-audio-open-1.0)                  | Stable Audio Open 1.0 generates variable-length (up to 47s) stereo audio at 44.1kHz from text prompts.                                                                              |            |              |   Audio   |
+| [SyncFusion](https://github.com/mcomunita/syncfusion)                                          | SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis.                                                   |[arXiv](https://arxiv.org/abs/2310.15247)      |              |   Audio   |
 | [TANGO](https://github.com/declare-lab/tango)                                                  | Text-to-Audio Generation using Instruction Tuned LLM and Latent Diffusion Model.                                                                                                      |           |              |   Audio   |
 | [WavJourney](https://github.com/Audio-AGI/WavJourney)                                          | Compositional Audio Creation with Large Language Models.                                                                    |[arXiv](https://arxiv.org/abs/2307.14335)      |              |   Audio   |
 
@@ -646,6 +683,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [Boomy](https://boomy.com/)                                                                    | Create generative music. Share it with the world.                                                                                                             |                                    |              |   Music   |
 | [ChatMusician](https://shanghaicannon.github.io/ChatMusician/)                                 | Fostering Intrinsic Musical Abilities Into LLM.                                                                                                              |                                     |              |   Music   |
 | [Chord2Melody](https://github.com/tanreinama/chord2melody)                                     | Automatic Music Generation AI.                                                                                                                               |                                     |              |   Music   |
+| [Diff-BGM](https://github.com/sizhelee/Diff-BGM)                                               | A Diffusion Model for Video Background Music Generation.                                                                   | [arXiv](https://arxiv.org/abs/2405.11913)      |              |   Music   |
 | [GPTAbleton](https://github.com/BurnedGuitarist/GPTAbleton)                                    | Draft script for processing GPT response and sending the MIDI notes into the Ableton clips with AbletonOSC and python-osc.                                |                                   |              |   Music   |
 | [HeyMusic.AI](https://heymusic.ai/zh)                                                          | AI Music Generator                                                                                                                                             |                                   |              |   Music   |
 | [Image to Music](https://imagetomusic.top/)                                                    | AI Image to Music Generator is a tool that uses artificial intelligence to convert images into music.                                                          |                                   |              |   Music   |
@@ -688,6 +726,8 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [Bert-VITS2](https://github.com/fishaudio/Bert-VITS2)                                          | VITS2 Backbone with multilingual bert.                                                                                                          |                                                 |              |  Speech  |
 | [ChatTTS](https://github.com/2noise/ChatTTS)                                                   | ChatTTS is a generative speech model for daily dialogue.                                                                                        |                                                 |              |  Speech  |
 | [CLAPSpeech](https://clapspeech.github.io/)                                                    | Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training.                                           | [arXiv](https://arxiv.org/abs/2305.10763)      |              |  Speech  |
+| [CosyVoice](https://github.com/FunAudioLLM/CosyVoice)                                          | Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.                                   |                                                 |              |  Speech  |
+| [DEX-TTS](https://github.com/winddori2002/DEX-TTS)                                             | Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability.                                         | [arXiv](https://arxiv.org/abs/2406.19135)      |              |  Speech  |
 | [EmotiVoice](https://github.com/netease-youdao/EmotiVoice)                                     | A Multi-Voice and Prompt-Controlled TTS Engine.                                                                                                 |                                                 |              |  Speech  |
 | [Fliki](https://fliki.ai/)                                                                     | Turn text into videos with AI voices.                                                                                                           |                                                 |              |  Speech  |
 | [Glow-TTS](https://github.com/jaywalnut310/glow-tts)                                           | A Generative Flow for Text-to-Speech via Monotonic Alignment Search.                                                       | [arXiv](https://arxiv.org/abs/2005.11129)      |              |  Speech  |
@@ -702,6 +742,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM,
 | [OpenVoice](https://github.com/myshell-ai/OpenVoice)                                           | Instant voice cloning by MyShell.                                                                                                              |                                                  |              |  Speech  |
 | [OverFlow](https://github.com/shivammehta25/OverFlow)                                          | Putting flows on top of neural transducers for better TTS.                                                                                     |                                                  |              |  Speech  |
 | [RealtimeTTS](https://github.com/KoljaB/RealtimeTTS)                                           | RealtimeTTS is a state-of-the-art text-to-speech (TTS) library designed for real-time applications.                                        |                                                  |              |  Speech  |
+| [SenseVoice](https://github.com/FunAudioLLM/SenseVoice)                                        | SenseVoice is a speech foundation model with multiple speech understanding capabilities, including automatic speech recognition (ASR), spoken language identification (LID), speech emotion recognition (SER), and audio event detection (AED).                                                                                            |                                                  |              |  Speech  |
 | [SpeechGPT](https://github.com/0nutation/SpeechGPT)                                            | Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities.                                      | [arXiv](https://arxiv.org/abs/2305.11000)      |              |  Speech  |
 | [speech-to-text-gpt3-unity](https://github.com/dr-iskandar/speech-to-text-gpt3-unity)          | This is the repo I use Whisper and ChatGPT API from OpenAI in Unity.                                                                           |                                                  |     Unity     |  Speech  |
 | [Stable Speech](https://github.com/sanchit-gandhi/stable-speech)                               | Stability AI's Text-to-Speech model.                                                                                                          |                                                   |              |  Speech  |