- AGI: Artificial General Intelligence
- MMLU: Massive Multitask Language Understanding
- RLHF: Reinforcement learning from human feedback
- Retrieval-Augmented Generation (RAG)
- Hallucinations
- Catastrophic interference (forgetting)
- Stochastic Parrot
- Modalities: Text, Image, Video, Audio
- 5-shot: Few-shot training with five samples
- CoT@32: Chain of thought prompting with 32 samples
- MoE: Mixture of experts
- Data contamination vs Task contamination
- AI Mirror Test
- AI Agentic Workflows
- Foundation models
- Downstream tasks
- Pre-training vs intermediate training (Domain Adaptation)
- Transfer learning
- Fine-tuning
- BLEU score: Metric for translation
- ROUGE score: Metric for text summarization
- Perplexity: Metric for MLM
- Embedding Quantization
- DenseFormer
- ChatGPT (A chatbot developed by OpenAI)
- Gemini (Successor of BARD)
- AlphaCode 2 (Programming tool powered by Gemini)
- SciSpace (AI chat for scientific PDFs)
- JSTOR (AI chat for scientific PDFs)
- Cody (Customly trainable AI assistant for businesses)
- Rawdog (CLI assistant that responds by generating and auto-executing a Python script)
- Devin (AI Software Engineer by Cognition)
- LaVague (Browser interaction and task automation)
- Chat with RTX (Locally personalized LLM by NVIDIA - 35Gb)
- EagleX (Attention-free transformer LLM based on the RWKV-v5 architecture)
- Sora (Creating video from text by OpenAI)
- Open-Sora (Open-source version of Sora)
- LoRA (Lightweight training technique that reduces the number of trainable parameters)
- DSPy (Solves the fragility in LLM apps by replacing prompting with programming and compiling) [More Info]
- LangChain (Build, observe, and deploy LLM‑powered apps easily)
- LlamaIndex (Turn your enterprise data into production-ready LLM applications)
- Ollama (Get up and running with large language models, locally)
- Phoenix (For AI observability and evaluation)
- LM Studio (Discover, download, and run local LLMs)
- NeoSync (To create synthetic data or anonymize sensitive data for fine-tuning or model training)
- Langfuse (Open Source LLM Engineering Platform)
- FastText (Library for efficient text classification and representation learning)
- PhiData (A toolkit for building AI Assistants with function calling and connecting LLMs to external tools)
- ScreenAI (A visual language model for UI and visually-situated language understanding)
- MLX (An array framework for machine learning research on Apple silicon)
- Midjourney (Image generator)
- Magnific.ai (Image upscaler)
- Lisa AI (Artistic image generator)
- Stable Diffusion (Text-to-image model from StabilityAI)
- DreamBooth (A fine-tuning model for stable diffusion model)
- VideoPoet (A large language model for zero-shot video generation)
- Supervision by Roboflow (A Python package of reusable computer vision tools)
- YOLO (Object detection, segmentation, pose estimation, tracking, and classification)
- LLaVa-NeXT (An open-source multimodal language model that can take image)
- Localpilot (Local GitHub Copilot on Macbook)
- NextFlow (Reproducible scientific workflows using software containers)
- nf-core (Bioinformatics pipelines)
- labml (Repository of annotated research paper implementations)
- Awesome Open-source Machine Learning for Developers
- LLMs from scratch (Build a Large Language Model From Scratch)
- ML Engineering (Machine Learning Engineering Open Book)
- Vercel
- FastAPI
- Chroma
- Open-WebUI
- Raycast
- Streamlit
- NiceGUI
- Transformers.js (Run 🤗 Transformers directly in your browser, with no need for a server)
- WebGPU (A JS API that enables webpage scripts to efficiently utilize a device's GPU)
- WASM (A binary instruction format for compiling and executing code in a client-side web browser)
- Burn (Dynamic Deep Learning Framework built using Rust)
- Pico MLX Server (A GUI to download and start AI models locally on Mac)
A regularly updated list of overused words and phrases is available in Google Sheets.
Further reading: How cheap, outsourced labour in Africa is shaping AI English
- Understanding Deep Learning (Simon J.D. Prince)
- Understanding Encoder And Decoder LLMs
- Encoder-Only vs Decoder-Only vs Encoder-Decoder Transformer
- BART Text Summarization vs. GPT-3 vs. BERT: An In-Depth Comparison
- BERT Fine-Tuning Tutorial with PyTorch
- Domain Adaptation with HuggingFace MLM
- Training BERT from Scratch on Your Custom Domain Data
- A Survey on Evaluation of Large Language Models