Skip to content

Latest commit

 

History

History
1436 lines (1192 loc) · 146 KB

topics.md

File metadata and controls

1436 lines (1192 loc) · 146 KB

Awesome Stars Awesome

A curated list of my GitHub stars! Generated by starred.

Contents

3d

  • UniversalViewer/universalviewer - A community-developed open source project on a mission to help you share your 📚📜📰📽️📻🗿 with the 🌎

ai

api

artificial-intelligence

  • yamadashy/repomix - 📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or o
  • ai-ng/2txt - Image to text, fast.
  • lucidrains/vit-pytorch - Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
  • fchollet/ARC-AGI - The Abstraction and Reasoning Corpus
  • AI4LAM/awesome-ai4lam - A list of awesome AI in libraries, archives, and museum collections from around the world 🕶️
  • marimo-team/marimo - A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.
  • explosion/spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
  • Kong/kong - 🦍 The Cloud-Native API Gateway and AI Gateway.

automation

  • n8n-io/n8n - Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
  • kestra-io/kestra - ⚡ Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
  • Shuffle/Shuffle - Shuffle: A general purpose security automation platform. Our focus is on collaboration and resource sharing.
  • Avaiga/taipy - Turns Data and AI algorithms into production-ready web applications in no time.
  • vincentneo/LosslessSwitcher - Automated Apple Music Lossless Sample Rate Switching for Audio Devices on Macs.
  • bfidatadigipres/dpx_encoding - BFI National Archive automated dpx preservation scripts written in BASH and Python for use with Media Area RAWcooked and other open source programmes.

awesome

awesome-list

aws

azure

bash

  • mikefarah/yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor
  • alebcay/awesome-shell - A curated list of awesome command-line frameworks, toolkits, guides and gizmos. Inspired by awesome-php.
  • adi1090x/dynamic-wallpaper - A simple bash script to set wallpapers according to current time, using cron job scheduler.

c

chatbot

  • Cinnamon/kotaemon - An open-source RAG-based tool for chatting with your documents.
  • yamadashy/repomix - 📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or o
  • X-PLUG/mPLUG-Owl - mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
  • InternLM/xtuner - An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
  • haotian-liu/LLaVA - [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
  • bigscience-workshop/petals - 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
  • snexus/llm-search - Querying local documents, powered by LLM
  • SALT-NLP/LLaVAR - Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"
  • stanford-oval/WikiChat - WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus.
  • run-llama/rags - Build ChatGPT over your data, all with natural language
  • microsoft/autogen - A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour

chatgpt

chrome

  • gotenberg/gotenberg - A developer-friendly API for converting numerous document formats into PDF files, and more!

chrome-extension

cli

  • n8n-io/n8n - Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
  • freedmand/semantra - Multi-tool for semantic search
  • BurntSushi/ripgrep - ripgrep recursively searches directories for a regex pattern while respecting your gitignore
  • mikefarah/yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor
  • agarrharr/awesome-cli-apps - 🖥 📊 🕹 🛠 A curated list of command line apps
  • alebcay/awesome-shell - A curated list of awesome command-line frameworks, toolkits, guides and gizmos. Inspired by awesome-php.
  • mifi/ezshare - Easily share files, folders and clipboard over LAN - Like Google Drive but without internet
  • mifi/editly - Slick, declarative command line video editing & API
  • IIIF-Commons/biiif-cli - A CLI to Build Static IIIF Collections
  • athityakumar/colorls - A Ruby gem that beautifies the terminal's ls command, with color and font-awesome icons. 🎉
  • ohmyzsh/ohmyzsh - 🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python,

computer-science

computer-vision

  • katanaml/sparrow - Data processing with ML, LLM and Vision LLM
  • merveenoyan/siglip - Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗
  • deepfates/memery - Search over large image datasets with natural language and computer vision!
  • lucidrains/vit-pytorch - Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
  • HumanSignal/label-studio - Label Studio is a multi-type data labeling and annotation tool with standardized output format
  • rsommerfeld/trocr - Powerful handwritten text recognition. A simple-to-use, unofficial implementation of the paper "TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models".
  • distant-viewing/dvt - Distant Viewing Toolkit for the Analysis of Visual Culture
  • fcakyon/craft-text-detector - Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
  • clovaai/donut - Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
  • imdeepmind/hocrox - Hocrox: An image preprocessing and augmentation library with Keras like interface.

cybersecurity

data

  • ucbepic/docetl - A system for agentic LLM-powered data processing and ETL

data-analysis

  • EdyVision/pii-codex - A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)
  • gchq/CyberChef - The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

data-science

  • explosion/spacy-stanza - 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
  • marimo-team/marimo - A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.
  • explosion/spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python

data-visualization

  • Avaiga/taipy - Turns Data and AI algorithms into production-ready web applications in no time.
  • marimo-team/marimo - A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.
  • directus/directus - The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.

database

  • gristlabs/grist-core - Grist is the evolution of spreadsheets.
  • directus/directus - The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.

deep-learning

  • DAI-Lab/RivaGAN - Robust video watermarking with non-differentiable adversaries.
  • JosefAlbers/whisper-turbo-mlx - Blazing fast whisper turbo for ASR (speech-to-text) tasks
  • jina-ai/serve - ☁️ Build multimodal AI applications with cloud-native stack
  • izuna385/Wikia-and-Wikipedia-EL-Dataset-Creator - You can create datasets from Wikia/Wikipedia that can be used for entity recognition and Entity Linking. Dumps for ja-wiki and VTuber-wiki are available!
  • mindee/doctr - docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
  • IBM/zshot - Zero and Few shot named entity & relationships recognition
  • HumanSignal/label-studio - Label Studio is a multi-type data labeling and annotation tool with standardized output format
  • SYSTRAN/faster-whisper - Faster Whisper transcription with CTranslate2
  • Unstructured-IO/unstructured - Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
  • BlinkDL/RWKV-LM - RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, sa
  • RWKV/rwkv.cpp - INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
  • bigscience-workshop/petals - 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
  • sanchit-gandhi/whisper-jax - JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
  • fcakyon/craft-text-detector - Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
  • salesforce/LAVIS - LAVIS - A One-stop Library for Language-Vision Intelligence
  • divya21raj/Actor-Recognition-In-Movies - Recognizing actors in a movie clip or image, using OpenCV, DeepLearning and Python.
  • kermitt2/grobid - A machine learning software for extracting information from scholarly documents
  • kermitt2/delft - a Deep Learning Framework for Text https://delft.readthedocs.io/
  • imdeepmind/hocrox - Hocrox: An image preprocessing and augmentation library with Keras like interface.
  • explosion/spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python

deployment

  • MightyMoud/sidekick - Bare metal to production ready in mins; your own fly server on your VPS.

devops

  • healthyhost/audit-vps-script - Run a security scan on your server and identify common gaps. Get your VPS ready for production.
  • kestra-io/kestra - ⚡ Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
  • Kong/kong - 🦍 The Cloud-Native API Gateway and AI Gateway.
  • johnkerl/miller - Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

discord

  • Shuffle/Shuffle - Shuffle: A general purpose security automation platform. Our focus is on collaboration and resource sharing.

docker

  • Stirling-Tools/Stirling-PDF - #1 Locally hosted web application that allows you to perform various operations on PDF files
  • n8n-io/n8n - Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
  • revdotcom/reverb - Open source inference code for Rev's model
  • gotenberg/gotenberg - A developer-friendly API for converting numerous document formats into PDF files, and more!
  • jina-ai/serve - ☁️ Build multimodal AI applications with cloud-native stack
  • louislam/uptime-kuma - A fancy self-hosted monitoring tool
  • Kong/kong - 🦍 The Cloud-Native API Gateway and AI Gateway.
  • nytimes/nginx-vod-module-docker - Docker image for nginx with Kaltura's VoD module used by The New York Times
  • anchore/syft - CLI tool and library for generating a Software Bill of Materials from container images and filesystems
  • anchore/grype - A vulnerability scanner for container images and filesystems

documentation

education

electron

  • janhq/jan - Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)
  • SocialGouv/archifiltre-docs - Visualisez et améliorez vos arborescences de fichiers !

emulator

english

flask

flutter

  • immich-app/immich - High performance self-hosted photo and video management solution.

framework

  • jina-ai/serve - ☁️ Build multimodal AI applications with cloud-native stack

frontend

go

  • ollama/ollama - Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
  • anchore/syft - CLI tool and library for generating a Software Bill of Materials from container images and filesystems
  • anchore/grype - A vulnerability scanner for container images and filesystems

golang

  • ollama/ollama - Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
  • mikefarah/yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor
  • anchore/syft - CLI tool and library for generating a Software Bill of Materials from container images and filesystems
  • anchore/grype - A vulnerability scanner for container images and filesystems

good-first-issue

  • twentyhq/twenty - Building a modern alternative to Salesforce, powered by the community.

google

graphql

  • twentyhq/twenty - Building a modern alternative to Salesforce, powered by the community.
  • directus/directus - The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.

hacktoberfest

  • nocodb/nocodb - 🔥 🔥 🔥 Open Source Airtable Alternative
  • twentyhq/twenty - Building a modern alternative to Salesforce, powered by the community.
  • immich-app/immich - High performance self-hosted photo and video management solution.
  • kestra-io/kestra - ⚡ Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
  • Shuffle/Shuffle - Shuffle: A general purpose security automation platform. Our focus is on collaboration and resource sharing.
  • Avaiga/taipy - Turns Data and AI algorithms into production-ready web applications in no time.
  • fcakyon/craft-text-detector - Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
  • appbaseio/reactivesearch - Search UI components for React and Vue
  • davidberenstein1957/classy-classification - This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classification with Huggingface.
  • davidberenstein1957/concise-concepts - This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.
  • elastic/kibana - Your window into the Elastic Stack
  • IIIF-Commons/parser - IIIF Presentation 2 + 3 parser
  • anchore/syft - CLI tool and library for generating a Software Bill of Materials from container images and filesystems
  • anchore/grype - A vulnerability scanner for container images and filesystems
  • ruffle-rs/ruffle - A Flash Player emulator written in Rust
  • ohmyzsh/ohmyzsh - 🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python,
  • internetarchive/bookreader - The Internet Archive BookReader
  • UniversalViewer/universalviewer - A community-developed open source project on a mission to help you share your 📚📜📰📽️📻🗿 with the 🌎
  • freeCodeCamp/freeCodeCamp - freeCodeCamp.org's open-source codebase and curriculum. Learn to code for free.

html

image-processing

ios

java

  • Stirling-Tools/Stirling-PDF - #1 Locally hosted web application that allows you to perform various operations on PDF files
  • kestra-io/kestra - ⚡ Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
  • DSpace/DSpace - (Official) The DSpace digital asset management system that powers your Institutional Repository
  • apache/incubator-stormcrawler - A scalable, mature and versatile web crawler based on Apache Storm

javascript

js

  • yamadashy/repomix - 📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or o

json

  • AykutSarac/jsoncrack.com - ✨ Innovative and open-source visualization application that transforms various data formats, such as JSON, YAML, XML, CSV and more, into interactive graphs.
  • mikefarah/yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor
  • johnkerl/miller - Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

kubernetes

  • jina-ai/serve - ☁️ Build multimodal AI applications with cloud-native stack
  • Kong/kong - 🦍 The Cloud-Native API Gateway and AI Gateway.

learning

library

linux

llm

  • katanaml/sparrow - Data processing with ML, LLM and Vision LLM
  • JohnSnowLabs/spark-nlp - State of the Art Natural Language Processing
  • yamadashy/repomix - 📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or o
  • meta-llama/llama-recipes - Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a
  • ucbepic/docetl - A system for agentic LLM-powered data processing and ETL
  • RahulSChand/gpu_poor - Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
  • fr0gger/Awesome-GPT-Agents - A curated list of GPT agents for cybersecurity
  • vllm-project/vllm - A high-throughput and memory-efficient inference and serving engine for LLMs
  • microsoft/unilm - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
  • alexpinel/Dot - Text-To-Speech, RAG, and LLMs. All local!
  • ScrapeGraphAI/Scrapegraph-ai - Python scraper based on AI
  • Mintplex-Labs/anything-llm - The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
  • ax-llm/ax - The unofficial DSPy framework. Build LLM powered Agents and "Agentic workflows" based on the Stanford DSP paper.
  • Maximilian-Winter/llama-cpp-agent - The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured ou
  • InternLM/xtuner - An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
  • RManLuo/Awesome-LLM-KG - Awesome papers about unifying LLMs and KGs
  • mem0ai/mem0 - The Memory layer for your AI apps
  • Blaizzy/mlx-vlm - MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
  • mlabonne/llm-course - Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
  • armbues/SiLLM - SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.
  • jina-ai/reader - Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
  • AugustDev/enchanted - Enchanted is iOS and macOS app for chatting with private self hosted language models such as Llama2, Mistral or Vicuna using Ollama.
  • explosion/curated-transformers - 🤖 A PyTorch library of curated Transformer models and their composable components
  • explosion/spacy-llm - 🦙 Integrating LLMs into structured NLP pipelines
  • video-db/PromptClip - Instantly create video clips from LLM prompts
  • Unstructured-IO/unstructured - Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
  • da-z/llamazing - A simple Web / UI / App / Frontend to Ollama.
  • open-webui/open-webui - User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
  • RWKV/rwkv.cpp - INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
  • deployradiant/pajama - A UI for Ollama on Mac
  • leptonai/search_with_lepton - Building a quick conversation-based search demo with Lepton AI.
  • rohan-paul/LLM-FineTuning-Large-Language-Models - LLM (Large Language Model) FineTuning
  • video-db/StreamRAG - Video Search and Streaming Agent 🕵️‍♂️
  • letta-ai/letta - Letta (formerly MemGPT) is a framework for creating LLM services with memory.
  • da-z/mlx-ui - A simple UI / Web / Frontend for MLX mlx-lm using Streamlit.
  • alphasecio/llama-index - A collection of apps powered by the LlamaIndex LLM framework.
  • snexus/llm-search - Querying local documents, powered by LLM
  • h2oai/h2ogpt - Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
  • riccardomusmeci/mlx-llm - Large Language Models (LLMs) applications and tools running on Apple Silicon in real-time with Apple MLX.
  • mlc-ai/mlc-llm - Universal LLM Deployment Engine with ML Compilation
  • ollama/ollama - Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
  • ise-uiuc/magicoder - [ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct
  • stanford-oval/WikiChat - WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus.
  • run-llama/rags - Build ChatGPT over your data, all with natural language
  • adbar/trafilatura - Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

low-code

  • nocodb/nocodb - 🔥 🔥 🔥 Open Source Airtable Alternative
  • n8n-io/n8n - Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
  • kestra-io/kestra - ⚡ Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...

machine-learning

  • NatLibFi/Annif - Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.
  • meta-llama/llama-recipes - Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a
  • merveenoyan/siglip - Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗
  • freedmand/semantra - Multi-tool for semantic search
  • ScrapeGraphAI/Scrapegraph-ai - Python scraper based on AI
  • mlabonne/llm-course - Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
  • nateraw/openai-vision-api-for-videos - Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦
  • jina-ai/serve - ☁️ Build multimodal AI applications with cloud-native stack
  • flairNLP/flair - A very simple framework for state-of-the-art Natural Language Processing (NLP)
  • explosion/spacy-stanza - 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
  • explosion/spacy-transformers - 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
  • explosion/spacy-llm - 🦙 Integrating LLMs into structured NLP pipelines
  • IBM/zshot - Zero and Few shot named entity & relationships recognition
  • Unstructured-IO/unstructured - Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
  • RWKV/rwkv.cpp - INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
  • bigscience-workshop/petals - 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
  • AI4LAM/awesome-ai4lam - A list of awesome AI in libraries, archives, and museum collections from around the world 🕶️
  • marimo-team/marimo - A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.
  • ageitgey/face_recognition - The world's simplest facial recognition api for Python and the command line
  • kermitt2/grobid - A machine learning software for extracting information from scholarly documents
  • kermitt2/entity-fishing - A machine learning tool for fishing entities
  • davidberenstein1957/classy-classification - This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classification with Huggingface.
  • davidberenstein1957/concise-concepts - This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.
  • imdeepmind/hocrox - Hocrox: An image preprocessing and augmentation library with Keras like interface.
  • explosion/spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
  • FilmColors/VIAN -

macos

markdown

microsoft

  • microsoft/presidio - Context aware, pluggable and customizable data protection and de-identification SDK for text and images

monitoring

music

  • karolkozer/planby -
  • ina-foss/inaSpeechSegmenter - CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

mysql

  • nocodb/nocodb - 🔥 🔥 🔥 Open Source Airtable Alternative
  • directus/directus - The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.

natural-language-processing

  • chartbeat-labs/textacy - NLP, before and after spaCy
  • JohnSnowLabs/spark-nlp - State of the Art Natural Language Processing
  • boudinfl/pke - Python Keyphrase Extraction module
  • urchade/GLiNER - Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
  • flairNLP/flair - A very simple framework for state-of-the-art Natural Language Processing (NLP)
  • izuna385/Wikia-and-Wikipedia-EL-Dataset-Creator - You can create datasets from Wikia/Wikipedia that can be used for entity recognition and Entity Linking. Dumps for ja-wiki and VTuber-wiki are available!
  • explosion/spacy-stanza - 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
  • explosion/spacy-transformers - 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
  • explosion/spacy-llm - 🦙 Integrating LLMs into structured NLP pipelines
  • IBM/zshot - Zero and Few shot named entity & relationships recognition
  • Unstructured-IO/unstructured - Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
  • AI4LAM/awesome-ai4lam - A list of awesome AI in libraries, archives, and museum collections from around the world 🕶️
  • stanford-oval/WikiChat - WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus.
  • Lucaterre/spacyfishing - A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata
  • davidberenstein1957/classy-classification - This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classification with Huggingface.
  • davidberenstein1957/concise-concepts - This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.
  • SapienzaNLP/extend - Entity Disambiguation as text extraction (ACL 2022)
  • stanfordnlp/CoreNLP - CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
  • explosion/spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python

nestjs

  • twentyhq/twenty - Building a modern alternative to Salesforce, powered by the community.
  • immich-app/immich - High performance self-hosted photo and video management solution.

neural-network

nextjs

  • AykutSarac/jsoncrack.com - ✨ Innovative and open-source visualization application that transforms various data formats, such as JSON, YAML, XML, CSV and more, into interactive graphs.
  • ai-ng/2txt - Image to text, fast.
  • supermemoryai/supermemory - Build your own second brain with supermemory. It's a ChatGPT for your bookmarks. Import tweets or save websites and content using the chrome extension.

nlp

node

  • n8n-io/n8n - Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
  • directus/directus - The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.

nodejs

  • immich-app/immich - High performance self-hosted photo and video management solution.
  • yamadashy/repomix - 📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or o
  • Mintplex-Labs/anything-llm - The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
  • ax-llm/ax - The unofficial DSPy framework. Build LLM powered Agents and "Agentic workflows" based on the Stanford DSP paper.
  • OpenInterpreter/open-interpreter - A natural language interface for computers
  • ProjectMirador/mirador-desktop - A desktop wrapper for Mirador and its environment, allowing use of local images.
  • IIIF-Commons/biiif-cli - A CLI to Build Static IIIF Collections
  • freeCodeCamp/freeCodeCamp - freeCodeCamp.org's open-source codebase and curriculum. Learn to code for free.

open-data

  • exponential-decay/the-format-registry - A mirror of the PRONOM file format registry in Linked Open Data format. The Format Registry is a linked (open) data file format repository. The work is the result of a four-day hack during November 20

open-source

  • twentyhq/twenty - Building a modern alternative to Salesforce, powered by the community.
  • piotrkulpinski/openalternative - A community driven list of open source alternatives to proprietary software and applications.
  • Cinnamon/kotaemon - An open-source RAG-based tool for chatting with your documents.
  • revdotcom/reverb - Open source inference code for Rev's model
  • n4ze3m/page-assist - Use your locally running AI models to assist you in your web browsing
  • DSpace/DSpace - (Official) The DSpace digital asset management system that powers your Institutional Repository

openai

opencv

opengl

others

parsing

php

  • passbolt/passbolt_api - Passbolt Community Edition (CE) API. The JSON API for the open source password manager for teams!

postgresql

  • nocodb/nocodb - 🔥 🔥 🔥 Open Source Airtable Alternative
  • twentyhq/twenty - Building a modern alternative to Salesforce, powered by the community.
  • directus/directus - The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.

privacy

  • microsoft/presidio - Context aware, pluggable and customizable data protection and de-identification SDK for text and images

programming

python

python3

pytorch

  • ServerlessLLM/ServerlessLLM - Serverless LLM Serving for Everyone.
  • meta-llama/llama-recipes - Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a
  • RahulSChand/gpu_poor - Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
  • vllm-project/vllm - A high-throughput and memory-efficient inference and serving engine for LLMs
  • X-PLUG/mPLUG-Owl - mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
  • WhisperSpeech/WhisperSpeech - An Open Source text-to-speech system built by inverting Whisper.
  • huggingface/pytorch-image-models - The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT)
  • flairNLP/flair - A very simple framework for state-of-the-art Natural Language Processing (NLP)
  • mindee/doctr - docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
  • explosion/curated-transformers - 🤖 A PyTorch library of curated Transformer models and their composable components
  • explosion/spacy-transformers - 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
  • IBM/zshot - Zero and Few shot named entity & relationships recognition
  • BlinkDL/RWKV-LM - RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, sa
  • rohan-paul/LLM-FineTuning-Large-Language-Models - LLM (Large Language Model) FineTuning
  • bigscience-workshop/petals - 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
  • NielsRogge/Transformers-Tutorials - This repository contains demos I made with the Transformers library by HuggingFace.
  • fcakyon/craft-text-detector - Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
  • mapluisch/LLaVA-CLI-with-multiple-images - LLaVA inference with multiple images at once for cross-image analysis.
  • SapienzaNLP/extend - Entity Disambiguation as text extraction (ACL 2022)

qt

  • shundhammer/qdirstat - QDirStat - Qt-based directory statistics (KDirStat without any KDE - from the original KDirStat author)

react

reactjs

redux

rest-api

  • nocodb/nocodb - 🔥 🔥 🔥 Open Source Airtable Alternative
  • NatLibFi/Annif - Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.
  • DSpace/DSpace - (Official) The DSpace digital asset management system that powers your Institutional Repository

ruby

  • athityakumar/colorls - A Ruby gem that beautifies the terminal's ls command, with color and font-awesome icons. 🎉

rust

  • BurntSushi/ripgrep - ripgrep recursively searches directories for a regex pattern while respecting your gitignore
  • bionic-gpt/bionic-gpt - BionicGPT is an on-premise replacement for ChatGPT, offering the advantages of Generative AI while maintaining strict data confidentiality
  • ruffle-rs/ruffle - A Flash Player emulator written in Rust

security

  • healthyhost/audit-vps-script - Run a security scan on your server and identify common gaps. Get your VPS ready for production.
  • Shuffle/Shuffle - Shuffle: A general purpose security automation platform. Our focus is on collaboration and resource sharing.
  • passbolt/passbolt_api - Passbolt Community Edition (CE) API. The JSON API for the open source password manager for teams!
  • hillu/local-log4j-vuln-scanner - Simple local scanner for vulnerable log4j instances
  • anchore/grype - A vulnerability scanner for container images and filesystems

security-tools

self-hosted

  • immich-app/immich - High performance self-hosted photo and video management solution.
  • MightyMoud/sidekick - Bare metal to production ready in mins; your own fly server on your VPS.
  • n8n-io/n8n - Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
  • alexpinel/Dot - Text-To-Speech, RAG, and LLMs. All local!
  • open-webui/open-webui - User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
  • janhq/jan - Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)
  • louislam/uptime-kuma - A fancy self-hosted monitoring tool

serverless

  • Kong/kong - 🦍 The Cloud-Native API Gateway and AI Gateway.

shell

  • alebcay/awesome-shell - A curated list of awesome command-line frameworks, toolkits, guides and gizmos. Inspired by awesome-php.
  • herrbischoff/awesome-command-line-apps - 🐚 Use your terminal shell to do awesome things.
  • ohmyzsh/ohmyzsh - 🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python,

software

sql

  • directus/directus - The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.

sqlite

  • nocodb/nocodb - 🔥 🔥 🔥 Open Source Airtable Alternative
  • directus/directus - The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.

stable-diffusion

  • swyxio/ai-notes - notes for software engineers getting up to speed on new AI developments. Serves as datastore for https://latent.space writing, and product brainstorming, but has cleaned up canonical references under
  • enricoros/big-AGI - AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. It features AI personas, AGI functions, multi-model chats, text-to-image, voice, response streaming, code highlight

statistics

  • johnkerl/miller - Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

svelte

  • immich-app/immich - High performance self-hosted photo and video management solution.

swift

  • argmaxinc/WhisperKit - On-device Speech Recognition for Apple Silicon
  • AugustDev/enchanted - Enchanted is iOS and macOS app for chatting with private self hosted language models such as Llama2, Mistral or Vicuna using Ollama.

swiftui

tailwindcss

  • supermemoryai/supermemory - Build your own second brain with supermemory. It's a ChatGPT for your bookmarks. Import tweets or save websites and content using the chrome extension.

tensorflow

terminal

  • athityakumar/colorls - A Ruby gem that beautifies the terminal's ls command, with color and font-awesome icons. 🎉
  • bnb/awesome-hyper - 🖥 Delightful Hyper plugins, themes, and resources
  • ohmyzsh/ohmyzsh - 🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python,

testing

typescript

  • twentyhq/twenty - Building a modern alternative to Salesforce, powered by the community.
  • immich-app/immich - High performance self-hosted photo and video management solution.
  • yamadashy/repomix - 📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or o
  • n8n-io/n8n - Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
  • ax-llm/ax - The unofficial DSPy framework. Build LLM powered Agents and "Agentic workflows" based on the Stanford DSP paper.
  • supermemoryai/supermemory - Build your own second brain with supermemory. It's a ChatGPT for your bookmarks. Import tweets or save websites and content using the chrome extension.
  • osmoscraft/osmosmemo - Turn GitHub into a bookmark manager
  • directus/directus - The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.
  • SocialGouv/archifiltre-docs - Visualisez et améliorez vos arborescences de fichiers !

ubuntu

vue

  • appbaseio/reactivesearch - Search UI components for React and Vue
  • directus/directus - The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.

web

web-components

  • muxinc/media-chrome - Custom elements (web components) for making audio and video player controls that look great in your website or app.

webapp

website

windows

xml

  • mikefarah/yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor

License

CC0

To the extent possible under law, stephenmcconnachie has waived all copyright and related or neighboring rights to this work.