Skip to content

๐Ÿ Hand-picked awesome Python libraries and frameworks, organised by category

License

Notifications You must be signed in to change notification settings

dylanhogg/awesome-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

400 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Awesome Python

Awesome Last commit License: MIT

Hand-picked awesome Python libraries and frameworks, organised by category ๐Ÿ

Interactive version: www.awesomepython.org

Updated 11 Feb 2026 with 1,553 repos

Categories

  • Newly Created Repositories - Awesome Python is regularly updated, and this category lists the most recently created GitHub repositories from all the other repositories here (10 repos)
  • Agentic AI - Agentic AI libraries, frameworks and tools: AI agents, workflows, autonomous decision-making, goal-oriented tasks, and API integrations (109 repos)
  • Code Quality - Code quality tooling: linters, formatters, pre-commit hooks, unused code removal (14 repos)
  • Crypto and Blockchain - Cryptocurrency and blockchain libraries: trading bots, API integration, Ethereum virtual machine, solidity (10 repos)
  • Data - General data libraries: data processing, serialisation, formats, databases, SQL, connectors, web crawlers, data generation/augmentation/checks (81 repos)
  • Debugging - Debugging and tracing tools (3 repos)
  • Diffusion Text to Image - Text-to-image diffusion model libraries, tools and apps for generating images from natural language (40 repos)
  • Finance - Financial and quantitative libraries: investment research tools, market data, algorithmic trading, backtesting, financial derivatives (27 repos)
  • Game Development - Game development tools, engines and libraries (6 repos)
  • GIS - Geospatial libraries: raster and vector data formats, interactive mapping and visualisation, computing frameworks for processing images, projections (17 repos)
  • Graph - Graphs and network libraries: network analysis, graph machine learning, visualisation (4 repos)
  • GUI - Graphical user interface libraries and toolkits (6 repos)
  • Jupyter - Jupyter and JupyterLab and Notebook tools, libraries and plugins (16 repos)
  • LLMs and ChatGPT - Large language model and GPT libraries and frameworks: auto-gpt, agents, QnA, chain-of-thought workflows, API integations. Also see the Natural Language Processing category for crossover (348 repos)
  • Math and Science - Mathematical, numerical and scientific libraries (22 repos)
  • Machine Learning - General - General and classical machine learning libraries. See below for other sections covering specialised ML areas (143 repos)
  • Machine Learning - Deep Learning - Machine learning libraries that cross over with deep learning in some way (67 repos)
  • Machine Learning - Interpretability - Machine learning interpretability libraries. Covers explainability, prediction explainations, dashboards, understanding knowledge development in training (21 repos)
  • Machine Learning - Ops - MLOps tools, frameworks and libraries: intersection of machine learning, data engineering and DevOps; deployment, health, diagnostics and governance of ML models (48 repos)
  • Machine Learning - Reinforcement - Machine learning libraries and toolkits that cross over with reinforcement learning in some way: agent reinforcement learning, agent environemnts, RLHF (22 repos)
  • Machine Learning - Time Series - Machine learning and classical timeseries libraries: forecasting, seasonality, anomaly detection, econometrics (17 repos)
  • Natural Language Processing - Natural language processing libraries and toolkits: text processing, topic modelling, tokenisers, chatbots. Also see the LLMs and ChatGPT category for crossover (72 repos)
  • Packaging - Python packaging, dependency management and bundling (21 repos)
  • Pandas - Pandas and dataframe libraries: data analysis, statistical reporting, pandas GUIs, pandas performance optimisations (18 repos)
  • Performance - Performance, parallelisation and low level libraries (19 repos)
  • Profiling - Memory and CPU/GPU profiling tools and libraries (8 repos)
  • Security - Security related libraries: vulnerability discovery, SQL injection, environment auditing (12 repos)
  • Simulation - Simulation libraries: robotics, economic, agent-based, traffic, physics, astronomy, chemistry, quantum simulation. Also see the Maths and Science category for crossover (35 repos)
  • Study - Miscellaneous study resources: algorithms, general resources, system design, code repos for textbooks, best practices, tutorials (63 repos)
  • Template - Template tools and libraries: cookiecutter repos, generators, quick-starts (9 repos)
  • Terminal - Terminal and console tools and libraries: CLI tools, terminal based formatters, progress bars (18 repos)
  • Testing - Testing libraries: unit testing, load testing, acceptance testing, code coverage, browser automation, plugins (15 repos)
  • Typing - Typing libraries: static and run-time type checking, annotations (14 repos)
  • Utility - General utility libraries: miscellaneous tools, linters, code formatters, version management, package tools, documentation tools (140 repos)
  • Vizualisation - Vizualisation tools and libraries. Application frameworks, 2D/3D plotting, dashboards, WebGL (30 repos)
  • Web - Web related frameworks and libraries: webapp servers, WSGI, ASGI, asyncio, HTTP, REST, user management (48 repos)

Newly Created Repositories

Awesome Python is regularly updated, and this category lists the most recently created GitHub repositories from all the other repositories here.

  1. affaan-m/everything-claude-code โญ 23,150
    Complete Claude Code configuration collection - agents, skills, hooks, commands, rules, MCPs. Battle-tested configs from an Anthropic hackathon winner.

  2. karpathy/llm-council โญ 13,742
    LLM Council works together to answer your hardest questions

  3. originalankur/maptoposter โญ 7,190
    Transform your favorite cities into beautiful, minimalist designs. MapToPoster lets you create and export visually striking map posters with code.

  4. anthropics/knowledge-work-plugins โญ 6,910
    Knowledge Work Plugins that turn Claude into a specialist for your role, team, and company

  5. deepseek-ai/Engram โญ 3,252
    Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models.

  6. aiming-lab/SimpleMem โญ 1,871
    SimpleMem addresses the fundamental challenge of efficient long-term memory for LLM agents through a three-stage pipeline grounded in Semantic Lossless Compression.

  7. 1rgs/nanocode โญ 1,669
    Minimal Claude Code alternative. Single Python file, zero dependencies, ~250 lines.

  8. alexzhang13/rlm โญ 1,668
    Recursive Language Models (RLMs) are a task-agnostic inference paradigm for language models (LMs) to handle near-infinite length contexts
    ๐Ÿ”— arxiv.org/abs/2512.24601v1

  9. agno-agi/dash โญ 1,598
    Self-learning data agent that grounds its answers in 6 layers of context. Inspired by OpenAI's in-house implementation.

  10. open-tinker/OpenTinker โญ 598
    OpenTinker is an RL-as-a-Service infrastructure for foundation models, providing a flexible environment design framework that supports diverse training scenarios over data and interaction modes.

Agentic AI

Agentic AI libraries, frameworks and tools: AI agents, workflows, autonomous decision-making, goal-oriented tasks, and API integrations.

  1. logspace-ai/langflow โญ 144,166
    Langflow is a powerful tool for building and deploying AI-powered agents and workflows.
    ๐Ÿ”— www.langflow.org

  2. langgenius/dify โญ 127,042
    Production-ready platform for agentic workflow development.
    ๐Ÿ”— dify.ai

  3. langchain-ai/langchain โญ 124,984
    ๐Ÿฆœ๐Ÿ”— The platform for reliable agents.
    ๐Ÿ”— docs.langchain.com/oss/python/langchain

  4. browser-use/browser-use โญ 76,464
    Browser use is the easiest way to connect your AI agents with the browser.
    ๐Ÿ”— browser-use.com

  5. github/spec-kit โญ 64,735
    Toolkit to help you get started with Spec-Driven Development: specifications become executable, directly generating working implementations

  6. geekan/MetaGPT โญ 63,365
    ๐ŸŒŸ The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
    ๐Ÿ”— mgx.dev

  7. microsoft/autogen โญ 53,832
    AutoGen is a framework for creating multi-agent AI applications that can act autonomously or work alongside humans.
    ๐Ÿ”— microsoft.github.io/autogen

  8. run-llama/llama_index โญ 46,554
    LlamaIndex is the leading framework for building LLM-powered agents over your data.
    ๐Ÿ”— developers.llamaindex.ai

  9. mem0ai/mem0 โญ 45,890
    Enhances AI assistants and agents with an intelligent memory layer, enabling personalized AI interactions
    ๐Ÿ”— mem0.ai

  10. crewaiinc/crewAI โญ 43,083
    Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
    ๐Ÿ”— crewai.com

  11. agno-agi/agno โญ 37,137
    Build, run, manage multi-agent systems.
    ๐Ÿ”— docs.agno.com

  12. openbmb/ChatDev โญ 28,989
    ChatDev stands as a virtual software company that operates through various intelligent agents holding different roles, including Chief Executive Officer, Chief Product Officer etc
    ๐Ÿ”— arxiv.org/abs/2307.07924

  13. stanford-oval/storm โญ 27,814
    An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
    ๐Ÿ”— storm.genie.stanford.edu

  14. composiohq/composio โญ 26,428
    Composio equips your AI agents & LLMs with 100+ high-quality integrations via function calling
    ๐Ÿ”— docs.composio.dev

  15. huggingface/smolagents โญ 25,075
    ๐Ÿค— smolagents: a barebones library for agents that think in code.
    ๐Ÿ”— huggingface.co/docs/smolagents

  16. assafelovic/gpt-researcher โญ 24,993
    An LLM agent that conducts deep research (local and web) on any given topic and generates a long report with citations.
    ๐Ÿ”— gptr.dev

  17. fosowl/agenticSeek โญ 24,539
    A 100% local alternative to Manus AI, this voice-enabled AI assistant autonomously browses the web, writes code, and plans tasks while keeping all data on your device.
    ๐Ÿ”— agenticseek.tech

  18. microsoft/OmniParser โญ 24,265
    OmniParser is a comprehensive method for parsing user interface screenshots into structured and easy-to-understand elements

  19. langchain-ai/langgraph โญ 23,696
    LangGraph is a library for building stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain.
    ๐Ÿ”— docs.langchain.com/oss/python/langgraph

  20. yoheinakajima/babyagi โญ 22,094
    GPT-4 powered task-driven autonomous agent
    ๐Ÿ”— babyagi.org

  21. a2aproject/A2A โญ 21,577
    An open protocol enabling communication and interoperability between opaque agentic applications.
    ๐Ÿ”— a2a-protocol.org

  22. openai/swarm โญ 20,819
    A framework exploring ergonomic, lightweight multi-agent orchestration.

  23. letta-ai/letta โญ 20,805
    Letta (formerly MemGPT) is a framework for creating LLM services with memory.
    ๐Ÿ”— docs.letta.com

  24. nirdiamant/GenAI_Agents โญ 19,499
    Tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive AI systems.

  25. bytedance/deer-flow โญ 19,381
    DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.
    ๐Ÿ”— deerflow.tech

  26. unity-technologies/ml-agents โญ 19,065
    The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
    ๐Ÿ”— unity.com/products/machine-learning-agents

  27. camel-ai/owl โญ 18,929
    ๐Ÿฆ‰ OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

  28. openai/openai-agents-python โญ 18,502
    A lightweight yet powerful framework for building multi-agent workflows. It is provider-agnostic, supporting the OpenAI Responses and Chat Completions APIs, as well as 100+ other LLMs.
    ๐Ÿ”— openai.github.io/openai-agents-python

  29. dzhng/deep-research โญ 18,372
    An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models.

  30. alibaba-nlp/DeepResearch โญ 18,035
    Tongyi Deep Research, the Leading Open-source Deep Research Agent
    ๐Ÿ”— tongyi-agent.github.io/blog/introducing-tongyi-deep-research

  31. google-gemini/gemini-fullstack-langgraph-quickstart โญ 17,758
    Demonstrates a fullstack application using a React and LangGraph-powered backend agent. The agent is designed to perform comprehensive research on a user's query.
    ๐Ÿ”— ai.google.dev/gemini-api/docs/google-search

  32. emcie-co/parlant โญ 17,579
    LLM agents built for control. Designed for real-world use. Deployed in minutes.
    ๐Ÿ”— www.parlant.io

  33. google/adk-python โญ 17,293
    An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.
    ๐Ÿ”— google.github.io/adk-docs

  34. agentscope-ai/agentscope โญ 15,850
    AgentScope: Agent-Oriented Programming for Building LLM Applications
    ๐Ÿ”— doc.agentscope.io

  35. camel-ai/camel โญ 15,751
    ๐Ÿซ CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
    ๐Ÿ”— docs.camel-ai.org

  36. pydantic/pydantic-ai โญ 14,427
    PydanticAI is a Python Agent Framework designed to make it less painful to build production grade applications with Generative AI.
    ๐Ÿ”— ai.pydantic.dev

  37. asyncfuncai/deepwiki-open โญ 13,812
    Custom implementation of DeepWiki, automatically creates beautiful, interactive wikis for any GitHub, GitLab, or BitBucket repository
    ๐Ÿ”— asyncfunc.mintlify.app

  38. smol-ai/developer โญ 12,205
    the first library to let you embed a developer agent in your own app!
    ๐Ÿ”— twitter.com/smolmodels

  39. sakanaai/AI-Scientist โญ 11,987
    The AI Scientist, the first comprehensive system for fully automatic scientific discovery, enabling Foundation Models such as Large Language Models (LLMs) to perform research independently.

  40. microsoft/agent-lightning โญ 11,622
    A structured way to train your agents with Automatic Prompt Optimization (APO). Just like you train a machine learning model on data, you can train an agent on a task dataset.
    ๐Ÿ”— microsoft.github.io/agent-lightning

  41. ag-ui-protocol/ag-ui โญ 11,565
    AG-UI: the Agent-User Interaction Protocol. Bring Agents into Frontend Applications.
    ๐Ÿ”— ag-ui.com

  42. langchain-ai/open_deep_research โญ 10,301
    Open Deep Research is an open source assistant that automates research and produces customizable reports on any topic

  43. microsoft/magentic-ui โญ 9,609
    A prototype of a human-centered interface powered by a multi-agent system that can browse and perform actions on the web, generate and execute code
    ๐Ÿ”— www.microsoft.com/en-us/research/blog/magentic-ui-an-experimental-human-centered-web-agent

  44. humanlayer/humanlayer โญ 8,911
    HumanLayer is an API and SDK that enables AI Agents to contact humans for help, feedback, and approvals.
    ๐Ÿ”— humanlayer.dev/code

  45. meta-llama/llama-stack โญ 8,246
    Llama Stack standardizes the building blocks needed to bring genai applications to market. These blocks cover model training and fine-tuning, evaluation, and running AI agents in production
    ๐Ÿ”— llamastack.github.io

  46. upsonic/Upsonic โญ 7,750
    Upsonic is a reliability-focused framework designed for real-world applications. It enables trusted agent workflows in your organization through advanced reliability features, including verification layers, triangular architecture, validator agents, and output evaluation systems.
    ๐Ÿ”— docs.upsonic.ai

  47. zilliztech/deep-searcher โญ 7,506
    DeepSearcher combines reasoning LLMs and VectorDBs o perform search, evaluation, and reasoning based on private data, providing highly accurate answer and comprehensive report
    ๐Ÿ”— zilliztech.github.io/deep-searcher

  48. awslabs/agent-squad โญ 7,280
    Flexible, lightweight open-source framework for orchestrating multiple AI agents to handle complex conversations
    ๐Ÿ”— awslabs.github.io/agent-squad

  49. x-plug/MobileAgent โญ 7,035
    Mobile-Agent: The Powerful GUI Agent Family

  50. mnotgod96/AppAgent โญ 6,478
    AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
    ๐Ÿ”— appagent-official.github.io

  51. samsungsailmontreal/TinyRecursiveModels โญ 6,275
    A recursive reasoning model that achieves amazing scores ARC-AGI-1 and ARC-AGI-2 with a tiny 7M parameters neural network

  52. prefecthq/marvin โญ 6,055
    an ambient intelligence library
    ๐Ÿ”— marvin.mintlify.app

  53. openai/openai-cs-agents-demo โญ 5,903
    Demo of a Customer Service Agent interface built on top of the OpenAI Agents SDK

  54. pyspur-dev/pyspur โญ 5,660
    A visual playground for agentic workflows: Iterate over your agents 10x faster
    ๐Ÿ”— pyspur.dev

  55. kyegomez/swarms โญ 5,638
    The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework. Website: https://swarms.ai
    ๐Ÿ”— docs.swarms.world

  56. brainblend-ai/atomic-agents โญ 5,516
    Atomic Agents provides a set of tools and agents that can be combined to create powerful applications. It is built on top of Instructor and leverages the power of Pydantic for data and schema validation and serialization.

  57. crewaiinc/crewAI-examples โญ 5,436
    A collection of examples that show how to use CrewAI framework to automate workflows.

  58. landing-ai/vision-agent โญ 5,206
    VisionAgent is a library that helps you utilize agent frameworks to generate code to solve your vision task

  59. codelion/openevolve โญ 5,203
    Evolutionary coding agent (like AlphaEvolve) enabling automated scientific and algorithmic discovery

  60. strands-agents/sdk-python โญ 4,940
    A model-driven approach to building AI agents in just a few lines of code.
    ๐Ÿ”— strandsagents.com

  61. rowboatlabs/rowboat โญ 4,321
    Local-first AI coworker, with memory
    ๐Ÿ”— www.rowboatlabs.com

  62. meta-llama/llama-stack-apps โญ 4,289
    Agentic components of the Llama Stack APIs

  63. tencentcloudadp/youtu-agent โญ 4,279
    A flexible, high-performance framework for building, running, and evaluating autonomous agents
    ๐Ÿ”— tencentcloudadp.github.io/youtu-agent

  64. ag2ai/ag2 โญ 4,082
    AG2 (formerly AutoGen) is an open-source programming framework for building AI agents and facilitating cooperation among multiple agents to solve tasks.
    ๐Ÿ”— ag2.ai

  65. joshuac215/agent-service-toolkit โญ 4,034
    A full toolkit for running an AI agent service built with LangGraph, FastAPI and Streamlit.
    ๐Ÿ”— agent-service-toolkit.streamlit.app

  66. going-doer/Paper2Code โญ 4,000
    A multi-agent LLM system that transforms paper into a code repository. It follows a three-stage pipeline: planning, analysis, and code generation, each handled by specialized agents.

  67. getzep/zep โญ 3,999
    Zep is a memory platform for AI agents that learns from user interactions and business data
    ๐Ÿ”— help.getzep.com

  68. langroid/langroid โญ 3,849
    Harness LLMs with Multi-Agent Programming
    ๐Ÿ”— langroid.github.io/langroid

  69. openmanus/OpenManus-RL โญ 3,836
    OpenManus-RL is an open-source initiative collaboratively led by Ulab-UIUC and MetaGPT. This project is an extended version of the original OpenManus initiative.

  70. i-am-bee/beeai-framework โญ 3,069
    Build production-ready AI agents in both Python and Typescript.
    ๐Ÿ”— framework.beeai.dev

  71. facebookresearch/Pearl โญ 2,971
    A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.

  72. cheshire-cat-ai/core โญ 2,951
    AI agent microservice
    ๐Ÿ”— cheshirecat.ai

  73. vllm-project/semantic-router โญ 2,907
    An Mixture-of-Models router that directs OpenAI API requests to the most suitable models from a defined pool based on Semantic Understanding
    ๐Ÿ”— vllm-semantic-router.com

  74. om-ai-lab/OmAgent โญ 2,624
    OmAgent is python library for building multimodal language agents with ease. We try to keep the library simple without too much overhead like other agent framework.
    ๐Ÿ”— om-agent.com

  75. swe-agent/mini-swe-agent โญ 2,606
    The 100 line AI agent that solves GitHub issues or helps you in your command line
    ๐Ÿ”— mini-swe-agent.com

  76. griptape-ai/griptape โญ 2,458
    Modular Python framework for AI agents and workflows with chain-of-thought reasoning, tools, and memory.
    ๐Ÿ”— www.griptape.ai

  77. langchain-ai/executive-ai-assistant โญ 2,161
    Executive AI Assistant (EAIA) is an AI agent that attempts to do the job of an Executive Assistant (EA).

  78. btahir/open-deep-research โญ 2,119
    Open source alternative to Gemini Deep Research. Generate reports with AI based on search results.
    ๐Ÿ”— opendeepresearch.vercel.app

  79. agentops-ai/AgentStack โญ 2,086
    AgentStack scaffolds your agent stack - The tech stack that collectively is your agent

  80. run-llama/llama_deploy โญ 2,071
    Async-first framework for deploying, scaling, and productionizing agentic multi-service systems based on workflows from llama_index.
    ๐Ÿ”— docs.llamaindex.ai/en/stable/module_guides/llama_deploy

  81. sakanaai/AI-Scientist-v2 โญ 2,034
    The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search

  82. openautocoder/Agentless โญ 2,002
    Agentless๐Ÿฑ: an agentless approach to automatically solve software development problems

  83. weaviate/elysia โญ 1,867
    Elysia is an agentic platform designed to use tools in a decision tree. A decision agent decides which tools to use dynamically based on its environment and context.

  84. jd-opensource/OxyGent โญ 1,831
    OxyGent is a modular multi-agent framework that lets you build, deploy, and evolve AI teams
    ๐Ÿ”— oxygent.jd.com

  85. msoedov/agentic_security โญ 1,749
    An open-source vulnerability scanner for Agent Workflows and LLMs. Protecting AI systems from jailbreaks, fuzzing, and multimodal attacks.
    ๐Ÿ”— agentic-security.vercel.app

  86. agno-agi/dash โญ 1,598
    Self-learning data agent that grounds its answers in 6 layers of context. Inspired by OpenAI's in-house implementation.

  87. szczyglis-dev/py-gpt โญ 1,559
    Desktop AI Assistant powered by GPT-5, GPT-4, o1, o3, Gemini, Claude, Ollama, DeepSeek, Perplexity, Grok, Bielik, chat, vision, voice, RAG, image and video generation, agents, tools, MCP, plugins, speech synthesis and recognition, web search, memory, presets, assistants,and more. Linux, Windows, Mac
    ๐Ÿ”— pygpt.net

  88. agentera/Agently โญ 1,530
    Agently is a development framework that helps developers build AI agent native application really fast.
    ๐Ÿ”— agently.tech

  89. shengranhu/ADAS โญ 1,498
    Automated Design of Agentic Systems using Meta Agent Search to show agents can invent novel and powerful agent designs
    ๐Ÿ”— www.shengranhu.com/adas

  90. link-agi/AutoAgents โญ 1,456
    [IJCAI 2024] Generate different roles for GPTs to form a collaborative entity for complex tasks.
    ๐Ÿ”— huggingface.co/spaces/linksoul/autoagents

  91. prefecthq/ControlFlow โญ 1,388
    ControlFlow provides a structured, developer-focused framework for defining workflows and delegating work to LLMs, without sacrificing control or transparency
    ๐Ÿ”— controlflow.ai

  92. langchain-ai/langgraph-swarm-py โญ 1,351
    A library for creating swarm-style multi-agent systems using LangGraph. A swarm is a type of multi-agent architecture where agents dynamically hand off control to one another based on their specializations
    ๐Ÿ”— langchain-ai.github.io/langgraph/concepts/multi_agent

  93. bytedance-seed/m3-agent โญ 1,214
    Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory

  94. k-dense-ai/karpathy โญ 1,186
    An agentic Machine Learning Engineer that trains state-of-the-art ML models using Claude Code SDK and Google ADK
    ๐Ÿ”— k-dense.ai

  95. plurai-ai/intellagent โญ 1,161
    Simulate interactions, analyze performance, and gain actionable insights for conversational agents. Test, evaluate, and optimize your agent to ensure reliable real-world deployment.
    ๐Ÿ”— intellagent-doc.plurai.ai

  96. google-deepmind/concordia โญ 1,155
    Concordia is a library to facilitate construction and use of generative agent-based models to simulate interactions of agents in grounded physical, social, or digital space.

  97. strnad/CrewAI-Studio โญ 1,144
    agentic,gui,automation

  98. thudm/CogAgent โญ 1,124
    An open-sourced end-to-end VLM-based GUI Agent

  99. victordibia/autogen-ui โญ 981
    Web UI for AutoGen (A Framework Multi-Agent LLM Applications)

  100. thytu/Agentarium โญ 933
    Framework for managing and orchestrating AI agents with ease. Agentarium provides a flexible and intuitive way to create, manage, and coordinate interactions between multiple AI agents in various environments.

  101. alpha-innovator/InternAgent โญ 840
    When Agent Becomes the Scientist โ€“ Building Closed-Loop System from Hypothesis to Verification
    ๐Ÿ”— discovery.intern-ai.org.cn/home

  102. deedy/mac_computer_use โญ 831
    A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
    ๐Ÿ”— x.com/deedydas/status/1849481225041559910

  103. codingmoh/open-codex โญ 665
    Open Codex is a fully open-source command-line AI assistant inspired by OpenAI Codex, supporting local language models like phi-4-mini and full integration with Ollama.

  104. salesforceairesearch/AgentLite โญ 641
    AgentLite is a research-oriented library designed for building and advancing LLM-based task-oriented agent systems. It simplifies the implementation of new agent/multi-agent architectures, enabling easy orchestration of multiple agents through a manager agent.

  105. quantalogic/quantalogic โญ 461
    QuantaLogic is a ReAct (Reasoning & Action) framework for building advanced AI agents. The cli version include coding capabilities comparable to Aider.

  106. agentscope-ai/agentscope-runtime โญ 375
    AgentScope Runtime: secure sandboxed tool execution and scalable agent deployment
    ๐Ÿ”— runtime.agentscope.io

  107. mannaandpoem/OpenManus โญ 306
    Open source version of Manus, the general AI agent

  108. sakanaai/AI-Scientist-ICLR2025-Workshop-Experiment โญ 279
    A paper produced by The AI Scientist passed a peer-review process at a workshop in a top machine learning conference

  109. prithivirajdamodaran/Route0x โญ 119
    A production-grade query routing solution, leveraging LLMs while optimizing for cost per query

Code Quality

Code quality tooling: linters, formatters, pre-commit hooks, unused code removal.

  1. astral-sh/ruff โญ 45,341
    An extremely fast Python linter and code formatter, written in Rust.
    ๐Ÿ”— docs.astral.sh/ruff

  2. psf/black โญ 41,322
    The uncompromising Python code formatter
    ๐Ÿ”— black.readthedocs.io/en/stable

  3. pre-commit/pre-commit โญ 14,844
    A framework for managing and maintaining multi-language pre-commit hooks.
    ๐Ÿ”— pre-commit.com

  4. google/yapf โญ 13,977
    A formatter for Python files

  5. sqlfluff/sqlfluff โญ 9,443
    A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
    ๐Ÿ”— www.sqlfluff.com

  6. pycqa/isort โญ 6,893
    A Python utility / library to sort imports.
    ๐Ÿ”— pycqa.github.io/isort

  7. davidhalter/jedi โญ 6,100
    Awesome autocompletion, static analysis and refactoring library for python
    ๐Ÿ”— jedi.readthedocs.io

  8. pycqa/pylint โญ 5,638
    It's not just a linter that annoys you!
    ๐Ÿ”— pylint.readthedocs.io/en/latest

  9. jendrikseipp/vulture โญ 4,291
    Find dead Python code

  10. asottile/pyupgrade โญ 4,031
    A tool (and pre-commit hook) to automatically upgrade syntax for newer versions of the language.

  11. pycqa/flake8 โญ 3,752
    flake8 is a python tool that glues together pycodestyle, pyflakes, mccabe, and third-party plugins to check the style and quality of some python code.
    ๐Ÿ”— flake8.pycqa.org

  12. wemake-services/wemake-python-styleguide โญ 2,813
    The strictest and most opinionated python linter ever!
    ๐Ÿ”— wemake-python-styleguide.rtfd.io

  13. python-lsp/python-lsp-server โญ 2,467
    Fork of the python-language-server project, maintained by the Spyder IDE team and the community

  14. tconbeer/sqlfmt โญ 504
    sqlfmt formats your dbt SQL files so you don't have to
    ๐Ÿ”— sqlfmt.com

Crypto and Blockchain

Cryptocurrency and blockchain libraries: trading bots, API integration, Ethereum virtual machine, solidity.

  1. freqtrade/freqtrade โญ 46,244
    Free, open source crypto trading bot
    ๐Ÿ”— www.freqtrade.io

  2. ccxt/ccxt โญ 40,661
    A cryptocurrency trading API with more than 100 exchanges in JavaScript / TypeScript / Python / C# / PHP / Go
    ๐Ÿ”— docs.ccxt.com

  3. crytic/slither โญ 6,100
    Static Analyzer for Solidity and Vyper
    ๐Ÿ”— blog.trailofbits.com/2018/10/19/slither-a-solidity-static-analysis-framework

  4. ethereum/web3.py โญ 5,472
    A python interface for interacting with the Ethereum blockchain and ecosystem.
    ๐Ÿ”— web3py.readthedocs.io

  5. ethereum/consensus-specs โญ 3,876
    Ethereum Proof-of-Stake Consensus Specifications
    ๐Ÿ”— ethereum.github.io/consensus-specs

  6. cyberpunkmetalhead/Binance-volatility-trading-bot โญ 3,490
    This is a fully functioning Binance trading bot that measures the volatility of every coin on Binance and places trades with the highest gaining coins If you like this project consider donating though the Brave browser to allow me to continuously improve the script.

  7. bmoscon/cryptofeed โญ 2,674
    Cryptocurrency Exchange Websocket Data Feed Handler

  8. binance/binance-public-data โญ 2,193
    Details on how to get Binance public data

  9. coinbase/agentkit โญ 1,042
    AgentKit is Coinbase Developer Platform's framework for easily enabling AI agents to take actions onchain. It is designed to be framework-agnostic, so you can use it with any AI framework, and wallet-agnostic
    ๐Ÿ”— docs.cdp.coinbase.com/agentkit/docs/welcome

  10. dylanhogg/awesome-crypto โญ 83
    A list of awesome crypto and blockchain projects
    ๐Ÿ”— www.awesomecrypto.xyz

Data

General data libraries: data processing, serialisation, formats, databases, SQL, connectors, web crawlers, data generation/augmentation/checks.

  1. microsoft/markitdown โญ 85,645
    A utility for converting files to Markdown, supports: PDF, PPT, Word, Excel, Images etc

  2. scrapy/scrapy โญ 59,530
    Scrapy, a fast high-level web crawling & scraping framework for Python.
    ๐Ÿ”— scrapy.org

  3. pathwaycom/pathway โญ 57,882
    Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
    ๐Ÿ”— pathway.com

  4. ds4sd/docling โญ 50,964
    Docling parses documents and exports them to the desired format with ease and speed.
    ๐Ÿ”— docling-project.github.io/docling

  5. apache/spark โญ 42,688
    Apache Spark - A unified analytics engine for large-scale data processing
    ๐Ÿ”— spark.apache.org

  6. mindsdb/mindsdb โญ 38,308
    Federated Query Engine for AI - The only MCP Server you'll ever need
    ๐Ÿ”— mindsdb.com

  7. jaidedai/EasyOCR โญ 28,818
    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
    ๐Ÿ”— www.jaided.ai

  8. qdrant/qdrant โญ 28,373
    Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
    ๐Ÿ”— qdrant.tech

  9. getredash/redash โญ 28,172
    Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
    ๐Ÿ”— redash.io

  10. humansignal/label-studio โญ 26,250
    Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats.
    ๐Ÿ”— labelstud.io

  11. chroma-core/chroma โญ 25,709
    Open-source search and retrieval database for AI applications.
    ๐Ÿ”— www.trychroma.com

  12. airbytehq/airbyte โญ 20,539
    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
    ๐Ÿ”— airbyte.com

  13. joke2k/faker โญ 19,043
    Faker is a Python package that generates fake data for you.
    ๐Ÿ”— faker.readthedocs.io

  14. avaiga/taipy โญ 19,023
    Turns Data and AI algorithms into production-ready web applications in no time.
    ๐Ÿ”— www.taipy.io

  15. tiangolo/sqlmodel โญ 17,536
    SQL databases in Python, designed for simplicity, compatibility, and robustness.
    ๐Ÿ”— sqlmodel.tiangolo.com

  16. binux/pyspider โญ 17,032
    A Powerful Spider(Web Crawler) System in Python.
    ๐Ÿ”— docs.pyspider.org

  17. apache/arrow โญ 16,429
    Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
    ๐Ÿ”— arrow.apache.org

  18. twintproject/twint โญ 16,295
    An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.

  19. weaviate/weaviate โญ 15,466
    Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native databaseโ€‹.
    ๐Ÿ”— weaviate.io/developers/weaviate

  20. cyclotruc/gitingest โญ 13,754
    Turn any Git repository into a prompt-friendly text ingest for LLMs.
    ๐Ÿ”— gitingest.com

  21. redis/redis-py โญ 13,436
    Redis Python client

  22. s0md3v/Photon โญ 12,622
    Incredibly fast crawler designed for OSINT.

  23. googleapis/genai-toolbox โญ 12,598
    MCP Toolbox for Databases is an open source MCP server for databases. Develop tools easier, faster, and more securely by handling connection pooling, authentication.
    ๐Ÿ”— googleapis.github.io/genai-toolbox/getting-started/introduction

  24. coleifer/peewee โญ 11,909
    a small, expressive orm -- supports postgresql, mysql, sqlite and cockroachdb
    ๐Ÿ”— docs.peewee-orm.com

  25. sqlalchemy/sqlalchemy โญ 11,420
    The Database Toolkit for Python
    ๐Ÿ”— www.sqlalchemy.org

  26. simonw/datasette โญ 10,709
    An open source multi-tool for exploring and publishing data
    ๐Ÿ”— datasette.io

  27. gristlabs/grist-core โญ 10,465
    Grist is the evolution of spreadsheets.
    ๐Ÿ”— www.getgrist.com

  28. voxel51/fiftyone โญ 10,272
    Refine high-quality datasets and visual AI models
    ๐Ÿ”— fiftyone.ai

  29. bigscience-workshop/petals โญ 9,880
    ๐ŸŒธ Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
    ๐Ÿ”— petals.dev

  30. yzhao062/pyod โญ 9,685
    A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
    ๐Ÿ”— pyod.readthedocs.io

  31. tobymao/sqlglot โญ 8,843
    Python SQL Parser and Transpiler
    ๐Ÿ”— sqlglot.com

  32. lancedb/lancedb โญ 8,604
    Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.
    ๐Ÿ”— lancedb.com/docs

  33. kaggle/kaggle-api โญ 7,106
    Official Kaggle API

  34. alirezamika/autoscraper โญ 7,076
    A Smart, Automatic, Fast and Lightweight Web Scraper for Python

  35. ibis-project/ibis โญ 6,360
    Ibis is a Python library that provides a lightweight, universal interface for data wrangling. It helps Python users explore and transform data of any size, stored anywhere.
    ๐Ÿ”— ibis-project.org

  36. madmaze/pytesseract โญ 6,301
    A Python wrapper for Google Tesseract

  37. vi3k6i5/flashtext โญ 5,701
    Extract Keywords from sentence or Replace keywords in sentences.

  38. rapidai/RapidOCR โญ 5,700
    ๐Ÿ“„ Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.
    ๐Ÿ”— rapidai.github.io/rapidocrdocs

  39. airbnb/knowledge-repo โญ 5,540
    A next-generation curated knowledge sharing platform for data scientists and other technical professions.

  40. superduperdb/superduper โญ 5,251
    Superduper: End-to-end framework for building custom AI applications and agents.
    ๐Ÿ”— superduper.io

  41. adbar/trafilatura โญ 5,219
    Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
    ๐Ÿ”— trafilatura.readthedocs.io

  42. giskard-ai/giskard-oss โญ 5,083
    ๐Ÿข Open-Source Evaluation & Testing library for LLM Agents
    ๐Ÿ”— docs.giskard.ai

  43. facebookresearch/AugLy โญ 5,072
    A data augmentations library for audio, image, text, and video.
    ๐Ÿ”— ai.facebook.com/blog/augly-a-new-data-augmentation-library-to-help-build-more-robust-ai-models

  44. dlt-hub/dlt โญ 4,828
    data load tool (dlt) is an open source Python library that makes data loading easy ๐Ÿ› ๏ธ
    ๐Ÿ”— dlthub.com/docs

  45. lk-geimfari/mimesis โญ 4,773
    Mimesis is a fast Python library for generating fake data in multiple languages.
    ๐Ÿ”— mimesis.name

  46. jazzband/tablib โญ 4,753
    Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
    ๐Ÿ”— tablib.readthedocs.io

  47. amundsen-io/amundsen โญ 4,719
    Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
    ๐Ÿ”— www.amundsen.io/amundsen

  48. mangiucugna/json_repair โญ 4,417
    A python module to repair invalid JSON from LLMs
    ๐Ÿ”— pypi.org/project/json-repair

  49. rom1504/img2dataset โญ 4,345
    Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

  50. mlabonne/llm-datasets โญ 4,190
    Curated list of datasets and tools for post-training.
    ๐Ÿ”— mlabonne.github.io/blog

  51. deepchecks/deepchecks โญ 3,968
    Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
    ๐Ÿ”— docs.deepchecks.com/stable

  52. sqlalchemy/alembic โญ 3,909
    A database migrations tool for SQLAlchemy.

  53. run-llama/llama-hub โญ 3,483
    A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
    ๐Ÿ”— llamahub.ai

  54. sdv-dev/SDV โญ 3,394
    Synthetic data generation for tabular data
    ๐Ÿ”— docs.sdv.dev/sdv

  55. docarray/docarray โญ 3,110
    Represent, send, store and search multimodal data
    ๐Ÿ”— docs.docarray.org

  56. datafold/data-diff โญ 2,992
    Compare tables within or across databases
    ๐Ÿ”— docs.datafold.com

  57. huggingface/datatrove โญ 2,848
    DataTrove is a library to process, filter and deduplicate text data at a very large scale. It provides a set of prebuilt commonly used processing blocks with a framework to easily add custom functionality

  58. pynamodb/PynamoDB โญ 2,643
    A pythonic interface to Amazon's DynamoDB
    ๐Ÿ”— pynamodb.readthedocs.io

  59. aminalaee/sqladmin โญ 2,620
    SQLAlchemy Admin for FastAPI and Starlette
    ๐Ÿ”— aminalaee.github.io/sqladmin

  60. pikepdf/pikepdf โญ 2,616
    A Python library for reading and writing PDF, powered by QPDF
    ๐Ÿ”— pikepdf.readthedocs.io

  61. sfu-db/connector-x โญ 2,537
    Fastest library to load data from DB to DataFrames in Rust and Python
    ๐Ÿ”— sfu-db.github.io/connector-x

  62. uqfoundation/dill โญ 2,423
    serialize all of Python
    ๐Ÿ”— dill.rtfd.io

  63. milvus-io/bootcamp โญ 2,360
    Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
    ๐Ÿ”— milvus.io

  64. emirozer/fake2db โญ 2,352
    Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql, mongodb, redis, couchdb.

  65. accenture/AmpliGraph โญ 2,227
    Python library for Representation Learning on Knowledge Graphs https://docs.ampligraph.org

  66. collerek/ormar โญ 1,794
    python async orm with fastapi in mind and pydantic validation
    ๐Ÿ”— collerek.github.io/ormar

  67. huggingface/aisheets โญ 1,621
    Build, enrich, and transform datasets using AI models with no code. Deploy locally or on the Hub with access to thousands of open models.
    ๐Ÿ”— huggingface.co/spaces/aisheets/sheets

  68. d-star-ai/dsRAG โญ 1,551
    A retrieval engine for unstructured data. It is especially good at handling challenging queries over dense text, like financial reports, legal documents, and academic papers.

  69. quixio/quix-streams โญ 1,513
    Python Streaming DataFrames for Kafka
    ๐Ÿ”— docs.quix.io

  70. meta-llama/synthetic-data-kit โญ 1,474
    Tool for generating high-quality synthetic datasets to fine-tune LLMs. Generate Reasoning Traces, QA Pairs, save them to a fine-tuning format with a simple CLI.
    ๐Ÿ”— pypi.org/project/synthetic-data-kit

  71. igorbenav/fastcrud โญ 1,467
    FastCRUD is a Python package for FastAPI, offering robust async CRUD operations and flexible endpoint creation utilities.
    ๐Ÿ”— benavlabs.github.io/fastcrud

  72. mchong6/JoJoGAN โญ 1,439
    Official PyTorch repo for JoJoGAN: One Shot Face Stylization

  73. apache/iceberg-python โญ 984
    PyIceberg is a Python library for programmatic access to Iceberg table metadata as well as to table data in Iceberg format.
    ๐Ÿ”— py.iceberg.apache.org

  74. weaviate/recipes โญ 934
    This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!

  75. ibm/data-prep-kit โญ 892
    Data Prep Kit is a community project to democratize and accelerate unstructured data preparation for LLM app developers
    ๐Ÿ”— data-prep-kit.github.io/data-prep-kit

  76. macbre/sql-metadata โญ 879
    Uses tokenized query returned by python-sqlparse and generates query metadata
    ๐Ÿ”— pypi.python.org/pypi/sql-metadata

  77. stackloklabs/deepfabric โญ 824
    Promptwright is a Python library designed for generating large synthetic datasets using LLMs
    ๐Ÿ”— docs.deepfabric.dev

  78. nvidia-nemo/DataDesigner โญ 653
    Create synthetic datasets that go beyond simple LLM prompting. Covers diverse statistical distributions, meaningful correlations between fields, or validated high-quality outputs.
    ๐Ÿ”— nvidia-nemo.github.io/datadesigner

  79. koaning/bulk โญ 598
    Bulk is a quick UI developer tool to apply some bulk labels.

  80. titan-systems/titan โญ 478
    Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API.

  81. pmgraham/datagrunt โญ 10
    Datagrunt is a Python library designed to simplify the way you work with CSV files. It provides a streamlined approach to reading, processing, and transforming your data into various formats, making data manipulation efficient and intuitive.
    ๐Ÿ”— www.datagrunt.io

Debugging

Debugging and tracing tools.

  1. cool-rr/PySnooper โญ 16,596
    Never use print for debugging again

  2. gruns/icecream โญ 10,003
    ๐Ÿฆ Never use print() to debug again.

  3. shobrook/rebound โญ 4,135
    Instant Stack Overflow results whenever an exception is thrown

Diffusion Text to Image

Text-to-image diffusion model libraries, tools and apps for generating images from natural language.

  1. automatic1111/stable-diffusion-webui โญ 160,173
    Stable Diffusion web UI

  2. comfyanonymous/ComfyUI โญ 101,251
    The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
    ๐Ÿ”— www.comfy.org

  3. compvis/stable-diffusion โญ 72,246
    A latent text-to-image diffusion model
    ๐Ÿ”— ommer-lab.com/research/latent-diffusion-models

  4. lllyasviel/ControlNet โญ 33,589
    Let us control diffusion models!

  5. huggingface/diffusers โญ 32,569
    ๐Ÿค— Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
    ๐Ÿ”— huggingface.co/docs/diffusers

  6. stability-ai/generative-models โญ 26,844
    Generative Models by Stability AI

  7. invoke-ai/InvokeAI โญ 26,601
    Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
    ๐Ÿ”— invoke-ai.github.io/invokeai

  8. openbmb/MiniCPM-V โญ 22,680
    MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

  9. apple/ml-stable-diffusion โญ 17,782
    Stable Diffusion with Core ML on Apple Silicon

  10. borisdayma/dalle-mini โญ 14,813
    DALLยทE Mini - Generate images from a text prompt
    ๐Ÿ”— www.craiyon.com

  11. compvis/latent-diffusion โญ 13,801
    High-Resolution Image Synthesis with Latent Diffusion Models

  12. divamgupta/diffusionbee-stable-diffusion-ui โญ 13,497
    Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.
    ๐Ÿ”— diffusionbee.com

  13. facebookresearch/dinov2 โญ 12,286
    PyTorch code and models for the DINOv2 self-supervised learning method.

  14. instantid/InstantID โญ 11,900
    InstantID: Zero-shot Identity-Preserving Generation in Seconds ๐Ÿ”ฅ
    ๐Ÿ”— instantid.github.io

  15. lucidrains/DALLE2-pytorch โญ 11,337
    Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

  16. opengvlab/InternVL โญ 9,736
    [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. ๆŽฅ่ฟ‘GPT-4o่กจ็Žฐ็š„ๅผ€ๆบๅคšๆจกๆ€ๅฏน่ฏๆจกๅž‹
    ๐Ÿ”— internvl.readthedocs.io/en/latest

  17. idea-research/GroundingDINO โญ 9,626
    [ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
    ๐Ÿ”— arxiv.org/abs/2303.05499

  18. ashawkey/stable-dreamfusion โญ 8,795
    Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.

  19. carson-katri/dream-textures โญ 8,108
    Stable Diffusion built-in to Blender

  20. xavierxiao/Dreambooth-Stable-Diffusion โญ 7,754
    Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion

  21. timothybrooks/instruct-pix2pix โญ 6,869
    PyTorch implementation of InstructPix2Pix, an instruction-based image editing model, based on the original CompVis/stable_diffusion repo.

  22. openai/consistency_models โญ 6,469
    Official repo for consistency models.

  23. salesforce/BLIP โญ 5,643
    PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

  24. nateraw/stable-diffusion-videos โญ 4,654
    Create ๐Ÿ”ฅ videos with Stable Diffusion by exploring the latent space and morphing between text prompts

  25. lkwq007/stablediffusion-infinity โญ 3,888
    Outpainting with Stable Diffusion on an infinite canvas

  26. jina-ai/discoart โญ 3,834
    ๐Ÿชฉ Create Disco Diffusion artworks in one line

  27. openai/improved-diffusion โญ 3,783
    Release for Improved Denoising Diffusion Probabilistic Models

  28. open-compass/VLMEvalKit โญ 3,741
    Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
    ๐Ÿ”— huggingface.co/spaces/opencompass/open_vlm_leaderboard

  29. mlc-ai/web-stable-diffusion โญ 3,712
    Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.
    ๐Ÿ”— mlc.ai/web-stable-diffusion

  30. openai/glide-text2im โญ 3,682
    GLIDE: a diffusion-based text-conditional image synthesis model

  31. google-research/big_vision โญ 3,333
    Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

  32. saharmor/dalle-playground โญ 2,749
    A playground to generate images from any text prompt using Stable Diffusion (past: using DALL-E Mini)

  33. stability-ai/stability-sdk โญ 2,434
    SDK for interacting with stability.ai APIs (e.g. stable diffusion inference)
    ๐Ÿ”— platform.stability.ai

  34. thudm/CogVLM2 โญ 2,427
    GPT4V-level open-source multi-modal model based on Llama3-8B

  35. coyote-a/ultimate-upscale-for-automatic1111 โญ 1,765
    Ultimate SD Upscale extension for AUTOMATIC1111 Stable Diffusion web UI

  36. divamgupta/stable-diffusion-tensorflow โญ 1,613
    Stable Diffusion in TensorFlow / Keras

  37. nvlabs/prismer โญ 1,309
    The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
    ๐Ÿ”— shikun.io/projects/prismer

  38. chenyangqiqi/FateZero โญ 1,159
    [ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
    ๐Ÿ”— fate-zero-edit.github.io

  39. tanelp/tiny-diffusion โญ 978
    A minimal PyTorch implementation of probabilistic diffusion models for 2D datasets.

  40. gojasper/flash-diffusion โญ 650
    โšก Flash Diffusion โšก: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation (AAAI 2025 Oral)
    ๐Ÿ”— gojasper.github.io/flash-diffusion-project

Finance

Financial and quantitative libraries: investment research tools, market data, algorithmic trading, backtesting, financial derivatives.

  1. openbb-finance/OpenBB โญ 59,275
    Financial data platform for analysts, quants and AI agents.
    ๐Ÿ”— openbb.co

  2. virattt/ai-hedge-fund โญ 45,480
    AI-powered hedge fund. The goal of this project is to explore the use of AI to make trading decisions.

  3. microsoft/qlib โญ 35,948
    Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, including supervised learning, market dynamics modeling, and RL, and is now equipped with https://github.com/microsoft/RD...
    ๐Ÿ”— qlib.readthedocs.io/en/latest

  4. ranaroussi/yfinance โญ 20,973
    Download market data from Yahoo! Finance's API
    ๐Ÿ”— ranaroussi.github.io/yfinance

  5. mementum/backtrader โญ 20,206
    Python Backtesting library for trading strategies
    ๐Ÿ”— www.backtrader.com

  6. quantopian/zipline โญ 19,357
    Zipline, a Pythonic Algorithmic Trading Library
    ๐Ÿ”— www.zipline.io

  7. ai4finance-foundation/FinGPT โญ 18,450
    FinGPT: Open-Source Financial Large Language Models! Revolutionize ๐Ÿ”ฅ We release the trained model on HuggingFace.
    ๐Ÿ”— ai4finance.org

  8. quantconnect/Lean โญ 16,011
    Lean Algorithmic Trading Engine by QuantConnect (Python, C#)
    ๐Ÿ”— lean.io

  9. ai4finance-foundation/FinRL โญ 13,790
    FinRLยฎ: Financial Reinforcement Learning. ๐Ÿ”ฅ
    ๐Ÿ”— ai4finance.org

  10. ta-lib/ta-lib-python โญ 11,640
    Python wrapper for TA-Lib (http://ta-lib.org/).
    ๐Ÿ”— ta-lib.github.io/ta-lib-python

  11. shiyu-coder/Kronos โญ 10,153
    Open-source foundation model for financial candlesticks, trained on data from over 45 global exchanges

  12. goldmansachs/gs-quant โญ 9,845
    Python toolkit for quantitative finance
    ๐Ÿ”— developer.gs.com/discover/products/gs-quant

  13. kernc/backtesting.py โญ 7,825
    ๐Ÿ”Ž ๐Ÿ“ˆ ๐Ÿ ๐Ÿ’ฐ Backtest trading strategies in Python.
    ๐Ÿ”— kernc.github.io/backtesting.py

  14. ranaroussi/quantstats โญ 6,619
    Portfolio analytics for quants, written in Python

  15. polakowo/vectorbt โญ 6,524
    Find your trading edge, using the fastest engine for backtesting, algorithmic trading, and research.
    ๐Ÿ”— vectorbt.dev

  16. quantopian/pyfolio โญ 6,207
    Portfolio and risk analytics in Python
    ๐Ÿ”— quantopian.github.io/pyfolio

  17. borisbanushev/stockpredictionai โญ 5,400
    In this noteboook I will create a complete process for predicting stock price movements. Follow along and we will achieve some pretty good results. For that purpose we will use a Generative Adversarial Network (GAN) with LSTM, a type of Recurrent Neural Network, as generator, and a Convolutional Neural Networ...

  18. google/tf-quant-finance โญ 5,194
    High-performance TensorFlow library for quantitative finance.

  19. gbeced/pyalgotrade โญ 4,635
    Python Algorithmic Trading Library
    ๐Ÿ”— gbeced.github.io/pyalgotrade

  20. matplotlib/mplfinance โญ 4,272
    Financial Markets Data Visualization using Matplotlib
    ๐Ÿ”— pypi.org/project/mplfinance

  21. quantopian/alphalens โญ 4,103
    Performance analysis of predictive (alpha) stock factors
    ๐Ÿ”— quantopian.github.io/alphalens

  22. zvtvz/zvt โญ 3,913
    modular quant framework.
    ๐Ÿ”— zvt.readthedocs.io/en/latest

  23. cuemacro/finmarketpy โญ 3,701
    Python library for backtesting trading strategies & analyzing financial markets (formerly pythalesians)
    ๐Ÿ”— www.cuemacro.com

  24. domokane/FinancePy โญ 2,755
    A Python Finance Library that focuses on the pricing and risk-management of Financial Derivatives, including fixed-income, equity, FX and credit derivatives.

  25. blankly-finance/blankly โญ 2,400
    ๐Ÿš€ ๐Ÿ’ธ Easily build, backtest and deploy your algo in just a few lines of code. Trade stocks, cryptos, and forex across exchanges w/ one package.
    ๐Ÿ”— package.blankly.finance

  26. cuemacro/findatapy โญ 1,968
    Python library to download market data via Bloomberg, Eikon, Quandl, Yahoo etc.

  27. ivebotunac/PrimoAgent โญ 268
    PrimoAgent is an multi agent AI stock analysis system built on LangGraph architecture that orchestrates four specialized agents to provide comprehensive daily trading insights and next-day price predictions
    ๐Ÿ”— primoinvesting.com

Game Development

Game development tools, engines and libraries.

  1. kitao/pyxel โญ 16,976
    A retro game engine for Python

  2. microsoft/TRELLIS โญ 11,630
    A large 3D asset generation model. It takes in text or image prompts and generates high-quality 3D assets in various formats, such as Radiance Fields, 3D Gaussians, and meshes.
    ๐Ÿ”— trellis3d.github.io

  3. pygame/pygame โญ 8,579
    ๐Ÿ๐ŸŽฎ pygame (the library) is a Free and Open Source python programming language library for making multimedia applications like games built on top of the excellent SDL library. C, Python, Native, OpenGL.
    ๐Ÿ”— www.pygame.org

  4. panda3d/panda3d โญ 5,025
    Powerful, mature open-source cross-platform game engine for Python and C++, developed by Disney and CMU
    ๐Ÿ”— www.panda3d.org

  5. pyglet/pyglet โญ 2,148
    pyglet is a cross-platform windowing and multimedia library for Python, for developing games and other visually rich applications.
    ๐Ÿ”— pyglet.org

  6. pythonarcade/arcade โญ 1,959
    Easy to use Python library for creating 2D arcade games.
    ๐Ÿ”— arcade.academy

GIS

Geospatial libraries: raster and vector data formats, interactive mapping and visualisation, computing frameworks for processing images, projections.

  1. domlysz/BlenderGIS โญ 8,737
    Blender addons to make the bridge between Blender and geographic data

  2. python-visualization/folium โญ 7,308
    Python Data. Leaflet.js Maps.
    ๐Ÿ”— python-visualization.github.io/folium

  3. originalankur/maptoposter โญ 7,190
    Transform your favorite cities into beautiful, minimalist designs. MapToPoster lets you create and export visually striking map posters with code.

  4. osgeo/gdal โญ 5,729
    GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
    ๐Ÿ”— gdal.org

  5. gboeing/osmnx โญ 5,549
    Download, model, analyze, and visualize street networks and other geospatial features from OpenStreetMap.
    ๐Ÿ”— osmnx.readthedocs.io

  6. geopandas/geopandas โญ 5,027
    Python tools for geographic data
    ๐Ÿ”— geopandas.org

  7. shapely/shapely โญ 4,358
    Manipulation and analysis of geometric objects
    ๐Ÿ”— shapely.readthedocs.io/en/stable

  8. opengeos/segment-geospatial โญ 3,863
    A Python package for segmenting geospatial data with the Segment Anything Model (SAM)
    ๐Ÿ”— samgeo.gishub.org

  9. giswqs/geemap โญ 3,856
    A Python package for interactive geospatial analysis and visualization with Google Earth Engine.
    ๐Ÿ”— geemap.org

  10. microsoft/torchgeo โญ 3,841
    TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
    ๐Ÿ”— www.osgeo.org/projects/torchgeo

  11. opengeos/leafmap โญ 3,656
    A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment
    ๐Ÿ”— leafmap.org

  12. holoviz/datashader โญ 3,501
    Quickly and accurately render even the largest data.
    ๐Ÿ”— datashader.org

  13. rasterio/rasterio โญ 2,474
    Rasterio reads and writes geospatial raster datasets
    ๐Ÿ”— rasterio.readthedocs.io

  14. plant99/felicette โญ 1,831
    Satellite imagery for dummies.

  15. microsoft/GlobalMLBuildingFootprints โญ 1,772
    Worldwide building footprints derived from satellite imagery

  16. pysal/pysal โญ 1,458
    PySAL: Python Spatial Analysis Library Meta-Package
    ๐Ÿ”— pysal.org/pysal

  17. residentmario/geoplot โญ 1,190
    High-level geospatial data visualization library for Python.
    ๐Ÿ”— residentmario.github.io/geoplot/index.html

Graph

Graphs and network libraries: network analysis, graph machine learning, visualisation.

  1. networkx/networkx โญ 16,546
    Network Analysis in Python
    ๐Ÿ”— networkx.org

  2. stellargraph/stellargraph โญ 3,044
    StellarGraph - Machine Learning on Graphs
    ๐Ÿ”— stellargraph.readthedocs.io

  3. microsoft/graspologic โญ 961
    graspologic is a package for graph statistical algorithms
    ๐Ÿ”— graspologic-org.github.io/graspologic

  4. dylanhogg/llmgraph โญ 496
    Create knowledge graphs with LLMs

GUI

Graphical user interface libraries and toolkits.

  1. hoffstadt/DearPyGui โญ 15,133
    Dear PyGui: A fast and powerful Graphical User Interface Toolkit for Python with minimal dependencies
    ๐Ÿ”— dearpygui.readthedocs.io/en/latest

  2. pysimplegui/PySimpleGUI โญ 13,713
    Python GUIs for Humans! PySimpleGUI is the top-rated Python application development environment. Launched in 2018 and actively developed, maintained, and supported in 2024. Transforms tkinter, Qt, WxPython, and Remi into a simple, intuitive, and fun experience for both hobbyists and expert users.
    ๐Ÿ”— www.pysimplegui.com

  3. parthjadhav/Tkinter-Designer โญ 10,170
    An easy and fast way to create a Python GUI ๐Ÿ

  4. samuelcolvin/FastUI โญ 8,947
    FastUI is a new way to build web application user interfaces defined by declarative Python code.
    ๐Ÿ”— fastui-demo.onrender.com

  5. r0x0r/pywebview โญ 5,687
    Build GUI for your Python program with JavaScript, HTML, and CSS
    ๐Ÿ”— pywebview.flowrl.com

  6. beeware/toga โญ 5,291
    A Python native, OS native GUI toolkit.
    ๐Ÿ”— toga.readthedocs.io/en/latest

Jupyter

Jupyter and JupyterLab and Notebook tools, libraries and plugins.

  1. marimo-team/marimo โญ 18,645
    A reactive Python notebook: run a cell or interact with a UI element, and marimo automatically runs dependent cells, keeping code and outputs consistent. marimo notebooks are stored as pure Python, executable as scripts, and deployable as apps.
    ๐Ÿ”— marimo.io

  2. jupyterlab/jupyterlab โญ 14,996
    JupyterLab computational environment.
    ๐Ÿ”— jupyterlab.readthedocs.io

  3. jupyter/notebook โญ 12,908
    Jupyter Interactive Notebook
    ๐Ÿ”— jupyter-notebook.readthedocs.io

  4. garrettj403/SciencePlots โญ 8,518
    Matplotlib styles for scientific plotting

  5. mwouts/jupytext โญ 7,096
    Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
    ๐Ÿ”— jupytext.readthedocs.io

  6. nteract/papermill โญ 6,358
    ๐Ÿ“š Parameterize, execute, and analyze notebooks
    ๐Ÿ”— papermill.readthedocs.io/en/latest

  7. voila-dashboards/voila โญ 5,886
    Voilร  turns Jupyter notebooks into standalone web applications
    ๐Ÿ”— voila.readthedocs.io

  8. connorferster/handcalcs โญ 5,801
    Python library for converting Python calculations into rendered latex.

  9. jupyterlite/jupyterlite โญ 4,738
    Wasm powered Jupyter running in the browser ๐Ÿ’ก
    ๐Ÿ”— jupyterlite.rtfd.io/en/stable/try/lab

  10. executablebooks/jupyter-book โญ 4,207
    Create beautiful, publication-quality books and documents from computational content.
    ๐Ÿ”— jupyterbook.org

  11. jupyterlab/jupyterlab-desktop โญ 4,180
    JupyterLab desktop application, based on Electron.

  12. jupyterlab/jupyter-ai โญ 4,094
    A generative AI extension for JupyterLab
    ๐Ÿ”— jupyter-ai.readthedocs.io

  13. mito-ds/mito โญ 2,609
    Jupyter extensions that help you write code faster: Context aware AI Chat, Autocomplete, and Spreadsheet
    ๐Ÿ”— trymito.io

  14. deepnote/deepnote โญ 2,585
    Deepnote is a successor of Jupyter. It uses the Deepnote kernel which is more powerful but still backwards compatible so you can seamlessly move between both, but adds an AI agent, sleek UI, new block types, and native data integrations.
    ๐Ÿ”— deepnote.com/?utm_source=github&utm_medium=github&utm_campaign=github&utm_content=readme_main

  15. koaning/drawdata โญ 1,608
    Draw datasets from within Python notebooks.
    ๐Ÿ”— koaning.github.io/drawdata

  16. infuseai/colab-xterm โญ 477
    Open a terminal in colab, including the free tier.

LLMs and ChatGPT

Large language model and GPT libraries and frameworks: auto-gpt, agents, QnA, chain-of-thought workflows, API integations. Also see the Natural Language Processing category for crossover.

  1. significant-gravitas/AutoGPT โญ 181,398
    AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
    ๐Ÿ”— agpt.co

  2. open-webui/open-webui โญ 121,671
    Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG
    ๐Ÿ”— openwebui.com

  3. deepseek-ai/DeepSeek-V3 โญ 101,281
    A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

  4. ggerganov/llama.cpp โญ 93,624
    LLM inference in C/C++

  5. nomic-ai/gpt4all โญ 77,052
    GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
    ๐Ÿ”— nomic.ai/gpt4all

  6. modelcontextprotocol/servers โญ 77,017
    A collection of reference implementations for the Model Context Protocol (MCP), as well as references to community built servers
    ๐Ÿ”— modelcontextprotocol.io

  7. infiniflow/ragflow โญ 72,047
    RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
    ๐Ÿ”— ragflow.io

  8. vllm-project/vllm โญ 68,388
    A high-throughput and memory-efficient inference and serving engine for LLMs
    ๐Ÿ”— vllm.ai

  9. hiyouga/LlamaFactory โญ 66,357
    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
    ๐Ÿ”— llamafactory.readthedocs.io

  10. xtekky/gpt4free โญ 65,698
    The official gpt4free repository | various collection of powerful language models | o4, o3 and deepseek r1, gpt-4.1, gemini 2.5
    ๐Ÿ”— t.me/g4f_channel

  11. killianlucas/open-interpreter โญ 61,792
    A natural language interface for computers
    ๐Ÿ”— openinterpreter.com

  12. facebookresearch/llama โญ 59,082
    Inference code for Llama models

  13. unclecode/crawl4ai โญ 58,929
    AI-ready web crawling tailored for LLMs, AI agents, and data pipelines. Open source, flexible, and built for real-time performance, Crawl4AI empowers developers with unmatched speed, precision, and deployment ease.
    ๐Ÿ”— crawl4ai.com

  14. imartinez/private-gpt โญ 57,072
    Interact with your documents using the power of GPT, 100% privately, no data leaks
    ๐Ÿ”— privategpt.dev

  15. gpt-engineer-org/gpt-engineer โญ 55,205
    CLI platform to experiment with codegen. Precursor to: https://lovable.dev

  16. pathwaycom/llm-app โญ 54,566
    Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. ๐ŸณDocker-friendly.โšกAlways in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
    ๐Ÿ”— pathway.com/developers/templates

  17. karpathy/nanoGPT โญ 52,283
    The simplest, fastest repository for training/finetuning medium-sized GPTs.

  18. xai-org/grok-1 โญ 51,370
    This repository contains JAX example code for loading and running the Grok-1 open-weights model.

  19. unslothai/unsloth โญ 51,098
    Fine-tuning & Reinforcement Learning for LLMs. ๐Ÿฆฅ Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.
    ๐Ÿ”— unsloth.ai/docs

  20. oobabooga/text-generation-webui โญ 45,923
    The definitive Web UI for local AI, with powerful features and easy setup.
    ๐Ÿ”— oobabooga.gumroad.com/l/deep_reason

  21. hpcaitech/ColossalAI โญ 41,330
    Making large AI models cheaper, faster and more accessible
    ๐Ÿ”— www.colossalai.org

  22. thudm/ChatGLM-6B โญ 41,222
    ChatGLM-6B: An Open Bilingual Dialogue Language Model | ๅผ€ๆบๅŒ่ฏญๅฏน่ฏ่ฏญ่จ€ๆจกๅž‹

  23. karpathy/nanochat โญ 40,732
    A full-stack implementation of an LLM like ChatGPT in a single, clean, minimal, hackable, dependency-lite codebase

  24. exo-explore/exo โญ 40,421
    Run your own AI cluster at home. Unify your existing devices into one powerful GPU: iPhone, iPad, Android, Mac, NVIDIA, Raspberry Pi etc

  25. lm-sys/FastChat โญ 39,381
    An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

  26. quivrhq/quivr โญ 38,879
    Opiniated RAG for integrating GenAI in your apps ๐Ÿง  Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.
    ๐Ÿ”— core.quivr.com

  27. danielmiessler/Fabric โญ 38,472
    Fabric is an open-source framework for augmenting humans using AI. It provides a modular system for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
    ๐Ÿ”— danielmiessler.com/p/fabric-origin-story

  28. laion-ai/Open-Assistant โญ 37,468
    OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
    ๐Ÿ”— open-assistant.io

  29. berriai/litellm โญ 34,399
    Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM]
    ๐Ÿ”— docs.litellm.ai/docs

  30. moymix/TaskMatrix โญ 34,284
    Connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting.

  31. pythagora-io/gpt-pilot โญ 33,749
    The first real AI developer

  32. khoj-ai/khoj โญ 32,272
    Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI
    ๐Ÿ”— khoj.dev

  33. stanfordnlp/dspy โญ 31,759
    DSPy: The framework for programmingโ€”not promptingโ€”language models
    ๐Ÿ”— dspy.ai

  34. anthropics/claude-cookbooks โญ 31,709
    Provides code and guides designed to help developers build with Claude, offering copy-able code snippets that you can easily integrate into your own projects.

  35. microsoft/graphrag โญ 30,500
    A modular graph-based Retrieval-Augmented Generation (RAG) system
    ๐Ÿ”— microsoft.github.io/graphrag

  36. tatsu-lab/stanford_alpaca โญ 30,273
    Code and documentation to train Stanford's Alpaca models, and generate the data.
    ๐Ÿ”— crfm.stanford.edu/2023/03/13/alpaca.html

  37. meta-llama/llama3 โญ 29,193
    The official Meta Llama 3 GitHub site

  38. karpathy/llm.c โญ 28,693
    LLM training in simple, pure C/CUDA. There is no need for 245MB of PyTorch or 107MB of cPython

  39. microsoft/semantic-kernel โญ 27,079
    An SDK that integrates LLMs like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java
    ๐Ÿ”— aka.ms/semantic-kernel

  40. qwenlm/Qwen3 โญ 26,271
    Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

  41. huggingface/open-r1 โญ 25,840
    The goal of this repo is to build the missing pieces of the R1 pipeline such that everybody can reproduce and build on top of it

  42. microsoft/BitNet โญ 25,810
    Official inference framework for 1-bit LLMs (e.g., BitNet b1.58). It offers a suite of optimized kernels, that support fast and lossless inference of 1.58-bit models

  43. vision-cair/MiniGPT-4 โญ 25,765
    Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
    ๐Ÿ”— minigpt-4.github.io

  44. cinnamon/kotaemon โญ 24,865
    An open-source RAG UI for chatting with your documents. Built with both end users and developers in mind
    ๐Ÿ”— cinnamon.github.io/kotaemon

  45. openai/gpt-2 โญ 24,567
    Code for the paper "Language Models are Unsupervised Multitask Learners"
    ๐Ÿ”— openai.com/blog/better-language-models

  46. microsoft/JARVIS โญ 24,513
    JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

  47. haotian-liu/LLaVA โญ 24,364
    [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
    ๐Ÿ”— llava.hliu.cc

  48. nirdiamant/RAG_Techniques โญ 24,339
    The most comprehensive and dynamic collections of Retrieval-Augmented Generation (RAG) tutorials available today. This repository serves as a hub for cutting-edge techniques aimed at enhancing the accuracy, efficiency, and contextual richness of RAG systems.

  49. deepset-ai/haystack โญ 23,959
    AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversatio...
    ๐Ÿ”— haystack.deepset.ai

  50. google/langextract โญ 23,629
    Library that uses LLMs to extract structured information from unstructured text documents based on user-defined instructions
    ๐Ÿ”— pypi.org/project/langextract

  51. karpathy/minGPT โญ 23,333
    A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

  52. sgl-project/sglang โญ 22,668
    SGLang is a high-performance serving framework for large language models and multimodal models.
    ๐Ÿ”— sglang.io

  53. vanna-ai/vanna โญ 22,377
    RAG (Retrieval-Augmented Generation) framework for SQL generation and related functionality.
    ๐Ÿ”— vanna.ai/docs

  54. mlc-ai/mlc-llm โญ 21,931
    Universal LLM Deployment Engine with ML Compilation
    ๐Ÿ”— llm.mlc.ai

  55. dao-ailab/flash-attention โญ 21,806
    Fast and memory-efficient exact attention

  56. modelcontextprotocol/python-sdk โญ 21,299
    The Model Context Protocol allows applications to provide context for LLMs in a standardized way, separating the concerns of providing context from the actual LLM interaction.
    ๐Ÿ”— modelcontextprotocol.github.io/python-sdk

  57. openai/chatgpt-retrieval-plugin โญ 21,229
    The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.

  58. guidance-ai/guidance โญ 21,221
    A guidance language for controlling large language models.

  59. rasahq/rasa โญ 20,993
    ๐Ÿ’ฌ Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
    ๐Ÿ”— rasa.com/docs/rasa

  60. huggingface/peft โญ 20,514
    ๐Ÿค— PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
    ๐Ÿ”— huggingface.co/docs/peft

  61. qwenlm/Qwen โญ 20,214
    The official repo of Qwen (้€šไน‰ๅƒ้—ฎ) chat & pretrained large language model proposed by Alibaba Cloud.

  62. skyvern-ai/skyvern โญ 20,197
    Skyvern automates browser-based workflows using LLMs and computer vision. It provides a simple API endpoint to fully automate manual workflows, replacing brittle or unreliable automation solutions.
    ๐Ÿ”— www.skyvern.com

  63. stitionai/devika โญ 19,489
    Devika is an advanced AI software engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective.
    ๐Ÿ”— opcode.sh

  64. karpathy/llama2.c โญ 19,130
    Inference Llama 2 in one file of pure C

  65. tloen/alpaca-lora โญ 18,980
    Instruct-tune LLaMA on consumer hardware

  66. volcengine/verl โญ 18,644
    veRL is a flexible, efficient and production-ready RL training library for large language models (LLMs).
    ๐Ÿ”— verl.readthedocs.io/en/latest/index.html

  67. facebookresearch/llama-cookbook โญ 18,166
    Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
    ๐Ÿ”— www.llama.com

  68. openai/evals โญ 17,586
    Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

  69. idea-research/Grounded-Segment-Anything โญ 17,359
    Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
    ๐Ÿ”— arxiv.org/abs/2401.14159

  70. mlc-ai/web-llm โญ 17,171
    High-performance In-browser LLM Inference Engine
    ๐Ÿ”— webllm.mlc.ai

  71. transformeroptimus/SuperAGI โญ 17,105
    <โšก๏ธ> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
    ๐Ÿ”— superagi.com

  72. kvcache-ai/ktransformers โญ 16,385
    A Flexible Framework for LLM Inference Optimizations - allows researchers to replace original torch modules with optimized variants
    ๐Ÿ”— kvcache-ai.github.io/ktransformers

  73. facebookresearch/codellama โญ 16,357
    Inference code for CodeLlama models

  74. mayooear/ai-pdf-chatbot-langchain โญ 16,314
    AI PDF chatbot agent built with LangChain & LangGraph
    ๐Ÿ”— www.youtube.com/watch?v=of6soldiewu

  75. thudm/ChatGLM2-6B โญ 15,680
    ChatGLM2-6B: An Open Bilingual Chat LLM | ๅผ€ๆบๅŒ่ฏญๅฏน่ฏ่ฏญ่จ€ๆจกๅž‹

  76. nvidia/Megatron-LM โญ 14,998
    Ongoing research training transformer models at scale
    ๐Ÿ”— docs.nvidia.com/megatron-core/developer-guide/latest/get-started/quickstart.html

  77. qwenlm/Qwen3-Coder โญ 14,967
    Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team, Alibaba Cloud.

  78. fauxpilot/fauxpilot โญ 14,759
    FauxPilot - an open-source alternative to GitHub Copilot server

  79. llmware-ai/llmware โญ 14,460
    Unified framework for building enterprise RAG pipelines with small, specialized models
    ๐Ÿ”— llmware-ai.github.io/llmware

  80. blinkdl/RWKV-LM โญ 14,321
    RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and f...

  81. swivid/F5-TTS โญ 13,999
    Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
    ๐Ÿ”— arxiv.org/abs/2410.06885

  82. karpathy/llm-council โญ 13,742
    LLM Council works together to answer your hardest questions

  83. anthropics/claude-quickstarts โญ 13,639
    A collection of projects designed to help developers quickly get started with building applications using the Anthropic API. Each quickstart provides a foundation that you can easily build upon and customize for your specific needs.

  84. canner/WrenAI โญ 13,527
    Open-source GenBI AI Agent that empowers data-driven teams to chat with their data to generate Text-to-SQL, charts, spreadsheets, reports, and BI.
    ๐Ÿ”— getwren.ai/oss

  85. andrewyng/aisuite โญ 13,384
    Simple, unified interface to multiple Generative AI providers. aisuite makes it easy for developers to use multiple LLM through a standardized interface.

  86. dottxt-ai/outlines โญ 13,289
    Structured Text Generation from LLMs
    ๐Ÿ”— dottxt-ai.github.io/outlines

  87. microsoft/LoRA โญ 13,196
    Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
    ๐Ÿ”— arxiv.org/abs/2106.09685

  88. lightning-ai/litgpt โญ 13,116
    20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
    ๐Ÿ”— lightning.ai

  89. lightning-ai/litgpt โญ 13,116
    20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
    ๐Ÿ”— lightning.ai

  90. qwenlm/Qwen-Agent โญ 13,021
    Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
    ๐Ÿ”— pypi.org/project/qwen-agent

  91. paddlepaddle/PaddleNLP โญ 12,904
    Easy-to-use and powerful LLM and SLM library with awesome model zoo.
    ๐Ÿ”— paddlenlp.readthedocs.io

  92. shishirpatil/gorilla โญ 12,697
    Enables LLMs to use tools by invoking APIs. Given a query, Gorilla comes up with the semantically and syntactically correct API.
    ๐Ÿ”— gorilla.cs.berkeley.edu

  93. jiayi-pan/TinyZero โญ 12,624
    TinyZero is a reproduction of DeepSeek R1 Zero in countdown and multiplication tasks.

  94. explodinggradients/ragas โญ 12,352
    Supercharge Your LLM Application Evaluations ๐Ÿš€
    ๐Ÿ”— docs.ragas.io

  95. modelscope/ms-swift โญ 12,316
    Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...) (AAAI 2025).
    ๐Ÿ”— swift.readthedocs.io/zh-cn/v3.12

  96. sapientinc/HRM โญ 12,274
    Hierarchical Reasoning Model (HRM), a novel recurrent architecture that attains significant computational depth while maintaining both training stability and efficiency
    ๐Ÿ”— sapient.inc

  97. google-research/vision_transformer โญ 12,247
    Vision Transformer and MLP-Mixer Architectures

  98. instructor-ai/instructor โญ 12,199
    Instructor is a Python library that makes it a breeze to work with structured outputs from large language models (LLMs). Built on top of Pydantic, it provides a simple, transparent, and user-friendly API to manage validation, retries, and streaming responses.
    ๐Ÿ”— python.useinstructor.com

  99. openlmlab/MOSS โญ 12,076
    An open-source tool-augmented conversational language model from Fudan University
    ๐Ÿ”— txsun1997.github.io/blogs/moss.html

  100. h2oai/h2ogpt โญ 12,004
    Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
    ๐Ÿ”— h2o.ai

  101. gibsonai/Memori โญ 11,861
    Memori enables any LLM to remember conversations, learn from interactions, and maintain context across sessions
    ๐Ÿ”— memorilabs.ai

  102. chainlit/chainlit โญ 11,430
    Build Conversational AI in minutes โšก๏ธ
    ๐Ÿ”— docs.chainlit.io

  103. topoteretes/cognee โญ 11,265
    Memory for AI Agents in 6 lines of code
    ๐Ÿ”— www.cognee.ai

  104. eleutherai/lm-evaluation-harness โญ 11,265
    A framework for few-shot evaluation of language models.
    ๐Ÿ”— www.eleuther.ai

  105. axolotl-ai-cloud/axolotl โญ 11,143
    Go ahead and axolotl questions
    ๐Ÿ”— docs.axolotl.ai

  106. geeeekexplorer/nano-vllm โญ 11,017
    A lightweight vLLM implementation built from scratch.

  107. microsoft/promptflow โญ 11,001
    Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
    ๐Ÿ”— microsoft.github.io/promptflow

  108. artidoro/qlora โญ 10,821
    QLoRA: Efficient Finetuning of Quantized LLMs
    ๐Ÿ”— arxiv.org/abs/2305.14314

  109. databrickslabs/dolly โญ 10,802
    Databricksโ€™ Dolly, a large language model trained on the Databricks Machine Learning Platform
    ๐Ÿ”— www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html

  110. mistralai/mistral-inference โญ 10,632
    Official inference library for Mistral models
    ๐Ÿ”— mistral.ai

  111. e2b-dev/E2B โญ 10,604
    E2B is an open-source infrastructure that allows you to run AI-generated code in secure isolated sandboxes in the cloud
    ๐Ÿ”— e2b.dev/docs

  112. karpathy/minbpe โญ 10,284
    Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

  113. promptfoo/promptfoo โญ 10,096
    Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
    ๐Ÿ”— promptfoo.dev

  114. pipecat-ai/pipecat โญ 9,974
    Open Source framework for voice and multimodal conversational AI
    ๐Ÿ”— pipecat.ai

  115. abetlen/llama-cpp-python โญ 9,922
    Simple Python bindings for @ggerganov's llama.cpp library.
    ๐Ÿ”— llama-cpp-python.readthedocs.io

  116. mshumer/gpt-prompt-engineer โญ 9,631
    Simply input a description of your task and some test cases, and the system will generate, test, and rank a multitude of prompts to find the ones that perform the best.

  117. blinkdl/ChatRWKV โญ 9,511
    ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.

  118. skypilot-org/skypilot โญ 9,367
    Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).
    ๐Ÿ”— docs.skypilot.co

  119. vikhyat/moondream โญ 9,264
    A tiny open-source computer-vision language model designed to run efficiently on edge devices
    ๐Ÿ”— moondream.ai

  120. jzhang38/TinyLlama โญ 8,878
    The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

  121. vaibhavs10/insanely-fast-whisper โญ 8,787
    An opinionated CLI to transcribe Audio files w/ Whisper on-device! Powered by ๐Ÿค— Transformers, Optimum & flash-attn

  122. thudm/CodeGeeX โญ 8,743
    CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
    ๐Ÿ”— codegeex.cn

  123. bytedance/Dolphin โญ 8,727
    A novel multimodal document image parsing model following an analyze-then-parse paradigm

  124. lyogavin/airllm โญ 8,713
    AirLLM optimizes inference memory usage, allowing 70B large language models to run inference on a single 4GB GPU card without quantization, distillation and pruning. And you can run 405B Llama3.1 on 8GB vram now.

  125. apple/ml-ferret โญ 8,676
    Ferret: Refer and Ground Anything Anywhere at Any Granularity

  126. sjtu-ipads/PowerInfer โญ 8,591
    High-speed Large Language Model Serving for Local Deployment

  127. optimalscale/LMFlow โญ 8,502
    An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
    ๐Ÿ”— optimalscale.github.io/lmflow

  128. eleutherai/gpt-neo โญ 8,288
    An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
    ๐Ÿ”— www.eleuther.ai

  129. lianjiatech/BELLE โญ 8,284
    BELLE: Be Everyone's Large Language model Engine๏ผˆๅผ€ๆบไธญๆ–‡ๅฏน่ฏๅคงๆจกๅž‹๏ผ‰

  130. future-house/paper-qa โญ 8,025
    High-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature
    ๐Ÿ”— futurehouse.gitbook.io/futurehouse-cookbook

  131. plachtaa/VALL-E-X โญ 7,959
    An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

  132. zilliztech/GPTCache โญ 7,913
    Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
    ๐Ÿ”— gptcache.readthedocs.io

  133. 01-ai/Yi โญ 7,846
    The Yi series models are the next generation of open-source large language models trained from scratch by 01.AI.
    ๐Ÿ”— 01.ai

  134. thudm/GLM-130B โญ 7,677
    GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

  135. sweepai/sweep โญ 7,628
    Sweep: AI coding assistant for JetBrains
    ๐Ÿ”— sweep.dev

  136. bigcode-project/starcoder โญ 7,534
    Home of StarCoder: fine-tuning & inference!

  137. vectifyai/PageIndex โญ 7,532
    A document indexing system that builds search tree structures from long documents, making them ready for reasoning-based RAG
    ๐Ÿ”— pageindex.ai

  138. openlm-research/open_llama โญ 7,529
    OpenLLaMA: An Open Reproduction of LLaMA

  139. weaviate/Verba โญ 7,525
    Retrieval Augmented Generation (RAG) chatbot powered by Weaviate

  140. boundaryml/baml โญ 7,442
    The AI framework that adds the engineering to prompt engineering (Python/TS/Ruby/Java/C#/Rust/Go compatible)
    ๐Ÿ”— docs.boundaryml.com

  141. eleutherai/gpt-neox โญ 7,370
    An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
    ๐Ÿ”— www.eleuther.ai

  142. mit-han-lab/streaming-llm โญ 7,172
    [ICLR 2024] Efficient Streaming Language Models with Attention Sinks
    ๐Ÿ”— arxiv.org/abs/2309.17453

  143. bhaskatripathi/pdfGPT โญ 7,168
    PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!
    ๐Ÿ”— huggingface.co/spaces/bhaskartripathi/pdfchatter

  144. apple/ml-fastvlm โญ 7,167
    FastVLM: Efficient Vision Encoding for Vision Language Models

  145. internlm/InternLM โญ 7,142
    Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
    ๐Ÿ”— internlm.readthedocs.io

  146. apple/corenet โญ 7,021
    CoreNet is a deep neural network toolkit that allows researchers and engineers to train standard and novel small and large-scale models for variety of tasks, including foundation models (e.g., CLIP and LLM), object classification, object detection, and semantic segmentation.

  147. nirdiamant/Prompt_Engineering โญ 7,013
    A comprehensive collection of tutorials and implementations for Prompt Engineering techniques, ranging from fundamental concepts to advanced strategies.

  148. k-dense-ai/claude-scientific-skills โญ 6,957
    A set of ready to use scientific skills for Claude
    ๐Ÿ”— k-dense.ai

  149. anthropics/knowledge-work-plugins โญ 6,910
    Knowledge Work Plugins that turn Claude into a specialist for your role, team, and company

  150. lmcache/LMCache โญ 6,765
    LMCache is an LLM serving engine extension to reduce TTFT and increase throughput, especially under long-context scenarios
    ๐Ÿ”— lmcache.ai

  151. langchain-ai/opengpts โญ 6,757
    An open source effort to create a similar experience to OpenAI's GPTs and Assistants API.

  152. arcee-ai/mergekit โญ 6,705
    Tools for merging pretrained large language models.

  153. minedojo/Voyager โญ 6,616
    An Open-Ended Embodied Agent with Large Language Models
    ๐Ÿ”— voyager.minedojo.org

  154. open-compass/opencompass โญ 6,595
    OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
    ๐Ÿ”— opencompass.org.cn

  155. run-llama/rags โญ 6,532
    RAGs is a Streamlit app that lets you create a RAG pipeline from a data source using natural language.

  156. qwenlm/Qwen-VL โญ 6,499
    The official repo of Qwen-VL (้€šไน‰ๅƒ้—ฎ-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

  157. yuliang-liu/MonkeyOCR โญ 6,447
    A lightweight LMM-based Document Parsing Model with a Structure-Recognition-Relation Triplet Paradigm

  158. nat/openplayground โญ 6,370
    An LLM playground you can run on your laptop

  159. guardrails-ai/guardrails โญ 6,300
    Open-source Python package for specifying structure and type, validating and correcting the outputs of large language models (LLMs)
    ๐Ÿ”— www.guardrailsai.com/docs

  160. allenai/OLMo โญ 6,294
    OLMo is a repository for training and using AI2's state-of-the-art open language models. It is designed by scientists, for scientists.
    ๐Ÿ”— allenai.org/olmo

  161. langchain-ai/chat-langchain โญ 6,228
    Locally hosted chatbot specifically focused on question answering over the LangChain documentation
    ๐Ÿ”— chat.langchain.com

  162. pytorch-labs/gpt-fast โญ 6,181
    Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

  163. lightning-ai/lit-llama โญ 6,093
    Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

  164. linkedin/Liger-Kernel โญ 6,065
    Efficient Triton Kernels for LLM Training
    ๐Ÿ”— linkedin.github.io/liger-kernel

  165. microsoft/LLMLingua โญ 5,784
    [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
    ๐Ÿ”— llmlingua.com

  166. microsoft/promptbase โญ 5,726
    promptbase is an evolving collection of resources, best practices, and example scripts for eliciting the best performance from foundation models.

  167. meta-pytorch/torchtune โญ 5,650
    a PyTorch library for easily authoring, post-training, and experimenting with recipes for SFT, knowledge distillation, DPO, PPO, GRPO, and quantization-aware training
    ๐Ÿ”— pytorch.org/torchtune/main

  168. nvidia/Guardrails โญ 5,538
    NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
    ๐Ÿ”— docs.nvidia.com/nemo/guardrails/latest/index.html

  169. openbmb/ToolBench โญ 5,492
    [ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
    ๐Ÿ”— openbmb.github.io/toolbench

  170. dsdanielpark/Bard-API โญ 5,226
    The unofficial python package that returns response of Google Bard through cookie value.
    ๐Ÿ”— pypi.org/project/bardapi

  171. katanaml/sparrow โญ 5,097
    Sparrow is a solution for efficient data extraction and processing from various documents and images like invoices and receipts
    ๐Ÿ”— sparrow.katanaml.io

  172. agiresearch/AIOS โญ 4,967
    AIOS, a Large Language Model (LLM) Agent operating system, embeds large language model into Operating Systems (OS) as the brain of the OS, enabling an operating system "with soul" -- an important step towards AGI.
    ๐Ÿ”— docs.aios.foundation

  173. togethercomputer/RedPajama-Data โญ 4,922
    The RedPajama-Data repository contains code for preparing large datasets for training large language models.

  174. 1rgs/jsonformer โญ 4,891
    A Bulletproof Way to Generate Structured JSON from Language Models

  175. h2oai/h2o-llmstudio โญ 4,781
    H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
    ๐Ÿ”— h2o.ai

  176. flashinfer-ai/flashinfer โญ 4,744
    FlashInfer is a library and kernel generator for Large Language Models that provides high-performance implementation of LLM GPU kernels such as FlashAttention, SparseAttention, PageAttention, Sampling
    ๐Ÿ”— flashinfer.ai

  177. kiln-ai/Kiln โญ 4,593
    Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.
    ๐Ÿ”— kiln.tech

  178. vllm-project/aibrix โญ 4,584
    AIBrix delivers a cloud-native solution optimized for deploying, managing, and scaling large language model (LLM) inference, tailored specifically to enterprise needs.

  179. kyegomez/tree-of-thoughts โญ 4,564
    Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%
    ๐Ÿ”— discord.gg/qutxnk2nmf

  180. yizhongw/self-instruct โญ 4,563
    Aligning pretrained language models with instruction data generated by themselves.

  181. lm-sys/RouteLLM โญ 4,553
    A framework for serving and evaluating LLM routers - save LLM costs without compromising quality

  182. marker-inc-korea/AutoRAG โญ 4,541
    AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
    ๐Ÿ”— marker-inc-korea.github.io/autorag

  183. microsoft/BioGPT โญ 4,486
    Implementation of BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining

  184. hiyouga/EasyR1 โญ 4,475
    EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
    ๐Ÿ”— verl.readthedocs.io

  185. llm-attacks/llm-attacks โญ 4,460
    This is the official repository for "Universal and Transferable Adversarial Attacks on Aligned Language Models"
    ๐Ÿ”— llm-attacks.org

  186. turboderp/exllamav2 โญ 4,425
    A fast inference library for running LLMs locally on modern consumer-class GPUs

  187. huggingface/text-embeddings-inference โญ 4,417
    A blazing fast inference solution for text embeddings models
    ๐Ÿ”— huggingface.co/docs/text-embeddings-inference/quick_tour

  188. ragapp/ragapp โญ 4,393
    The easiest way to use Agentic RAG in any enterprise

  189. instruction-tuning-with-gpt-4/GPT-4-LLM โญ 4,342
    Instruction Tuning with GPT-4
    ๐Ÿ”— instruction-tuning-with-gpt-4.github.io

  190. openai/simple-evals โญ 4,314
    Lightweight library for evaluating language models

  191. truefoundry/cognita โญ 4,314
    RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
    ๐Ÿ”— cognita.truefoundry.com

  192. neo4j-labs/llm-graph-builder โญ 4,294
    Transform unstructured data into a structured Knowledge Graph stored in Neo4j with LLMs
    ๐Ÿ”— llm-graph-builder.neo4jlabs.com

  193. p-e-w/heretic โญ 4,286
    Heretic is a tool that removes censorship from transformer-based LLMs without post-training

  194. microsoft/LMOps โญ 4,262
    General technology for enabling AI capabilities w/ LLMs and MLLMs
    ๐Ÿ”— aka.ms/generalai

  195. pytorch/executorch โญ 4,182
    An end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices.
    ๐Ÿ”— executorch.ai

  196. mshumer/gpt-llm-trainer โญ 4,168
    Input a description of your task, and the system will generate a dataset, parse it, and fine-tune a LLaMA 2 model for you

  197. openai/harmony โญ 4,149
    Renderer for the harmony response format to be used with gpt-oss

  198. eth-sri/lmql โญ 4,136
    A language for constraint-guided and efficient LLM programming.
    ๐Ÿ”— lmql.ai

  199. deep-agent/R1-V โญ 4,025
    We are building a general framework for Reinforcement Learning with Verifiable Rewards (RLVR) in VLM. RLVR outperforms chain-of-thought supervised fine-tuning (CoT-SFT) in both effectiveness and out-of-distribution (OOD) robustness for vision language models.

  200. sylphai-inc/AdalFlow โญ 3,998
    Unified auto-differentiative framework for both zero-shot prompt optimization and few-shot optimization. It advances existing auto-optimization research, including Text-Grad and DsPy
    ๐Ÿ”— adalflow.sylph.ai

  201. defog-ai/sqlcoder โญ 3,990
    SoTA LLM for converting natural language questions to SQL queries

  202. meta-llama/PurpleLlama โญ 3,988
    Set of tools to assess and improve LLM security. An umbrella project to bring together tools and evals to help the community build responsibly with open genai models.

  203. bclavie/RAGatouille โญ 3,828
    Bridging the gap between state-of-the-art research and alchemical RAG pipeline practices.

  204. ravenscroftj/turbopilot โญ 3,808
    Turbopilot is an open source large-language-model based code completion engine that runs locally on CPU

  205. lightning-ai/LitServe โญ 3,790
    A minimal Python framework for building custom AI inference servers with full control over logic, batching, and scaling.
    ๐Ÿ”— lightning.ai/litserve?utm_source=litserve_readme&utm_medium=referral&utm_campaign=litserve_readme

  206. agenta-ai/agenta โญ 3,787
    The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.
    ๐Ÿ”— www.agenta.ai

  207. microsoft/PromptWizard โญ 3,744
    PromptWizard is a discrete prompt optimization framework that employs a self-evolving mechanism where the LLM generates, critiques, and refines its own prompts and examples

  208. mmabrouk/llm-workflow-engine โญ 3,718
    Power CLI and Workflow manager for LLMs (core package)

  209. predibase/lorax โญ 3,682
    Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
    ๐Ÿ”— loraexchange.ai

  210. next-gpt/NExT-GPT โญ 3,614
    Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model
    ๐Ÿ”— next-gpt.github.io

  211. evolvinglmms-lab/lmms-eval โญ 3,588
    One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
    ๐Ÿ”— www.lmms-lab.com

  212. huggingface/smollm โญ 3,575
    Everything about the SmolLM and SmolVLM family of models
    ๐Ÿ”— huggingface.co/huggingfacetb

  213. verazuo/jailbreak_llms โญ 3,534
    Official repo for the ACM CCS 2024 paper "Do Anything Now'': Characterizing and Evaluating In-The-Wild Jailbreak Prompts
    ๐Ÿ”— jailbreak-llms.xinyueshen.me

  214. minimaxir/simpleaichat โญ 3,520
    Python package for easily interfacing with chat apps, with robust features and minimal code complexity.

  215. iryna-kondr/scikit-llm โญ 3,491
    Seamlessly integrate LLMs into scikit-learn.
    ๐Ÿ”— beastbyte.ai

  216. jaymody/picoGPT โญ 3,438
    An unnecessarily tiny implementation of GPT-2 in NumPy.

  217. mit-han-lab/llm-awq โญ 3,424
    AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

  218. minimaxir/gpt-2-simple โญ 3,406
    Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

  219. novasky-ai/SkyThought โญ 3,369
    Sky-T1: Train your own O1 preview model within $450
    ๐Ÿ”— novasky-ai.github.io

  220. deep-diver/LLM-As-Chatbot โญ 3,332
    LLM as a Chatbot Service

  221. zou-group/textgrad โญ 3,319
    TextGrad is a framework building automatic differentiation by implementing backpropagation through text feedback provided by LLMs, strongly building on the gradient metaphor.
    ๐Ÿ”— textgrad.com

  222. luodian/Otter โญ 3,286
    ๐Ÿฆฆ Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
    ๐Ÿ”— otter-ntu.github.io

  223. ruc-nlpir/FlashRAG โญ 3,277
    FlashRAG is a Python toolkit for the reproduction and development of RAG research. Our toolkit includes 36 pre-processed benchmark RAG datasets and 15 state-of-the-art RAG algorithms.
    ๐Ÿ”— arxiv.org/abs/2405.13576

  224. deepseek-ai/Engram โญ 3,252
    Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models.

  225. googleapis/python-genai โญ 3,251
    Google Gen AI Python SDK provides an interface for developers to integrate Google's generative models into their Python applications.
    ๐Ÿ”— googleapis.github.io/python-genai

  226. cohere-ai/cohere-toolkit โญ 3,156
    Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.

  227. microsoft/torchscale โญ 3,132
    Foundation Architecture for (M)LLMs
    ๐Ÿ”— aka.ms/generalai

  228. mistralai/mistral-finetune โญ 3,068
    A light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA.

  229. argilla-io/distilabel โญ 3,065
    Distilabel is the framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
    ๐Ÿ”— distilabel.argilla.io

  230. truera/trulens โญ 3,057
    Evaluation and Tracking for LLM Experiments and AI Agents
    ๐Ÿ”— www.trulens.org

  231. noahshinn/reflexion โญ 3,042
    [NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning

  232. hegelai/prompttools โญ 2,998
    Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
    ๐Ÿ”— prompttools.readthedocs.io

  233. li-plus/chatglm.cpp โญ 2,968
    C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

  234. freedomintelligence/LLMZoo โญ 2,951
    โšกLLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.โšก

  235. baichuan-inc/Baichuan-13B โญ 2,950
    A 13B large language model developed by Baichuan Intelligent Technology
    ๐Ÿ”— huggingface.co/baichuan-inc/baichuan-13b-chat

  236. eladlev/AutoPrompt โญ 2,915
    A prompt optimization framework designed to enhance and perfect your prompts for real-world use cases

  237. deepseek-ai/DualPipe โญ 2,910
    DualPipe is an innovative bidirectional pipeline parallelism algorithm introduced in the DeepSeek-V3 Technical Report.

  238. alpha-vllm/LLaMA2-Accessory โญ 2,801
    An Open-source Toolkit for LLM Development
    ๐Ÿ”— llama2-accessory.readthedocs.io

  239. juncongmoo/pyllama โญ 2,799
    LLaMA: Open and Efficient Foundation Language Models

  240. janhq/cortex.cpp โญ 2,761
    Cortex is a Local AI API Platform that is used to run and customize LLMs.
    ๐Ÿ”— cortex.so

  241. langwatch/langwatch โญ 2,742
    LangWatch is an open platform for Observing, Evaluating and Optimizing your LLM and Agentic applications.
    ๐Ÿ”— langwatch.ai

  242. paperswithcode/galai โญ 2,738
    Model API for GALACTICA

  243. roboflow/maestro โญ 2,656
    streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
    ๐Ÿ”— maestro.roboflow.com

  244. vllm-project/llm-compressor โญ 2,615
    Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
    ๐Ÿ”— docs.vllm.ai/projects/llm-compressor

  245. spcl/graph-of-thoughts โญ 2,587
    Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"
    ๐Ÿ”— arxiv.org/pdf/2308.09687.pdf

  246. intel/neural-compressor โญ 2,573
    SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime
    ๐Ÿ”— intel.github.io/neural-compressor

  247. ofa-sys/OFA โญ 2,552
    Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

  248. young-geng/EasyLM โญ 2,507
    Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

  249. huggingface/nanotron โญ 2,479
    Minimalistic large language model 3D-parallelism training

  250. illuin-tech/colpali โญ 2,468
    Code used for training the vision retrievers in the ColPali: Efficient Document Retrieval with Vision Language Models paper
    ๐Ÿ”— huggingface.co/vidore

  251. protectai/llm-guard โญ 2,442
    Sanitization, detection of harmful language, prevention of data leakage, and resistance against prompt injection attacks for LLMs
    ๐Ÿ”— protectai.github.io/llm-guard

  252. azure-samples/graphrag-accelerator โญ 2,410
    One-click deploy of a Knowledge Graph powered RAG (GraphRAG) in Azure
    ๐Ÿ”— github.com/microsoft/graphrag

  253. civitai/sd_civitai_extension โญ 2,379
    All of the Civitai models inside Automatic 1111 Stable Diffusion Web UI

  254. uptrain-ai/uptrain โญ 2,338
    An open-source unified platform to evaluate and improve Generative AI applications. Provide grades for 20+ preconfigured evaluations (covering language, code, embedding use cases)
    ๐Ÿ”— uptrain.ai

  255. facebookresearch/large_concept_model โญ 2,330
    Large Concept Models: Language modeling in a sentence representation space

  256. akariasai/self-rag โญ 2,307
    This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
    ๐Ÿ”— selfrag.github.io

  257. casper-hansen/AutoAWQ โญ 2,307
    AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
    ๐Ÿ”— casper-hansen.github.io/autoawq

  258. huggingface/lighteval โญ 2,280
    LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
    ๐Ÿ”— huggingface.co/docs/lighteval/en/index

  259. ist-daslab/gptq โญ 2,247
    Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
    ๐Ÿ”— arxiv.org/abs/2210.17323

  260. microsoft/Megatron-DeepSpeed โญ 2,220
    Ongoing research training transformer language models at scale, including: BERT & GPT-2

  261. gepa-ai/gepa โญ 2,154
    GEPA (Genetic-Pareto) is a framework for optimizing arbitrary systems composed of text componentsโ€”like AI prompts, code snippets, or textual specsโ€”against any evaluation metric

  262. epfllm/meditron โญ 2,136
    Meditron is a suite of open-source medical Large Language Models (LLMs).
    ๐Ÿ”— huggingface.co/epfl-llm

  263. tairov/llama2.mojo โญ 2,115
    Inference Llama 2 in one file of pure ๐Ÿ”ฅ
    ๐Ÿ”— www.modular.com/blog/community-spotlight-how-i-built-llama2-by-aydyn-tairov

  264. ai-hypercomputer/maxtext โญ 2,106
    MaxText is a high performance, highly scalable, open-source LLM written in pure Python/Jax and targeting Google Cloud TPUs and GPUs for training and inference.
    ๐Ÿ”— maxtext.readthedocs.io

  265. facebookresearch/chameleon โญ 2,081
    Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
    ๐Ÿ”— arxiv.org/abs/2405.09818

  266. openai/image-gpt โญ 2,079
    Archived. Code and models from the paper "Generative Pretraining from Pixels"

  267. lucidrains/toolformer-pytorch โญ 2,057
    Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI

  268. google-gemini/genai-processors โญ 2,043
    GenAI Processors is a lightweight Python library that enables efficient, parallel content processing.

  269. huggingface/picotron โญ 2,008
    Minimalist & most-hackable repository for pre-training Llama-like models with 4D Parallelism (Data, Tensor, Pipeline, Context parallel)

  270. neulab/prompt2model โญ 2,006
    A system that takes a natural language task description to train a small special-purpose model that is conducive for deployment.

  271. minishlab/model2vec โญ 1,988
    Model2Vec is a technique to turn any sentence transformer into a really small static model, reducing model size by 15x and making the models up to 500x faster, with a small drop in performance
    ๐Ÿ”— minish.ai/packages/model2vec

  272. noamgat/lm-format-enforcer โญ 1,979
    Enforce the output format (JSON Schema, Regex etc) of a language model

  273. agentops-ai/tokencost โญ 1,914
    Easy token price estimates for 400+ LLMs. TokenOps.
    ๐Ÿ”— agentops.ai

  274. aiming-lab/SimpleMem โญ 1,871
    SimpleMem addresses the fundamental challenge of efficient long-term memory for LLM agents through a three-stage pipeline grounded in Semantic Lossless Compression.

  275. qwenlm/Qwen-Audio โญ 1,865
    The official repo of Qwen-Audio (้€šไน‰ๅƒ้—ฎ-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

  276. ray-project/llm-applications โญ 1,844
    A comprehensive guide to building RAG-based LLM applications for production.

  277. openai/gpt-discord-bot โญ 1,831
    Example Discord bot written in Python that uses the completions API to have conversations with the text-davinci-003 model, and the moderations API to filter the messages.

  278. jennyzzt/dgm โญ 1,800
    Self-improving system that iteratively modifies its own code and empirically validates each change

  279. 1rgs/nanocode โญ 1,669
    Minimal Claude Code alternative. Single Python file, zero dependencies, ~250 lines.

  280. alexzhang13/rlm โญ 1,668
    Recursive Language Models (RLMs) are a task-agnostic inference paradigm for language models (LMs) to handle near-infinite length contexts
    ๐Ÿ”— arxiv.org/abs/2512.24601v1

  281. meetkai/functionary โญ 1,592
    Chat language model that can use tools and interpret the results

  282. answerdotai/rerankers โญ 1,590
    Welcome to rerankers! Our goal is to provide users with a simple API to use any reranking models.

  283. jina-ai/thinkgpt โญ 1,583
    Agent techniques to augment your LLM and push it beyong its limits

  284. leochlon/pythea โญ 1,578
    Hallucination Risk Calculator & Prompt Re-engineering Toolkit (OpenAI-only)

  285. run-llama/semtools โญ 1,563
    Semantic search and document parsing tools for the command line

  286. chatarena/chatarena โญ 1,529
    ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.

  287. nirdiamant/Controllable-RAG-Agent โญ 1,529
    An advanced Retrieval-Augmented Generation (RAG) solution designed to tackle complex questions that simple semantic similarity-based retrieval cannot solve

  288. run-llama/llama-lab โญ 1,514
    Llama Lab is a repo dedicated to building cutting-edge projects using LlamaIndex

  289. mlc-ai/xgrammar โญ 1,502
    XGrammar is an open-source library for efficient, flexible, and portable structured generation. It supports general context-free grammar to enable a broad range of structures while bringing careful system optimizations to enable fast executions.
    ๐Ÿ”— xgrammar.mlc.ai/docs

  290. cstankonrad/long_llama โญ 1,463
    LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.

  291. farizrahman4u/loopgpt โญ 1,459
    Re-implementation of Auto-GPT as a python package, written with modularity and extensibility in mind.

  292. sumandora/remove-refusals-with-transformers โญ 1,445
    A proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens

  293. mlfoundations/dclm โญ 1,409
    DataComp for Language Models

  294. facebookresearch/MobileLLM โญ 1,405
    Training code of MobileLLM introduced in our work: "MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases"

  295. protectai/rebuff โญ 1,399
    Rebuff is designed to protect AI applications from prompt injection (PI) attacks through a multi-layered defense
    ๐Ÿ”— playground.rebuff.ai

  296. explosion/spacy-llm โญ 1,361
    ๐Ÿฆ™ Integrating LLMs into structured NLP pipelines
    ๐Ÿ”— spacy.io/usage/large-language-models

  297. deepseek-ai/EPLB โญ 1,336
    Expert Parallelism Load Balancer across GPUs

  298. keirp/automatic_prompt_engineer โญ 1,336
    This repo contains code for the paper "Large Language Models Are Human-Level Prompt Engineers"

  299. hao-ai-lab/LookaheadDecoding โญ 1,315
    Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
    ๐Ÿ”— arxiv.org/abs/2402.02057

  300. centerforaisafety/hle โญ 1,312
    Humanity's Last Exam (HLE) is a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage
    ๐Ÿ”— lastexam.ai

  301. ray-project/ray-llm โญ 1,264
    RayLLM - LLMs on Ray (Archived). Read README for more info.
    ๐Ÿ”— docs.ray.io/en/latest

  302. srush/MiniChain โญ 1,235
    A tiny library for coding with large language models.
    ๐Ÿ”— srush-minichain.hf.space

  303. nousresearch/Hermes-Function-Calling โญ 1,188
    Code for the Hermes Pro Large Language Model to perform function calling based on the provided schema. It allows users to query the model and retrieve information related to stock prices, company fundamentals, financial statements

  304. cagostino/npcpy โญ 1,177
    This repo leverages the power of LLMs to understand your natural language commands and questions, executing tasks, answering queries, and providing relevant information from local files and the web.

  305. cyberark/FuzzyAI โญ 1,155
    A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.

  306. ibm/Dromedary โญ 1,144
    Dromedary: towards helpful, ethical and reliable LLMs.

  307. lupantech/chameleon-llm โญ 1,139
    Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".
    ๐Ÿ”— chameleon-llm.github.io

  308. safety-research/bloom โญ 1,132
    Bloom generates evaluation suites that probe LLMs for specific behaviors (sycophancy, self-preservation, political bias, etc.)

  309. utkusen/promptmap โญ 1,098
    Vulnerability scanning tool that automatically tests prompt injection attacks on your LLM applications. It analyzes your LLM system prompts, runs them, and sends attack prompts to them.

  310. rlancemartin/auto-evaluator โญ 1,091
    Evaluation tool for LLM QA chains
    ๐Ÿ”— autoevaluator.langchain.com

  311. datadreamer-dev/DataDreamer โญ 1,088
    DataDreamer is a powerful open-source Python library for prompting, synthetic data generation, and training workflows. It is designed to be simple, extremely efficient, and research-grade.
    ๐Ÿ”— datadreamer.dev

  312. ctlllll/LLM-ToolMaker โญ 1,056
    Large Language Models as Tool Makers

  313. wandb/weave โญ 1,047
    Weave is a toolkit for developing AI-powered applications, built by Weights & Biases.
    ๐Ÿ”— wandb.me/weave

  314. prometheus-eval/prometheus-eval โญ 1,032
    Evaluate your LLM's response with Prometheus and GPT4 ๐Ÿ’ฏ

  315. pinecone-io/canopy โญ 1,027
    Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone
    ๐Ÿ”— www.pinecone.io

  316. huggingface/optimum-nvidia โญ 1,027
    Optimum-NVIDIA delivers the best inference performance on the NVIDIA platform through Hugging Face. Run LLaMA 2 at 1,200 tokens/second (up to 28x faster than the framework)

  317. microsoft/Llama-2-Onnx โญ 1,026
    A Microsoft optimized version of the Llama 2 model, available from Meta

  318. nomic-ai/pygpt4all โญ 1,016
    Official supported Python bindings for llama.cpp + gpt4all
    ๐Ÿ”— nomic-ai.github.io/pygpt4all

  319. langchain-ai/langsmith-cookbook โญ 997
    LangSmith is a platform for building production-grade LLM applications.
    ๐Ÿ”— langsmith-cookbook.vercel.app

  320. ajndkr/lanarky โญ 995
    The web framework for building LLM microservices [deprecated]
    ๐Ÿ”— lanarky.ajndkr.com

  321. likejazz/llama3.np โญ 992
    llama3.np is a pure NumPy implementation for Llama 3 model.

  322. thinking-machines-lab/batch_invariant_ops โญ 951
    Defeating Nondeterminism in LLM Inference: fixing floating-point non-associativity

  323. soulter/hugging-chat-api โญ 936
    HuggingChat Python API๐Ÿค—

  324. opengvlab/OmniQuant โญ 886
    [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

  325. salesforceairesearch/promptomatix โญ 878
    An Automatic Prompt Optimization Framework. Structured approach to prompt optimization, ensuring consistency, cost-effectiveness, and high-quality outputs

  326. bytedtsinghua-sia/MemAgent โญ 871
    A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.

  327. safety-research/petri โญ 847
    Autonomously crafts environments, runs multiโ€‘turn audits against a target model using humanโ€‘like messages and simulated tools, and then scores transcripts
    ๐Ÿ”— safety-research.github.io/petri

  328. facebookresearch/cwm โญ 804
    Code World Model (CWM) is a 32-billion-parameter open-weights LLM, to advance research on code generation with world models.

  329. junruxiong/IncarnaMind โญ 797
    Connect and chat with your multiple documents (pdf and txt) through GPT 3.5, GPT-4 Turbo, Claude and Local Open-Source LLMs
    ๐Ÿ”— www.incarnamind.com

  330. tag-research/TAG-Bench โญ 766
    Table-Augmented Generation (TAG) is a unified and general-purpose paradigm for answering natural language questions over databases
    ๐Ÿ”— arxiv.org/pdf/2408.14717

  331. meta-llama/prompt-ops โญ 752
    PDO (Prompt Duel Optimizer) - an efficient label-free prompt optimization method using dueling bandits and Thompson sampling

  332. developersdigest/llm-api-engine โญ 750
    Build and deploy AI-powered APIs in seconds. This project allows you to create custom APIs that extract structured data from websites using natural language descriptions, powered by LLMs and web scraping technology.
    ๐Ÿ”— www.youtube.com/watch?v=8kuek1bo4mm

  333. microsoft/sammo โญ 747
    A library for prompt engineering and optimization (SAMMO = Structure-aware Multi-Objective Metaprompt Optimization)

  334. metauto-ai/agent-as-a-judge โญ 717
    ๐Ÿ‘ฉโ€โš–๏ธ Agent-as-a-Judge: The Magic for Open-Endedness
    ๐Ÿ”— arxiv.org/pdf/2410.10934

  335. microsoft/VPTQ โญ 674
    Extreme Low-bit Vector Post-Training Quantization for Large Language Models

  336. modal-labs/llm-finetuning โญ 647
    Guide for fine-tuning Llama/Mistral/CodeLlama models and more

  337. qixucen/atom โญ 637
    Atom of Thoughts (AoT) is a new reasoning framework that represents the solution as a composition of atomic questions. This approach transforms the reasoning process into a Markov process with atomic states

  338. judahpaul16/gpt-home โญ 633
    ChatGPT at home! A better alternative to commercial smart home assistants, built on the Raspberry Pi using LiteLLM and LangGraph.
    ๐Ÿ”— hub.docker.com/r/judahpaul/gpt-home

  339. huggingface/text-clustering โญ 592
    Easily embed, cluster and semantically label text datasets

  340. deepseek-ai/DeepSeek-Prover-V1.5 โญ 553
    DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

  341. continuum-llms/chatgpt-memory โญ 535
    Allows to scale the ChatGPT API to multiple simultaneous sessions with infinite contextual and adaptive memory powered by GPT and Redis datastore.

  342. codelion/adaptive-classifier โญ 526
    A flexible, adaptive classification system that allows for dynamic addition of new classes and continuous learning from examples. Built on top of transformers from HuggingFace, this library provides an easy-to-use interface for creating and updating text classifiers.

  343. xaviviro/python-toon โญ 320
    A compact data format optimized for transmitting structured information to Large Language Models (LLMs) with 30-60% fewer tokens than JSON.

  344. quotient-ai/judges โญ 316
    judges is a small library to use and create LLM-as-a-Judge evaluators. The purpose of judges is to have a curated set of LLM evaluators in a low-friction format across a variety of use cases

  345. stanford-oval/suql โญ 296
    SUQL: Conversational Search over Structured and Unstructured Data with LLMs
    ๐Ÿ”— arxiv.org/abs/2311.09818

  346. emissary-tech/legit-rag โญ 274
    A modular Retrieval-Augmented Generation (RAG) system built with FastAPI, Qdrant, and OpenAI.

  347. dottxt-ai/outlines-core โญ 273
    Core functionality for structured generation, formerly implemented in Outlines, with a focus on performance and portability.
    ๐Ÿ”— docs.rs/outlines-core

  348. jina-ai/llm-query-expansion โญ 64
    Query Expension for Better Query Embedding using LLMs

Math and Science

Mathematical, numerical and scientific libraries.

  1. numpy/numpy โญ 31,307
    The fundamental package for scientific computing with Python.
    ๐Ÿ”— numpy.org

  2. camdavidsonpilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers โญ 28,453
    aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
    ๐Ÿ”— camdavidsonpilon.github.io/probabilistic-programming-and-bayesian-methods-for-hackers

  3. taichi-dev/taichi โญ 27,916
    Productive, portable, and performant GPU programming in Python: Taichi Lang is an open-source, imperative, parallel programming language for high-performance numerical computation.
    ๐Ÿ”— taichi-lang.org

  4. experience-monks/math-as-code โญ 15,461
    This is a reference to ease developers into mathematical notation by showing comparisons with Python code

  5. scipy/scipy โญ 14,389
    SciPy library main repository
    ๐Ÿ”— scipy.org

  6. sympy/sympy โญ 14,338
    A computer algebra system written in pure Python
    ๐Ÿ”— sympy.org

  7. google/or-tools โญ 13,016
    Google Optimization Tools (a.k.a., OR-Tools) is an open-source, fast and portable software suite for solving combinatorial optimization problems.
    ๐Ÿ”— developers.google.com/optimization

  8. z3prover/z3 โญ 11,848
    Z3 is a theorem prover from Microsoft Research with a Python language binding.

  9. cupy/cupy โญ 10,736
    NumPy & SciPy for GPU
    ๐Ÿ”— cupy.dev

  10. cvxpy/cvxpy โญ 6,096
    A Python-embedded modeling language for convex optimization problems.
    ๐Ÿ”— www.cvxpy.org

  11. google-deepmind/alphageometry โญ 4,755
    Solving Olympiad Geometry without Human Demonstrations

  12. pim-book/programmers-introduction-to-mathematics โญ 3,635
    Code for A Programmer's Introduction to Mathematics
    ๐Ÿ”— pimbook.org

  13. talalalrawajfeh/mathematics-roadmap โญ 3,319
    A Comprehensive Roadmap to Mathematics

  14. pyro-ppl/numpyro โญ 2,598
    Probabilistic programming with NumPy powered by JAX for autograd and JIT compilation to GPU/TPU/CPU.
    ๐Ÿ”— num.pyro.ai

  15. mckinsey/causalnex โญ 2,426
    A Python library that helps data scientists to infer causation rather than observing correlation.
    ๐Ÿ”— causalnex.readthedocs.io

  16. facebookresearch/theseus โญ 1,992
    A library for differentiable nonlinear optimization

  17. pymc-labs/CausalPy โญ 1,093
    A Python package for causal inference in quasi-experimental settings
    ๐Ÿ”— causalpy.readthedocs.io

  18. extropic-ai/thrml โญ 987
    A JAX library for building and sampling probabilistic graphical models, with a focus on efficient block Gibbs sampling and energy-based models
    ๐Ÿ”— docs.thrml.ai

  19. brandondube/prysm โญ 322
    Prysm is an open-source library for physical and first-order modeling of optical systems and analysis of related data: numerical and physical optics, integrated modeling, phase retrieval, segmented systems, polynomials and fitting, sequential raytracing.
    ๐Ÿ”— prysm.readthedocs.io/en/stable

  20. lean-dojo/ReProver โญ 316
    Retrieval-Augmented Theorem Provers for Lean
    ๐Ÿ”— leandojo.org/leandojo.html

  21. albahnsen/pycircular โญ 105
    pycircular is a Python module for circular data analysis

  22. gbillotey/Fractalshades โญ 35
    Arbitrary-precision fractal explorer - Python package

Machine Learning - General

General and classical machine learning libraries. See below for other sections covering specialised ML areas.

  1. openai/openai-cookbook โญ 71,101
    Examples and guides for using the OpenAI API
    ๐Ÿ”— cookbook.openai.com

  2. scikit-learn/scikit-learn โญ 64,753
    scikit-learn: machine learning in Python
    ๐Ÿ”— scikit-learn.org

  3. suno-ai/bark โญ 38,929
    ๐Ÿ”Š Text-Prompted Generative Audio Model

  4. facebookresearch/faiss โญ 38,864
    A library for efficient similarity search and clustering of dense vectors.
    ๐Ÿ”— faiss.ai

  5. tencentarc/GFPGAN โญ 37,353
    GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

  6. google-research/google-research โญ 37,132
    This repository contains code released by Google Research
    ๐Ÿ”— research.google

  7. roboflow/supervision โญ 36,378
    We write your reusable computer vision tools. ๐Ÿ’œ
    ๐Ÿ”— supervision.roboflow.com

  8. google/jax โญ 34,681
    Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
    ๐Ÿ”— docs.jax.dev

  9. google/mediapipe โญ 33,447
    Cross-platform, customizable ML solutions for live and streaming media.
    ๐Ÿ”— ai.google.dev/edge/mediapipe

  10. open-mmlab/mmdetection โญ 32,319
    OpenMMLab Detection Toolbox and Benchmark
    ๐Ÿ”— mmdetection.readthedocs.io

  11. lutzroeder/netron โญ 32,250
    Visualizer for neural network, deep learning and machine learning models
    ๐Ÿ”— netron.app

  12. ageron/handson-ml2 โญ 29,799
    A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

  13. dmlc/xgboost โญ 27,898
    Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
    ๐Ÿ”— xgboost.readthedocs.io

  14. facebookresearch/fastText โญ 26,471
    A library for efficient learning of word representations and sentence classification.
    ๐Ÿ”— fasttext.cc

  15. modular/modular โญ 25,497
    The Modular Accelerated Xecution (MAX) platform is an integrated suite of AI libraries, tools, and technologies that unifies commonly fragmented AI deployment workflows
    ๐Ÿ”— docs.modular.com

  16. harisiqbal88/PlotNeuralNet โญ 24,366
    Latex code for making neural networks diagrams

  17. ml-explore/mlx โญ 23,585
    MLX is an array framework for machine learning on Apple silicon, brought to you by Apple machine learning research.
    ๐Ÿ”— ml-explore.github.io/mlx

  18. jina-ai/serve โญ 21,828
    โ˜๏ธ Build multimodal AI applications with cloud-native stack
    ๐Ÿ”— jina.ai/serve

  19. onnx/onnx โญ 20,211
    Open standard for machine learning interoperability
    ๐Ÿ”— onnx.ai

  20. huggingface/candle โญ 19,143
    Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support) and ease of use.

  21. microsoft/onnxruntime โญ 19,069
    ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
    ๐Ÿ”— onnxruntime.ai

  22. microsoft/LightGBM โญ 18,027
    A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
    ๐Ÿ”— lightgbm.readthedocs.io/en/latest

  23. tensorflow/tensor2tensor โญ 16,935
    Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

  24. google-gemini/cookbook โญ 16,259
    A collection of guides and examples for the Gemini API, including quickstart tutorials for writing prompts.
    ๐Ÿ”— ai.google.dev/gemini-api/docs

  25. ddbourgin/numpy-ml โญ 16,241
    Machine learning, in numpy
    ๐Ÿ”— numpy-ml.readthedocs.io

  26. neonbjb/tortoise-tts โญ 14,785
    A multi-voice TTS system trained with an emphasis on quality

  27. aleju/imgaug โญ 14,726
    Image augmentation for machine learning experiments.
    ๐Ÿ”— imgaug.readthedocs.io

  28. deepmind/deepmind-research โญ 14,645
    This repository contains implementations and illustrative code to accompany DeepMind publications

  29. microsoft/nni โญ 14,338
    An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
    ๐Ÿ”— nni.readthedocs.io

  30. jindongwang/transferlearning โญ 14,255
    Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-่ฟ็งปๅญฆไน 
    ๐Ÿ”— transferlearning.xyz

  31. deepmind/alphafold โญ 14,224
    Implementation of the inference pipeline of AlphaFold v2

  32. spotify/annoy โญ 14,135
    Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

  33. ggerganov/ggml โญ 13,866
    Tensor library for machine learning

  34. optuna/optuna โญ 13,421
    A hyperparameter optimization framework
    ๐Ÿ”— optuna.org

  35. facebookresearch/AnimatedDrawings โญ 12,761
    Code to accompany "A Method for Animating Children's Drawings of the Human Figure"

  36. thudm/CogVideo โญ 12,361
    text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

  37. cleanlab/cleanlab โญ 11,281
    Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
    ๐Ÿ”— cleanlab.ai

  38. statsmodels/statsmodels โญ 11,210
    Statsmodels: statistical modeling and econometrics in Python
    ๐Ÿ”— www.statsmodels.org/devel

  39. wandb/wandb โญ 10,768
    The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
    ๐Ÿ”— wandb.ai

  40. twitter/the-algorithm-ml โญ 10,510
    Source code for Twitter's Recommendation Algorithm
    ๐Ÿ”— blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm

  41. facebookresearch/xformers โญ 10,292
    Hackable and optimized Transformers building blocks, supporting a composable construction.
    ๐Ÿ”— facebookresearch.github.io/xformers

  42. megvii-basedetection/YOLOX โญ 10,291
    YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/

  43. epistasislab/tpot โญ 10,041
    A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
    ๐Ÿ”— epistasislab.github.io/tpot

  44. awslabs/autogluon โญ 9,828
    Fast and Accurate ML in 3 Lines of Code
    ๐Ÿ”— auto.gluon.ai

  45. pycaret/pycaret โญ 9,677
    An open-source, low-code machine learning library in Python
    ๐Ÿ”— www.pycaret.org

  46. open-mmlab/mmsegmentation โญ 9,586
    OpenMMLab Semantic Segmentation Toolbox and Benchmark.
    ๐Ÿ”— mmsegmentation.readthedocs.io/en/main

  47. huggingface/accelerate โญ 9,461
    ๐Ÿš€ A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
    ๐Ÿ”— huggingface.co/docs/accelerate

  48. pymc-devs/pymc โญ 9,459
    Bayesian Modeling and Probabilistic Programming in Python
    ๐Ÿ”— www.pymc.io

  49. uberi/speech_recognition โญ 8,927
    Speech recognition module for Python, supporting several engines and APIs, online and offline.
    ๐Ÿ”— pypi.python.org/pypi/speechrecognition

  50. catboost/catboost โญ 8,768
    A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
    ๐Ÿ”— catboost.ai

  51. ml-explore/mlx-examples โญ 8,168
    Examples in the MLX framework

  52. lmcinnes/umap โญ 8,072
    Uniform Manifold Approximation and Projection
    ๐Ÿ”— umap-learn.readthedocs.io

  53. automl/auto-sklearn โญ 8,042
    Automated Machine Learning with scikit-learn
    ๐Ÿ”— automl.github.io/auto-sklearn

  54. py-why/dowhy โญ 7,924
    DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
    ๐Ÿ”— www.pywhy.org/dowhy

  55. project-monai/MONAI โญ 7,773
    AI Toolkit for Healthcare Imaging
    ๐Ÿ”— project-monai.github.io

  56. hyperopt/hyperopt โญ 7,608
    Distributed Asynchronous Hyperparameter Optimization in Python
    ๐Ÿ”— hyperopt.github.io/hyperopt

  57. featurelabs/featuretools โญ 7,598
    An open source python library for automated feature engineering
    ๐Ÿ”— www.featuretools.com

  58. hips/autograd โญ 7,446
    Efficiently computes derivatives of NumPy code.

  59. open-mmlab/mmagic โญ 7,369
    OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic ๐Ÿช„: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.
    ๐Ÿ”— mmagic.readthedocs.io/en/latest

  60. scikit-learn-contrib/imbalanced-learn โญ 7,078
    A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
    ๐Ÿ”— imbalanced-learn.org

  61. yangchris11/samurai โญ 7,037
    Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
    ๐Ÿ”— yangchris11.github.io/samurai

  62. probml/pyprobml โญ 6,995
    Python code for "Probabilistic Machine learning" book by Kevin Murphy

  63. nicolashug/Surprise โญ 6,757
    A Python scikit for building and analyzing recommender systems
    ๐Ÿ”— surpriselib.com

  64. google-deepmind/graphcast โญ 6,488
    GraphCast: Learning skillful medium-range global weather forecasting

  65. google/automl โญ 6,452
    Google Brain AutoML

  66. cleverhans-lab/cleverhans โญ 6,401
    An adversarial example library for constructing attacks, building defenses, and benchmarking both

  67. open-mmlab/mmcv โญ 6,388
    OpenMMLab Computer Vision Foundation
    ๐Ÿ”— mmcv.readthedocs.io/en/latest

  68. kevinmusgrave/pytorch-metric-learning โญ 6,298
    The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
    ๐Ÿ”— kevinmusgrave.github.io/pytorch-metric-learning

  69. uber/causalml โญ 5,704
    Uplift modeling and causal inference with machine learning algorithms

  70. online-ml/river โญ 5,683
    ๐ŸŒŠ Online machine learning in Python
    ๐Ÿ”— riverml.xyz

  71. priorlabs/TabPFN โญ 5,562
    The TabPFN is a neural network that learned to do tabular data prediction. This is the original CUDA-supporting pytorch impelementation.
    ๐Ÿ”— priorlabs.ai

  72. google-deepmind/graph_nets โญ 5,394
    Graph Nets is DeepMind's library for building graph networks in Tensorflow and Sonnet.
    ๐Ÿ”— arxiv.org/abs/1806.01261

  73. skvark/opencv-python โญ 5,165
    Automated CI toolchain to produce precompiled opencv-python, opencv-python-headless, opencv-contrib-python and opencv-contrib-python-headless packages.
    ๐Ÿ”— pypi.org/project/opencv-python

  74. mdbloice/Augmentor โญ 5,148
    Image augmentation library in Python for machine learning.
    ๐Ÿ”— augmentor.readthedocs.io/en/stable

  75. apple/coremltools โญ 5,122
    Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
    ๐Ÿ”— coremltools.readme.io

  76. rasbt/mlxtend โญ 5,092
    A library of extension and helper modules for Python's data analysis and machine learning libraries.
    ๐Ÿ”— rasbt.github.io/mlxtend

  77. nmslib/hnswlib โญ 5,067
    Header-only C++/python library for fast approximate nearest neighbors
    ๐Ÿ”— github.com/nmslib/hnswlib

  78. marqo-ai/marqo โญ 5,009
    Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
    ๐Ÿ”— www.marqo.ai

  79. sanchit-gandhi/whisper-jax โญ 4,665
    JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

  80. huggingface/autotrain-advanced โญ 4,551
    AutoTrain Advanced: faster and easier training and deployments of state-of-the-art machine learning models
    ๐Ÿ”— huggingface.co/autotrain

  81. py-why/EconML โญ 4,478
    ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to brin...
    ๐Ÿ”— www.microsoft.com/en-us/research/project/alice

  82. huggingface/notebooks โญ 4,438
    Notebooks using the Hugging Face libraries ๐Ÿค—

  83. nv-tlabs/GET3D โญ 4,425
    Generative Model of High Quality 3D Textured Shapes Learned from Images

  84. districtdatalabs/yellowbrick โญ 4,395
    Visual analysis and diagnostic tools to facilitate machine learning model selection.
    ๐Ÿ”— www.scikit-yb.org

  85. lucidrains/deep-daze โญ 4,332
    Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun

  86. zjunlp/DeepKE โญ 4,304
    [EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
    ๐Ÿ”— deepke.zjukg.cn

  87. microsoft/FLAML โญ 4,283
    A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
    ๐Ÿ”— microsoft.github.io/flaml

  88. huggingface/speech-to-speech โญ 4,274
    Speech To Speech: an effort for an open-sourced and modular GPT4-o

  89. cmusphinx/pocketsphinx โญ 4,261
    A small speech recognizer

  90. rucaibox/RecBole โญ 4,235
    A unified, comprehensive and efficient recommendation library
    ๐Ÿ”— recbole.io

  91. ourownstory/neural_prophet โญ 4,233
    NeuralProphet: A simple forecasting package
    ๐Ÿ”— neuralprophet.com

  92. facebookresearch/flow_matching โญ 4,032
    Flow Matching (FM) is a recent framework for generative modeling that has achieved state-of-the-art performance across various domains, including image, video, audio, speech, and biological structures
    ๐Ÿ”— facebookresearch.github.io/flow_matching

  93. cornellius-gp/gpytorch โญ 3,819
    GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian process models with ease.

  94. lightly-ai/lightly โญ 3,666
    A python library for self-supervised learning on images.
    ๐Ÿ”— docs.lightly.ai/self-supervised-learning

  95. huggingface/safetensors โญ 3,597
    Implements a new simple format for storing tensors safely (as opposed to pickle) and that is still fast (zero-copy).
    ๐Ÿ”— huggingface.co/docs/safetensors

  96. yoheinakajima/instagraph โญ 3,540
    Converts text input or URL into knowledge graph and displays

  97. petarv-/GAT โญ 3,492
    Implementation of a Graph Attention Network (GAT) layer in TensorFlow
    ๐Ÿ”— petar-v.com/gat

  98. neuraloperator/neuraloperator โญ 3,337
    Comprehensive library for learning neural operators in PyTorch. It is the official implementation for Fourier Neural Operators and Tensorized Neural Operators.
    ๐Ÿ”— neuraloperator.github.io/dev/index.html

  99. pytorch/glow โญ 3,328
    Compiler for Neural Network hardware accelerators

  100. hrnet/HRNet-Semantic-Segmentation โญ 3,311
    The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919

  101. facebookresearch/vissl โญ 3,294
    VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
    ๐Ÿ”— vissl.ai

  102. lucidrains/musiclm-pytorch โญ 3,293
    Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch

  103. shankarpandala/lazypredict โญ 3,285
    Lazy Predict help build a lot of basic models without much code and helps understand which models works better without any parameter tuning

  104. huggingface/huggingface_hub โญ 3,277
    The official Python client for the Hugging Face Hub.
    ๐Ÿ”— huggingface.co/docs/huggingface_hub

  105. huggingface/optimum โญ 3,268
    ๐Ÿš€ Accelerate inference and training of ๐Ÿค— Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
    ๐Ÿ”— huggingface.co/docs/optimum/main

  106. mljar/mljar-supervised โญ 3,234
    Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
    ๐Ÿ”— mljar.com

  107. nvidia/cuda-python โญ 3,145
    CUDA Python: Performance meets Productivity
    ๐Ÿ”— nvidia.github.io/cuda-python

  108. google-research/t5x โญ 2,938
    T5X is a modular, composable, research-friendly framework for high-performance, configurable, self-service training, evaluation, and inference of sequence models (starting with language) at many scales.

  109. eric-mitchell/direct-preference-optimization โญ 2,834
    Reference implementation for DPO (Direct Preference Optimization)

  110. rom1504/clip-retrieval โญ 2,716
    Easily compute clip embeddings and build a clip retrieval system with them
    ๐Ÿ”— rom1504.github.io/clip-retrieval

  111. freedmand/semantra โญ 2,686
    Semantra is a multipurpose tool for semantically searching documents. Query by meaning rather than just by matching text.

  112. apple/ml-ane-transformers โญ 2,672
    Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)

  113. qdrant/fastembed โญ 2,651
    Fast, Accurate, Lightweight Python library to make State of the Art Embedding
    ๐Ÿ”— qdrant.github.io/fastembed

  114. huggingface/evaluate โญ 2,404
    ๐Ÿค— Evaluate: A library for easily evaluating machine learning models and datasets.
    ๐Ÿ”— huggingface.co/docs/evaluate

  115. benedekrozemberczki/karateclub โญ 2,273
    Karate Club is an unsupervised machine learning extension library for NetworkX.
    ๐Ÿ”— karateclub.readthedocs.io

  116. microsoft/Olive โญ 2,240
    Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.
    ๐Ÿ”— microsoft.github.io/olive

  117. castorini/pyserini โญ 2,008
    Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
    ๐Ÿ”— pyserini.io

  118. linkedin/greykite โญ 1,854
    A flexible, intuitive and fast forecasting library

  119. rentruewang/koila โญ 1,832
    Prevent PyTorch's CUDA error: out of memory in just 1 line of code.
    ๐Ÿ”— koila.rentruewang.com

  120. laekov/fastmoe โญ 1,829
    A fast MoE impl for PyTorch
    ๐Ÿ”— fastmoe.ai

  121. visual-layer/fastdup โญ 1,817
    fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.
    ๐Ÿ”— docs.visual-layer.com/fastdup_docs_old/first%20steps/getting-started

  122. microsoft/i-Code โญ 1,707
    The ambition of the i-Code project is to build integrative and composable multimodal AI. The "i" stands for integrative multimodal learning.

  123. google/vizier โญ 1,623
    Python-based research interface for blackbox and hyperparameter optimization, based on the internal Google Vizier Service.
    ๐Ÿ”— oss-vizier.readthedocs.io

  124. microsoft/Semi-supervised-learning โญ 1,559
    A Unified Semi-Supervised Learning Codebase (NeurIPS'22)
    ๐Ÿ”— usb.readthedocs.io

  125. spotify/voyager โญ 1,535
    ๐Ÿ›ฐ๏ธ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.
    ๐Ÿ”— spotify.github.io/voyager

  126. jina-ai/finetuner โญ 1,508
    ๐ŸŽฏ Task-oriented embedding tuning for BERT, CLIP, etc.
    ๐Ÿ”— finetuner.jina.ai

  127. patchy631/machine-learning โญ 1,505
    Machine Learning Tutorials Repository

  128. lightning-ai/lightning-thunder โญ 1,437
    Thunder is a source-to-source compiler for PyTorch. It makes PyTorch programs faster by combining and using different hardware executors at once

  129. gradio-app/trackio โญ 1,230
    A lightweight, local-first, and ๐Ÿ†“ experiment tracking library from Hugging Face ๐Ÿค—

  130. huggingface/optimum-quanto โญ 1,020
    A pytorch quantization backend for optimum

  131. criteo/autofaiss โญ 893
    Automatically create Faiss knn indices with the most optimal similarity search parameters.
    ๐Ÿ”— criteo.github.io/autofaiss

  132. trent-b/iterative-stratification โญ 882
    Provides scikit-learn compatible cross validators with stratification for multilabel data.

  133. minishlab/semhash โญ 878
    SemHash is a lightweight and flexible tool for deduplicating datasets using semantic similarity. It combines fast embedding generation from Model2Vec with efficient ANN-based similarity search through Vicinity
    ๐Ÿ”— minish.ai/packages/semhash

  134. nomic-ai/contrastors โญ 773
    Contrastive learning toolkit that enables researchers and engineers to train and evaluate contrastive models efficiently.

  135. intel/intel-npu-acceleration-library โญ 701
    The Intel NPU Acceleration Library is a Python library designed to boost the efficiency of your applications by leveraging the power of the Intel Neural Processing Unit (NPU) to perform high-speed computations on compatible hardware.

  136. nicolas-hbt/pygraft โญ 696
    Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
    ๐Ÿ”— pygraft.readthedocs.io/en/latest

  137. eleutherai/sparsify โญ 685
    This library trains k-sparse autoencoders (SAEs) on the residual stream activations of HuggingFace language models, roughly following the recipe detailed in Scaling and evaluating sparse autoencoders (Gao et al. 2024)

  138. hkust-knowcomp/AutoSchemaKG โญ 669
    A Knowledge Graph Construction Framework with Schema Generation and Knowledge Graph Completion
    ๐Ÿ”— hkust-knowcomp.github.io/autoschemakg

  139. google-deepmind/limit โญ 619
    On the Theoretical Limitations of Embedding-Based Retrieval
    ๐Ÿ”— arxiv.org/abs/2508.21038

  140. apple/ml-l3m โญ 229
    A flexible library for training any type of large model, regardless of modality. Instead of more traditional approaches, we opt for a config-heavy approach

  141. dylanhogg/gptauthor โญ 97
    GPTAuthor is an AI tool for writing long form, multi-chapter stories given a story prompt.

  142. awslabs/stickler โญ 23
    A library for evaluating structured data and AI outputs with weighted field comparison and custom comparators
    ๐Ÿ”— awslabs.github.io/stickler

  143. wjbmattingly/gliner-finetune โญ 14
    A library to generate synthetic data using LLM models, process this data, and then use it to train a GLiNER model. GLiNER is a Named Entity Recognition (NER) framework.

Machine Learning - Deep Learning

Machine learning libraries that cross over with deep learning in some way.

  1. tensorflow/tensorflow โญ 193,464
    An Open Source Machine Learning Framework for Everyone
    ๐Ÿ”— tensorflow.org

  2. pytorch/pytorch โญ 96,869
    Tensors and Dynamic neural networks in Python with strong GPU acceleration
    ๐Ÿ”— pytorch.org

  3. openai/whisper โญ 93,624
    Robust Speech Recognition via Large-Scale Weak Supervision

  4. keras-team/keras โญ 63,738
    Deep Learning for humans
    ๐Ÿ”— keras.io

  5. deepfakes/faceswap โญ 54,915
    Deepfakes Software For All
    ๐Ÿ”— www.faceswap.dev

  6. facebookresearch/segment-anything โญ 53,249
    The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

  7. microsoft/DeepSpeed โญ 41,382
    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
    ๐Ÿ”— www.deepspeed.ai

  8. rwightman/pytorch-image-models โญ 36,256
    The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
    ๐Ÿ”— huggingface.co/docs/timm

  9. xinntao/Real-ESRGAN โญ 34,015
    Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

  10. facebookresearch/detectron2 โญ 33,986
    Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
    ๐Ÿ”— detectron2.readthedocs.io/en/latest

  11. openai/CLIP โญ 32,375
    CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

  12. lightning-ai/pytorch-lightning โญ 30,771
    The deep learning framework to pretrain, finetune and deploy AI models. PyTorch Lightning is just organized PyTorch - Lightning disentangles PyTorch code to decouple the science from the engineering.
    ๐Ÿ”— lightning.ai/pytorch-lightning/?utm_source=ptl_readme&utm_medium=referral&utm_campaign=ptl_readme

  13. google-research/tuning_playbook โญ 29,721
    A playbook for systematically maximizing the performance of deep learning models.

  14. facebookresearch/Detectron โญ 26,409
    FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

  15. matterport/Mask_RCNN โญ 25,497
    Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

  16. lucidrains/vit-pytorch โญ 24,921
    Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

  17. paddlepaddle/Paddle โญ 23,586
    PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice ๏ผˆใ€Ž้ฃžๆกจใ€ๆ ธๅฟƒๆก†ๆžถ๏ผŒๆทฑๅบฆๅญฆไน &ๆœบๅ™จๅญฆไน ้ซ˜ๆ€ง่ƒฝๅ•ๆœบใ€ๅˆ†ๅธƒๅผ่ฎญ็ปƒๅ’Œ่ทจๅนณๅฐ้ƒจ็ฝฒ๏ผ‰
    ๐Ÿ”— www.paddlepaddle.org

  18. pyg-team/pytorch_geometric โญ 23,414
    Graph Neural Network Library for PyTorch
    ๐Ÿ”— pyg.org

  19. sanster/IOPaint โญ 22,642
    Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
    ๐Ÿ”— www.iopaint.com

  20. danielgatis/rembg โญ 21,624
    Rembg is a tool to remove images background

  21. apache/mxnet โญ 20,840
    Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
    ๐Ÿ”— mxnet.apache.org

  22. rasbt/deeplearning-models โญ 17,369
    A collection of various deep learning architectures, models, and tips

  23. microsoft/Swin-Transformer โญ 15,667
    This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
    ๐Ÿ”— arxiv.org/abs/2103.14030

  24. albumentations-team/albumentations โญ 15,266
    Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
    ๐Ÿ”— albumentations.ai

  25. facebookresearch/detr โญ 15,071
    End-to-End Object Detection with Transformers

  26. nvidia/DeepLearningExamples โญ 14,697
    State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

  27. dmlc/dgl โญ 14,228
    Python package built to ease deep learning on graph, on top of existing DL frameworks.
    ๐Ÿ”— dgl.ai

  28. mlfoundations/open_clip โญ 13,289
    Open source implementation of OpenAI's CLIP (Contrastive Language-Image Pre-training).

  29. tencent-hunyuan/HunyuanVideo โญ 11,635
    HunyuanVideo: A Systematic Framework For Large Video Generation Model
    ๐Ÿ”— aivideo.hunyuan.tencent.com

  30. kornia/kornia โญ 11,031
    ๐Ÿ Geometric Computer Vision Library for Spatial AI
    ๐Ÿ”— kornia.readthedocs.io

  31. facebookresearch/pytorch3d โญ 9,761
    PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
    ๐Ÿ”— pytorch3d.org

  32. modelscope/facechain โญ 9,497
    FaceChain is a deep-learning toolchain for generating your Digital-Twin.

  33. arogozhnikov/einops โญ 9,360
    Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
    ๐Ÿ”— einops.rocks

  34. keras-team/autokeras โญ 9,290
    AutoML library for deep learning
    ๐Ÿ”— autokeras.com

  35. bytedance/monolith โญ 9,259
    A deep learning framework for large scale recommendation modeling with collisionless embedding and real time training captures.

  36. pyro-ppl/pyro โญ 8,959
    Deep universal probabilistic programming with Python and PyTorch
    ๐Ÿ”— pyro.ai

  37. facebookresearch/ImageBind โญ 8,955
    ImageBind One Embedding Space to Bind Them All

  38. nvidia/apex โญ 8,899
    A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

  39. lucidrains/imagen-pytorch โญ 8,409
    Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch

  40. google/trax โญ 8,302
    Trax โ€” Deep Learning with Clear Code and Speed

  41. xpixelgroup/BasicSR โญ 8,062
    Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.
    ๐Ÿ”— basicsr.readthedocs.io/en/latest

  42. google/flax โญ 7,046
    Flax is a neural network library for JAX that is designed for flexibility.
    ๐Ÿ”— flax.readthedocs.io

  43. skorch-dev/skorch โญ 6,149
    A scikit-learn compatible neural network library that wraps PyTorch

  44. facebookresearch/mmf โญ 5,616
    A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
    ๐Ÿ”— mmf.sh

  45. mosaicml/composer โญ 5,458
    Supercharge Your Model Training
    ๐Ÿ”— docs.mosaicml.com

  46. nvidiagameworks/kaolin โญ 5,021
    A PyTorch Library for Accelerating 3D Deep Learning Research

  47. deci-ai/super-gradients โญ 4,997
    Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
    ๐Ÿ”— www.supergradients.com

  48. pytorch/ignite โญ 4,727
    High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
    ๐Ÿ”— pytorch-ignite.ai

  49. facebookincubator/AITemplate โญ 4,701
    AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

  50. cvg/LightGlue โญ 4,322
    LightGlue: Local Feature Matching at Light Speed (ICCV 2023)

  51. modelscope/ClearerVoice-Studio โญ 3,856
    An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

  52. google-research/scenic โญ 3,757
    Scenic: A Jax Library for Computer Vision Research and Beyond

  53. williamyang1991/VToonify โญ 3,598
    [SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer

  54. pytorch/botorch โญ 3,452
    Bayesian optimization in PyTorch
    ๐Ÿ”— botorch.org

  55. facebookresearch/PyTorch-BigGraph โญ 3,419
    Generate embeddings from large-scale graph-structured data.
    ๐Ÿ”— torchbiggraph.readthedocs.io

  56. alpa-projects/alpa โญ 3,175
    Training and serving large-scale neural networks with auto parallelization.
    ๐Ÿ”— alpa.ai

  57. deepmind/dm-haiku โญ 3,172
    JAX-based neural network library
    ๐Ÿ”— dm-haiku.readthedocs.io

  58. nerdyrodent/VQGAN-CLIP โญ 2,661
    Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

  59. pytorch/torchrec โญ 2,462
    Pytorch domain library for recommendation systems
    ๐Ÿ”— pytorch.org/torchrec

  60. danielegrattarola/spektral โญ 2,393
    Graph Neural Networks with Keras and Tensorflow 2.
    ๐Ÿ”— graphneural.network

  61. google-research/electra โญ 2,369
    ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

  62. fepegar/torchio โญ 2,346
    Medical imaging processing for AI applications.
    ๐Ÿ”— docs.torchio.org

  63. neuralmagic/sparseml โญ 2,143
    Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

  64. jeshraghian/snntorch โญ 1,855
    Deep and online learning with spiking neural networks in Python
    ๐Ÿ”— snntorch.readthedocs.io/en/latest

  65. sakanaai/continuous-thought-machines โญ 1,728
    Continuous Thought Machine (CTM), a model designed to unfold and then leverage neural activity as the underlying mechanism for observation and action
    ๐Ÿ”— pub.sakana.ai/ctm

  66. xl0/lovely-tensors โญ 1,353
    Tensors, for human consumption
    ๐Ÿ”— xl0.github.io/lovely-tensors

  67. allenai/reward-bench โญ 683
    RewardBench is a benchmark designed to evaluate the capabilities and safety of reward models (including those trained with Direct Preference Optimization, DPO)
    ๐Ÿ”— huggingface.co/spaces/allenai/reward-bench

Machine Learning - Interpretability

Machine learning interpretability libraries. Covers explainability, prediction explainations, dashboards, understanding knowledge development in training.

  1. slundberg/shap โญ 24,945
    A game theoretic approach to explain the output of any machine learning model.
    ๐Ÿ”— shap.readthedocs.io

  2. marcotcr/lime โญ 12,090
    Lime: Explaining the predictions of any machine learning classifier

  3. arize-ai/phoenix โญ 8,344
    AI Observability & Evaluation
    ๐Ÿ”— arize.com/docs/phoenix

  4. interpretml/interpret โญ 6,768
    Fit interpretable models. Explain blackbox machine learning.
    ๐Ÿ”— interpret.ml/docs

  5. pytorch/captum โญ 5,537
    Model interpretability and understanding for PyTorch
    ๐Ÿ”— captum.ai

  6. tensorflow/lucid โญ 4,705
    A collection of infrastructure and tools for research in neural network interpretability.

  7. pair-code/lit โญ 3,628
    The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
    ๐Ÿ”— pair-code.github.io/lit

  8. maif/shapash โญ 3,119
    ๐Ÿ”… Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models
    ๐Ÿ”— maif.github.io/shapash

  9. transformerlensorg/TransformerLens โญ 3,012
    A library for mechanistic interpretability of GPT-style language models
    ๐Ÿ”— transformerlensorg.github.io/transformerlens

  10. eleutherai/pythia โญ 2,715
    Interpretability analysis and scaling laws to understand how knowledge develops and evolves during training in autoregressive transformers

  11. seldonio/alibi โญ 2,607
    Algorithms for explaining machine learning models
    ๐Ÿ”— docs.seldon.io/projects/alibi/en/stable

  12. oegedijk/explainerdashboard โญ 2,469
    Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.
    ๐Ÿ”— explainerdashboard.readthedocs.io

  13. jalammar/ecco โญ 2,073
    Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
    ๐Ÿ”— ecco.readthedocs.io

  14. google-deepmind/penzai โญ 1,851
    A JAX library for writing models as legible, functional pytree data structures, along with tools for visualizing, modifying, and analyzing them. Penzai focuses on making it easy to do stuff with models after they have been trained
    ๐Ÿ”— penzai.readthedocs.io

  15. stanfordnlp/pyreft โญ 1,554
    Stanford NLP Python library for Representation Finetuning (ReFT)
    ๐Ÿ”— arxiv.org/abs/2404.03592

  16. selfexplainml/PiML-Toolbox โญ 1,285
    PiML (Python Interpretable Machine Learning) toolbox for model development & diagnostics
    ๐Ÿ”— selfexplainml.github.io/piml-toolbox

  17. ethicalml/xai โญ 1,221
    XAI is a Machine Learning library that is designed with AI explainability in its core. XAI contains various tools that enable for analysis and evaluation of data and models
    ๐Ÿ”— ethical.institute/principles.html#commitment-3

  18. jbloomaus/SAELens โญ 1,174
    Training Sparse Autoencoders on LLms. Analyse sparse autoencoders and neural network internals.
    ๐Ÿ”— decoderesearch.github.io/saelens

  19. andyzoujm/representation-engineering โญ 942
    Representation Engineering: A Top-Down Approach to AI Transparency
    ๐Ÿ”— www.ai-transparency.org

  20. ndif-team/nnsight โญ 782
    The nnsight package enables interpreting and manipulating the internals of deep learned models.
    ๐Ÿ”— nnsight.net

  21. labmlai/inspectus โญ 705
    Inspectus provides visualization tools for attention mechanisms in deep learning models. It provides a set of comprehensive views, making it easier to understand how these models work.

Machine Learning - Ops

MLOps tools, frameworks and libraries: intersection of machine learning, data engineering and DevOps; deployment, health, diagnostics and governance of ML models.

  1. apache/airflow โญ 43,992
    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
    ๐Ÿ”— airflow.apache.org

  2. ray-project/ray โญ 40,957
    Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
    ๐Ÿ”— ray.io

  3. kestra-io/kestra โญ 26,269
    Event Driven Orchestration & Scheduling Platform for Mission Critical Applications
    ๐Ÿ”— kestra.io

  4. mlflow/mlflow โญ 23,795
    The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.
    ๐Ÿ”— mlflow.org

  5. jlowin/fastmcp โญ 22,279
    FastMCP is the standard framework for building MCP servers and clients. FastMCP 1.0 was incorporated into the official MCP Python SDK.
    ๐Ÿ”— gofastmcp.com

  6. prefecthq/prefect โญ 21,412
    Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
    ๐Ÿ”— prefect.io

  7. langfuse/langfuse โญ 21,017
    ๐Ÿชข Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. ๐ŸŠYC W23
    ๐Ÿ”— langfuse.com/docs

  8. spotify/luigi โญ 18,628
    Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

  9. iterative/dvc โญ 15,302
    ๐Ÿฆ‰ Data Versioning and ML Experiments
    ๐Ÿ”— dvc.org

  10. dagster-io/dagster โญ 14,798
    An orchestration platform for the development, production, and observation of data assets.
    ๐Ÿ”— dagster.io

  11. horovod/horovod โญ 14,655
    Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
    ๐Ÿ”— horovod.ai

  12. dbt-labs/dbt-core โญ 12,132
    dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
    ๐Ÿ”— getdbt.com

  13. bentoml/OpenLLM โญ 12,063
    Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
    ๐Ÿ”— bentoml.com

  14. ludwig-ai/ludwig โญ 11,644
    Low-code framework for building custom LLMs, neural networks, and other AI models
    ๐Ÿ”— ludwig.ai

  15. great-expectations/great_expectations โญ 11,095
    Always know what to expect from your data.
    ๐Ÿ”— docs.greatexpectations.io

  16. huggingface/text-generation-inference โญ 10,739
    A Rust, Python and gRPC server for text generation inference. Used in production at HuggingFace to power Hugging Chat, the Inference API and Inference Endpoint.
    ๐Ÿ”— hf.co/docs/text-generation-inference

  17. kedro-org/kedro โญ 10,718
    Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
    ๐Ÿ”— kedro.org

  18. netflix/metaflow โญ 9,728
    Build, Manage and Deploy AI/ML Systems
    ๐Ÿ”— metaflow.org

  19. activeloopai/deeplake โญ 8,982
    Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
    ๐Ÿ”— activeloop.ai

  20. mage-ai/mage-ai โญ 8,622
    ๐Ÿง™ Build, run, and manage data pipelines for integrating and transforming data.
    ๐Ÿ”— www.mage.ai

  21. bentoml/BentoML โญ 8,384
    The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
    ๐Ÿ”— bentoml.com

  22. internlm/lmdeploy โญ 7,553
    LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
    ๐Ÿ”— lmdeploy.readthedocs.io/en/latest

  23. evidentlyai/evidently โญ 7,041
    Evidently is โ€‹โ€‹an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
    ๐Ÿ”— discord.gg/xzjkranp8b

  24. flyteorg/flyte โญ 6,694
    Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
    ๐Ÿ”— flyte.org

  25. feast-dev/feast โญ 6,645
    The Open Source Feature Store for AI/ML
    ๐Ÿ”— feast.dev

  26. adap/flower โญ 6,598
    Flower: A Friendly Federated AI Framework
    ๐Ÿ”— flower.ai

  27. allegroai/clearml โญ 6,468
    ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
    ๐Ÿ”— clear.ml/docs

  28. aimhubio/aim โญ 5,966
    Aim ๐Ÿ’ซ โ€” An easy-to-use & supercharged open-source experiment tracker.
    ๐Ÿ”— aimstack.io

  29. zenml-io/zenml โญ 5,166
    ZenML ๐Ÿ™: One AI Platform from Pipelines to Agents. https://zenml.io.
    ๐Ÿ”— zenml.io

  30. internlm/xtuner โญ 5,061
    A Next-Generation Training Engine Built for Ultra-Large MoE Models
    ๐Ÿ”— xtuner.readthedocs.io/zh-cn/latest

  31. orchest/orchest โญ 4,144
    Build data pipelines, the easy way ๐Ÿ› ๏ธ
    ๐Ÿ”— orchest.readthedocs.io/en/stable

  32. kubeflow/pipelines โญ 4,062
    Machine Learning Pipelines for Kubeflow
    ๐Ÿ”— www.kubeflow.org/docs/components/pipelines

  33. polyaxon/polyaxon โญ 3,692
    MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle
    ๐Ÿ”— polyaxon.com

  34. ploomber/ploomber โญ 3,622
    The fastest โšก๏ธ way to build data pipelines. Develop iteratively, deploy anywhere. โ˜๏ธ
    ๐Ÿ”— docs.ploomber.io

  35. towhee-io/towhee โญ 3,449
    Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
    ๐Ÿ”— towhee.io

  36. azure/PyRIT โญ 3,342
    The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and ML engineers to red team foundation models and their applications.
    ๐Ÿ”— azure.github.io/pyrit

  37. determined-ai/determined โญ 3,211
    Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
    ๐Ÿ”— determined.ai

  38. leptonai/leptonai โญ 2,804
    A Pythonic framework to simplify AI service building
    ๐Ÿ”— lepton.ai

  39. michaelfeil/infinity โญ 2,635
    Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models, clip, clap and colpali
    ๐Ÿ”— michaelfeil.github.io/infinity

  40. apache/hamilton โญ 2,374
    Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
    ๐Ÿ”— hamilton.apache.org

  41. meltano/meltano โญ 2,326
    Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
    ๐Ÿ”— meltano.com

  42. labmlai/labml โญ 2,293
    ๐Ÿ”Ž Monitor deep learning model training and hardware usage from your mobile phone ๐Ÿ“ฑ
    ๐Ÿ”— labml.ai

  43. vllm-project/production-stack โญ 2,122
    vLLMโ€™s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
    ๐Ÿ”— docs.vllm.ai/projects/production-stack

  44. dstackai/dstack โญ 2,018
    dstack is an open-source control plane for running development, training, and inference jobs on GPUsโ€”across hyperscalers, neoclouds, or on-prem.
    ๐Ÿ”— dstack.ai/docs

  45. dagworks-inc/burr โญ 1,891
    Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastructure.
    ๐Ÿ”— burr.apache.org

  46. substratusai/kubeai โญ 1,133
    AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
    ๐Ÿ”— www.kubeai.org

  47. arize-ai/openinference โญ 833
    OpenInference is a set of conventions and plugins that is complimentary to OpenTelemetry to enable tracing of AI applications.
    ๐Ÿ”— arize-ai.github.io/openinference

  48. lightonai/pylate โญ 690
    Built on Sentence Transformers, designed to simplify fine-tuning, inference, and retrieval with state-of-the-art ColBERT models
    ๐Ÿ”— lightonai.github.io/pylate

Machine Learning - Reinforcement

Machine learning libraries and toolkits that cross over with reinforcement learning in some way: agent reinforcement learning, agent environemnts, RLHF

  1. openai/gym โญ 36,974
    A toolkit for developing and comparing reinforcement learning algorithms.
    ๐Ÿ”— www.gymlibrary.dev

  2. lvwerra/trl โญ 17,119
    Train transformer language models with reinforcement learning.
    ๐Ÿ”— hf.co/docs/trl

  3. openai/baselines โญ 16,627
    OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

  4. farama-foundation/Gymnasium โญ 11,179
    An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
    ๐Ÿ”— gymnasium.farama.org

  5. google/dopamine โญ 10,842
    Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
    ๐Ÿ”— github.com/google/dopamine

  6. thu-ml/tianshou โญ 9,822
    An elegant PyTorch deep reinforcement learning library.
    ๐Ÿ”— tianshou.org

  7. deepmind/pysc2 โญ 8,247
    StarCraft II Learning Environment

  8. lucidrains/PaLM-rlhf-pytorch โญ 7,878
    Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

  9. tensorlayer/TensorLayer โญ 7,387
    Deep Learning and Reinforcement Learning Library for Scientists and Engineers
    ๐Ÿ”— tensorlayerx.com

  10. keras-rl/keras-rl โญ 5,558
    Deep Reinforcement Learning for Keras.
    ๐Ÿ”— keras-rl.readthedocs.io

  11. deepmind/dm_control โญ 4,424
    Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

  12. ai4finance-foundation/ElegantRL โญ 4,279
    Massively Parallel Deep Reinforcement Learning. ๐Ÿ”ฅ
    ๐Ÿ”— ai4finance.org

  13. deepmind/acme โญ 3,904
    A library of reinforcement learning components and agents

  14. facebookresearch/ReAgent โญ 3,684
    A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)
    ๐Ÿ”— reagent.ai

  15. opendilab/DI-engine โญ 3,582
    DI-engine is a generalized decision intelligence engine for PyTorch and JAX. It provides python-first and asynchronous-native task and middleware abstractions
    ๐Ÿ”— di-engine-docs.readthedocs.io

  16. pettingzoo-team/PettingZoo โญ 3,286
    An API standard for multi-agent reinforcement learning environments, with popular reference environments and related utilities
    ๐Ÿ”— pettingzoo.farama.org

  17. pytorch/rl โญ 3,263
    A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
    ๐Ÿ”— pytorch.org/rl

  18. eureka-research/Eureka โญ 3,107
    Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
    ๐Ÿ”— eureka-research.github.io

  19. kzl/decision-transformer โญ 2,755
    Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

  20. anthropics/hh-rlhf โญ 1,808
    Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
    ๐Ÿ”— arxiv.org/abs/2204.05862

  21. google-deepmind/meltingpot โญ 781
    A suite of test scenarios for multi-agent reinforcement learning.

  22. open-tinker/OpenTinker โญ 598
    OpenTinker is an RL-as-a-Service infrastructure for foundation models, providing a flexible environment design framework that supports diverse training scenarios over data and interaction modes.

Natural Language Processing

Natural language processing libraries and toolkits: text processing, topic modelling, tokenisers, chatbots. Also see the LLMs and ChatGPT category for crossover.

  1. huggingface/transformers โญ 155,622
    ๐Ÿค— Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
    ๐Ÿ”— huggingface.co/transformers

  2. myshell-ai/OpenVoice โญ 35,840
    Instant voice cloning by MIT and MyShell. Audio foundation model.
    ๐Ÿ”— research.myshell.ai/open-voice

  3. explosion/spaCy โญ 33,097
    ๐Ÿ’ซ Industrial-strength Natural Language Processing (NLP) in Python
    ๐Ÿ”— spacy.io

  4. pytorch/fairseq โญ 32,110
    Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

  5. vikparuchuri/marker โญ 31,151
    Marker converts PDF, EPUB, and MOBI to markdown. It's 10x faster than nougat, more accurate on most documents, and has low hallucination risk.
    ๐Ÿ”— www.datalab.to

  6. microsoft/unilm โญ 21,979
    Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
    ๐Ÿ”— aka.ms/generalai

  7. huggingface/datasets โญ 21,122
    ๐Ÿค— The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
    ๐Ÿ”— huggingface.co/docs/datasets

  8. m-bain/whisperX โญ 19,783
    WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

  9. vikparuchuri/surya โญ 19,159
    OCR, layout analysis, reading order, table recognition in 90+ languages
    ๐Ÿ”— www.datalab.to

  10. ukplab/sentence-transformers โญ 18,144
    State-of-the-Art Text Embeddings
    ๐Ÿ”— www.sbert.net

  11. openai/tiktoken โญ 17,074
    tiktoken is a fast BPE tokeniser for use with OpenAI's models.

  12. nvidia/NeMo โญ 16,608
    A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
    ๐Ÿ”— docs.nvidia.com/nemo-framework/user-guide/latest/overview.html

  13. rare-technologies/gensim โญ 16,333
    Topic Modelling for Humans
    ๐Ÿ”— radimrehurek.com/gensim

  14. gunthercox/ChatterBot โญ 14,478
    ChatterBot is a machine learning, conversational dialog engine for creating chat bots
    ๐Ÿ”— docs.chatterbot.us

  15. nltk/nltk โญ 14,468
    NLTK Source
    ๐Ÿ”— www.nltk.org

  16. flairnlp/flair โญ 14,343
    A very simple framework for state-of-the-art Natural Language Processing (NLP)
    ๐Ÿ”— flairnlp.github.io/flair

  17. jina-ai/clip-as-service โญ 12,812
    ๐Ÿ„ Scalable embedding, reasoning, ranking for images and sentences with CLIP
    ๐Ÿ”— clip-as-service.jina.ai

  18. neuml/txtai โญ 12,049
    ๐Ÿ’ก All-in-one AI framework for semantic search, LLM orchestration and language model workflows
    ๐Ÿ”— neuml.github.io/txtai

  19. allenai/allennlp โญ 11,888
    An open-source NLP research library, built on PyTorch.
    ๐Ÿ”— www.allennlp.org

  20. facebookresearch/seamless_communication โญ 11,738
    Foundational Models for State-of-the-Art Speech and Text Translation

  21. google/sentencepiece โญ 11,599
    Unsupervised text tokenizer for Neural Network-based text generation.

  22. speechbrain/speechbrain โญ 11,084
    A PyTorch-based Speech Toolkit
    ๐Ÿ”— speechbrain.github.io

  23. facebookresearch/ParlAI โญ 10,628
    A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
    ๐Ÿ”— parl.ai

  24. doccano/doccano โญ 10,501
    Open source annotation tool for machine learning practitioners.

  25. facebookresearch/nougat โญ 9,809
    Implementation of Nougat Neural Optical Understanding for Academic Documents
    ๐Ÿ”— facebookresearch.github.io/nougat

  26. espnet/espnet โญ 9,700
    End-to-End Speech Processing Toolkit
    ๐Ÿ”— espnet.github.io/espnet

  27. sloria/TextBlob โญ 9,486
    Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
    ๐Ÿ”— textblob.readthedocs.io

  28. togethercomputer/OpenChatKit โญ 9,015
    OpenChatKit provides a powerful, open-source base to create both specialized and general purpose chatbots

  29. clips/pattern โญ 8,854
    Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
    ๐Ÿ”— github.com/clips/pattern/wiki

  30. maartengr/BERTopic โญ 7,346
    Leveraging BERT and c-TF-IDF to create easily interpretable topics.
    ๐Ÿ”— maartengr.github.io/bertopic

  31. quivrhq/MegaParse โญ 7,267
    File Parser optimised for LLM Ingestion with no loss ๐Ÿง  Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
    ๐Ÿ”— megaparse.com

  32. deeppavlov/DeepPavlov โญ 6,963
    An open source library for deep learning end-to-end dialog systems and chatbots.
    ๐Ÿ”— deeppavlov.ai

  33. facebookresearch/metaseq โญ 6,542
    A codebase for working with Open Pre-trained Transformers, originally forked from fairseq.

  34. kingoflolz/mesh-transformer-jax โญ 6,363
    Model parallel transformers in JAX and Haiku

  35. aiwaves-cn/agents โญ 5,858
    An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents

  36. layout-parser/layout-parser โญ 5,642
    A Unified Toolkit for Deep Learning Based Document Image Analysis
    ๐Ÿ”— layout-parser.github.io

  37. salesforce/CodeGen โญ 5,172
    CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

  38. minimaxir/textgenrnn โญ 4,932
    Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

  39. argilla-io/argilla โญ 4,820
    Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
    ๐Ÿ”— argilla-io.github.io/argilla/latest

  40. makcedward/nlpaug โญ 4,641
    Data augmentation for NLP
    ๐Ÿ”— makcedward.github.io

  41. promptslab/Promptify โญ 4,536
    Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research
    ๐Ÿ”— discord.gg/m88xfymbk6

  42. facebookresearch/DrQA โญ 4,480
    Reading Wikipedia to Answer Open-Domain Questions

  43. thilinarajapakse/simpletransformers โญ 4,227
    Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
    ๐Ÿ”— simpletransformers.ai

  44. maartengr/KeyBERT โญ 4,086
    A minimal and easy-to-use keyword extraction technique that leverages BERT embeddings to create keywords and keyphrases that are most similar to a document.
    ๐Ÿ”— maartengr.github.io/keybert

  45. rapidfuzz/RapidFuzz โญ 3,717
    Rapid fuzzy string matching in Python using various string metrics
    ๐Ÿ”— rapidfuzz.github.io/rapidfuzz

  46. chonkie-inc/chonkie โญ 3,635
    ๐Ÿฆ› CHONK docs with Chonkie โœจ โ€” The lightweight ingestion library for fast, efficient and robust RAG pipelines
    ๐Ÿ”— docs.chonkie.ai

  47. life4/textdistance โญ 3,512
    ๐Ÿ“ Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

  48. bytedance/lightseq โญ 3,304
    LightSeq: A High Performance Library for Sequence Processing and Generation

  49. neuralmagic/deepsparse โญ 3,159
    Sparsity-aware deep learning inference runtime for CPUs
    ๐Ÿ”— neuralmagic.com/deepsparse

  50. huawei-noah/Pretrained-Language-Model โญ 3,155
    Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

  51. ddangelov/Top2Vec โญ 3,108
    Top2Vec learns jointly embedded topic, document and word vectors.

  52. salesforce/CodeT5 โญ 3,096
    Home of CodeT5: Open Code LLMs for Code Understanding and Generation
    ๐Ÿ”— arxiv.org/abs/2305.07922

  53. bigscience-workshop/promptsource โญ 2,994
    Toolkit for creating, sharing and using natural language prompts.

  54. jbesomi/texthero โญ 2,917
    Text preprocessing, representation and visualization from zero to hero.
    ๐Ÿ”— texthero.org

  55. huggingface/neuralcoref โญ 2,890
    โœจFast Coreference Resolution in spaCy with Neural Networks
    ๐Ÿ”— huggingface.co/coref

  56. nvidia/nv-ingest โญ 2,817
    NVIDIA-Ingest is a scalable, performance-oriented document content and metadata extraction microservice.
    ๐Ÿ”— docs.nvidia.com/nemo/retriever/latest/extraction/overview

  57. urchade/GLiNER โญ 2,726
    Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
    ๐Ÿ”— urchade.github.io/gliner

  58. huggingface/setfit โญ 2,673
    SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers.
    ๐Ÿ”— hf.co/docs/setfit

  59. alibaba/EasyNLP โญ 2,182
    EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit

  60. thudm/P-tuning-v2 โญ 2,075
    An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks

  61. featureform/featureform โญ 1,961
    The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
    ๐Ÿ”— www.featureform.com

  62. marella/ctransformers โญ 1,878
    Python bindings for the Transformer models implemented in C/C++ using GGML library.

  63. nomic-ai/nomic โญ 1,859
    Nomic Developer API SDK
    ๐Ÿ”— atlas.nomic.ai

  64. intellabs/fastRAG โญ 1,760
    Efficient Retrieval Augmentation and Generation Framework

  65. pemistahl/lingua-py โญ 1,625
    The most accurate natural language detection library for Python, suitable for short text and mixed-language text

  66. answerdotai/ModernBERT โญ 1,620
    Bringing BERT into modernity via both architecture changes and scaling
    ๐Ÿ”— arxiv.org/abs/2412.13663

  67. xhluca/bm25s โญ 1,464
    Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
    ๐Ÿ”— bm25s.github.io

  68. openai/grade-school-math โญ 1,384
    GSM8K, a dataset of 8.5K high quality linguistically diverse grade school math word problems

  69. jonasgeiping/cramming โญ 1,361
    Cramming the training of a (BERT-type) language model into limited compute.

  70. abertsch72/unlimiformer โญ 1,066
    Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"

  71. webis-de/small-text โญ 635
    Small-Text provides state-of-the-art Active Learning for Text Classification. Several pre-implemented Query Strategies, Initialization Strategies, and Stopping Critera are provided, which can be easily mixed and matched to build active learning experiments or applications.
    ๐Ÿ”— small-text.readthedocs.io

  72. fastino-ai/GLiNER2 โญ 539
    GLiNER2 unifies Named Entity Recognition, Text Classification, and Structured Data Extraction into a single 205M parameter model. It provides efficient CPU-based inference without requiring complex pipelines or external API dependencies.

Packaging

Python packaging, dependency management and bundling.

  1. astral-sh/uv โญ 77,640
    An extremely fast Python package installer and resolver, written in Rust. Designed as a drop-in replacement for pip and pip-compile.
    ๐Ÿ”— docs.astral.sh/uv

  2. pyenv/pyenv โญ 44,127
    pyenv lets you easily switch between multiple versions of Python.

  3. python-poetry/poetry โญ 34,157
    Python packaging and dependency management made easy
    ๐Ÿ”— python-poetry.org

  4. pypa/pipenv โญ 25,106
    A virtualenv management tool that supports a multitude of systems and nicely bridges the gaps between pip, python and virtualenv.
    ๐Ÿ”— pipenv.pypa.io

  5. mitsuhiko/rye โญ 14,304
    a Hassle-Free Python Experience
    ๐Ÿ”— rye.astral.sh

  6. pyinstaller/pyinstaller โญ 12,850
    Freeze (package) Python programs into stand-alone executables
    ๐Ÿ”— www.pyinstaller.org

  7. pypa/pipx โญ 12,457
    Install and Run Python Applications in Isolated Environments
    ๐Ÿ”— pipx.pypa.io

  8. conda-forge/miniforge โญ 9,199
    A conda-forge distribution.
    ๐Ÿ”— conda-forge.org/download

  9. pdm-project/pdm โญ 8,541
    A modern Python package and dependency manager supporting the latest PEP standards
    ๐Ÿ”— pdm-project.org

  10. jazzband/pip-tools โญ 7,990
    A set of tools to keep your pinned Python dependencies fresh (pip-compile + pip-sync)
    ๐Ÿ”— pip-tools.rtfd.io

  11. mamba-org/mamba โญ 7,877
    The Fast Cross-Platform Package Manager: mamba is a reimplementation of the conda package manager in C++
    ๐Ÿ”— mamba.readthedocs.io

  12. conda/conda โญ 7,283
    A system-level, binary package and environment manager running on all major operating systems and platforms.
    ๐Ÿ”— docs.conda.io/projects/conda

  13. pypa/hatch โญ 7,109
    Modern, extensible Python project management
    ๐Ÿ”— hatch.pypa.io/latest

  14. prefix-dev/pixi โญ 6,197
    pixi is a cross-platform, multi-language package manager and workflow tool built on the foundation of the conda ecosystem.
    ๐Ÿ”— pixi.sh

  15. indygreg/PyOxidizer โญ 6,060
    A modern Python application packaging and distribution tool

  16. pypa/virtualenv โญ 5,007
    A tool to create isolated Python environments. Since Python 3.3, a subset of it has been integrated into the standard lib venv module.
    ๐Ÿ”— virtualenv.pypa.io

  17. spack/spack โญ 4,931
    A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
    ๐Ÿ”— spack.io

  18. pantsbuild/pex โญ 4,170
    A tool for generating .pex (Python EXecutable) files, lock files and venvs.
    ๐Ÿ”— docs.pex-tool.org

  19. pypa/flit โญ 2,241
    Simplified packaging of Python modules
    ๐Ÿ”— flit.pypa.io

  20. ofek/pyapp โญ 1,913
    Runtime installer for Python applications
    ๐Ÿ”— ofek.dev/pyapp

  21. python-poetry/install.python-poetry.org โญ 247
    The official Poetry installation script
    ๐Ÿ”— install.python-poetry.org

Pandas

Pandas and dataframe libraries: data analysis, statistical reporting, pandas GUIs, pandas performance optimisations.

  1. pandas-dev/pandas โญ 47,680
    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
    ๐Ÿ”— pandas.pydata.org

  2. pola-rs/polars โญ 37,116
    Extremely fast Query Engine for DataFrames, written in Rust
    ๐Ÿ”— docs.pola.rs

  3. duckdb/duckdb โญ 35,624
    DuckDB is an analytical in-process SQL database management system
    ๐Ÿ”— www.duckdb.org

  4. gventuri/pandas-ai โญ 23,060
    Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
    ๐Ÿ”— pandas-ai.com

  5. kanaries/pygwalker โญ 15,593
    PyGWalker: Turn your dataframe into an interactive UI for visual analysis
    ๐Ÿ”— kanaries.net/pygwalker

  6. ydataai/ydata-profiling โญ 13,343
    1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
    ๐Ÿ”— docs.sdk.ydata.ai

  7. rapidsai/cudf โญ 9,468
    cuDF is a GPU DataFrame library for loading joining, aggregating, filtering, and otherwise manipulating data
    ๐Ÿ”— docs.rapids.ai/api/cudf/stable

  8. eventual-inc/Daft โญ 5,144
    High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
    ๐Ÿ”— daft.ai

  9. deepseek-ai/smallpond โญ 4,905
    A lightweight data processing framework built on DuckDB and 3FS.

  10. unionai-oss/pandera โญ 4,180
    A light-weight, flexible, and expressive statistical data testing library
    ๐Ÿ”— www.union.ai/pandera

  11. aws/aws-sdk-pandas โญ 4,093
    pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
    ๐Ÿ”— aws-sdk-pandas.readthedocs.io

  12. nalepae/pandarallel โญ 3,806
    A simple and efficient tool to parallelize Pandas operations on all availableย CPUs
    ๐Ÿ”— nalepae.github.io/pandarallel

  13. adamerose/PandasGUI โญ 3,265
    A GUI for Pandas DataFrames

  14. delta-io/delta-rs โญ 3,114
    A native Rust library for Delta Lake, with bindings into Python
    ๐Ÿ”— delta-io.github.io/delta-rs

  15. jmcarpenter2/swifter โญ 2,641
    A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner

  16. fugue-project/fugue โญ 2,136
    A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
    ๐Ÿ”— fugue-tutorials.readthedocs.io

  17. pyjanitor-devs/pyjanitor โญ 1,477
    Clean APIs for data cleaning. Python implementation of R package Janitor
    ๐Ÿ”— pyjanitor-devs.github.io/pyjanitor

  18. renumics/spotlight โญ 1,245
    Interactively explore unstructured datasets from your dataframe.
    ๐Ÿ”— renumics.com

Performance

Performance, parallelisation and low level libraries.

  1. celery/celery โญ 27,904
    Distributed Task Queue (development branch)
    ๐Ÿ”— docs.celeryq.dev

  2. google/flatbuffers โญ 25,451
    FlatBuffers: Memory Efficient Serialization Library
    ๐Ÿ”— flatbuffers.dev

  3. pybind/pybind11 โญ 17,658
    Seamless operability between C++11 and Python
    ๐Ÿ”— pybind11.readthedocs.io

  4. exaloop/codon โญ 16,566
    A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support
    ๐Ÿ”— docs.exaloop.io

  5. dask/dask โญ 13,728
    Parallel computing with task scheduling
    ๐Ÿ”— dask.org

  6. numba/numba โญ 10,865
    NumPy aware dynamic Python compiler using LLVM
    ๐Ÿ”— numba.pydata.org

  7. modin-project/modin โญ 10,350
    Modin: Scale your Pandas workflows by changing a single line of code
    ๐Ÿ”— modin.readthedocs.io

  8. vaexio/vaex โญ 8,469
    Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second ๐Ÿš€
    ๐Ÿ”— vaex.io

  9. nebuly-ai/optimate โญ 8,353
    A collection of libraries to optimise AI model performances
    ๐Ÿ”— www.nebuly.com

  10. python-trio/trio โญ 7,110
    Trio โ€“ a friendly Python library for async concurrency and I/O
    ๐Ÿ”— trio.readthedocs.io

  11. mher/flower โญ 7,096
    Real-time monitor and web admin for Celery distributed task queue
    ๐Ÿ”— flower.readthedocs.io

  12. airtai/faststream โญ 4,901
    FastStream is a powerful and easy-to-use asynchronous Python framework for building asynchronous services interacting with event streams such as Apache Kafka, RabbitMQ, NATS and Redis.
    ๐Ÿ”— faststream.ag2.ai/latest

  13. tlkh/asitop โญ 4,419
    Perf monitoring CLI tool for Apple Silicon
    ๐Ÿ”— tlkh.github.io/asitop

  14. facebookincubator/cinder โญ 3,757
    This is Meta's fork of the CPython runtime. The name "cinder" here is historical, see https://github.com/facebookincubator/cinderx for the Python extension / JIT compiler.

  15. agronholm/anyio โญ 2,367
    High level asynchronous concurrency and networking framework that works on top of either Trio or asyncio

  16. tiangolo/asyncer โญ 2,343
    Asyncer, async and await, focused on developer experience.
    ๐Ÿ”— asyncer.tiangolo.com

  17. intel/intel-extension-for-transformers โญ 2,173
    โšก Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platformsโšก

  18. intel/intel-extension-for-pytorch โญ 2,005
    A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

  19. faster-cpython/ideas โญ 1,727
    Discussion and work tracker for Faster CPython project.

Profiling

Memory and CPU/GPU profiling tools and libraries.

  1. benfred/py-spy โญ 14,865
    Sampling profiler for Python programs

  2. bloomberg/memray โญ 14,792
    Memray is a memory profiler for Python
    ๐Ÿ”— bloomberg.github.io/memray

  3. plasma-umass/scalene โญ 13,243
    Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

  4. joerick/pyinstrument โญ 7,596
    ๐Ÿšดย Call stack profiler for Python. Shows you why your code is slow!
    ๐Ÿ”— pyinstrument.readthedocs.io

  5. gaogaotiantian/viztracer โญ 7,523
    A debugging and profiling tool that can trace and visualize python code execution
    ๐Ÿ”— viztracer.readthedocs.io

  6. pythonprofilers/memory_profiler โญ 4,549
    Monitor Memory usage of Python code
    ๐Ÿ”— pypi.python.org/pypi/memory_profiler

  7. pyutils/line_profiler โญ 3,188
    Line-by-line profiling for Python

  8. reloadware/reloadium โญ 2,996
    Hot Reloading and Profiling for Python

Security

Security related libraries: vulnerability discovery, SQL injection, environment auditing.

  1. swisskyrepo/PayloadsAllTheThings โญ 74,615
    A list of useful payloads and bypass for Web Application Security and Pentest/CTF
    ๐Ÿ”— swisskyrepo.github.io/payloadsallthethings

  2. sqlmapproject/sqlmap โญ 36,374
    Automatic SQL injection and database takeover tool
    ๐Ÿ”— sqlmap.org

  3. certbot/certbot โญ 32,781
    Certbot is EFF's tool to obtain certs from Let's Encrypt and (optionally) auto-enable HTTPS on your server. It can also act as a client for any other CA that uses the ACME protocol.

  4. aquasecurity/trivy โญ 31,095
    Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more
    ๐Ÿ”— trivy.dev

  5. bridgecrewio/checkov โญ 8,423
    Checkov is a static code analysis tool for infrastructure as code (IaC) and also a software composition analysis (SCA) tool for images and open source packages.
    ๐Ÿ”— www.checkov.io

  6. stamparm/maltrail โญ 8,167
    Malicious traffic detection system

  7. pycqa/bandit โญ 7,688
    Bandit is a tool designed to find common security issues in Python code.
    ๐Ÿ”— bandit.readthedocs.io

  8. nccgroup/ScoutSuite โญ 7,519
    Multi-Cloud Security Auditing Tool

  9. microsoft/presidio โญ 6,716
    Context aware, pluggable and customizable PII de-identification service for text and images
    ๐Ÿ”— microsoft.github.io/presidio

  10. rhinosecuritylabs/pacu โญ 5,040
    The AWS exploitation framework, designed for testing the security of Amazon Web Services environments.
    ๐Ÿ”— rhinosecuritylabs.com/aws/pacu-open-source-aws-exploitation-framework

  11. dashingsoft/pyarmor โญ 4,913
    A tool used to obfuscate python scripts, bind obfuscated scripts to fixed machine or expire obfuscated scripts.
    ๐Ÿ”— pyarmor.dashingsoft.com

  12. fadi002/de4py โญ 950
    toolkit for python reverse engineering

Simulation

Simulation libraries: robotics, economic, agent-based, traffic, physics, astronomy, chemistry, quantum simulation. Also see the Maths and Science category for crossover.

  1. atsushisakai/PythonRobotics โญ 28,351
    Python sample codes and textbook for robotics algorithms.
    ๐Ÿ”— atsushisakai.github.io/pythonrobotics

  2. genesis-embodied-ai/Genesis โญ 28,015
    Genesis is a physics platform, and generative data engine, designed for general purpose Robotics/Embodied AI/Physical AI applications
    ๐Ÿ”— genesis-world.readthedocs.io

  3. bulletphysics/bullet3 โญ 14,183
    Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
    ๐Ÿ”— bulletphysics.org

  4. isl-org/Open3D โญ 13,256
    Open3D: A Modern Library for 3D Data Processing
    ๐Ÿ”— www.open3d.org

  5. dlr-rm/stable-baselines3 โญ 12,596
    Stable Baselines3 (SB3) is a set of reliable implementations of reinforcement learning algorithms in PyTorch
    ๐Ÿ”— stable-baselines3.readthedocs.io

  6. nvidia/Cosmos โญ 8,085
    NVIDIA Cosmos is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster.
    ๐Ÿ”— github.com/nvidia-cosmos

  7. qiskit/qiskit โญ 6,956
    Qiskit is an open-source SDK for working with quantum computers at the level of extended quantum circuits, operators, and primitives.
    ๐Ÿ”— www.ibm.com/quantum/qiskit

  8. nvidia/warp โญ 6,138
    A Python framework for accelerated simulation, data generation and spatial computing.
    ๐Ÿ”— nvidia.github.io/warp

  9. nvidia-omniverse/IsaacLab โญ 6,136
    Unified framework for robot learning built on NVIDIA Isaac Sim
    ๐Ÿ”— isaac-sim.github.io/isaaclab

  10. astropy/astropy โญ 5,012
    Astronomy and astrophysics core library
    ๐Ÿ”— www.astropy.org

  11. quantumlib/Cirq โญ 4,851
    Python framework for creating, editing, and running Noisy Intermediate-Scale Quantum (NISQ) circuits.
    ๐Ÿ”— quantumai.google/cirq

  12. chakazul/Lenia โญ 3,728
    Lenia is a 2D cellular automata with continuous space, time and states. It produces a huge variety of interesting methematical life forms
    ๐Ÿ”— chakazul.github.io/lenia/javascript/lenia.html

  13. openai/mujoco-py โญ 3,105
    MuJoCo is a physics engine for detailed, efficient rigid body simulations with contacts. mujoco-py allows using MuJoCo from Python 3.

  14. pennylaneai/pennylane โญ 3,037
    PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Built by researchers, for research.
    ๐Ÿ”— pennylane.ai

  15. google/brax โญ 3,033
    Massively parallel rigidbody physics simulation on accelerator hardware.

  16. nvidia-omniverse/IsaacGymEnvs โญ 2,817
    Example RL environments for the NVIDIA Isaac Gym high performance environments

  17. facebookresearch/habitat-lab โญ 2,807
    A modular high-level library to train embodied AI agents across a variety of tasks and environments.
    ๐Ÿ”— aihabitat.org

  18. tencent-hunyuan/Hunyuan3D-2.1 โญ 2,788
    Tencent Hunyuan3D-2.1 is a scalable 3D asset creation system that advances state-of-the-art 3D generation
    ๐Ÿ”— 3d.hunyuan.tencent.com

  19. taichi-dev/difftaichi โญ 2,707
    10 differentiable physical simulators built with Taichi differentiable programming (DiffTaichi, ICLR 2020)

  20. dlr-rm/rl-baselines3-zoo โญ 2,691
    A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
    ๐Ÿ”— rl-baselines3-zoo.readthedocs.io

  21. isaac-sim/IsaacSim โญ 2,407
    NVIDIA Isaac Sim is a simulation platform built on NVIDIA Omniverse, designed to develop, test, train, and deploy AI-powered robots in realistic virtual environments.
    ๐Ÿ”— developer.nvidia.com/isaac/sim

  22. microsoft/PromptCraft-Robotics โญ 2,084
    Community for applying LLMs to robotics and a robot simulator with ChatGPT integration
    ๐Ÿ”— aka.ms/chatgpt-robotics

  23. eloialonso/diamond โญ 1,952
    DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model
    ๐Ÿ”— diamond-wm.github.io

  24. polymathicai/the_well โญ 1,655
    15TB of Physics Simulations: collection of machine learning datasets containing numerical simulations of a wide variety of spatiotemporal physical systems.
    ๐Ÿ”— polymathic-ai.org/the_well

  25. bowang-lab/scGPT โญ 1,446
    scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI
    ๐Ÿ”— scgpt.readthedocs.io/en/latest

  26. altera-al/project-sid โญ 1,168
    Project Sid: Many-agent simulations toward AI civilization technical report

  27. google-deepmind/materials_discovery โญ 1,114
    Graph Networks for Materials Science (GNoME) is a project centered around scaling machine learning methods to tackle materials science.

  28. viblo/pymunk โญ 1,038
    Pymunk is a easy-to-use pythonic 2d physics library that can be used whenever you need 2d rigid body physics from Python
    ๐Ÿ”— www.pymunk.org

  29. eureka-research/DrEureka โญ 915
    Official Repository for "DrEureka: Language Model Guided Sim-To-Real Transfer" (RSS 2024)
    ๐Ÿ”— eureka-research.github.io/dr-eureka

  30. ur-whitelab/chemcrow-public โญ 868
    Chemcrow

  31. vandijklab/cell2sentence โญ 801
    Cell2Sentence (C2S-Scale) framework for applying Large Language Models (LLMs) to single-cell transcriptomics.

  32. sakanaai/ShinkaEvolve โญ 799
    A framework that combines LLMs with evolutionary algorithms to drive scientific discovery. Leveraging creative capabilities of LLMs and the optimization power of evolutionary search, enables automated exploration and improvement of scientific code.

  33. sakanaai/asal โญ 449
    Automating the Search for Artificial Life with Foundation Models!
    ๐Ÿ”— pub.sakana.ai/asal

  34. arshka/PhysiX โญ 110
    A Foundation Model for physics simulations

  35. ur-whitelab/chemcrow-runs โญ 94
    ur-whitelab/chemcrow-runs

Study

Miscellaneous study resources: algorithms, general resources, system design, code repos for textbooks, best practices, tutorials.

  1. thealgorithms/Python โญ 217,119
    All Algorithms implemented in Python
    ๐Ÿ”— thealgorithms.github.io/python

  2. microsoft/generative-ai-for-beginners โญ 105,570
    Learn the fundamentals of building Generative AI applications with our 21-lesson comprehensive course by Microsoft Cloud Advocates.

  3. rasbt/LLMs-from-scratch โญ 83,698
    Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
    ๐Ÿ”— amzn.to/4fqvn0d

  4. mlabonne/llm-course โญ 73,812
    Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
    ๐Ÿ”— mlabonne.github.io/blog

  5. labmlai/annotated_deep_learning_paper_implementations โญ 65,496
    ๐Ÿง‘โ€๐Ÿซ 60+ Implementations/tutorials of deep learning papers with side-by-side notes ๐Ÿ“; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), ๐ŸŽฎ reinforcement learning (ppo, dqn), capsnet, distillation, ... ๐Ÿง 
    ๐Ÿ”— nn.labml.ai

  6. jakevdp/PythonDataScienceHandbook โญ 46,573
    Python Data Science Handbook: full text in Jupyter Notebooks
    ๐Ÿ”— jakevdp.github.io/pythondatasciencehandbook

  7. realpython/python-guide โญ 29,471
    Python best practices guidebook, written for humans.
    ๐Ÿ”— docs.python-guide.org

  8. d2l-ai/d2l-en โญ 28,013
    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
    ๐Ÿ”— d2l.ai

  9. christoschristofidis/awesome-deep-learning โญ 27,358
    A curated list of awesome Deep Learning tutorials, projects and communities.

  10. hannibal046/Awesome-LLM โญ 26,079
    Awesome-LLM: a curated list of Large Language Model

  11. huggingface/agents-course โญ 24,945
    This repository contains the Hugging Face Agents Course.

  12. wesm/pydata-book โญ 24,209
    Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media

  13. affaan-m/everything-claude-code โญ 23,150
    Complete Claude Code configuration collection - agents, skills, hooks, commands, rules, MCPs. Battle-tested configs from an Anthropic hackathon winner.

  14. microsoft/recommenders โญ 21,384
    Best Practices on Recommendation Systems
    ๐Ÿ”— recommenders-team.github.io/recommenders/intro.html

  15. karpathy/nn-zero-to-hero โญ 20,041
    Neural Networks: Zero to Hero

  16. handsonllm/Hands-On-Large-Language-Models โญ 19,993
    Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
    ๐Ÿ”— www.llm-book.com

  17. fchollet/deep-learning-with-python-notebooks โญ 19,876
    Jupyter notebooks for the code samples of the book "Deep Learning with Python"

  18. nirdiamant/agents-towards-production โญ 16,944
    The open-source playbook for turning AI agents into real-world products.

  19. mrdbourke/pytorch-deep-learning โญ 16,921
    Materials for the Learn PyTorch for Deep Learning: Zero to Mastery course.
    ๐Ÿ”— learnpytorch.io

  20. zhanymkanov/fastapi-best-practices โญ 16,156
    FastAPI Best Practices and Conventions we used at our startup

  21. naklecha/llama3-from-scratch โญ 15,243
    llama3 implementation one matrix multiplication at a time

  22. graykode/nlp-tutorial โญ 14,840
    Natural Language Processing Tutorial for Deep Learning Researchers
    ๐Ÿ”— www.reddit.com/r/machinelearning/comments/amfinl/project_nlptutoral_repository_who_is_studying

  23. shangtongzhang/reinforcement-learning-an-introduction โญ 14,515
    Python Implementation of Reinforcement Learning: An Introduction

  24. karpathy/micrograd โญ 14,428
    A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

  25. chiphuyen/aie-book โญ 13,206
    Code for AI Engineering: Building Applications with Foundation Models (Chip Huyen 2025)

  26. eugeneyan/open-llms โญ 12,602
    ๐Ÿ“‹ A list of open LLMs available for commercial use.

  27. rucaibox/LLMSurvey โญ 12,070
    The official GitHub page for the survey paper "A Survey of Large Language Models".
    ๐Ÿ”— arxiv.org/abs/2303.18223

  28. srush/GPU-Puzzles โญ 11,902
    Teaching beginner GPU programming in a completely interactive fashion

  29. openai/spinningup โญ 11,548
    An educational resource to help anyone learn deep reinforcement learning.
    ๐Ÿ”— spinningup.openai.com

  30. nielsrogge/Transformers-Tutorials โญ 11,479
    This repository contains demos I made with the Transformers library by HuggingFace.

  31. mooler0410/LLMsPracticalGuide โญ 10,142
    A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
    ๐Ÿ”— arxiv.org/abs/2304.13712v2

  32. kalyanks-nlp/llm-engineer-toolkit โญ 9,688
    A curated list of 120+ LLM libraries category wise.
    ๐Ÿ”— www.linkedin.com/in/kalyanksnlp

  33. roboflow/notebooks โญ 9,117
    A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like RF-DETR, YOLO11, SAM 3, and Qwen3-VL.
    ๐Ÿ”— roboflow.com/models

  34. udlbook/udlbook โญ 8,940
    Understanding Deep Learning - Simon J.D. Prince

  35. engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies โญ 8,205
    Curated collection of 300+ case studies from over 80 companies, detailing practical applications and insights into machine learning (ML) system design

  36. alirezadir/Machine-Learning-Interviews โญ 7,644
    This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

  37. firmai/industry-machine-learning โญ 7,442
    A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
    ๐Ÿ”— www.sov.ai

  38. gkamradt/langchain-tutorials โญ 7,366
    Overview and tutorial of the LangChain Library

  39. huggingface/smol-course โญ 6,578
    a practical course on aligning language models for your specific use case. It's a handy way to get started with aligning language models, because everything runs on most local machines.

  40. neetcode-gh/leetcode โญ 6,258
    Leetcode solutions for NeetCode.io

  41. mrdbourke/tensorflow-deep-learning โญ 5,831
    All course materials for the Zero to Mastery Deep Learning with TensorFlow course.
    ๐Ÿ”— dbourke.link/ztmtfcourse

  42. udacity/deep-learning-v2-pytorch โญ 5,469
    Projects and exercises for the latest Deep Learning ND program https://www.udacity.com/course/deep-learning-nanodegree--nd101

  43. promptslab/Awesome-Prompt-Engineering โญ 5,296
    This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc
    ๐Ÿ”— discord.gg/m88xfymbk6

  44. timofurrer/awesome-asyncio โญ 4,987
    A curated list of awesome Python asyncio frameworks, libraries, software and resources

  45. rasbt/machine-learning-book โญ 4,943
    Code Repository for Machine Learning with PyTorch and Scikit-Learn
    ๐Ÿ”— sebastianraschka.com/books/#machine-learning-with-pytorch-and-scikit-learn

  46. huggingface/deep-rl-class โญ 4,719
    This repo contains the Hugging Face Deep Reinforcement Learning Course.

  47. zotroneneis/machine_learning_basics โญ 4,402
    Plain python implementations of basic machine learning algorithms

  48. huggingface/diffusion-models-class โญ 4,263
    Materials for the Hugging Face Diffusion Models Course

  49. amanchadha/coursera-deep-learning-specialization โญ 4,149
    Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks and Deep Learning; (ii) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization; (iii) Structuring Machine Learning Projects; (iv...

  50. fluentpython/example-code-2e โญ 3,928
    Example code for Fluent Python, 2nd edition (O'Reilly 2022)
    ๐Ÿ”— amzn.to/3j48u2j

  51. cosmicpython/book โญ 3,708
    A Book about Pythonic Application Architecture Patterns for Managing Complexity. Cosmos is the Opposite of Chaos you see. O'R. wouldn't actually let us call it "Cosmic Python" tho.
    ๐Ÿ”— www.cosmicpython.com

  52. mrdbourke/zero-to-mastery-ml โญ 3,583
    All course materials for the Zero to Mastery Machine Learning and Data Science course.
    ๐Ÿ”— dbourke.link/ztmmlcourse

  53. krzjoa/awesome-python-data-science โญ 3,318
    Probably the best curated list of data science software in Python.
    ๐Ÿ”— krzjoa.github.io/awesome-python-data-science

  54. huggingface/cookbook โญ 2,580
    Community-driven practical examples of building AI applications and solving various tasks with AI using open-source tools and models.
    ๐Ÿ”— huggingface.co/learn/cookbook

  55. gerdm/prml โญ 2,532
    Repository of notes, code and notebooks in Python for the book Pattern Recognition and Machine Learning by Christopher Bishop

  56. cerlymarco/MEDIUM_NoteBook โญ 2,141
    Repository containing notebooks of my posts on Medium

  57. aburkov/theLMbook โญ 2,080
    Code for Hundred-Page Language Models Book by Andriy Burkov
    ๐Ÿ”— www.thelmbook.com

  58. huggingface/evaluation-guidebook โญ 2,040
    Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

  59. atcold/NYU-DLSP21 โญ 1,651
    NYU Deep Learning Spring 2021
    ๐Ÿ”— atcold.github.io/nyu-dlsp21

  60. davidadsp/Generative_Deep_Learning_2nd_Edition โญ 1,443
    The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
    ๐Ÿ”— www.oreilly.com/library/view/generative-deep-learning/9781098134174

  61. rasbt/LLM-workshop-2024 โญ 1,061
    A 4-hour coding workshop to understand how LLMs are implemented and used

  62. cfregly/ai-performance-engineering โญ 973
    AI Systems Performance Engineering code and resources for the O'Reilly book covering GPU optimization, distributed training, inference scaling

  63. dylanhogg/awesome-python โญ 439
    ๐Ÿ Hand-picked awesome Python libraries and frameworks, organised by category
    ๐Ÿ”— www.awesomepython.org

Template

Template tools and libraries: cookiecutter repos, generators, quick-starts.

  1. tiangolo/full-stack-fastapi-template โญ 41,003
    Full stack, modern web application template. Using FastAPI, React, SQLModel, PostgreSQL, Docker, GitHub Actions, automatic HTTPS and more.

  2. cookiecutter/cookiecutter โญ 24,573
    A cross-platform command-line utility that creates projects from cookiecutters (project templates), e.g. Python package projects, C projects.
    ๐Ÿ”— pypi.org/project/cookiecutter

  3. drivendata/cookiecutter-data-science โญ 9,620
    A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
    ๐Ÿ”— cookiecutter-data-science.drivendata.org

  4. buuntu/fastapi-react โญ 2,570
    ๐Ÿš€ Cookiecutter Template for FastAPI + React Projects. Using PostgreSQL, SQLAlchemy, and Docker

  5. cjolowicz/cookiecutter-hypermodern-python โญ 1,902
    Cookiecutter template for a Python package based on the Hypermodern Python article series.
    ๐Ÿ”— cookiecutter-hypermodern-python.readthedocs.io

  6. fmind/mlops-python-package โญ 1,385
    Best practices designed to support your MLOPs initiatives. You can use this package as part of your MLOps toolkit or platform e.g. Model Registry, Experiment Tracking, Realtime Inference
    ๐Ÿ”— fmind.github.io/mlops-python-package

  7. fpgmaas/cookiecutter-uv โญ 1,223
    A modern cookiecutter template for Python projects that use uv for dependency management
    ๐Ÿ”— fpgmaas.github.io/cookiecutter-uv

  8. tezromach/python-package-template โญ 1,096
    ๐Ÿš€ Your next Python package needs a bleeding-edge project structure.

  9. callmesora/llmops-python-package โญ 887
    Best practices designed to support your LLMOps initiatives. You can use this package as part of your LLMOps toolkit or platform e.g. Model Registry, Experiment Tracking, Realtime Inference

Terminal

Terminal and console tools and libraries: CLI tools, terminal based formatters, progress bars.

  1. anthropics/claude-code โญ 60,052
    Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows
    ๐Ÿ”— code.claude.com/docs/en/overview

  2. willmcgugan/rich โญ 55,226
    Rich is a Python library for rich text and beautiful formatting in the terminal.
    ๐Ÿ”— rich.readthedocs.io/en/latest

  3. aider-ai/aider โญ 40,036
    Aider lets you pair program with LLMs, to edit code in your local git repository
    ๐Ÿ”— aider.chat

  4. willmcgugan/textual โญ 33,821
    The lean application framework for Python. Build sophisticated user interfaces with a simple Python API. Run your apps in the terminal and a web browser.
    ๐Ÿ”— textual.textualize.io

  5. tqdm/tqdm โญ 30,889
    โšก A Fast, Extensible Progress Bar for Python and CLI
    ๐Ÿ”— tqdm.github.io

  6. google/python-fire โญ 28,072
    Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.

  7. tiangolo/typer โญ 18,686
    Typer, build great CLIs. Easy to code. Based on Python type hints.
    ๐Ÿ”— typer.tiangolo.com

  8. pallets/click โญ 17,141
    Python composable command line interface toolkit
    ๐Ÿ”— click.palletsprojects.com

  9. simonw/llm โญ 10,950
    A CLI utility and Python library for interacting with Large Language Models, both via remote APIs and models that can be installed and run on your own machine.
    ๐Ÿ”— llm.datasette.io

  10. prompt-toolkit/python-prompt-toolkit โญ 10,225
    Library for building powerful interactive command line applications in Python
    ๐Ÿ”— python-prompt-toolkit.readthedocs.io

  11. saulpw/visidata โญ 8,755
    A terminal spreadsheet multitool for discovering and arranging data
    ๐Ÿ”— visidata.org

  12. xxh/xxh โญ 5,886
    ๐Ÿš€ Bring your favorite shell wherever you go through the ssh. Xonsh shell, fish, zsh, osquery and so on.

  13. tconbeer/harlequin โญ 5,646
    The SQL IDE for Your Terminal.
    ๐Ÿ”— harlequin.sh

  14. manrajgrover/halo โญ 3,086
    ๐Ÿ’ซ Beautiful spinners for terminal, IPython and Jupyter

  15. textualize/trogon โญ 2,791
    Easily turn your Click CLI into a powerful terminal application

  16. darrenburns/elia โญ 2,416
    A snappy, keyboard-centric terminal user interface for interacting with large language models. Chat with ChatGPT, Claude, Llama 3, Phi 3, Mistral, Gemma and more.

  17. shobrook/wut โญ 1,406
    Just type wut and an LLM will help you understand whatever's in your terminal. You'll be surprised how useful this can be.

  18. 1j01/textual-paint โญ 1,083
    ๐ŸŽจ MS Paint in your terminal.
    ๐Ÿ”— pypi.org/project/textual-paint

Testing

Testing libraries: unit testing, load testing, acceptance testing, code coverage, browser automation, plugins.

  1. mitmproxy/mitmproxy โญ 42,031
    An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
    ๐Ÿ”— mitmproxy.org

  2. locustio/locust โญ 27,382
    Write scalable load tests in plain Python ๐Ÿš—๐Ÿ’จ

  3. microsoft/playwright-python โญ 14,182
    Playwright is a Python library to automate Chromium, Firefox and WebKit browsers with a single API.
    ๐Ÿ”— playwright.dev/python

  4. pytest-dev/pytest โญ 13,483
    The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
    ๐Ÿ”— pytest.org

  5. confident-ai/deepeval โญ 13,158
    LLM evaluation framework similar to Pytest but specialized for unit testing LLM outputs. DeepEval incorporates the latest research to evaluate LLM outputs based on metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc
    ๐Ÿ”— deepeval.com

  6. seleniumbase/SeleniumBase โญ 12,116
    Python APIs for web automation, testing, and bypassing bot-detection with ease.
    ๐Ÿ”— seleniumbase.io

  7. robotframework/robotframework โญ 11,370
    Generic automation framework for acceptance testing and RPA
    ๐Ÿ”— robotframework.org

  8. hypothesisworks/hypothesis โญ 8,407
    The property-based testing library for Python
    ๐Ÿ”— hypothesis.works

  9. getmoto/moto โญ 8,161
    A library that allows you to easily mock out tests based on AWS infrastructure.
    ๐Ÿ”— docs.getmoto.org/en/latest

  10. newsapps/beeswithmachineguns โญ 6,618
    A utility for arming (creating) many bees (micro EC2 instances) to attack (load test) targets (web applications).
    ๐Ÿ”— apps.chicagotribune.com

  11. codium-ai/qodo-cover โญ 5,262
    Qodo-Cover: An AI-Powered Tool for Automated Test Generation and Code Coverage Enhancement! ๐Ÿ’ป๐Ÿค–๐Ÿงช๐Ÿž
    ๐Ÿ”— qodo.ai

  12. spulec/freezegun โญ 4,485
    Let your Python tests travel through time

  13. getsentry/responses โญ 4,316
    A utility for mocking out the Python Requests library.

  14. tox-dev/tox โญ 3,892
    Command line driven CI frontend and development task automation tool.
    ๐Ÿ”— tox.wiki

  15. nedbat/coveragepy โญ 3,308
    The code coverage tool for Python
    ๐Ÿ”— coverage.readthedocs.io

Machine Learning - Time Series

Machine learning and classical timeseries libraries: forecasting, seasonality, anomaly detection, econometrics.

  1. facebook/prophet โญ 19,969
    Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
    ๐Ÿ”— facebook.github.io/prophet

  2. sktime/sktime โญ 9,464
    A unified framework for machine learning with time series
    ๐Ÿ”— www.sktime.net

  3. unit8co/darts โญ 9,161
    A python library for user-friendly forecasting and anomaly detection on time series.
    ๐Ÿ”— unit8co.github.io/darts

  4. blue-yonder/tsfresh โญ 9,086
    Automatic extraction of relevant features from time series:
    ๐Ÿ”— tsfresh.readthedocs.io

  5. google-research/timesfm โญ 7,665
    TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
    ๐Ÿ”— research.google/blog/a-decoder-only-foundation-model-for-time-series-forecasting

  6. facebookresearch/Kats โญ 6,278
    Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.

  7. awslabs/gluonts โญ 5,123
    Probabilistic time series modeling in Python
    ๐Ÿ”— ts.gluon.ai

  8. amazon-science/chronos-forecasting โญ 4,691
    Chronos: Pretrained Models for Time Series Forecasting
    ๐Ÿ”— arxiv.org/abs/2510.15821

  9. nixtla/statsforecast โญ 4,661
    Lightning โšก๏ธ fast forecasting with statistical and econometric models.
    ๐Ÿ”— nixtlaverse.nixtla.io/statsforecast

  10. salesforce/Merlion โญ 4,474
    Merlion: A Machine Learning Framework for Time Series Intelligence

  11. tdameritrade/stumpy โญ 4,055
    STUMPY is a powerful and scalable Python library for modern time series analysis
    ๐Ÿ”— stumpy.readthedocs.io/en/latest

  12. yuqinie98/PatchTST โญ 2,409
    An offical implementation of PatchTST: A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

  13. aistream-peelout/flow-forecast โญ 2,268
    Deep learning PyTorch library for time series forecasting, classification, and anomaly detection (originally for flood forecasting).
    ๐Ÿ”— flow-forecast.atlassian.net/wiki/spaces/ff/overview

  14. uber/orbit โญ 2,028
    A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.
    ๐Ÿ”— orbit-ml.readthedocs.io/en/stable

  15. time-series-foundation-models/lag-llama โญ 1,537
    Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

  16. ngruver/llmtime โญ 819
    LLMTime, a method for zero-shot time series forecasting with large language models (LLMs) by encoding numbers as text and sampling possible extrapolations as text completions
    ๐Ÿ”— arxiv.org/abs/2310.07820

  17. google/temporian โญ 708
    Temporian is an open-source Python library for preprocessing โšก and feature engineering ๐Ÿ›  temporal data ๐Ÿ“ˆ for machine learning applications ๐Ÿค–
    ๐Ÿ”— temporian.readthedocs.io

Typing

Typing libraries: static and run-time type checking, annotations.

  1. python/mypy โญ 20,144
    Optional static typing for Python
    ๐Ÿ”— www.mypy-lang.org

  2. astral-sh/ty โญ 16,731
    An extremely fast Python type checker and language server, written in Rust.
    ๐Ÿ”— docs.astral.sh/ty

  3. microsoft/pyright โญ 15,153
    Static Type Checker for Python

  4. facebook/pyre-check โญ 7,141
    Performant type-checking for python.
    ๐Ÿ”— pyre-check.org

  5. python-attrs/attrs โญ 5,711
    Python Classes Without Boilerplate
    ๐Ÿ”— www.attrs.org

  6. facebook/pyrefly โญ 5,247
    A fast type checker and IDE for Python. (A new version of Pyre)
    ๐Ÿ”— pyrefly.org

  7. google/pytype โญ 5,033
    A static type analyzer for Python code
    ๐Ÿ”— google.github.io/pytype

  8. instagram/MonkeyType โญ 4,989
    A Python library that generates static type annotations by collecting runtime types

  9. python/typeshed โญ 4,988
    Collection of library stubs for Python, with static types

  10. koxudaxi/datamodel-code-generator โญ 3,718
    Python data model generator (Pydantic, dataclasses, TypedDict, msgspec) from OpenAPI, JSON Schema, GraphQL, and raw data (JSON/YAML/CSV).
    ๐Ÿ”— koxudaxi.github.io/datamodel-code-generator

  11. detachhead/basedpyright โญ 3,054
    Basedpyright is a fork of pyright with various type checking improvements, pylance features and more.
    ๐Ÿ”— docs.basedpyright.com

  12. mtshiba/pylyzer โญ 2,875
    A fast, feature-rich static code analyzer & language server for Python
    ๐Ÿ”— mtshiba.github.io/pylyzer

  13. microsoft/pylance-release โญ 1,977
    Fast, feature-rich language support for Python. Documentation and issues for Pylance.

  14. robertcraigie/pyright-python โญ 258
    Python command line wrapper for pyright, a static type checker
    ๐Ÿ”— pypi.org/project/pyright

Utility

General utility libraries: miscellaneous tools, linters, code formatters, version management, package tools, documentation tools.

  1. yt-dlp/yt-dlp โญ 143,790
    A feature-rich command-line audio/video downloader
    ๐Ÿ”— discord.gg/h5mncfw63r

  2. home-assistant/core โญ 84,357
    ๐Ÿก Open source home automation that puts local control and privacy first.
    ๐Ÿ”— www.home-assistant.io

  3. abi/screenshot-to-code โญ 71,484
    Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
    ๐Ÿ”— screenshottocode.com

  4. python/cpython โญ 71,201
    The Python programming language
    ๐Ÿ”— www.python.org

  5. localstack/localstack โญ 64,183
    ๐Ÿ’ป A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline
    ๐Ÿ”— localstack.cloud

  6. ggerganov/whisper.cpp โญ 46,061
    Port of OpenAI's Whisper model in C/C++

  7. faif/python-patterns โญ 42,688
    A collection of design patterns/idioms in Python

  8. mingrammer/diagrams โญ 41,945
    ๐ŸŽจ Diagram as Code for prototyping cloud system architectures
    ๐Ÿ”— diagrams.mingrammer.com

  9. openai/openai-python โญ 29,758
    The official Python library for the OpenAI API
    ๐Ÿ”— pypi.org/project/openai

  10. blakeblackshear/frigate โญ 29,705
    NVR with realtime local object detection for IP cameras
    ๐Ÿ”— frigate.video

  11. pydantic/pydantic โญ 26,549
    Data validation using Python type hints
    ๐Ÿ”— docs.pydantic.dev

  12. squidfunk/mkdocs-material โญ 25,858
    Documentation that simply works
    ๐Ÿ”— squidfunk.github.io/mkdocs-material

  13. keon/algorithms โญ 24,956
    Minimal examples of data structures and algorithms in Python

  14. norvig/pytudes โญ 24,251
    Python programs, usually short, of considerable difficulty, to perfect particular skills.

  15. delgan/loguru โญ 23,492
    Python logging made (stupidly) simple
    ๐Ÿ”— loguru.readthedocs.io

  16. facebookresearch/audiocraft โญ 22,929
    Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

  17. chriskiehl/Gooey โญ 22,040
    Turn (almost) any Python command line program into a full GUI application with one line

  18. rustpython/RustPython โญ 21,718
    A Python Interpreter written in Rust
    ๐Ÿ”— rustpython.github.io

  19. mkdocs/mkdocs โญ 21,625
    Project documentation with Markdown.
    ๐Ÿ”— www.mkdocs.org

  20. micropython/micropython โญ 21,373
    MicroPython - a lean and efficient Python implementation for microcontrollers and constrained systems
    ๐Ÿ”— micropython.org

  21. higherorderco/Bend โญ 19,144
    A massively parallel, high-level programming language
    ๐Ÿ”— higherorderco.com

  22. kivy/kivy โญ 18,828
    Open source UI framework written in Python, running on Windows, Linux, macOS, Android and iOS
    ๐Ÿ”— kivy.org

  23. openai/triton โญ 18,225
    Development repository for the Triton language and compiler
    ๐Ÿ”— triton-lang.org

  24. comet-ml/opik โญ 17,482
    Opik is an open-source platform for evaluating, testing and monitoring LLM applications.
    ๐Ÿ”— www.comet.com/docs/opik

  25. ipython/ipython โญ 16,662
    Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
    ๐Ÿ”— ipython.readthedocs.org

  26. alievk/avatarify-python โญ 16,550
    Avatars for Zoom, Skype and other video-conferencing apps.

  27. caronc/apprise โญ 15,648
    Apprise - Push Notifications that work with just about every platform!
    ๐Ÿ”— hub.docker.com/r/caronc/apprise

  28. pyo3/pyo3 โญ 15,202
    Rust bindings for the Python interpreter
    ๐Ÿ”— pyo3.rs

  29. google/brotli โญ 14,548
    Brotli is a generic-purpose lossless compression algorithm that compresses data using a combination of a modern variant of the LZ77 algorithm, Huffman coding and 2nd order context modeling

  30. nuitka/Nuitka โญ 14,417
    Nuitka is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.4-3.13. You feed it your Python app, it does a lot of clever things, and spits out an executable or extension module.
    ๐Ÿ”— nuitka.net

  31. zulko/moviepy โญ 14,268
    Video editing with Python
    ๐Ÿ”— zulko.github.io/moviepy

  32. pyodide/pyodide โญ 14,148
    Pyodide is a Python distribution for the browser and Node.js based on WebAssembly
    ๐Ÿ”— pyodide.org/en/stable

  33. python-pillow/Pillow โญ 13,332
    The Python Imaging Library adds image processing capabilities to Python (Pillow is the friendly PIL fork)
    ๐Ÿ”— python-pillow.github.io

  34. pytube/pytube โญ 13,066
    A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
    ๐Ÿ”— pytube.io

  35. ninja-build/ninja โญ 12,636
    Ninja is a small build system with a focus on speed.
    ๐Ÿ”— ninja-build.org

  36. asweigart/pyautogui โญ 12,236
    A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.

  37. dbader/schedule โญ 12,225
    Python job scheduling for humans.
    ๐Ÿ”— schedule.readthedocs.io

  38. secdev/scapy โญ 12,003
    Scapy: the Python-based interactive packet manipulation program & library.
    ๐Ÿ”— scapy.net

  39. magicstack/uvloop โญ 11,607
    Ultra fast asyncio event loop.

  40. icloud-photos-downloader/icloud_photos_downloader โญ 11,439
    A command-line tool to download photos from iCloud

  41. pallets/jinja โญ 11,404
    A very fast and expressive template engine.
    ๐Ÿ”— jinja.palletsprojects.com

  42. aristocratos/bpytop โญ 10,854
    Linux/OSX/FreeBSD resource monitor

  43. cython/cython โญ 10,580
    The most widely used Python to C compiler
    ๐Ÿ”— cython.org

  44. facebookresearch/hydra โญ 10,139
    Hydra is a framework for elegantly configuring complex applications
    ๐Ÿ”— hydra.cc

  45. py-pdf/pypdf โญ 9,761
    A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
    ๐Ÿ”— pypdf.readthedocs.io/en/latest

  46. boto/boto3 โญ 9,665
    Boto3, an AWS SDK for Python
    ๐Ÿ”— aws.amazon.com/sdk-for-python

  47. paramiko/paramiko โญ 9,661
    The leading native Python SSHv2 protocol library.
    ๐Ÿ”— paramiko.org

  48. aws/serverless-application-model โญ 9,546
    The AWS Serverless Application Model (AWS SAM) transform is a AWS CloudFormation macro that transforms SAM templates into CloudFormation templates.
    ๐Ÿ”— aws.amazon.com/serverless/sam

  49. xonsh/xonsh โญ 9,186
    ๐Ÿš Python-powered shell. Full-featured and cross-platform.
    ๐Ÿ”— xon.sh

  50. arrow-py/arrow โญ 9,015
    ๐Ÿน Better dates & times for Python
    ๐Ÿ”— arrow.readthedocs.io

  51. googleapis/google-api-python-client โญ 8,678
    ๐Ÿ The official Python client library for Google's discovery based APIs.
    ๐Ÿ”— googleapis.github.io/google-api-python-client/docs

  52. eternnoir/pyTelegramBotAPI โญ 8,662
    Python Telegram bot api.

  53. theskumar/python-dotenv โญ 8,619
    Reads key-value pairs from a .env file and can set them as environment variables. It helps in developing applications following the 12-factor principles.
    ๐Ÿ”— saurabh-kumar.com/python-dotenv

  54. kellyjonbrazil/jc โญ 8,510
    CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts.

  55. jasonppy/VoiceCraft โญ 8,456
    Zero-Shot Speech Editing and Text-to-Speech in the Wild

  56. jd/tenacity โญ 8,287
    Retrying library for Python
    ๐Ÿ”— tenacity.readthedocs.io

  57. googlecloudplatform/python-docs-samples โญ 7,956
    Code samples used on cloud.google.com

  58. timdettmers/bitsandbytes โญ 7,912
    Accessible large language models via k-bit quantization for PyTorch.
    ๐Ÿ”— huggingface.co/docs/bitsandbytes/main/en/index

  59. ijl/orjson โญ 7,827
    Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy

  60. pygithub/PyGithub โญ 7,647
    Typed interactions with the GitHub API v3
    ๐Ÿ”— pygithub.readthedocs.io

  61. sphinx-doc/sphinx โญ 7,627
    The Sphinx documentation generator
    ๐Ÿ”— www.sphinx-doc.org

  62. google/latexify_py โญ 7,592
    A library to generate LaTeX expression from Python code.

  63. pyca/cryptography โญ 7,446
    cryptography is a package designed to expose cryptographic primitives and recipes to Python developers.
    ๐Ÿ”— cryptography.io

  64. bndr/pipreqs โญ 7,417
    pipreqs - Generate pip requirements.txt file based on imports of any project. Looking for maintainers to move this project forward.

  65. agronholm/apscheduler โญ 7,257
    Task scheduling library for Python
    ๐Ÿ”— apscheduler.readthedocs.io

  66. gorakhargosh/watchdog โญ 7,234
    Python library and shell utilities to monitor filesystem events.
    ๐Ÿ”— packages.python.org/watchdog

  67. marshmallow-code/marshmallow โญ 7,231
    A lightweight library for converting complex objects to and from simple Python datatypes.
    ๐Ÿ”— marshmallow.readthedocs.io

  68. hugapi/hug โญ 6,906
    Embrace the APIs of the future. Hug aims to make developing APIs as simple as possible, but no simpler.

  69. pdfminer/pdfminer.six โญ 6,871
    Community maintained fork of pdfminer - we fathom PDF
    ๐Ÿ”— pdfminersix.readthedocs.io

  70. openai/point-e โญ 6,849
    Point cloud diffusion for 3D model synthesis

  71. traceloop/openllmetry โญ 6,784
    Open-source observability for your GenAI or LLM application, based on OpenTelemetry
    ๐Ÿ”— www.traceloop.com/openllmetry

  72. sdispater/pendulum โญ 6,609
    Python datetimes made easy
    ๐Ÿ”— pendulum.eustace.io

  73. scikit-image/scikit-image โญ 6,431
    Image processing in Python
    ๐Ÿ”— scikit-image.org

  74. pytransitions/transitions โญ 6,392
    A lightweight, object-oriented finite state machine implementation in Python with many extensions

  75. wireservice/csvkit โญ 6,328
    A suite of utilities for converting to and working with CSV, the king of tabular file formats.
    ๐Ÿ”— csvkit.readthedocs.io

  76. rsalmei/alive-progress โญ 6,221
    A new kind of Progress Bar, with real-time throughput, ETA, and very cool animations!

  77. spotify/pedalboard โญ 5,941
    ๐ŸŽ› ๐Ÿ”Š A Python library for audio.
    ๐Ÿ”— spotify.github.io/pedalboard

  78. pywinauto/pywinauto โญ 5,863
    Windows GUI Automation with Python (based on text properties)
    ๐Ÿ”— pywinauto.github.io

  79. tebelorg/RPA-Python โญ 5,442
    Python package for doing RPA

  80. buildbot/buildbot โญ 5,423
    Python-based continuous integration testing framework; your pull requests are more than welcome!
    ๐Ÿ”— www.buildbot.net

  81. prompt-toolkit/ptpython โญ 5,394
    A better Python REPL

  82. pythonnet/pythonnet โญ 5,385
    Python for .NET is a package that gives Python programmers nearly seamless integration with the .NET Common Language Runtime (CLR) and provides a powerful application scripting tool for .NET developers.
    ๐Ÿ”— pythonnet.github.io

  83. pyo3/maturin โญ 5,317
    Build and publish crates with pyo3, cffi and uniffi bindings as well as rust binaries as python packages
    ๐Ÿ”— maturin.rs

  84. pycqa/pycodestyle โญ 5,149
    Simple Python style checker in one Python file
    ๐Ÿ”— pycodestyle.pycqa.org

  85. ashleve/lightning-hydra-template โญ 5,116
    PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. โšก๐Ÿ”ฅโšก

  86. pytoolz/toolz โญ 5,112
    A functional standard library for Python.
    ๐Ÿ”— toolz.readthedocs.org

  87. bogdanp/dramatiq โญ 5,098
    A fast and reliable background task processing library for Python 3.
    ๐Ÿ”— dramatiq.io

  88. gitpython-developers/GitPython โญ 5,063
    GitPython is a python library used to interact with Git repositories.
    ๐Ÿ”— gitpython.readthedocs.org

  89. jorgebastida/awslogs โญ 4,975
    AWS CloudWatch logs for Humansโ„ข

  90. ets-labs/python-dependency-injector โญ 4,779
    Dependency injection framework for Python
    ๐Ÿ”— python-dependency-injector.ets-labs.org

  91. pyinvoke/invoke โญ 4,695
    Pythonic task management & command execution.
    ๐Ÿ”— pyinvoke.org

  92. pyinfra-dev/pyinfra โญ 4,673
    ๐Ÿ”ง pyinfra turns Python code into shell commands and runs them on your servers. Execute ad-hoc commands and write declarative operations. Target SSH servers, local machine and Docker containers. Fast and scales from one server to thousands.
    ๐Ÿ”— pyinfra.com

  93. spotify/basic-pitch โญ 4,609
    A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
    ๐Ÿ”— basicpitch.io

  94. blealtan/efficient-kan โญ 4,566
    An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).

  95. pydantic/monty โญ 4,523
    A minimal, secure Python interpreter written in Rust for use by AI

  96. hynek/structlog โญ 4,517
    Simple, powerful, and fast logging for Python.
    ๐Ÿ”— www.structlog.org

  97. adafruit/circuitpython โญ 4,455
    CircuitPython - a Python implementation for teaching coding with microcontrollers
    ๐Ÿ”— circuitpython.org

  98. miguelgrinberg/python-socketio โญ 4,309
    Python Socket.IO server and client

  99. evhub/coconut โญ 4,299
    Coconut (coconut-lang.org) is a variant of Python that adds on top of Python syntax new features for simple, elegant, Pythonic functional programming.
    ๐Ÿ”— coconut-lang.org

  100. pydata/xarray โญ 4,070
    N-D labeled arrays and datasets in Python
    ๐Ÿ”— xarray.dev

  101. pydantic/logfire โญ 3,955
    AI observability platform for production LLM and agent systems.
    ๐Ÿ”— logfire.pydantic.dev/docs

  102. tartley/colorama โญ 3,765
    Simple cross-platform colored terminal text in Python

  103. camelot-dev/camelot โญ 3,576
    A Python library to extract tabular data from PDFs
    ๐Ÿ”— camelot-py.readthedocs.io

  104. jorisschellekens/borb โญ 3,551
    borb is a library for reading, creating and manipulating PDF files in python.
    ๐Ÿ”— borbpdf.com

  105. jcrist/msgspec โญ 3,530
    A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
    ๐Ÿ”— jcristharif.com/msgspec

  106. osohq/oso โญ 3,494
    Deprecated: See README

  107. pyserial/pyserial โญ 3,490
    Python serial port access library
    ๐Ÿ”— pyserial.readthedocs.io/en/latest

  108. karpathy/reader3 โญ 3,261
    A lightweight, self-hosted EPUB reader that lets you read through EPUB books one chapter at a time.

  109. libaudioflux/audioFlux โญ 3,241
    A library for audio and music analysis, feature extraction.
    ๐Ÿ”— audioflux.top

  110. rhettbull/osxphotos โญ 3,230
    Python app to work with pictures and associated metadata from Apple Photos on macOS. Also includes a package to provide programmatic access to the Photos library, pictures, and metadata.

  111. cdgriffith/Box โญ 2,815
    Python dictionaries with advanced dot notation access
    ๐Ÿ”— github.com/cdgriffith/box/wiki

  112. whylabs/whylogs โญ 2,788
    An open-source data logging library for machine learning models and data pipelines. ๐Ÿ“š Provides visibility into data quality & model performance over time. ๐Ÿ›ก๏ธ Supports privacy-preserving data collection, ensuring safety & robustness. ๐Ÿ“ˆ
    ๐Ÿ”— whylogs.readthedocs.io

  113. liiight/notifiers โญ 2,728
    The easy way to send notifications
    ๐Ÿ”— notifiers.readthedocs.io

  114. litl/backoff โญ 2,702
    Python library providing function decorators for configurable backoff and retry

  115. anthropics/anthropic-sdk-python โญ 2,651
    SDK providing access to Anthropic's safety-first language model APIs

  116. dosisod/refurb โญ 2,519
    A tool for refurbishing and modernizing Python codebases

  117. pyston/pyston โญ 2,509
    (No longer maintained) A faster and highly-compatible implementation of the Python programming language.
    ๐Ÿ”— www.pyston.org

  118. astanin/python-tabulate โญ 2,508
    Pretty-print tabular data in Python, a library and a command-line utility. Repository migrated from bitbucket.org/astanin/python-tabulate.
    ๐Ÿ”— pypi.org/project/tabulate

  119. omry/omegaconf โญ 2,330
    Flexible Python configuration system. The last one you will ever need.

  120. ariebovenberg/whenever โญ 2,285
    โฐ Modern datetime library for Python
    ๐Ÿ”— whenever.rtfd.io

  121. open-telemetry/opentelemetry-python โญ 2,285
    OpenTelemetry Python API and SDK
    ๐Ÿ”— opentelemetry.io

  122. p0dalirius/Coercer โญ 2,164
    A python script to automatically coerce a Windows server to authenticate on an arbitrary machine through 12 methods.
    ๐Ÿ”— podalirius.net

  123. pygments/pygments โญ 2,098
    Pygments is a generic syntax highlighter written in Python
    ๐Ÿ”— pygments.org

  124. mkdocstrings/mkdocstrings โญ 2,043
    ๐Ÿ“˜ Automatic documentation from sources, for MkDocs.
    ๐Ÿ”— mkdocstrings.github.io

  125. karpathy/rendergit โญ 2,011
    Render any git repo into a single static HTML page for humans or LLMs

  126. chrishayuk/mcp-cli โญ 1,844
    A protocol-level CLI designed to interact with a Model Context Protocol server. The client allows users to send commands, query data, and interact with various resources provided by the server.

  127. extensityai/symbolicai โญ 1,659
    Compositional Differentiable Programming Library - divide-and-conquer approach to break down a complex problem into smaller, more manageable problems.

  128. pypy/pypy โญ 1,635
    PyPy is a very fast and compliant implementation of the Python language.
    ๐Ÿ”— pypy.org

  129. lcompilers/lpython โญ 1,627
    Python compiler
    ๐Ÿ”— lpython.org

  130. juanbindez/pytubefix โญ 1,445
    Python3 library for downloading YouTube Videos.
    ๐Ÿ”— pytubefix.readthedocs.io

  131. daveebbelaar/python-whatsapp-bot โญ 1,438
    This guide will walk you through the process of creating a WhatsApp bot using the Meta (formerly Facebook) Cloud API with pure Python, and Flask
    ๐Ÿ”— www.datalumina.com

  132. pydantic/pydantic-settings โญ 1,227
    Settings management using pydantic
    ๐Ÿ”— docs.pydantic.dev/latest/usage/pydantic_settings

  133. barracuda-fsh/pyobd โญ 1,140
    An OBD-II compliant car diagnostic tool

  134. modal-labs/modal-examples โญ 1,087
    Examples of programs built using Modal
    ๐Ÿ”— modal.com/docs

  135. lastmile-ai/aiconfig โญ 1,077
    AIConfig saves prompts, models and model parameters as source control friendly configs. This allows you to iterate on prompts and model parameters separately from your application code.
    ๐Ÿ”— aiconfig.lastmileai.dev

  136. tavily-ai/tavily-python โญ 981
    The Tavily Python wrapper allows for easy interaction with the Tavily API, offering the full range of our search and extract functionalities directly from your Python programs.
    ๐Ÿ”— pypi.org/project/tavily-python

  137. tox-dev/filelock โญ 926
    A platform independent file lock in Python, which provides a simple way of inter-process communication
    ๐Ÿ”— py-filelock.readthedocs.io

  138. secretiveshell/MCP-Bridge โญ 895
    A middleware to provide an openAI compatible endpoint that can call MCP tools

  139. google/pyglove โญ 708
    Manipulating Python Programs

  140. neuml/annotateai โญ 400
    Automatically annotates papers using Large Language Models (LLMs)

Vizualisation

Vizualisation tools and libraries. Application frameworks, 2D/3D plotting, dashboards, WebGL.

  1. apache/superset โญ 70,248
    Apache Superset is a Data Visualization and Data Exploration Platform
    ๐Ÿ”— superset.apache.org

  2. streamlit/streamlit โญ 43,180
    Streamlit โ€” A faster way to build and share data apps.
    ๐Ÿ”— streamlit.io

  3. gradio-app/gradio โญ 41,415
    Build and share delightful machine learning apps, all in Python. ๐ŸŒŸ Star to support our work!
    ๐Ÿ”— www.gradio.app

  4. danny-avila/LibreChat โญ 33,310
    LibreChat is a free, open source AI chat platform. This Web UI offers vast customization, supporting numerous AI providers, services, and integrations.
    ๐Ÿ”— librechat.ai

  5. plotly/dash โญ 24,412
    Data Apps & Dashboards for Python. No JavaScript Required.
    ๐Ÿ”— plotly.com/dash

  6. matplotlib/matplotlib โญ 22,262
    matplotlib: plotting with Python
    ๐Ÿ”— matplotlib.org/stable

  7. bokeh/bokeh โญ 20,307
    Interactive Data Visualization in the browser, from Python
    ๐Ÿ”— bokeh.org

  8. plotly/plotly.py โญ 18,208
    The interactive graphing library for Python โœจ
    ๐Ÿ”— plotly.com/python

  9. microsoft/data-formulator โญ 14,772
    Transform data and create rich visualizations iteratively with AI
    ๐Ÿ”— arxiv.org/abs/2408.16119

  10. visgl/deck.gl โญ 13,775
    WebGL2 powered visualization framework
    ๐Ÿ”— deck.gl

  11. mwaskom/seaborn โญ 13,696
    Statistical data visualization in Python
    ๐Ÿ”— seaborn.pydata.org

  12. nvidia/TensorRT-LLM โญ 12,718
    TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performa...
    ๐Ÿ”— nvidia.github.io/tensorrt-llm

  13. marceloprates/prettymaps โญ 12,109
    Draw pretty maps from OpenStreetMap data! Built with osmnx +matplotlib + shapely
    ๐Ÿ”— prettymaps.streamlit.app

  14. altair-viz/altair โญ 10,222
    Declarative visualization library for Python
    ๐Ÿ”— altair-viz.github.io

  15. renpy/renpy โญ 6,169
    The Ren'Py Visual Novel Engine
    ๐Ÿ”— www.renpy.org

  16. holoviz/panel โญ 5,577
    Panel: The powerful data exploration & web app framework for Python
    ๐Ÿ”— panel.holoviz.org

  17. lux-org/lux โญ 5,369
    Automatically visualize your pandas dataframe via a single print! ๐Ÿ“Š ๐Ÿ’ก

  18. man-group/dtale โญ 5,045
    Visualizer for pandas data structures
    ๐Ÿ”— alphatechadmin.pythonanywhere.com

  19. has2k1/plotnine โญ 4,490
    A Grammar of Graphics for Python
    ๐Ÿ”— plotnine.org

  20. pyqtgraph/pyqtgraph โญ 4,282
    Fast data visualization and GUI tools for scientific / engineering applications
    ๐Ÿ”— www.pyqtgraph.org

  21. residentmario/missingno โญ 4,183
    missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities that allows you to get a quick visual summary of the completeness (or lack thereof) of your dataset.

  22. mckinsey/vizro โญ 3,561
    Vizro is a low-code toolkit for building high-quality data visualization apps.
    ๐Ÿ”— vizro.readthedocs.io/en/stable

  23. pyvista/pyvista โญ 3,483
    3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK)
    ๐Ÿ”— docs.pyvista.org

  24. ml-tooling/opyrator โญ 3,135
    ๐Ÿช„ Turns your machine learning code into microservices with web API, interactive GUI, and more.
    ๐Ÿ”— opyrator-playground.mltooling.org

  25. netflix/flamescope โญ 3,097
    FlameScope is a visualization tool for exploring different time ranges as Flame Graphs.

  26. facebookresearch/hiplot โญ 2,800
    HiPlot makes understanding high dimensional data easy
    ๐Ÿ”— facebookresearch.github.io/hiplot

  27. napari/napari โญ 2,584
    A fast, interactive, multi-dimensional image viewer for Python. It's designed for browsing, annotating, and analyzing large multi-dimensional images.
    ๐Ÿ”— napari.org

  28. holoviz/holoviz โญ 906
    High-level tools to simplify visualization in Python.
    ๐Ÿ”— holoviz.org

  29. hazyresearch/meerkat โญ 852
    Explore and understand your training and validation data.

  30. anvaka/word2vec-graph โญ 711
    Exploring word2vec embeddings as a graph of nearest neighbors
    ๐Ÿ”— anvaka.github.io/pm/#/galaxy/word2vec-wiki?cx=-4651&cy=4492&cz=-1988&lx=-0.0915&ly=-0.9746&lz=-0.2030&lw=0.0237&ml=300&s=1.75&l=1&v=d50_clean_small

Web

Web related frameworks and libraries: webapp servers, WSGI, ASGI, asyncio, HTTP, REST, user management.

  1. tiangolo/fastapi โญ 94,400
    FastAPI framework, high performance, easy to learn, fast to code, ready for production
    ๐Ÿ”— fastapi.tiangolo.com

  2. django/django โญ 86,553
    The Web framework for perfectionists with deadlines.
    ๐Ÿ”— www.djangoproject.com

  3. sherlock-project/sherlock โญ 72,113
    Hunt down social media accounts by username across social networks
    ๐Ÿ”— sherlockproject.xyz

  4. pallets/flask โญ 71,080
    The Python micro framework for building web applications.
    ๐Ÿ”— flask.palletsprojects.com

  5. psf/requests โญ 53,675
    A simple, yet elegant, HTTP library.
    ๐Ÿ”— requests.readthedocs.io/en/latest

  6. reflex-dev/reflex โญ 28,008
    ๐Ÿ•ธ๏ธ Web apps in pure Python ๐Ÿ
    ๐Ÿ”— reflex.dev

  7. tornadoweb/tornado โญ 22,432
    Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
    ๐Ÿ”— www.tornadoweb.org

  8. vincigit00/Scrapegraph-ai โญ 22,365
    ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents
    ๐Ÿ”— scrapegraphai.com

  9. wagtail/wagtail โญ 20,076
    A Django content management system focused on flexibility and user experience
    ๐Ÿ”— wagtail.org

  10. pyscript/pyscript โญ 18,691
    A framework that allows users to create rich Python applications in the browser using HTML's interface and the power of Pyodide, WASM, and modern web technologies.
    ๐Ÿ”— pyscript.net

  11. huge-success/sanic โญ 18,630
    Accelerate your web app development | Build fast. Run fast.
    ๐Ÿ”— sanic.dev

  12. aio-libs/aiohttp โญ 16,222
    Asynchronous HTTP client/server framework for asyncio and Python
    ๐Ÿ”— docs.aiohttp.org

  13. flet-dev/flet โญ 15,387
    Flet enables developers to easily build realtime web, mobile and desktop apps in Python. No frontend experience required.
    ๐Ÿ”— flet.dev

  14. zauberzeug/nicegui โญ 15,172
    Create web-based user interfaces with Python. The nice way.
    ๐Ÿ”— nicegui.io

  15. encode/httpx โญ 14,932
    A next generation HTTP client for Python. ๐Ÿฆ‹
    ๐Ÿ”— www.python-httpx.org

  16. getpelican/pelican โญ 13,195
    Static site generator that supports Markdown and reST syntax. Powered by Python.
    ๐Ÿ”— getpelican.com

  17. encode/starlette โญ 11,863
    The little ASGI framework that shines. ๐ŸŒŸ
    ๐Ÿ”— starlette.dev

  18. aws/chalice โญ 11,053
    Python Serverless Microframework for AWS

  19. benoitc/gunicorn โญ 10,404
    gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
    ๐Ÿ”— www.gunicorn.org

  20. encode/uvicorn โญ 10,325
    An ASGI web server, for Python. ๐Ÿฆ„
    ๐Ÿ”— uvicorn.dev

  21. falconry/falcon โญ 9,780
    The no-magic web API and microservices framework for Python developers, with a focus on reliability and performance at scale.
    ๐Ÿ”— falcon.readthedocs.io

  22. vitalik/django-ninja โญ 8,856
    ๐Ÿ’จ Fast, Async-ready, Openapi, type hints based framework for building APIs
    ๐Ÿ”— django-ninja.dev

  23. bottlepy/bottle โญ 8,734
    bottle.py is a fast and simple micro-framework for python web-applications.
    ๐Ÿ”— bottlepy.org

  24. graphql-python/graphene โญ 8,244
    GraphQL framework for Python
    ๐Ÿ”— graphene-python.org

  25. reactive-python/reactpy โญ 8,154
    ReactPy is a library for building user interfaces in Python without Javascript
    ๐Ÿ”— reactpy.dev

  26. starlite-api/litestar โญ 7,934
    Light, flexible and extensible ASGI framework | Built to scale
    ๐Ÿ”— docs.litestar.dev

  27. pallets/werkzeug โญ 6,830
    The comprehensive WSGI web application library.
    ๐Ÿ”— werkzeug.palletsprojects.com

  28. pyeve/eve โญ 6,744
    REST API framework designed for human beings
    ๐Ÿ”— python-eve.org

  29. fastapi-users/fastapi-users โญ 5,948
    Ready-to-use and customizable users management for FastAPI
    ๐Ÿ”— fastapi-users.github.io/fastapi-users

  30. webpy/webpy โญ 5,933
    web.py is a web framework for python that is as simple as it is powerful.
    ๐Ÿ”— webpy.org

  31. pywebio/PyWebIO โญ 4,822
    Write interactive web app in script way.
    ๐Ÿ”— pywebio.readthedocs.io

  32. nameko/nameko โญ 4,765
    A microservices framework for Python that lets service developers concentrate on application logic and encourages testability.
    ๐Ÿ”— www.nameko.io

  33. strawberry-graphql/strawberry โญ 4,590
    A GraphQL library for Python that leverages type annotations ๐Ÿ“
    ๐Ÿ”— strawberry.rocks

  34. freddyaboulton/fastrtc โญ 4,500
    Turn any python function into a real-time audio and video stream over WebRTC or WebSockets.
    ๐Ÿ”— fastrtc.org

  35. h2oai/wave โญ 4,223
    H2O Wave is a software stack for building beautiful, low-latency, realtime, browser-based applications and dashboards entirely in Python/R without using HTML, Javascript, or CSS.
    ๐Ÿ”— wave.h2o.ai

  36. fastapi-admin/fastapi-admin โญ 3,684
    A fast admin dashboard based on FastAPI and TortoiseORM with tabler ui, inspired by Django admin
    ๐Ÿ”— fastapi-admin-docs.long2ice.io

  37. pallets/quart โญ 3,577
    An async Python micro framework for building web applications.
    ๐Ÿ”— quart.palletsprojects.com

  38. s3rius/FastAPI-template โญ 2,716
    Feature rich robust FastAPI template.

  39. flipkart-incubator/Astra โญ 2,629
    Automated Security Testing For REST API's

  40. dot-agent/nextpy โญ 2,335
    ๐Ÿค–Self-Modifying Framework from the Future ๐Ÿ”ฎ World's First AMS
    ๐Ÿ”— dotagent.ai

  41. neoteroi/BlackSheep โญ 2,302
    Fast ASGI web framework for Python
    ๐Ÿ”— www.neoteroi.dev/blacksheep

  42. dmontagu/fastapi-utils โญ 2,297
    Reusable utilities for FastAPI: a number of utilities to help reduce boilerplate and reuse common functionality across projects
    ๐Ÿ”— fastapiutils.github.io/fastapi-utils

  43. python-restx/flask-restx โญ 2,240
    Fork of Flask-RESTPlus: Fully featured framework for fast, easy and documented API development with Flask
    ๐Ÿ”— flask-restx.readthedocs.io/en/latest

  44. jordaneremieff/mangum โญ 2,059
    An adapter for running ASGI applications in AWS Lambda to handle Function URL, API Gateway, ALB, and Lambda@Edge events
    ๐Ÿ”— mangum.fastapiexpert.com

  45. long2ice/fastapi-cache โญ 1,812
    fastapi-cache is a tool to cache fastapi response and function result, with backends support redis and memcached.
    ๐Ÿ”— github.com/long2ice/fastapi-cache

  46. rstudio/py-shiny โญ 1,678
    Shiny for Python
    ๐Ÿ”— shiny.posit.co/py

  47. awtkns/fastapi-crudrouter โญ 1,676
    A dynamic FastAPI router that automatically creates CRUD routes for your models
    ๐Ÿ”— fastapi-crudrouter.awtkns.com

  48. whitphx/stlite โญ 1,583
    A port of Streamlit to WebAssembly, powered by Pyodide.
    ๐Ÿ”— edit.share.stlite.net


Interactive version: www.awesomepython.org, Hugging Face Dataset: awesome-python

Please raise a new issue to suggest a Python repo that you would like to see added.

1,553 hand-picked awesome Python libraries and frameworks, updated 11 Feb 2026