
Starred repositories
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (p…
Done in a safe environment for educational purposes :)
Visualizing various metrics collected from various cryptographies
A generative world for general-purpose robotics & embodied AI learning.
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
A high-throughput and memory-efficient inference and serving engine for LLMs
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
On-device AI across mobile, embedded and edge for PyTorch
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
A curated list of neural network pruning resources.
ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
OpenBao exists to provide a software solution to manage, store, and distribute sensitive data including secrets, certificates, and keys.
Monocle is a framework for tracing GenAI app code. This repo contains implementation of Monocle for GenAI apps written in Python.
GenAI components at micro-service level; GenAI service composer to create mega-service
A scikit-learn compatible neural network library that wraps PyTorch