體驗尖端 LLM 推理優化的靈活框架
KTransformers is a flexible Python framework designed to enhance the Hugging Face Transformers experience by providing advanced kernel optimizations and placement/parallelism strategies for LLM inference. It offers a user-friendly injection system that allows researchers to easily replace original modules with optimized variants, supporting features like CPU/GPU offloading of quantized models and integration with tools like Llamafile and Marlin.
How the donated funds are distributed
Kivach works on the Obyte network, and therefore you can track all donations.