pku-sec-lab/hybrimoe

[DAC'25] Official implement of "HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference"

Python

105

Apache License 2.0

HybriMoE is a system designed to improve the efficiency of Mixture-of-Experts (MoE) inference by optimizing CPU-GPU scheduling and cache management. It introduces hybrid scheduling to balance workloads, impact-driven prefetching to prioritize expert loading, and MoE-specialized caching to reduce cache misses. This project is intended for researchers and developers working on large-scale MoE models, particularly those seeking to optimize inference performance and resource utilization.

Total donated

Undistributed

Share with your subscribers:

Recipients

How the donated funds are distributed

Support the dependencies

Support the repos that depend on this repository

Top contributors

shuzhangzhong

6 contributions

Recent events

Kivach works on the Obyte network, and therefore you can track all donations.

No events yet