huangzhenhua111

huangzhenhua111

Popular repositories Loading

mllm mllm Public

Forked from UbiquitousLearning/mllm

Reproducible edge LLM profiling/benchmark toolkit (KV/attention memory + prefill/decode breakdown) to pinpoint NPU bottlenecks, plus a minimal graph-capture export for v2/static-graph IR design & v…

C++
d9000_llm_policy_diag d9000_llm_policy_diag Public

Topology-aware bottleneck diagnosis for mobile LLM inference on Dimensity 9000

C++