I build systems infrastructure for AI at scale. Background in distributed databases, storage engines, and disaggregated memory. Currently focused on LLM inference optimization and the infrastructure layer for AI agents.
CS grad @ USTC. Open-source contributor to RocksDB, Mooncake, Apache Kvrocks, ColossalAI.
KV Cache scheduling, memory-efficient serving, disaggregated inference architectures.
Inference optimization for video diffusion models. Making generation faster and more memory-efficient.
Memory systems, retrieval acceleration, and orchestration primitives for autonomous AI agents.
Inference pipelines for multi-modal reasoning. Vision-language-code integration at systems level.
CXL, PMEM, RDMA-based memory architectures. Far memory management for data-intensive workloads.
LSM-tree optimization, KV store performance, transaction processing in modern hardware.
I'd rather build a system that solves a class of problems than manually solve each instance. Automation, good abstractions, and AI-augmented workflows compound over time.
A paper matters when it changes how a system works in production. I'm interested in closing the gap between what's published and what's deployed.
The best infrastructure is invisible. Clean interfaces, minimal abstractions, graceful degradation. Complexity should live in the problem, not in the solution.
Coming soon — notes on systems, inference, and building.