Systems × AI Infrastructure

Chaomei Yan

I build systems infrastructure for AI at scale. Background in distributed databases, storage engines, and disaggregated memory. Currently focused on LLM inference optimization and the infrastructure layer for AI agents.

CS grad @ USTC. Open-source contributor to RocksDB, Mooncake, Apache Kvrocks, ColossalAI.

GitHub Contact

01 / Focus

What I'm working on

LLM Inference Optimization

KV Cache scheduling, memory-efficient serving, disaggregated inference architectures.

Video Generation Acceleration

Inference optimization for video diffusion models. Making generation faster and more memory-efficient.

Agent Infrastructure

Memory systems, retrieval acceleration, and orchestration primitives for autonomous AI agents.

Multimodal Systems

Inference pipelines for multi-modal reasoning. Vision-language-code integration at systems level.

Disaggregated Memory

CXL, PMEM, RDMA-based memory architectures. Far memory management for data-intensive workloads.

Storage & Database Internals

LSM-tree optimization, KV store performance, transaction processing in modern hardware.

02 / Work

Selected projects

Mooncake — KVCache & Transfer Engine

Contributing to the LLM serving platform behind Kimi. Working on transfer engine optimization and KV Cache disaggregation for efficient large model inference.

llm-serving

LevelDB Bloom Filter Index

Bloom filter index optimization for LSM-tree based storage engines. Reducing read amplification through smarter filter placement.

storage

Apache RocketMQ Python Client

Built the Python client from scratch during GSoC 2023 — RPC layer, Producer, Simple Consumer, MessageIdCodec. 8 merged PRs.

messaging

More projects in progress

Working on new projects in agent infrastructure and inference optimization. Stay tuned.

upcoming

03 / Open Source

Contributions to production infrastructure

Apache RocketMQ — Python Client (GSoC) DLedger — Snapshot Support Kmesh — Test Framework Mooncake — KVCache OpenTenBase — Slow SQL Curve — Partition Fix HugeGraph AI — RAG Tests

04 / Philosophy

How I think about building

Leverage over labor

I'd rather build a system that solves a class of problems than manually solve each instance. Automation, good abstractions, and AI-augmented workflows compound over time.

Research that ships

A paper matters when it changes how a system works in production. I'm interested in closing the gap between what's published and what's deployed.

Infrastructure taste

The best infrastructure is invisible. Clean interfaces, minimal abstractions, graceful degradation. Complexity should live in the problem, not in the solution.

05 / Writing

Notes & thinking

Coming soon — notes on systems, inference, and building.