2026
an archive of posts from this year
| Apr 18, 2026 | CuTe DSL - Notes |
|---|---|
| Apr 18, 2026 | CUTLASS WGMMA on Hopper - Notes |
| Apr 14, 2026 | Investigating Flaky `test_eagle_dp` — Batch Invariance Failure on L4 GPUs |
| Mar 29, 2026 | GEMM Kernel Optimization Notes |
| Mar 25, 2026 | SiLU+Mul+FP8 Block Quant Pattern Matching Pipeline - vLLM Notes |
| Mar 25, 2026 | Fused SiLU+Mul+FP8 Block Quantization CUDA Kernel - vLLM Notes |
| Mar 10, 2026 | Anatomy of a Spark Job Run |
| Feb 13, 2026 | Transformer Block FLOPs & Parameters Calculations |
| Jan 26, 2026 | Distributed Systems - Lecture 1 |
| Jan 24, 2026 | LLMR - Lecture 1 |