kernelbench.com

Agentic GPU kernel benchmark results

Design principles

  • Percent of theoretical maximum, not speedup over PyTorch. Scores are grounded in hardware ceilings instead of baseline quirks.
  • Modern coding-agent harnesses. Runs use Claude Code, Codex CLI, Cursor, Gemini CLI, Kimi CLI, OpenCode, Grok, and MiniMax where those are the natural interfaces.
  • Public transcript viewers. Browse the run index or open a scored Hard run to inspect tool calls, solution files, checks, timing, and costs.
  • Judge-assisted audit. A judge model helps flag reward hacking, rubric leaks, and suspicious shortcuts for human review.

Citation

Cite this benchmark suite

If you use these results, cite the website for the public benchmark view, the relevant benchmark repository for problem definitions and harness code, and the Hugging Face dataset for the run transcripts. This is a results and artifact site, not Stanford's original KernelBench benchmark.

@misc{arledge2026kernelbenchcom,
  title        = {kernelbench.com: Agentic GPU Kernel Benchmark Results and Run Artifacts},
  author       = {Arledge, Elliot},
  year         = {2026},
  howpublished = {\url{https://kernelbench.com}},
  note         = {Website, benchmark results, transcript viewers, and citation index}
}

@misc{arledge2026hard,
  title        = {Hard: Agentic CUDA Kernel Result Suite},
  author       = {Arledge, Elliot},
  year         = {2026},
  howpublished = {\url{https://github.com/Infatoshi/KernelBench-Hard}},
  note         = {CUDA benchmark suite, harness, results, and annotations}
}

@misc{arledge2026v3,
  title        = {v3: Multi-GPU Agentic Kernel Result Suite},
  author       = {Arledge, Elliot},
  year         = {2026},
  howpublished = {\url{https://github.com/Infatoshi/KernelBench-v3}},
  note         = {Multi-GPU benchmark suite, harness, and result artifacts}
}

@misc{arledge2026hardruns,
  title        = {KernelBench-Hard Run Artifacts},
  author       = {Arledge, Elliot},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/datasets/Infatoshi/kernelbench-hard-runs}},
  note         = {Run transcripts, solutions, checks, timing, and cost metadata}
}

@misc{arledge2026v3runs,
  title        = {KernelBench-v3 Run Artifacts},
  author       = {Arledge, Elliot},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/datasets/Infatoshi/kernelbench-v3-runs}},
  note         = {Run artifacts and benchmark result data}
}

Contact

Open to inquiries: collaborations, model evals, custom benchmark builds, kernel-engineering consulting, anything kernel-adjacent.

Reach out: infatoshi@gmail.com