BENCHMARKING LLM CODE FOR TIME-COMPLEXITY COMPLIANCE (C3E)
A JCST 'Just Accepted' paper introduces Complexity-Constraint Code Evaluation (C3E), a benchmark to check whether LLM-generated code meets stated time-complexit...
A JCST 'Just Accepted' paper introduces Complexity-Constraint Code Evaluation (C3E), a benchmark to check whether LLM-generated code meets stated time-complexity constraints. For teams using AI to write algorithms, this offers a way to catch solutions that pass functional tests but violate performance budgets.
LLM-generated code can be correct yet asymptotically inefficient, driving cost and latency in production.
A standard benchmark helps compare models and enforce performance guardrails in review and CI.
-
terminal
Add empirical complexity checks (vary input sizes, fit complexity curves) to CI for AI-generated code paths.
-
terminal
Evaluate your preferred LLMs on tasks with explicit complexity targets to select defaults and set guardrails.
Legacy codebase integration strategies...
- 01.
Retrofit performance budgets into existing tests for hotspots and fail PRs when complexity regressions appear.
- 02.
Expect flaky timings; stabilize by pinning hardware, warming caches, and using statistical thresholds.
Fresh architecture paradigms...
- 01.
Define algorithmic specs with explicit complexity requirements and auto-generate harnesses to verify them.
- 02.
Prefer coding patterns and libraries with predictable complexity and document them in generation prompts.