C3E: BENCHMARKING TIME-COMPLEXITY COMPLIANCE IN LLM-GENERATED CODE
JCST has a just-accepted paper proposing C3E, a benchmark to check whether LLM-generated code meets specified time-complexity constraints, not just functional c...
JCST has a just-accepted paper proposing C3E, a benchmark to check whether LLM-generated code meets specified time-complexity constraints, not just functional correctness. This gives teams a way to detect algorithmic regressions when using AI coding assistants, especially for performance-sensitive backends and data pipelines.
Prevents hidden Big-O regressions when AI-generated code replaces optimized routines.
Enables standardized comparison of models and prompt patterns against performance budgets.
-
terminal
Add C3E-style tasks to your internal LLM eval harness with explicit complexity targets and verify outputs against input sizes.
-
terminal
Gate AI-authored PRs by checking algorithmic complexity (e.g., via worst-case tests and input-scaling curves) before merging.
Legacy codebase integration strategies...
- 01.
Integrate complexity checks into CI for hot paths and compare AI-generated alternatives against current baselines before rollout.
- 02.
Start with a narrow set of latency-critical endpoints or ETL kernels and expand once false positives/negatives are understood.
Fresh architecture paradigms...
- 01.
Define complexity budgets in coding guidelines for GenAI usage and include complexity assertions in test suites from day one.
- 02.
Curate a domain-specific task set (data transforms, joins, graph ops) with labeled complexity targets to steer prompts and model selection.