Works with MSSQL 2025 (and probably earlier versions).
AI-driven automatic hypothesis generation and testing for SQL Server query optimization.
Point it at a slow SQL Server command. The tool uses AI to inspect the underlying stored procedures and views, form optimization hypotheses, apply each one, benchmark it, verify result integrity via checksums, revert the change, and report which hypothesis won — all fully automated, zero manual intervention.
Slow SQL command
|
v
[1] Schema Discovery
AI reads the stored procedure / view source,
follows dependencies, identifies base tables
|
v
[2] Hypothesis Generation
AI proposes N optimization strategies
(indexes, columnstore, query rewrites, statistics, ...)
|
v
[3] For each hypothesis:
a. Apply optimization SQL
b. Warm up + measure timing (CPU ms / elapsed ms)
c. Checksum base tables -> verify result correctness
d. Revert optimization SQL
e. Record outcome
|
v
[4] Report
Markdown + HTML summary:
best strategy, % improvement, all timings, integrity status
The operations, including checksum and reverting changes, is done by AI. AI is well known to fail or to weird things like deleting all data permanently.
It is important that you understand this risk.
Example: Recently I got help from AI to change a column. To do this it decided to create new table, copy over data, delete old table and rename new table in place. It did all of that perfectly, except the copy-data step failed. Table was lost, data gone.
So, whatever database you give it access to - just assume it will mess it up. I'm not to be held liable for any damages caused by this source code.
- Checksums before and after — row count +
CHECKSUM_AGG(BINARY_CHECKSUM(*))on all touched base tables. If any checksum differs, the attempt is flagged as a data integrity failure. - Always reverts — every hypothesis is rolled back after measurement. The database is left in its original state.
- Cache flushed between runs —
CHECKPOINT,DBCC DROPCLEANBUFFERS,DBCC FREEPROCCACHEbetween each iteration to prevent warm-cache skew. - Statistics updated —
UPDATE STATISTICS WITH FULLSCANapplied after schema changes to ensure a fair comparison.
- .NET 10 SDK
- SQL Server (2019+ recommended; requires
sysadminor equivalent for cache-clearing commands) - An OpenAI API key (or compatible endpoint)
git clone https://github.com/tedd/Tedd.AutoSqlOptimizer.git
cd Tedd.AutoSqlOptimizerCopy the template and fill in your credentials:
cp src/Tedd.AutoSqlOptimizer/appsettings.json \
src/Tedd.AutoSqlOptimizer/appsettings.Development.jsonEdit appsettings.Development.json:
{
"ConnectionString": "Server=YOUR_SERVER;Database=YOUR_DATABASE;User Id=YOUR_USER;Password=YOUR_PASSWORD;Encrypt=True;TrustServerCertificate=True;Connection Timeout=600;",
"OpenAI": {
"ApiKey": "sk-..."
}
}appsettings.Development.json is loaded on top of appsettings.json and is git-ignored — your secrets stay local.
Alternative: environment variables
export OPENAI_API_KEY=sk-...Or place the key in ~/.openaikey (plain text, no newline needed).
To give it a fair start, and not to mess with anything it shouldn't, I use a development server (Linux, Docker) and limit its CPU's to same as our Azure instance. I also restore the DB from scratch before every run.
USE master;
ALTER DATABASE [devdb-03]
SET SINGLE_USER
WITH ROLLBACK IMMEDIATE;
CHECKPOINT;
RESTORE DATABASE [devdb-03]
FROM DISK = '/var/opt/mssql/backup/devdb-01.bak'
WITH
MOVE 'devdb-01'
TO '/var/opt/mssql/data/devdb-03.mdf',
MOVE 'devdb-01_log'
TO '/var/opt/mssql/data/devdb-03_log.ldf',
RECOVERY,
REPLACE;
CHECKPOINT;
ALTER DATABASE [devdb-03]
SET MULTI_USER;
CHECKPOINT;Create a folder under src/Tedd.AutoSqlOptimizer/Optimize/ named with a numeric prefix and a description:
Optimize/
├── init.sql <- optional DB initialization / restore script
└── 001_MySlowProcedure/
├── AI_Input.txt <- freetext description of the test (optional — see below)
├── 1_before.sql <- SQL to benchmark (required, OR generated from AI_Input.txt)
├── 2_optimize.sql <- optimization SQL (omit for AI mode)
├── 3_after.sql <- after SQL (omit for AI mode; defaults to 1_before.sql if absent)
└── 4_revert.sql <- revert SQL (omit for AI mode)
AI_Input.txt is an optional plain-text file that can be placed in any optimization folder. It serves two purposes:
-
Extra context for the AI optimizer. When present it is injected into every schema-discovery and optimization prompt, giving the AI human-written domain knowledge (e.g. "this proc is called with millions of rows, PartyId is always filtered, the result is sorted by date").
-
Generates missing SQL files on the fly. If any of the four SQL files are absent, the AI uses
AI_Input.txtto generate them before the benchmark run starts:1_before.sqlmissing → AI generates the benchmark query from the description.3_after.sqlmissing → defaults to1_before.sql(same query, different schema state).
The system works correctly with any combination of present/absent files as long as at least 1_before.sql or AI_Input.txt is provided.
AI mode (recommended): provide only 1_before.sql (and optionally AI_Input.txt). The tool discovers the schema and generates all optimization hypotheses automatically.
Description-only mode: provide only AI_Input.txt. The AI generates 1_before.sql from the description, then runs in AI mode.
Manual mode: provide all four SQL files. The tool applies 2_optimize.sql, benchmarks 3_after.sql, then runs 4_revert.sql. All files fully support multi-batch scripts using GO statement separators.
EXEC [dbo].[usp_MySlowProcedure] @Param1 = 1Or a plain query:
SELECT c.CustomerName, SUM(o.Amount) AS Total
FROM Orders o
JOIN Customers c ON c.Id = o.CustomerId
WHERE o.OrderDate >= '2024-01-01'
GROUP BY c.CustomerName
ORDER BY Total DESCcd src/Tedd.AutoSqlOptimizer
dotnet runOptionally filter to a single optimization folder by passing part of its name:
dotnet run -- MySlowProcedureAll settings live in appsettings.json (defaults) and can be overridden in appsettings.Development.json or via environment variables.
| Key | Default | Description |
|---|---|---|
ConnectionString |
(placeholder) | SQL Server connection string |
OpenAI:ApiKey |
"" |
OpenAI API key (or set OPENAI_API_KEY env var) |
OpenAI:Model |
gpt-4o |
Model used for schema analysis and hypothesis generation |
BenchmarkIterations |
5 |
Number of timed iterations per hypothesis |
WarmUpIterations |
3 |
Warm-up iterations before timing begins |
AiOptimizationCount |
10 |
Number of hypotheses the AI generates per target |
AiMaxRetries |
4 |
Max retries if the AI generates invalid SQL |
TimingMetric |
Lowest |
Lowest or Average — which timing value to use for comparison |
OptimizationsPath |
Optimize |
Folder containing optimization targets |
OutputPath |
Runs |
Folder where run results are written |
IntegrityCheckSkipPattern |
^SYS_MON\. |
Regex — skip checksum for tables matching this pattern |
IncludePatterns |
[] |
List of regexes — only run folders whose name matches at least one pattern (empty = run all) |
ExcludePatterns |
[] |
List of regexes — skip folders whose name matches any pattern (empty = skip none) |
RunInitBeforeEachTest |
false |
Run init.sql before every optimization folder instead of once at startup |
RunInitBeforeNextTestIfRevertFailed |
false |
Run init.sql before the next test if the previous revert failed, to restore a clean DB state |
Each run creates a timestamped folder under Runs/:
Runs/
└── 2026-03-14 153000/
├── run.log <- full timestamped log
├── summary.html <- live-updating summary across all targets
├── summary.md
└── 001_MySlowProcedure/
├── AI_Input.txt <- copied from source folder (if present)
├── 1_before.sql <- captured/generated before SQL
├── 3_after.sql <- captured after SQL (may equal 1_before.sql)
├── ai_opt_1/
│ ├── description.txt <- AI hypothesis description
│ ├── ai_prompt.txt <- the prompt sent to the AI
│ ├── 2_optimize.sql <- generated optimization SQL
│ └── 4_revert.sql <- generated revert SQL
├── ai_opt_2/ ...
├── ai_opt_combined/ <- combined best-of-all attempt (AI mode)
├── results.md <- per-target report (Markdown)
└── results.html <- per-target report (HTML, open in browser)
The HTML report shows:
- Before baseline timing (CPU ms + elapsed ms)
- Each hypothesis: name, description, timing, improvement %, integrity status
- Best overall strategy highlighted
Place an init.sql in the Optimize/ folder. It runs once before any benchmarks by default — useful for restoring a snapshot or seeding test data. Set RunInitBeforeEachTest: true to re-run it before every folder, or RunInitBeforeNextTestIfRevertFailed: true to re-run it only when a revert left the DB dirty:
Optimize/
├── init.sql <- optional, runs first (or per-test — see config)
└── 001_MyTarget/
└── 1_before.sql
If init.sql contains RESTORE DATABASE, the tool automatically connects to master for that step.
Program.cs Entry point, configuration, logging
Services/
BenchmarkRunner.cs Orchestrates the full run lifecycle
AiOptimizer.cs Schema discovery, AI hypothesis loop, integrity checks
SqlExecutor.cs SQL execution, timing, cache clearing, checksums
ReportGenerator.cs Markdown and HTML report generation
Models/
BenchmarkConfig.cs Configuration model
OptimizationFolder.cs Loads and parses an optimization folder
BenchmarkResult.cs Timing measurements
OptimizationSummary.cs Cross-folder run summary
SqlTimingResult.cs Single execution timing (CPU + elapsed ms)

