Query Optimization
Overview
Section titled “Overview”Submit a slow SQL query and AutoDB returns a structured execution plan, ranked rewrite alternatives validated by pglast for syntactic correctness, and index recommendations with storage and write-overhead impact estimates. When RAG documents are available, the optimizer uses your business context to inform its analysis.
Execution Plans
Section titled “Execution Plans”AutoDB parses the PostgreSQL EXPLAIN output into a structured tree that you can traverse programmatically. Key fields in execution_plan:
root— the root node of the plan tree, withnode_type(e.g.,"Seq Scan","Index Scan","Hash Join"), cost estimates, andchildrenfor nested operationstotal_cost— total estimated cost in PostgreSQL cost unitsraw_plan_text— the originalEXPLAINtext from PostgreSQLanalyzed— whether real timing data is included (see below)
EXPLAIN ANALYZE
Section titled “EXPLAIN ANALYZE”Add ?analyze=true to the query string to run EXPLAIN ANALYZE, which returns actual execution timing alongside the estimates.
curl -X POST "https://api.autodb.app/api/v1/connections/550e8400-e29b-41d4-a716-446655440000/queries/optimize?analyze=true" \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{"sql": "SELECT id FROM orders WHERE created_at > '\''2024-01-01'\'';"}'When analyzed is true, plan nodes include actual_startup_time, actual_total_time, actual_rows, and actual_loops in addition to the estimated values.
Rewrite Alternatives
Section titled “Rewrite Alternatives”AutoDB generates ranked query rewrites in the alternatives array. Each alternative includes:
rank— position in the ranking (1 = best recommendation)rewritten_sql— the optimized query, validated with pglast for syntactic correctness before inclusionexplanation— plain-English description of what was changed and whyestimated_improvement— rough performance estimate (e.g.,"~1.5-2x faster")
All rewrites are validated with pglast to ensure they are syntactically correct SQL. Invalid rewrites are filtered out before they reach you.
Index Recommendations
Section titled “Index Recommendations”The index_recommendations array combines two detection methods:
rule_based— heuristic detection of sequential scans and missing indexes from the execution planllm_suggested— AI-generated recommendations based on query patterns and schema context
Each recommendation includes:
create_sql— a ready-to-executeCREATE INDEXstatementreason— why this index would help the queryestimated_storage_impact— approximate additional storage needed (e.g.,"~50MB")write_overhead_warning— impact on write performance (e.g.,"Adds ~5-10% overhead on INSERT/UPDATE")benefits— list of query patterns that would benefit from this indexdetection_method— either"rule_based"or"llm_suggested"
Example: Optimizing a Query
Section titled “Example: Optimizing a Query”curl -X POST https://api.autodb.app/api/v1/connections/550e8400-e29b-41d4-a716-446655440000/queries/optimize \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_API_KEY" \ -d '{"sql": "SELECT o.id, c.name, SUM(oi.quantity * oi.unit_price) AS total FROM orders o JOIN customers c ON o.customer_id = c.id JOIN order_items oi ON oi.order_id = o.id WHERE o.created_at > '\''2024-01-01'\'' GROUP BY o.id, c.name ORDER BY total DESC LIMIT 100;"}'Response:
{ "connection_id": "550e8400-e29b-41d4-a716-446655440000", "original_sql": "SELECT * FROM orders WHERE created_at > '2024-01-01'", "execution_plan": { "root": { "node_type": "Seq Scan", "relation_name": "orders", "total_cost": 1250.0, "plan_rows": 5000, "plan_width": 48, "children": [] }, "raw_plan_text": "Seq Scan on orders (cost=0.00..1250.00 rows=5000 width=48)\n Filter: (created_at > '2024-01-01'::date)", "total_cost": 1250.0, "analyzed": false }, "alternatives": [ { "rank": 1, "original_sql": "SELECT * FROM orders WHERE created_at > '2024-01-01'", "rewritten_sql": "SELECT id, customer_id, total, created_at FROM orders WHERE created_at > '2024-01-01'", "explanation": "Replace SELECT * with specific columns to reduce I/O.", "estimated_improvement": "~1.5-2x faster" } ], "index_recommendations": [ { "table_name": "orders", "columns": ["created_at"], "index_type": "btree", "create_sql": "CREATE INDEX CONCURRENTLY idx_orders_created_at ON orders(created_at);", "reason": "Sequential scan on orders filtered by created_at", "estimated_storage_impact": "~50MB", "write_overhead_warning": "Adds ~5-10% overhead on INSERT/UPDATE", "benefits": ["Date range queries on orders"], "detection_method": "rule_based" } ], "context_used": true}Context-Aware Optimization
Section titled “Context-Aware Optimization”The context_used field indicates whether RAG document context was available to inform the analysis. When you upload business documentation via the /documents endpoint, the optimizer uses that context to make more relevant recommendations — for example, understanding which columns are frequently queried together or which tables are performance-critical.
When context_used is false, the optimizer relies solely on the execution plan and schema structure.