SQL Optimization Case Study: Using '=' Instead of 'IN' for Instant Performance Gains!

We need to quickly retrieve information about employees with the highest salary. The common approach is:
|
|
When dealing with millions of records, response time can jump from milliseconds to several seconds, potentially causing full table scans!
You might assume IN subqueries work like this:
But in reality (especially in MySQL 5.7 and earlier):
|
|
Metric | Original IN Query | Optimized = Query | Improvement |
---|---|---|---|
Execution Time | 3.8 seconds | 0.2 seconds | 95% |
Rows Scanned | 2 million Ă 2 | 2 million | 50% |
Memory Usage | 480MB | 90MB | 81% |
(Test Environment: MySQL 5.7, 2 million rows in employees table, 4-core 8GB server)
– Original IN query plan (dangerous signals!) -> Nested Loop (cost=2.3 rows=1) -> Index scan on employees # full table scan -> Materialize (cost=1.1 rows=1) -> Aggregate: max(salary)
– Optimized “=” query plan (clean and efficient) -> Filter: (salary = (select #2)) (cost=0.35 rows=1) -> Index scan on employees using idx_salary # uses index -> Select #2 (subquery in condition) -> Aggregate: max(salary) (cost=0.15 rows=1)
If the subquery may return multiple values, IN must be used Without an index, a full table scan is still required (but it’s still faster than the original approach)
Even after rewriting to “=”, ensure there’s an index on the salary field:
|
|
After optimization comparison:
Metric | Without Index | With Index | Improvement |
---|---|---|---|
Execution Time | 0.5 seconds | 0.02 seconds | 96% |
You can see that when SQLFLASH determines the result returns only one row, it rewrites the SQL from ‘in’ to ‘=’, thus significantly improving performance and avoiding the overhead of repeatedly calculating subqueries in the outer query for each row.
By rewriting with “=”, we achieved:
Key Takeaway: When the subquery result is unique, “=” is more efficient than “IN”! Next time you encounter a similar issue, check if the subquery returns a single value and try this optimization.
SQLFlash is your AI-powered SQL Optimization Partner.
Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.
Join us and experience the power of SQLFlash today!.