Leetcode Hardest SQL 185: Is the Best Solution Still Slow? | SQLFlash

LeetCode SQL Optimization: Window Functions vs. Subqueries

Developers who frequently use LeetCode are certainly familiar with the problem 185: Department Top Three Salaries, a classic question included in the Top SQL 50. The official provides two reference solutions, which can be found at: Official Solution. These two approaches have sparked extensive discussions in the LeetCode comment section. Let’s conduct a practical test to compare the differences between these two correct answers under real-world data scenarios. 185: Department Top Three Salaries

Official Solutions Comparison

Solution 1: Subquery Approach (Traditional Method)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SELECT d.name AS 'Department', 
       e1.name AS 'Employee', 
       e1.salary AS 'Salary' 
FROM Employee e1
JOIN Department d
ON e1.departmentId = d.id 
WHERE
    3 > (SELECT COUNT(DISTINCT e2.salary)
        FROM Employee e2
        WHERE e2.salary > e1.salary AND e1.departmentId = e2.departmentId);

This solution uses correlated subqueries to determine the salary ranking row by row. It calculates the number of employees in the same department with higher salaries than the current employee. If the count is less than 3, the record is retained.

Solution 2: Window Function Approach (Modern Method)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
WITH employee_department AS
    (
    SELECT d.id, 
        d.name AS Department, 
        salary AS Salary, 
        e.name AS Employee, 
        DENSE_RANK() OVER (PARTITION BY d.id ORDER BY salary DESC) AS rnk
    FROM Department d
    JOIN Employee e
    ON d.id = e.departmentId
    )
SELECT Department, Employee, Salary
FROM employee_department
WHERE rnk <= 3;

This solution leverages the DENSE_RANK() window function to rank employees by descending salary within each department and directly filters records ranked in the top 3.

Performance Testing: Battle of Queries with Ten Thousand-Level Data

Let’s conduct a performance test of these two solutions.

Building the Testing Environment

Let AI Help Generate Scripts:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Table: Employee

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| id           | int     |
| name         | varchar |
| salary       | int     |
| departmentId | int     |
+--------------+---------+
id is the primary key for this table.
departmentId is a foreign key referencing the ID column of the Department table.
Each row indicates an employee's ID, name, salary, and department ID.

Table: Department

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| id          | int     |
| name        | varchar |
+-------------+---------+
id is the primary key for this table.
Each row indicates a department's ID and name.

Table Creation Statements:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
-- Create Department table
CREATE TABLE Department (
    id INT PRIMARY KEY,
    name VARCHAR(100)
);

-- Create Employee table
CREATE TABLE Employee (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    salary INT,
    departmentId INT,
    FOREIGN KEY (departmentId) REFERENCES Department(id)
);

Stored Procedure:

This stored procedure creates 10 departments and inserts 1,000 employees into each department, resulting in a total of 10,000 Employee records.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
DELIMITER $$

CREATE PROCEDURE generate_test_data()
BEGIN
    DECLARE dept_id INT DEFAULT 1;
    DECLARE emp_id INT DEFAULT 1;
    DECLARE i INT;
    DECLARE j INT;

    SET i = 1;
    WHILE i <= 10 DO
        INSERT INTO Department (id, name) VALUES (i, CONCAT('Dept_', i));
        SET i = i + 1;
    END WHILE;

    SET i = 1;
    WHILE i <= 10 DO
        SET j = 1;
        WHILE j <= 1000 DO
            INSERT INTO Employee (id, name, salary, departmentId)
            VALUES (emp_id, CONCAT('Employee_', emp_id), FLOOR(3000 + RAND() * 7000), i);
            SET emp_id = emp_id + 1;
            SET j = j + 1;
        END WHILE;
        SET i = i + 1;
    END WHILE;
END$$

DELIMITER ;

Execute the Stored Procedure:

1
CALL generate_test_data();

Performance Comparison Results:

SQL QUERY DIFF

The performance difference turned out to be significant. The SQL using DENSE_RANK() executed in milliseconds, while the subquery version took over 5 seconds, which is unacceptable.

Why Use Window Functions?

Starting from MySQL version 8.0, window functions were introduced, greatly enhancing complex analysis capabilities in SQL. Compared to traditional methods like subqueries or temporary tables, window functions offer the following advantages:

  • Avoid Repeat Scans and Multiple Calculations: Traditional methods like subqueries often require multiple table scans, especially for calculations like ranking or cumulative sums, which can lead to performance bottlenecks. Window functions complete multiple calculations in a single scan.
  • Reduce Intermediate Results and Temporary Table Overheads: Subqueries or JOINs typically generate intermediate result sets, potentially requiring temporary tables or worktables, increasing memory and disk usage.
  • Avoid Unnecessary Materialization: Window functions share computation results under the same PARTITION BY and ORDER BY conditions, avoiding repeated calculations.

SQL Optimization with SQLFLASH

After SQLFLASH Optimization

Although there are differences in SQL syntax, the results remain consistent. It seems that SQLFlash can easily handle it!

Summary and Recommendations

  1. Prioritize window functions: In MySQL 8.0+ or databases that support window functions, they are the superior choice.
  2. For more detailed analysis on SQL Server, refer to my previous article: SQL Server Performance Optimization: Window Functions to Double Your Query Efficiency

References:

  1. Window Functions in MySQL 8.0
  2. Use Window Functions to Speed Up Correlated Subqueries

What is SQLFlash?

SQLFlash is your AI-powered SQL Optimization Partner.

Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.

How to use SQLFlash in a database?

Ready to elevate your SQL performance?

Join us and experience the power of SQLFlash today!.