How to use ROW_NUMBER() and PARTITION BY In MySQL
The ROW_NUMBER() function, combined with PARTITION BY, allows efficient grouping and filtering based on assigned row numbers.
SELECT
your_columns,
ROW_NUMBER() OVER (PARTITION BY your_grouping_column ORDER BY your_ordering_column DESC) AS row_num
FROM
your_table;
Example 1: Restricting Sales Data to the Top 2 Products per Category
Let’s consider a scenario where you have a table named sales with columns product, category, and revenue. The goal is to retrieve the top 2 products based on revenue for each category.
-- Sample Data
CREATE TABLE sales (
product VARCHAR(50),
category VARCHAR(50),
revenue INT
);
INSERT INTO sales VALUES ('Laptop', 'Electronics', 1000);
INSERT INTO sales VALUES ('Smartphone', 'Electronics', 1200);
INSERT INTO sales VALUES ('Refrigerator', 'Appliances', 800);
INSERT INTO sales VALUES ('Washing Machine', 'Appliances', 900);
INSERT INTO sales VALUES ('Tablet', 'Electronics', 500);
INSERT INTO sales VALUES ('Coffee Maker', 'Appliances', 600);
-- Query to Retrieve Top 2 Products per Category
SELECT
product,
category,
revenue
FROM (
SELECT
product,
category,
revenue,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) AS row_num
FROM
sales
) ranked
WHERE
row_num <= 2;
Output:
product |
category |
revenue |
---|---|---|
Smartphone |
Electronics |
1200 |
Laptop |
Electronics |
1000 |
Washing Machine |
Appliances |
900 |
Refrigerator |
Appliances |
800 |
Explanation: The query retrieves the top 2 products per category from the ‘sales’ table based on descending revenue. The output displays columns ‘product‘, ‘category‘, and ‘revenue‘. It utilizes the ROW_NUMBER() window function to assign row numbers within each category, and filters rows where the row number is 1 or 2.
Example 2: Retrieving Top 3 Employees per Department Based on Salary
Consider a table named employees with columns employee_id, department, and salary. You want to retrieve the top 3 employees based on salary for each department.
-- Sample Data
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
department VARCHAR(50),
salary INT
);
INSERT INTO employees VALUES (1, 'HR', 50000);
INSERT INTO employees VALUES (2, 'IT', 60000);
INSERT INTO employees VALUES (3, 'HR', 55000);
INSERT INTO employees VALUES (4, 'IT', 65000);
INSERT INTO employees VALUES (5, 'Finance', 70000);
-- Query to Retrieve Top 3 Employees per Department
SELECT
employee_id,
department,
salary
FROM (
SELECT
employee_id,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM
employees
) ranked
WHERE
row_num <= 3;
Output:
employee_id |
department |
salary |
---|---|---|
2 |
IT |
60000 |
4 |
IT |
65000 |
5 |
Finance |
70000 |
1 |
HR |
50000 |
3 |
HR |
55000 |
Explanation: The output displays the top 3 employees with the highest salaries in each department. It includes their ‘employee_id‘, ‘department‘, and ‘salary’. The result is obtained by ranking employees within each department based on salary in descending order using ROW_NUMBER().
How to Restrict Results to Top N Rows per Group in MySQL
When working with MySQL databases, the need to limit results to the top N rows per group is a common requirement. This is particularly useful in scenarios where you aim to retrieve a specific number of records for each distinct group in your dataset. Achieving this involves leveraging the ROW_NUMBER() window function along with the PARTITION BY clause to effectively partition the data into groups.