18. MySQL: HAVING Clause

18. MySQL: HAVING Clause

The HAVING clause in MySQL is used in conjunction with the GROUP BY clause to filter the results of a query based on aggregate values. It is used to filter groups based on a specified condition, similar to how the WHERE clause is used to filter individual rows.

POINTS TO REMEMBER:

  1. The HAVING clause can only be used in a SELECT statement, it cannot be used in any other type of SQL statement.

  2. The HAVING clause is used in conjunction with the GROUP BY clause to filter the results of a query based on aggregate values.

  3. The HAVING clause can only be used with aggregate functions such as COUNT, SUM, AVG, etc.

  4. The HAVING clause is applied after the GROUP BY clause, so all the rows are grouped together before the filter is applied.

  5. The HAVING clause can be used in combination with the WHERE clause to further filter the results of a query.

  6. The HAVING clause can be used to filter on aggregate values as well as on non-aggregated columns that are included in the SELECT statement via GROUP BY clause.

HAVING Clause:

The HAVING clause in MySQL is used in conjunction with the GROUP BY clause to filter the results of a query based on aggregate values.

It is used to filter groups based on a specified condition, similar to how the WHERE clause is used to filter individual rows.

Syntax:

SELECT column_name
FROM table
WHERE condition
GROUP BY column_name
HAVING condition
ORDER BY column_name;

Why the Having Clause? Can't we just filter using the WHERE keyword?

  • The HAVING clause was added to SQL to provide a way to filter groups based on aggregate values because the WHERE keyword cannot be used with aggregate functions.

  • This is because the WHERE clause is used to filter individual rows before any grouping has taken place, whereas aggregate functions are used to summarize data across multiple rows.

  • Therefore, a separate clause, the HAVING clause, is required to filter groups based on the results of aggregate functions.

You can test this by trying to use a WHERE clause with an aggregate function in a SQL query. For example, the following query will result in an error:

SELECT COUNT(*) 
FROM actor
WHERE COUNT(*) > 5;

Output:

This is because the WHERE clause is trying to filter rows based on the result of an aggregate function (the count of all rows), but the aggregate function has not been applied yet. Instead, you would need to use the HAVING clause in this case, like this:

SELECT first_name,COUNT(*)
FROM actor
GROUP BY first_name
HAVING count(*) >= 4 ;

Output:


Examples:

The following query groups the film_category data by category_id and uses the HAVING clause to only show the groups that have more than 60 films.

SELECT category_id, COUNT(film_id) as film_count
FROM film_category
GROUP BY category_id
HAVING COUNT(film_id) > 60;

Output:


Let us introduce a join in the above query to have a better look at the data.

SELECT film_category.category_id, category.name, COUNT(film_id) as film_count
FROM film_category
JOIN category ON film_category.category_id = category.category_id 
GROUP BY category_id
HAVING COUNT(film_id) > 60;

Output:


The HAVING clause can be used with aggregate functions such as SUM, COUNT, AVG, etc. to filter groups based on specific conditions.

SELECT film_category.category_id, category.name,SUM(rental_rate*rental_duration) as revenue
FROM film
JOIN inventory ON film.film_id = inventory.film_id
JOIN rental ON inventory.inventory_id = rental.inventory_id
JOIN film_category ON film_category.film_id = film.film_id
JOIN category ON category.category_id = film_category.category_id
GROUP BY category_id
HAVING revenue > 13000;

In this example, the query groups the data from the film, inventory,film_category, category, and rental table by category_id and uses the HAVING clause to only show the groups that have revenue greater than 13000.

Output:


To see more examples, you can always refer w3 schools; it has always been my go-to website to understand things.

References: w3schools.com/mysql/default.asp

*****All the outputs provided are a snippet of the actual result.