Order By and Group By in SQL Tutorial
Last updated on 08th Oct 2020, Blog, Tutorials
What is SQL?
SQL stands for Structured Query Language. A query language is a kind of programming language that’s designed to facilitate retrieving specific information from databases, and that’s exactly what SQL does. To put it simply, SQL is the language of databases.
That matters because most companies store their data in databases. And while there are many types of databases (like MySQL, PostgreSQL, Microsoft SQL Server), most of them speak SQL, so once you’ve got SQL basics under your belt, you’ll be able to work with any of them.
Even if you’re planning to do your analysis with another language like Python, at most companies, chances are you’ll need to use SQL to retrieve the data you need from the company’s database. As of this writing, there are more than 80,000 SQL jobs listed on Indeed in the US alone.
Subscribe For Free Demo
Error: Contact form not found.
SQL Basics: Relational Databases
A relational database is a database that stores related information across multiple tables and allows you to query information in more than one table at the same time.
It’s easier to understand how this works by thinking through an example. Imagine you’re a business and you want to keep track of your sales information. You could set up a spreadsheet in Excel with all of the information you want to keep track of as separate columns: Order number, date, amount due, shipment tracking number, customer name, customer address, and customer phone number.
This setup would work fine for tracking the information you need to begin with, but as you start to get repeat orders from the same customer you’ll find that their name, address and phone number gets stored in multiple rows of your spreadsheet.
As your business grows and the number of orders you’re tracking increases, this redundant data will take up unnecessary space and generally decrease the efficiency of your sales tracking system. You might also run into issues with data integrity. There’s no guarantee, for example, that every field will be populated with the correct data type or that the name and address will be entered exactly the same way every time.
With a relational database, like the one in the above diagram, you avoid all of these issues. You could set up two tables, one for orders and one for customers. The ‘customers’ table would include a unique ID number for each customer, along with the name, address and phone number we were already tracking. The ‘orders’ table would include your order number, date, amount due, tracking number and, instead of a separate field for each item of customer data, it would have a column for the customer ID.
This enables us to pull up all of the customer info for any given order, but we only have to store it once in our database rather than listing it out again for every single order.
How do GROUP and ORDER BY Differ?
This is a question I’m frequently asked. On the surface both clauses appear to do the same thing; that is sort sort data. But this is where their similarities end. In fact, both serve entirely different purposes.
- The ORDER BY clause’s purpose is to sort the query result by specific columns.
- The GROUP BY clause’s purpose is summarize unique combinations of columns values.
Before we get into their differences consider the general setup of the SELECT statement:
- SELECT column-list
- From table
- GROUP BY columnA, columnB
- ORDER BY columnlist
ORDER BY
The ORDER BY statement is used to sort values. You probably already knew that! So
- SELECT SalesOrderID,
- ProductID,
- OrderQty* UnitPrice As ExtendedPrice
- FROM Sales.SalesOrderDetail
- ORDER BY SalesOrderID
GROUP BY
Contrast this to the GROUP BY clause, which is used to group like column values into a single row.
This is useful as it allows you to summarize information. For instance you can use aggregate functions such as SUM and AVERAGE to calculate values.
In this example
- SELECT SalesOrderID,
- SUM(OrderQty* UnitPrice) As TotalPrice
- FROM Sales.SalesOrderDetail
- GROUP BY SalesOrderID
We are grouping by SalesOrderID and summing each order’s product prices to return the total. This is the magic of the GROUP BY clause: it allows you to perform summary calculations on multiple rows.
- SELECT SalesOrderID,
- SUM(OrderQty* UnitPrice) As TotalPrice
- FROM Sales.SalesOrderDetail
- GROUP BY SalesOrderID
- ORDER BY TotalPrice
To summarize, the key difference between order by and group by is:
- ORDER BY is used to sort a result by a list of columns or expressions.
- GROUP BY is used to create unique combinations of a list of columns that can be used to form summaries. A byproduct of this operation is that the grouping tend to be sorted; however, this isn’t a guarantee.
The SQL ORDER BY clause is used to sort the data in ascending or descending order, based on one or more columns. Some databases sort the query results in an ascending order by default.
Syntax
The basic syntax of the ORDER BY clause is as follows −
- SELECT column-list
- FROM table_name
- [WHERE condition]
- [ORDER BY column1, column2, .. columnN] [ASC | DESC];
You can use more than one column in the ORDER BY clause. Make sure whatever column you are using to sort that column should be in the column-list.
Example
Consider the CUSTOMERS table having the following records −
- | 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
- | 2 | Khilan | 25 | Delhi | 1500.00 |
- | 3 | kaushik | 23 | Kota | 2000.00 |
- | 4 | Chaitali | 25 | Mumbai | 6500.00 |
- | 5 | Hardik | 27 | Bhopal | 8500.00 |
- | 6 | Komal | 22 | MP | 4500.00 |
- | 7 | Muffy | 24 | Indore | 10000.00 |
The following code block has an example, which would sort the result in an ascending order by the NAME and the SALARY −
- SQL> SELECT * FROM CUSTOMERS
This would produce the following result −
- | 4 | Chaitali | 25 | Mumbai | 6500.00 |
- | 5 | Hardik | 27 | Bhopal | 8500.00 |
- | 3 | kaushik | 23 | Kota | 2000.00 |
- | 2 | Khilan | 25 | Delhi | 1500.00 |
- | 6 | Komal | 22 | MP | 4500.00 |
- | 7 | Muffy | 24 | Indore | 10000.00 |
- | 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
The following code block has an example, which would sort the result in the descending order by NAME.
- SQL> SELECT * FROM CUSTOMERS
This would produce the following result −
- | 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
- | 7 | Muffy | 24 | Indore | 10000.00 |
- | 6 | Komal | 22 | MP | 4500.00 |
- | 2 | Khilan | 25 | Delhi | 1500.00 |
- | 3 | kaushik | 23 | Kota | 2000.00 |
- | 5 | Hardik | 27 | Bhopal | 8500.00 |
- | 4 | Chaitali | 25 | Mumbai | 6500.00 |
MySQL GROUP BY clause
The GROUP BY clause groups a set of rows into a set of summary rows by values of columns or expressions. The GROUP BY clause returns one row for each group. In other words, it reduces the number of rows in the result set.
You often use the GROUP BY clause with aggregate functions such as SUM, AVG, MAX, MIN, and COUNT. The aggregate function that appears in the SELECT clause provides information about each group.
The GROUP BY clause is an optional clause of the SELECT statement. The following illustrates the GROUP BY clause syntax:
- SELECT
- c1, c2,…, cn, aggregate_function(ci)
- FROM
- table
- WHERE
- where_conditions
- GROUP BY c1 , c2,…,cn;
The GROUP BY clause must appear after the FROM and WHERE clauses. Following the GROUP BY keywords is a list of comma-separated columns or expressions that you want to use as criteria to group rows.
MySQL evaluates the GROUP BY clause after the FROM, WHERE and SELECT clauses and before the HAVING , ORDER BY and LIMIT clauses:
The SQL GROUP BY clause is used in collaboration with the SELECT statement to arrange identical data into groups. This GROUP BY clause follows the WHERE clause in a SELECT statement and precedes the ORDER BY clause.
Syntax
The basic syntax of a GROUP BY clause is shown in the following code block. The GROUP BY clause must follow the conditions in the WHERE clause and must precede the ORDER BY clause if one is used.
- SELECT column1, column2
- FROM table_name
- WHERE [ conditions ]
- GROUP BY column1, column2
- ORDER BY column1, column2
Example
Consider the CUSTOMERS table is having the following records −
- | 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
- | 2 | Khilan | 25 | Delhi | 1500.00 |
- | 3 | kaushik | 23 | Kota | 2000.00 |
- | 4 | Chaitali | 25 | Mumbai | 6500.00 |
- | 5 | Hardik | 27 | Bhopal | 8500.00 |
- | 6 | Komal | 22 | MP | 4500.00 |
- | 7 | Muffy | 24 | Indore | 10000.00 |
If you want to know the total amount of the salary on each customer, then the GROUP BY query would be as follows.
- SQL> SELECT NAME, SUM(SALARY) FROM CUSTOMERS
This would produce the following result −
- | Chaitali | 6500.00 |
- | Hardik | 8500.00 |
- | kaushik | 2000.00 |
- | Khilan | 1500.00 |
- | Komal | 4500.00 |
- | Muffy | 10000.00 |
- | Ramesh | 2000.00 |
Now, let us look at a table where the CUSTOMERS table has the following records with duplicate names −
- | 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
- | 2 | Ramesh | 25 | Delhi | 1500.00 |
- | 3 | kaushik | 23 | Kota | 2000.00 |
- | 4 | kaushik | 25 | Mumbai | 6500.00 |
- | 5 | Hardik | 27 | Bhopal | 8500.00 |
- | 6 | Komal | 22 | MP | 4500.00 |
- | 7 | Muffy | 24 | Indore | 10000.00 |
Now again, if you want to know the total amount of salary on each customer, then the GROUP BY query would be as follows −
This would produce the following result −
- | Hardik | 8500.00 |
- | kaushik | 8500.00 |
- | Komal | 4500.00 |
- | Muffy | 10000.00 |
- | Ramesh | 3500.00 |
Are you looking training with Right Jobs?
Contact Us- Different types of Joins in SQL Server
- SQL Applications Tutorial
- SQL Server Tutorial
Related Articles
Popular Courses
- SAP Training
11025 Learners
- Postgre sql Training
12022 Learners
- Snow Flake Training
11141 Learners
- What is Dimension Reduction? | Know the techniques
- Difference between Data Lake vs Data Warehouse: A Complete Guide For Beginners with Best Practices
- What is Dimension Reduction? | Know the techniques
- What does the Yield keyword do and How to use Yield in python ? [ OverView ]
- Agile Sprint Planning | Everything You Need to Know