Difference between UNION and UNION ALL in SQL


The main difference between UNION and UNION ALL is how they handle duplicate records. 

The UNION and UNION ALL handle duplicates differently. The UNION will combine two or more select statements to remove the duplicates, whereas the UNION ALL will not remove the duplicates but instead concatenate all the records, including duplicates. 

Table of Contents:

What is UNION in SQL?

The UNION in SQL will combine two or more results of SELECT statements but will filter out all the duplicate values from the tables in the result set. 

Syntax:

SELECT column1, column2, column3, ...
FROM table1
WHERE condition
UNION
SELECT column1, column2, column3, ...
FROM table2
WHERE condition;

Example:

CREATE TABLE customers (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    city VARCHAR(50)
);
CREATE TABLE suppliers (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    city VARCHAR(50)
);
INSERT INTO customers (id, name, city) VALUES
(1, 'Bahadhur', 'Kolkata'),
(2, 'Hema', 'Tamil Nadu'),
(3, 'Chahar', 'Delhi'),
(4, 'Dan', 'Kerala');
INSERT INTO suppliers (id, name, city) VALUES
(1, 'Supplier A', 'Kolkata'),
(2, 'Supplier B', 'Tamil Nadu'),
(3, 'Supplier C', 'Delhi'),
(4, 'Supplier D', 'Kerala');
SELECT city FROM customers
UNION
SELECT city FROM suppliers;

Output:

What is UNION in SQLWhat is UNION in SQL

Explanation: The UNION query combines two tables of data into one. In this example, the UNION query combined the city of two tables, removed the duplicates, and displayed unique values. 

What is UNION ALL in SQL?

UNION ALL works the same as UNION; it combines all the result data from SELECT statements, but it will not remove the duplicate values from the table in the result set. It will combine both result values and duplicate values from the dataset.

Syntax:

SELECT column1, column2, column3, ...
FROM table1
WHERE condition

UNION ALL

SELECT column1, column2, column3, ...
FROM table2
WHERE condition;
CREATE TABLE sales_2023 (
    order_id INT PRIMARY KEY,
    customer_id INT,
    amount DECIMAL(10,2)
);

Example:

CREATE TABLE sales_2024 (
    order_id INT PRIMARY KEY,
    customer_id INT,
    amount DECIMAL(10,2)
);
-- Insert sales data for 2023
INSERT INTO sales_2023 (order_id, customer_id, amount) VALUES
(101, 1, 150.75),
(102, 2, 220.50),
(103, 3, 340.00),
(104, 4, 180.25);
-- Insert sales data for 2024
INSERT INTO sales_2024 (order_id, customer_id, amount) VALUES
(201, 2, 200.00),
(202, 3, 400.75),
(203, 5, 120.50),
(204, 6, 310.90);
SELECT order_id, customer_id, amount, '2023' AS year FROM sales_2023
UNION ALL
SELECT order_id, customer_id, amount, '2024' AS year FROM sales_2024;

Output:

What is UNION ALL in SQLWhat is UNION ALL in SQL

Explanation: The UNION ALL combines all the data from customers and sales together with duplicate values. 

Key Difference Between UNION and UNION ALL

UNION UNION ALL
Removes all the duplicates from the table. Will not remove the duplicates from the table.
Slower in performance as it has to remove all the duplicates. Faster in performance as it doesn’t need to sort or remove the duplicates.
It requires high usage of memory as it needs sorting. It doesn’t require high memory usage. 
It produces unique content It has all values, including duplicates. 
Key Difference Between UNION and UNION ALLKey Difference Between UNION and UNION ALL

Rules for UNION and UNION ALL in MySQL

  1. Every SELECT statement must return the same number of columns
  2. Each column should have its corresponding matching data types.
  3. UNION will remove all the duplicates.
  4. UNION ALL will include all the rows, including duplicates.
  5. ORDER BY should be placed after the SELECT statement.
  6. UNION is slower in performance as it has to handle duplicates.
  7. UNION ALL is faster as it doesn’t need to handle duplicates.
  8. NULL values are always considered regular values in UNION and UNION ALL.

Real-world Examples of UNION and UNION ALL

1. Merging Job Positions With Freelancers Using UNION:

The UNION will merge the job position from various sources, remove the duplicates, and return the unique data. 

Example:

CREATE TABLE full_time_jobs (
    id INT PRIMARY KEY,
    job_title VARCHAR(100),
    company VARCHAR(100),
    location VARCHAR(50)
);
CREATE TABLE freelance_jobs (
    id INT PRIMARY KEY,
    job_title VARCHAR(100),
    client VARCHAR(100),
    remote BOOLEAN
);

-- Insert full-time job listings
INSERT INTO full_time_jobs (id, job_title, company, location) VALUES
(1, 'Software Engineer', 'ERV', 'Delhi'),
(2, 'Data Analyst', 'PQR', 'Srinagar'),
(3, 'Marketing Manager', 'ABC', 'Lucknow'),
(4, 'Software Engineer', 'XYZ', 'Hyderabad');

-- Insert freelance job listings
INSERT INTO freelance_jobs (id, job_title, client, remote) VALUES
(1, 'Software Engineer', ' Client A', TRUE),
(2, 'Graphic Designer', 'Client B', TRUE),
(3, 'Data Analyst', 'Client C', TRUE),
(4, 'SEO Specialist', 'Client D', TRUE);

SELECT job_title FROM full_time_jobs
UNION
SELECT job_title FROM freelance_jobs;

Output:

Merging Job Positions With Freelancers Using UNIONMerging Job Positions With Freelancers Using UNION

Explanation: The UNION combined job positions with freelance jobs and returned with unique values from the table. 

2. Hospital Records Using UNION ALL: 

The diagnosis of patients from the general ward and ICU who are present in the hospital database, the UNION ALL, will produce the result of combining both the diagnosis of the general ward and ICU. 

Example:

CREATE TABLE general_ward (
    id INT PRIMARY KEY,
    patient_name VARCHAR(100),
    diagnosis VARCHAR(100)
);
CREATE TABLE icu_ward (
    id INT PRIMARY KEY,
    patient_name VARCHAR(100),
    diagnosis VARCHAR(100)
);

-- Insert patient records in the General Ward
INSERT INTO general_ward (id, patient_name, diagnosis) VALUES
(1, 'Alex', 'Pneumonia'),
(2, 'Baskar', 'Diabetes'),
(3, 'Sneha', 'Hypertension'),
(4, 'Shekhar', 'Pneumonia');  -- Duplicate diagnosis

-- Insert patient records in ICU
INSERT INTO icu_ward (id, patient_name, diagnosis) VALUES
(1, 'Eric', 'COVID-19'),
(2, 'Bishnoi', 'Pneumonia'),  -- Duplicate diagnosis
(3, 'Govind', 'Diabetes'),    -- Duplicate diagnosis
(4, 'Hari', 'Cardiac Arrest');

SELECT diagnosis FROM general_ward
UNION ALL
SELECT diagnosis FROM icu_ward;

Output:

Hospital Records Using UNION ALLHospital Records Using UNION ALL

Explanation: This gives the output of the diagnosis of patients in general wards and ICU, including duplicate values.

Best Practices for Using UNION and UNION ALL

  • Use UNION when you need to eliminate duplicate records and maintain data integrity.
  • Opt for UNION ALL when duplicates are acceptable, as it improves performance by avoiding the overhead of sorting and filtering.
  • Ensure that all SELECT statements have the same number of columns and compatible data types.
  • Apply filters like WHERE clauses in individual SELECT statements to reduce the dataset before using UNION or UNION ALL.
  • Use explicit column names in the final query for clarity and maintainability.

Performance Considerations: UNION vs UNION ALL

  • UNION requires additional processing to sort and remove duplicates, which can slow down performance, especially on large datasets.
  • UNION ALL is generally faster as it directly combines the results without filtering duplicates.
  • In high-volume data scenarios, UNION ALL is preferred unless uniqueness is essential.
  • Using appropriate indexing and limiting returned rows with clauses like LIMIT or TOP can further enhance performance.
  • Analyze query execution plans to identify bottlenecks and optimize accordingly.

Use cases of UNION and UNION ALL

UNION:

  • The use cases for UNION will be in merging customer lists from multiple branches without any duplicates. 
  • It can also help combine employee records of different departments. 
  • It can create sales reports with transaction histories without any duplicates. 

UNION ALL:

  • It will collect all logs from the servers with duplicates with event records.
  • It will combine monthly sales throughout the world and will keep records of all the events with duplicates. 

Conclusion

The UNION and UNION ALL are SQL operators that will combine two SELECT queries. The UNION query returns unique values after removing duplicates; The UNION ALL will retain all the records from the table, including duplicates. The main difference between them is handling duplicates. The UNION maintains the code redundancy, but the processing time will be delayed than UNION ALL because of sorting. The UNION ALL is faster because it returns all the records. 

FAQs

1. When should I use UNION instead of UNION ALL?

Use UNION when you want to combine results from multiple queries and remove duplicate rows, ensuring each record appears only once in the final result.

2. Does UNION ALL affect performance?

UNION ALL is generally faster than UNION because it doesn’t check for duplicate records, reducing processing time, especially with large datasets.

3. Can UNION and UNION ALL combine data from different tables?

Yes, both can combine results from multiple tables as long as the number of columns and their data types match in the combined queries.



Leave a Reply

Your email address will not be published. Required fields are marked *