SQL Bribery: SQL - Intersect

INTERSECT (Transact-SQL)

INTERSECT returns distinct rows that are output by both the left and right input queries operator.

INTERSECT ALL operator that is not implemented in SQL Server.

The INTERSECT operator removes the duplicate rows from the final intersected result set values.
The SQL INTERSECT clause/operator is used to combine two SELECT statements, but returns rows only from the first SELECT statement that are identical to a row in the second SELECT statement.

CREATE TABLE t1

(

Col1 INT,

Col2 INT,

Col3 INT

)

-- Create the 2nd table

CREATE TABLE t2

(

Col1 INT,

Col2 INT

)

-- Insert some records into both tables

INSERT INTO t1 VALUES (1, 1, 1), (2, 2, 2), (2, 2, 2), (2, 2, 3), (2, 2, 4), (3, 3, 3) , (4, 4, 4) , (4, 4, 4), (2, 2, 5)

INSERT INTO t2 VALUES (2, 2), (2, 2), (2, 2), (3, 3), (4, 4), (4, 4), (6, 6), (2, 2)

For table t2, the 2nd table consists of a duplicate record – we able to see the record with the values 2 appears 4 times in the table. When you now perform an INTERSECT between both tables, the records with the values of 2 just appear once in the intersect result set. The duplicate row was just eliminated from intersect result set!

WITH IntersectRemoveDuplicate AS

(

SELECT

Col1,

Col2

FROM t1

INTERSECT

SELECT

Col1,

Col2

FROM t2

)

SELECT Col1, Col2 FROM IntersectRemoveDuplicate

If you want to keep duplicate rows in the table (IntersectAll), you have to make them unique with using the ROW_NUMBER(). With that function you just generate a unique row number for every duplicate record. So, duplicate records become unique, and the “duplicate” row is returned 4 times as expected.

WITH IntersectAllWithDuplicate AS

(

SELECT

ROW_NUMBER() OVER (PARTITION BY Col1, Col2

ORDER BY (SELECT 0))

AS RowNumber,

Col1,

Col2

FROM t1

INTERSECT

SELECT

ROW_NUMBER() OVER (PARTITION BY Col1, Col2

ORDER BY (SELECT 0))

AS RowNumber,

Col1,

Col2

FROM t2

)

SELECT Col1, Col2, RowNumber FROM IntersectAllWithDuplicate

Order By Col1, Col2

Practical Example :

We want to query from table customer and table vendor to find out who are our customer and vendor as well.

-- Create the 1st table

CREATE TABLE Vendor

(

VendorID INT,

VendorName nvarchar(100),

VendorPurchase decimal

)

-- Create the 2nd table

CREATE TABLE Customer

(

CustomerID INT,

CustomerName nvarchar(100),

CustomerOrder decimal

)

-- Insert some records into both tables

INSERT INTO Vendor VALUES (1, 'CompanyA', 100),

(2, 'CompanyB', 200),

(2, 'CompanyB', 230),

(2, 'CompanyB', 300),

(2, 'CompanyB', 400),

(3, 'CompanyC', 330),

(4, 'CompanyD', 450),

(4, 'CompanyD', 40),

(5, 'CompanyE', 480),

(2, 'CompanyB', 500)

INSERT INTO Customer VALUES (2, 'CompanyB', 120),

(2, 'CompanyB', 200),

(2, 'CompanyB', 250),

(3, 'CompanyC', 300),

(4, 'CompanyD', 140),

(4, 'CompanyD', 240),

(6, 'CompanyF', 600), (2, 'CompanyB', 200)

WITH IntersectRemoveDuplicate AS

(

SELECT

VendorID As ID,

VendorName As Name

FROM Vendor

INTERSECT

SELECT

CustomerID As ID,

CustomerName As Name

FROM Customer

)

SELECT ID, Name FROM IntersectRemoveDuplicate

GO

Output :

2 CompanyB

3 CompanyC

4 CompanyD

WITH IntersectAllWithDuplicate AS

(

SELECT

ROW_NUMBER() OVER (PARTITION BY VendorID, VendorName

ORDER BY (SELECT 0)) AS RowNumber,

VendorID As ID,

VendorName As Name

FROM Vendor

INTERSECT

SELECT

ROW_NUMBER() OVER (PARTITION BY CustomerID, CustomerName

ORDER BY (SELECT 0)) AS RowNumber,

CustomerID As ID,

CustomerName As Name

FROM Customer

)

SELECT ID, Name, RowNumber FROM IntersectAllWithDuplicate

Order By ID, Name

Output :

2 CompanyB 1

2 CompanyB 2

2 CompanyB 3

2 CompanyB 4

2 CompanyB 5

2 CompanyB 6

2 CompanyB 7

2 CompanyB 8

3 CompanyC 1

4 CompanyD 1

4 CompanyD 2

4 CompanyD 3

4 CompanyD 4

SQL Bribery

Saturday, December 7, 2019

SQL - Intersect

No comments:

Post a Comment