SQL Not Exists: Filter Out Records that Exist in a Subquery
Subqueries and SQL Statements
A subquery is basically a query within a query. The following
query is a standard, simple SQL select statement:
select * from customers where
customerId = 5
The “select *” part tells the SQL database to return all
columns. The asterisk in SQL indicates that you want to return all columns, but
you can also specify columns individually.
The “where” clause indicates that you want to return just
customers with an id of 5. You can, of course, return more than one record. You
can assume from the above statement that “customerId” is the unique column. A
customerId should only be one value per customer, so you can assume the above
query only returns one record.
Instead of using the “where customerId = 5” clause, you can add
a subquery. That’s where NOT EXISTS works. For instance, take the following query:
select * from customers where NOT EXISTS (select customerId from orders)
The above
statement basically says “give me all customer records where the customerId
does not exist in the orders table.” Therefore, the business logic is “I want
to see all customers who do not have orders.” The NOT EXISTS is the opposite of
the EXISTS statement, so you just reverse the business logic if you want to see
the opposite results. For instance, the following statement gives you the
opposite results:
select * from customers where EXISTS (select customerId from orders)
The
above statement says “give me all customer records where the customer has
placed an order.” You’ll probably need to use EXISTS and NOT EXISTS in
different parts of your application depending on your business logic.
When you run
a query such as the ones above, it should be noted that the subquery runs first. The
subquery in the EXISTS and NOT EXISTS statements is the query that returns order records.
This query runs first. Then, the main or “outer” query runs. In the above
examples, the outer query is the select statement based on the customers table.
Consider the following query:
select * from customers where NOT EXISTS (select customerId from orders where createdate >= getdate() - 30)
In this
statement, a list of orders is returned based on the order date. It’s assumed
that “createdate” is the column that contains the order’s creation date.
Therefore, any order placed within the last 30 days is returned from the
subquery. These records are then used for the outer query, which is again the
select statement run against the customers table.
Combining NOT EXISTS with Other Where Clause Filters
The where
clause can take multiple logic filters. Suppose you want to get a list of
customers who placed an order, but you only want customers who have signed up
within the last 60 days. For instance, you might want to know how many of your
customers placed orders, but you want “new” customers. “New” is an arbitrary
value, so you set that value at 60 days. You can write SQL statements that
match this business logic. You can usually write SQL statements in several ways
to match the business logic, but in this example you want to use the NOT EXISTS
statement. The following code is a SQL statement that codes for the specified
business logic:
select * from customers where EXISTS (select customerId from orders where createdate >= getdate() - 30) and NOT EXISTS (select customerid from customers where createdate <= getdate() – 60)
The above
statement combines the EXISTS and NOT EXISTS statements.
First,
the EXISTS subquery runs. This is a subquery that gets a list of customers
where orders were placed within the last 30 days.
Next,
the NOT EXISTS subquery runs. This subquery gets a list of customers that were
created prior to 60 days ago.
Since
the second subquery uses the NOT EXISTS statement, the main query does a match
with the NOT EXISTS subquery against the customer database, and filters out
records where they exist in the subquery. Therefore, it only gets records where
the customerId is not located in the NOT EXISTS subquery.
Insight
If
subquery
without any matching key with the main query like following, no rows will be
return; For example, if there is not price column in the customers table, the
following statement will not procure any result.
select * from customers where EXISTS (select price from pricelist)
Example
SELECT ColA, ColB, Col1, Col2, Col3, Col4
FROM TableA
WHERE (NOT EXISTS
(SELECT ColA, ColB, Col5, Col6, Col7, Col8
FROM TableB
--WHERE (TableA.ColA = ColA) AND
-- (TableA.ColB = ColB)
))
ColA and ColB must be Keys.
SELECT device, purchdoc, processhr, receivedt, invdate,
invno, pay_no_pay, potp, flag, uploadeddt,
FROM TableA
WHERE ( NOT EXISTS
(SELECT timelog, assignedlotno, remainingqty, do_status, todo, sumgoodqty,
sumrejqty, bihr, devicename, invoiceno, datecode, matcheddate, item,
sourcedevice, purchdoc, netprice, processhr, receivedt, invdate, invno, pay_no_pay, devnumber
FROM TableB
--WHERE (TableA.processhr = processhr) AND
-- (TableA.sourcedevice = devnumber) AND
-- (TableA.purchdoc = purchdoc)
))