While querying the production server to solve one critical issue, suddenly one query completely caught my attention. We had written a query which was created to find records which exist in Table A but do not exist in Table B, based on a certain column. The query was syntactically correct and ran without any errors, but it did not return any results. There were some records which we were hoping to get in the results, so we investigated the reason why the correct query did not return any.
The NOT IN clause returns rows from the outer table which do not exist in the inner table used in the subquery. In this tutorial, we will examine using the NOT IN clause with null values.
Example of the NOT IN Clause
SELECT * FROM OuterTable WHERE PK_Id NOT IN (SELECT FK_id from InnerTable);
Example of the NOT IN Clause with NULL Values in InnerTable.
CREATE TABLE PRODUCT ( PK_Product_Id INT PRIMARY KEY , Name VARCHAR( 255) ); INSERT INTO PRODUCT VALUES ( 1, 'Coke'), (2, 'Pepsi'),(3 ,'Mango'), (4 , '7 Up' ); CREATE TABLE PRODUCT_DETAILS ( PK_Details_Id INT PRIMARY KEY , [Description] VARCHAR( 500), FK_Product_Id INT FOREIGN KEY REFERENCES PRODUCT (PK_Product_Id ) ); INSERT INTO PRODUCT_DETAILS VALUES ( 100,'500 ML is good' ,1); INSERT INTO PRODUCT_DETAILS VALUES ( 101,'500 ML is good' ,2); INSERT INTO PRODUCT_DETAILS VALUES ( 102,'500 ML is good' ,3); INSERT INTO PRODUCT_DETAILS VALUES ( 103,'500 ML is good' ,NULL);
As shown in the image above, the PRODUCT_DETAILS table contains one NULL value.
Now, our aim is to find out the names of the products from the Product table whose details are not available in the Product_Details table. Ideally, it should return Product 4 from the Product table since the details of Product 4 do not exist in the Product_Details table.
We might think this can be easily achieved using the NOT IN predicate by writing the following query.
SELECT * FROM PRODUCT WHERE PK_Product_Id NOT IN (SELECT Fk_Product_Id FROM PRODUCT_DETAILS);
The query above does not return anything although syntactically it is correct. Because of the existence of a NULL value in the Product_Details table, it fails to return the expected results.
SQL Server uses the Three-Valued logic concept here.
As we are aware, a NULL value does exist in the Product_Details table in the fk_product_id column. A NULL value is an unknown or missing value.
SQL Server converts the NOT IN clause using three-value logic and evaluates it in the following manner.
NOT IN (SELECT 1 OR 2 OR 3 OR NULL) NOT IN (Fk_product_id = 1 OR Fk_product_id = 2 OR Fk_product_id = 3 OR Fk_product_id = NULL) NOT IN (Fk_product_id = 1 OR Fk_product_id = 2 OR Fk_product_id = 3 OR UNKNOWN ) NOT IN (TRUE OR TRUE OR TRUE OR UNKNOWN ) NOT IN (TRUE OR TRUE OR UNKNOWN) NOT IN (TRUE OR UNKNOWN) NOT IN (UNKNOWN )
As the final result is evaluated as UNKNOWN, the NOT IN query does not return any result because of the existence of a NULL value.
The solution is to make the NOT IN queries work with the existence of NULL values and use two-valued logic, only TRUE or FALSE.
--NOT IN WITH IS NOT NULL Filter SELECT * FROM PRODUCT WHERE PK_Product_Id NOT IN (SELECT Fk_Product_Id FROM PRODUCT_DETAILS WHERE Fk_Product_Id IS NOT NULL); --NOT Exists SELECT * FROM PRODUCT Prd WHERE NOT EXISTS (SELECT Fk_Product_Id FROM PRODUCT_DETAILS WHERE Fk_Product_Id = Prd.PK_Product_Id);
NOT EXISTS also gives us the right results because it uses two-valued Boolean logic, only TRUE or False, to filter out the rows.
--Using Left Join SELECT * FROM PRODUCT Prd LEFT JOIN PRODUCT_DETAILS PrdDetails ON Prd. PK_Product_Id = PrdDetails .FK_Product_Id WHERE PrdDetails. FK_Product_Id IS NULL
Using Left join, we can also retrieve records which exist in the Product table but do not exist in the Product_Details page when we do a join based on the primary key (pk_product_id) and the foreign key (fk_product_id).