sql not in - ghdrako/doc_snipets GitHub Wiki

 SELECT email
 FROM erp.customer_contact_details
 WHERE state NOT IN (SELECT state
 FROM erp.suppliers);

The predicate state NOT IN (SELECT state FROM erp.suppliers) can never return TRUE if even one NULL is present. The expression evaluates to “unknown” if the subquery returns any null values, effectively negating the predicate and leading to an empty result set.

 SELECT email, state
 FROM erp.customer_contact_details
 WHERE state NOT IN ('Fake state')
 LIMIT 1;
 email | state
---------------------------+------
[email protected] | NY
 (1 row)

With NULLs:

 SELECT email, state
 FROM erp.customer_contact_details
 WHERE state NOT IN ('Fake state', null)
 LIMIT 1;
 email | state
-------+------
(0 rows)

This is the same as writing NOT (state IN ('Fake state')), so basically, NOT (FALSE); therefore, TRUE. Adding a NULL makes the predicate unknown because NOT (state IN ('Fake state', null)) evaluates to NOT (NULL), which is the same as NULL, and so it cannot be TRUE.

 SELECT ccd.email
 FROM erp.customer_contact_details ccd
 WHERE NOT EXISTS (SELECT FROM erp.suppliers s
 WHERE ccd.state = s.state)
 AND ccd.state IS NOT NULL;
 email
--------------------------
[email protected]
 [email protected]
 (2 rows)

This query finally gives the correct results. It should also be faster to execute, as the query planner this time chooses an anti-join.

The following alternative query explicitly demonstrates the anti-join mechanism and allows you to better visualize how rows are getting excluded based on the lack of matches:

 SELECT ccd.email
 FROM erp.customer_contact_details ccd
 LEFT JOIN erp.suppliers s USING (state)
 WHERE s.state IS NULL;