tags : SQL

Join Types in relational algebra

Cartesian Product

Theta Join

  • Joins Horizontally
  • AKA Qualified Join (See SQL join flavors)
  • Similar to INTERSECTION but allows different schema

Natural Join

  • Better than Cartesian Product in ways that gets rid of duplicates, adds filters etc.
  • A natural join finds all pairs of columns with the same name and uses them as join criteria.
  • we don’t specify matching column names at all. Think of using USING but w/o even specifying the column name
  • Like Theta join, it can be inner, left, right, or full
  • Natural Join is a bad idea because of the implicit match. So mostly DB engines will be implementing Theta Joins.

Lateral Join

  • This allows you to reference one subquery from another
  • More powerful than Parition Join, better supported
  • The for each loop of SQL

Others

  • Semi Join
  • Anti Join
  • Equi Join
  • Self Join
  • Partition Join
    • Only implemented by Oracle

Client Side Join

  • This is not related to Relational Algebra
  • This is basically doing 2 regular query and then doing the join in the application code. Eg. If we need to join data from a mysql tables and a PostgreSQL table.

Join Types in SQL

  • Not all joins can be described though Venn diagrams, Venn diagrams do not give the full pictures of joins but they give good enough intuition
  • There is no concept of INNER(LEFT,RIGHT) JOINs
    • LEFT OUTER = LEFT
    • RIGHT OUTER = RIGHT
    • FULL OUTER = FULL

CROSS JOIN

  • Is the Cartesian Product, Not related to Cross Product from Linear Algebra.
  • All other JOIN types can be derived from CROSS JOIN, with additional filters, and perhaps unions.

INNER JOIN

  • Commonly used
  • Inner join produces only the set of records that match in both Table A and Table B.

OUTER JOIN

Outer Joins are useful when we want to retain those rows from either the LEFT side or the RIGHT or both (FULL) sides

LEFT OUTER JOIN / LEFT JOIN

  • Produces a complete set of records from Table A, with the matching records (where available) in Table B. If there is no match, the right side will contain null.
  • LEFT JOIN queries are among the most optimizable in every database server.

RIGHT OUTER JOIN / RIGHT JOIN

  • right outer join behaves almost identically to the left outer join, but the roles of the tables are switched

FULL OUTER JOIN

  • Combines the results of the left and right outer joins
  • Produces the set of all records in Table A and Table B, with matching records from both sides where available. If there is no match, the missing side will contain null.

Join Syntax

  • TODO: Verify this
-- "Classic" ANSI JOIN syntax
SELECT *
FROM author a
JOIN book b ON a.author_id = b.author_id
 
-- "Nice" ANSI JOIN syntax
SELECT *
FROM author a
JOIN book b USING (author_id)

Tips

  • You can do multiple join in the same statement one after another
  • When writing JOIN statements, refer to columns as table_name.column_name
  • Atleast one of the joined columns should have an unique value, otherwise we can get strange results

Join Complexity

More Resources