Wednesday, April 17, 2013

Joining Tables in SQL


Introduction

This document will try and explain the ways that joining two tables work. To help explain this I created two temp tables with specific rows in them to proof the point. Find the SQL to create the sample Temp tables in Appendix A. We have table TEMP_A with four rows in it the ID's of this four rows in unique and numbered 1,2,3,4 respectively. Then we also have table TEMP_B with five rows in it. It has rows 1, 2,3,3,5 in it. Note that row 1 and 2 from Table A have one reference each in table B.
Row 3 have two reverences in table B row 4 have no reverences in table B at all, and there is an Orphan row in table B (row 5) that have no parent row in table A.
Also note that the reserved words inner and outer is optional. left outer join and left join mean exactly the same thing.
OK now on to the fun stuff.

Normal Join (Or Inner Join)

Joining (or inner joining) the two table on the ID fields you will get all rows in the intersection of the two sets, meaning where they both have the same value.
Using the data sample created in Appendix A we will get the following result:
select *
from TEMP_A
INNER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
Tbl_IDTbl_DataTbl_IDTbl_Data
1Tbl A Row 11Tbl B Row 1
2Tbl A Row 22Tbl B Row 2
3Tbl A Row 33Tbl B Row 3a
3Tbl A Row 33Tbl B Row 3b
Note that row 3 in table a is duplicated once for each corresponding row in table B.

Left Join (Or Left Outer Join)

The Left join will return the Intersection of the two tables and in addition it will also return the rows from the left table that do not have corresponding rows in the right table. What is left and what is right. Well the Left table is the first table specified and the right is the second table specified. Or the Right table is the table after the Join statement. The Left side is the rest of the data SQL is working with.
Using the data sample created in Appendix A we will get the following result.
select *
from TEMP_A
LEFT OUTER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
Tbl_IDTbl_DataTbl_IDTbl_Data
1Tbl A Row 11Tbl B Row 1
2Tbl A Row 22Tbl B Row 2
3Tbl A Row 33Tbl B Row 3a
3Tbl A Row 33Tbl B Row 3b
4Tbl A Row 4NULLNULL
Note that Row 4 from table A is now included but since there is not corresponding row in table B all the fields from table B contain NULL's.

Right Join (Or Right Outer Join)

The Right Join Is very much like the left join but it return rows from the Right Table that have no corresponding rows in the Left table.
Using the data sample created in Appendix A we will get the following result.
select *
from TEMP_A
RIGHT OUTER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
Tbl_IDTbl_DataTbl_IDTbl_Data
1Tbl A Row 11Tbl B Row 1
2Tbl A Row 22Tbl B Row 2
3Tbl A Row 33Tbl B Row 3a
3Tbl A Row 33Tbl B Row 3b
NULLNULL5Tbl B Row 5
This set also contains 5 rows. But the last row this time contains data from table B and all data from table A is NULL.

Full Join (Or Full Outer Join)

Well this is like a left and a Right join Combined. It will return the intersection of the two tables, and all the rows from table A not having corresponding rows in B and all the rows from B not having corresponding rows in A.
Using the data sample created in Appendix A we will get the following result:
select *
from TEMP_A
FULL OUTER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
Tbl_IDTbl_DataTbl_IDTbl_Data
1Tbl A Row 11Tbl B Row 1
2Tbl A Row 22Tbl B Row 2
3Tbl A Row 33Tbl B Row 3a
3Tbl A Row 33Tbl B Row 3b
4Tbl A Row 4NULLNULL
NULLNULL5Tbl B Row 5

Cross Join

Well a Cross join is not really a join you do not specify the fields to join on you just specify the name of the tables. It will return every row from table A matched up with every row in table B so the end result will have lots of rows.
Using the data sample created in Appendix A we will get the following result.
select *
from TEMP_A, TEMP_B
Tbl_IDTbl_DataTbl_IDTbl_Data
1Tbl A Row 11Tbl B Row 1
1Tbl A Row 12Tbl B Row 2
1Tbl A Row 13Tbl B Row 3a
1Tbl A Row 13Tbl B Row 3b
1Tbl A Row 15Tbl B Row 5
2Tbl A Row 21Tbl B Row 1
2Tbl A Row 22Tbl B Row 2
2Tbl A Row 23Tbl B Row 3a
2Tbl A Row 23Tbl B Row 3b
2Tbl A Row 25Tbl B Row 5
3Tbl A Row 31Tbl B Row 1
3Tbl A Row 32Tbl B Row 2
3Tbl A Row 33Tbl B Row 3a
3Tbl A Row 33Tbl B Row 3b
3Tbl A Row 35Tbl B Row 5
4Tbl A Row 41Tbl B Row 1
4Tbl A Row 42Tbl B Row 2
4Tbl A Row 43Tbl B Row 3a
4Tbl A Row 43Tbl B Row 3b
4Tbl A Row 45Tbl B Row 5

Why not using a unique key is bad

During this explanation I used table A with unique values in the TBL_ID field. It is not desired that you use table where the joining key is not unique in at least one of the two tables. If this is the case you will find that the non-unique key will perform cross joins. Let's add a row to Table A to show this.
insert into TEMP_A values (3, 'Tbl A Row 3 dup')
Now select a normal Join as before and watch the results.
select *
from TEMP_A
INNER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
Tbl_IDTbl_DataTbl_IDTbl_Data
1Tbl A Row 11Tbl B Row 1
2Tbl A Row 22Tbl B Row 2
3Tbl A Row 33Tbl B Row 3a
3Tbl A Row 33Tbl B Row 3b
3Tbl A Row 3 dup3Tbl B Row 3a
3Tbl A Row 3 dup3Tbl B Row 3b
See how we now retuned 6 rows.

Using Join to select orphans

So how do I get all the rows in one table that do not have corresponding rows from the other table? Simple with a left or right join and a where removing the unwanted rows.
Using the sample data:
select *
from TEMP_A
LEFT OUTER JOIN TEMP_B
ON TEMP_A.Tbl_ID = TEMP_B.Tbl_ID
where TEMP_B.Tbl_ID is null
Tbl_IDTbl_DataTbl_IDTbl_Data
4Tbl A Row 4NULLNULL

Appendix A (Sample Data)

-- Create the first temp table
create table TEMP_A (
Tbl_ID int not null,
Tbl_Data varchar(50) not null
)
-- Insert sample data in to first temp table
insert into TEMP_A values (1, 'Tbl A Row 1')
insert into TEMP_A values (2, 'Tbl A Row 2')
insert into TEMP_A values (3, 'Tbl A Row 3')
insert into TEMP_A values (4, 'Tbl A Row 4')
-- Create the second temp table
create table TEMP_B (
Tbl_ID int not null,
Tbl_Data varchar(50) not null
)
-- Inset sample data into the second temp table
insert into TEMP_B values (1, 'Tbl B Row 1')
insert into TEMP_B values (2, 'Tbl B Row 2')
insert into TEMP_B values (3, 'Tbl B Row 3a')
insert into TEMP_B values (3, 'Tbl B Row 3b')
insert into TEMP_B values (5, 'Tbl B Row 5')

No comments:

Post a Comment