Koha How-To

UNION or JOIN in my query???

If you have taken a look at the Koha Reports Library page you’ve seen some interesting report syntax that you know works when you plug it into the Reports editor in Koha but may not be linguistically clear when reading the SQL. In my last posting I reviewed Joins and how they are used. Today I’m reviewing UNIONs and what they are and how they are used.

To review, a join is used when you have a primary key in one table that you link to a second table. This linking allows you to pull data from both tables simultaneously and accurately. For example, pulling patron name and current checkouts links the borrower table and issues table on borrowernumber. Your results are a list of patrons with each item checked out to them.

A UNION works differently, instead of linking two tables it “links” two queries. Let me explain…a UNION is a way to combine the results of two or more queries into one result set. In order for the UNION to work correctly you need to remember a few things:

  1. The queries you are placing into a UNION need to have the same structure, that is, the same number of columns and identical data types for each column. The columns from each query do not need to be named the same, just be similar in nature (text, integer, etc).
  2. The final data set column headings will be the column names designated by the first query in the UNION.
  3. A UNION will default to only showing the DISTINCT rows in the result set. If you want to see every result you need to specify UNION ALL in your SQL statement.

A simple example is shown below. This query will show us which patrons (name and branch) borrowed from the A_branch combined with a list of patrons that borrowed from the B_branch.

(SELECT b.surname, b.firstname, b.branchcode
FROM borrowers b
LEFT JOIN old_issues o on b.borrowernumber=o.borrowernumber
WHERE o.branchcode = ‘A_BRANCH’)
UNION
(SELECT b.surname, b.firstname, b.branchcode
FROM borrowers b
LEFT JOIN old_issues o on b.borrowernumber=o.borrowernumber
WHERE o.branchcode=’B_BRANCH’)

However, you probably noticed that this is not the best method to run this particular query. Using a simple JOIN makes a simpler and more elegant way to get the same information.

SELECT distinct b.surname, b.firstname, b.branchcode
FROM borrowers b
LEFT JOIN old_issues o on o.borrowernumber=b.borrowernumber
WHERE o.branchcode in (“WOOD”, “TENSED”);

What then is the point of UNION? A UNION’s strongest role in SQL is to allow us to do things in one query that we couldn’t do with just JOINS. One of the handiest use of UNION is to create a subset of data that we can then query again! If we take a look at the query “Count Active Patrons” below (copied from the Koha Reports Library) we see there is a SELECT in the first line that is ‘selecting’ data from the subset of data created by the two queries following the FROM. The two queries are joined together into one set of data by the UNION and that set of data is now queried to gather the data needed.

SELECT YEAR(issuedate), MONTH(issuedate), categorycode, COUNT(DISTINCT borrowernumber)
FROM (
SELECT issuedate, borrowernumber FROM old_issues
UNION ALL
SELECT issuedate, borrowernumber FROM issues
) AS all_issues
LEFT JOIN borrowers USING (borrowernumber)
GROUP BY YEAR(issuedate), MONTH(issuedate), categorycode

Read through the Koha Report Library and you’ll see UNION used a number of times. It’s a handy tool to keep in your SQL toolbox!

Read more by Joy Nelson

Tags reports tutorial, reports, SQL