Home » Top 20 SQL JOINs Interview Questions and Answers (2023)

Top 20 SQL JOINs Interview Questions and Answers (2023)

Data is very valuable to organizations. Actionable insights that give an organization competitive advantage and help it run more efficiently can be extracted from the organization’s data. Therefore, data must be collected and stored.

Databases are an organized way to store and query data. There are two main types of databases: relational and non-relational.

Relational databases are very popular because they’re structured and data is organized in tables. The data in the various tables may have one-to-one, one-to-many, or many-to-many relationships. The way the data is organized in a relational database is defined in its schema.

Non-relational databases store data using simple key-value pairs in non-tabular form. Although relational databases support Binary Large Objects (BLOB) for storing semi-structured and unstructed data, storing and retrieving these types of data is easier with non-relational databases.

The Structured Query Language (SQL) is the language for accessing and interacting with relational databases. If you want to add, delete, edit, or query information on a relational database, the easiest way to do it is through SQL.

Relational databases dominate the database market and they are projected to grow by more than 30 percent between 2021 and 2026. SQL is the most in-demand skill for data related jobs. As more and more organizations embrace the use of relational databases, the demand for data professionals with SQL skills will continue to grow.

More often than not, you may need to combine two or more tables in a relational database to get the data needed for performing your analysis. You can combine tables in SQL with JOIN clauses. There are several SQL JOIN clauses; understanding how each of them works can be challenging.

SQL JOIN questions are, therefore, interviewers’ favorites. They tend to show up in most SQL interviews. In this tutorial, we’ll take you step-by-step through the answers to the top 20 SQL JOIN interview questions — and equip you with the knowledge to ace your upcoming SQL interviews. We’ll be writing a lot of SQL queries. This SQL Cheat Sheet will help you get up to speed if your SQL skills are a little rusty.


SQL JOINs Interview Questions

In a SQL interview, you’ll most likely be asked questions that require you to combine tables. SQL JOIN clauses are used to combine data from two or more tables in a relational database.

SQL JOIN clauses are often used when the tables share some sort of relationship. The most common conditional expression to join tables is the equality conditional (equi-join). However, you can join tables that don’t share any relationships and use other conditional expressions besides equality.

Here are some commoly asked SQL JOIN interview questions and their answers. We recommend that you draw your own diagrams and repeatedly practice these questions to master how SQL JOINs work.

1. What are SQL JOINs?

Answer

SQL JOIN clauses are used to combine rows from two or more tables. Usually, tables are joined based on a related column between them, but you can also join tables that have no relationships. The term “SQL JOIN” is used to refer to a SQL query with the JOIN keyword. This is known as an explicit join. However, you can also join tables in SQL without using the JOIN keyword. This is known as an implicit join.

2. What are the general explicit and implicit join notation syntaxes?

Answer

EXPLICIT JOIN:

SELECT *
FROM [left table]
[JOIN CLAUSE] [right table]
ON conditional expression;

IMPLICIT JOIN:

SELECT *
FROM [left table], [right table]
WHERE conditional expression;

The conditional expression is not required for some types of SQL JOINs.

3. What are the different types of JOINs in SQL?

Answer

image-1.pngimage-1.png

The main types of SQL JOINs are:

  • CROSS JOIN: matches every row of the left table with every row of the right table and returns their Cartesian product. It’s also known as Cartesian JOIN.
  • [INNER] JOIN: returns rows that have matching values on both the right and left tables, based on the specified conditional expression.
  • NATURAL JOIN: a type of equi-join that combines the left and right tables using common column names.
  • LEFT [OUTER] JOIN: returns all the rows from the left table with the matching rows from the right table, based on the specified conditional expression.
  • RIGHT [OUTER] JOIN: returns all the rows from the right table with the matching rows from the left table, based on the specified conditional expression.
  • FULL [OUTER] JOIN: returns all rows from the left and the right tables, regardless of whether the conditional expression is matched.
  • SELF JOIN: allows you to join a table to itself as if the tables were two different tables.

It’s worth noting that the SQL syntax placed inside square brackets is optional and can be excluded from the query. For example, the SQL query below executes an INNER JOIN:

SELECT *
FROM left_table
JOIN right_table
ON conditional expression;

4. What type of JOIN is executed without conditional expression in the explict and implict join notations shown below?

Explicit join:

SELECT *
FROM [left table]
[JOIN CLAUSE] [right table]

Implict join:

SELECT *
FROM [left table], [right table]

Answer

Let’s set up a SQLite database with tables. When you try to connect to a database that doesn’t exist, it’s created for you. First:

pip install ipython-sql

Then create the database and tables:

%%capture
%load_ext sql
sql sqlite:///joins.db
%%sql

-- create a students table:

CREATE TABLE students (
    student_id INTEGER PRIMARY KEY,
    first_name TEXT NOT NULL,
    last_name TEXT NOT NULL
);

INSERT INTO students VALUES (1, "Mary", "Wilson");
INSERT INTO students VALUES (2, "Tim", "Ben");
INSERT INTO students VALUES (3, "Alice", "Robinson");
INSERT INTO students VALUES (4, "Reece", "Bells");

-- create a student_contact table:

CREATE TABLE student_contact (
    student_id,
    email_address,
    FOREIGN KEY (student_id) REFERENCES students(student_id)
);

INSERT INTO student_contact VALUES (1, "[email protected]");
INSERT INTO student_contact VALUES (2, "[email protected]");
INSERT INTO student_contact VALUES (3, "[email protected]");

-- create a staff table:

CREATE TABLE staff (
    staff_id INTEGER PRIMARY KEY,
    first_name TEXT NOT NULL,
    last_name TEXT NOT NULL
);

INSERT INTO staff VALUES (1, "Ada", "Lovelace");
INSERT INTO staff VALUES (2, "Adam ", "Smith");
INSERT INTO staff VALUES (3, "Nikolo", "Tesla");

-- create a staff_contact table:

CREATE TABLE staff_contact (
    staff_id,
    email_address,
    FOREIGN KEY (staff_id) REFERENCES staff(staff_id)
);

INSERT INTO staff_contact VALUES (1, "[email protected]");
INSERT INTO staff_contact VALUES (2, "[email protected]");
INSERT INTO staff_contact VALUES (3, "[email protected]");
     * sqlite:///joins.db
    Done.
    1 rows affected.
    1 rows affected.
    1 rows affected.
    1 rows affected.
    Done.
    1 rows affected.
    1 rows affected.
    1 rows affected.
    Done.
    1 rows affected.
    1 rows affected.
    1 rows affected.
    Done.
    1 rows affected.
    1 rows affected.
    1 rows affected.

We’ll use the student_contact as the left table and staff_contact as the right table to answer this question. Let’s see them:

%sql SELECT * FROM student_contact;
     * sqlite:///joins.db
    Done.
%sql SELECT * FROM staff_contact;
     * sqlite:///joins.db
    Done.

If we don’t specify the join condition, SQL does it for us. It assumes a CROSS JOIN. If the left table has n rows and y columns, and the right table has m rows and z columns, a CROSS JOIN will return y + z columns and m x n rows. Our example will return:

This is true for JOIN clauses that don’t have conditional expressions built into them. The NATURAL JOIN has an equality expression built into it and will not result in a CROSS JOIN because we don’t even need to specify the conditional expression when using it.

Let’s evaluate whether this is true for the the following explict joins: INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN:

%%sql

-- INNER JOIN without conditional expression

SELECT *
FROM student_contact
INNER JOIN staff_contact;
     * sqlite:///joins.db
    Done.
%%sql

-- LEFT OUTER JOIN without conditional expression

SELECT *
FROM student_contact
LEFT OUTER JOIN staff_contact;
     * sqlite:///joins.db
    Done.
%%sql

-- RIGHT OUTER JOIN without conditional expression

SELECT *
FROM student_contact
RIGHT OUTER JOIN staff_contact;
     * sqlite:///joins.db
    Done.
%%sql

-- FULL OUTER JOIN without conditional expression

SELECT *
FROM student_contact
FULL OUTER JOIN staff_contact;
     * sqlite:///joins.db
    Done.

Let’s evaluate whether this is also true for the implicit join:

%%sql

-- IMPLICIT JOIN without conditional expression

SELECT *
FROM student_contact, staff_contact;
     * sqlite:///joins.db
    Done.

Lastly, let’s evaluate whether this is true for a NATURAL JOIN:

%%sql

-- NATURAL JOIN has in-built equality conditional

SELECT *
FROM student_contact
NATURAL JOIN staff_contact;
     * sqlite:///joins.db
    Done.
student_idemail_addressstaff_id

NATURAL JOIN does not return a CROSS JOIN because of its in-built equality conditional. It searches for the same column name(s) in the left and right tables and applies the equality (=) conditional. For this example, email_address is the similar column in both tables. However, there are no matching email addresses in the tables, so an empty table is returned.

5. How is the INNER JOIN executed?

Answer

The INNER JOIN returns the matching values in both the right and left tables. A CROSS JOIN is first executed, then the conditional expression specified is used to limit the rows returned. Let’s illustrate with the students and student_contact tables.

The cross join of these tables returns:

%%sql

-- CROSS JOIN of the students and student_contact

SELECT *
FROM students
CROSS JOIN student_contact;
     * sqlite:///joins.db
    Done.

Let’s assume we’re using the equality conditional. Then rows matching in the student_id columns from both tables are returned:

There is no matching value for Reece Bells on both tables. So, values from this row are not returned. This SQL query can be evaluated using an INNER JOIN as:

%%sql

-- INNER JOIN of the students and student_contact

SELECT *
FROM students
INNER JOIN student_contact
    ON students.student_id = student_contact.student_id;
     * sqlite:///joins.db
    Done.

6. How is the LEFT OUTER JOIN executed?

Answer

First LEFT OUTER JOIN returns all the rows in the left table:

student_idfirst_namelast_namestudent_idemail_address
1MaryWilson
2TimBen
3AliceRobinson
4ReeceBells

Next, matching values from CROSS JOIN with the right tables are also returned:

Finally, unmatched rows from the left table are added to the result. Their values are padded with either NULL or None:

The SQL query for a LEFT OUTER JOIN with equality conditional is shown below:

%%sql

-- LEFT OUTER JOIN of the students and student_contact

SELECT *
FROM students
LEFT JOIN student_contact
    ON students.student_id = student_contact.student_id;
     * sqlite:///joins.db
    Done.

7. How is the FULL OUTER JOIN executed?

Answer

The FULL OUTER JOIN returns all rows from the left and the right tables, regardless of whether the conditional expression is matched. To show how this works, we’ll create two additional tables. The courses table lists the courses presently being offered in school, and the last_enrolment table lists the courses students last enrolled in.

%%sql

-- CREATE the courses table

CREATE TABLE courses (
    course_id INTEGER PRIMARY KEY,
    course_name TEXT NOT NULL
);

INSERT INTO courses VALUES (100, "Advanced Physics");
INSERT INTO courses VALUES (200, "Computer Science");
INSERT INTO courses VALUES (300, "Economics");
INSERT INTO courses VALUES (400, "Quantum Computing");
INSERT INTO courses VALUES (500, "Cryptography");

-- CREATE the last_enrolment table

CREATE TABLE last_enrolment (
    student_id,
    course_id,
    FOREIGN KEY (student_id) REFERENCES students(student_id),
    FOREIGN KEY (course_id) REFERENCES courses(course_id)
);

INSERT INTO last_enrolment VALUES (2, 500);
INSERT INTO last_enrolment VALUES (1, 500);
INSERT INTO last_enrolment VALUES (3, 400);
INSERT INTO last_enrolment VALUES (2, 400);
INSERT INTO last_enrolment VALUES (4, 111);
     * sqlite:///joins.db
    Done.
    1 rows affected.
    1 rows affected.
    1 rows affected.
    1 rows affected.
    1 rows affected.
    Done.
    1 rows affected.
    1 rows affected.
    1 rows affected.
    1 rows affected.
    1 rows affected.

Let’s view these tables:

%sql SELECT * FROM courses;
     * sqlite:///joins.db
    Done.
course_idcourse_name
100Advanced Physics
200Computer Science
300Economics
400Quantum Computing
500Cryptography
%sql SELECT * FROM last_enrolment;
     * sqlite:///joins.db
    Done.
student_idcourse_id
2500
1500
3400
2400
4111

Let’s get back to how the FULL OUTER JOIN works. First, the CROSS JOIN is evaluated:

%%sql
SELECT * 
FROM last_enrolment
CROSS JOIN courses;
     * sqlite:///joins.db
    Done.
student_idcourse_idcourse_id_1course_name
2500100Advanced Physics
2500200Computer Science
2500300Economics
2500400Quantum Computing
2500500Cryptography
1500100Advanced Physics
1500200Computer Science
1500300Economics
1500400Quantum Computing
1500500Cryptography
3400100Advanced Physics
3400200Computer Science
3400300Economics
3400400Quantum Computing
3400500Cryptography
2400100Advanced Physics
2400200Computer Science
2400300Economics
2400400Quantum Computing
2400500Cryptography
4111100Advanced Physics
4111200Computer Science
4111300Economics
4111400Quantum Computing
4111500Cryptography

Next, the entire row of the left table, last_enrolment, is returned:

student_idcourse_id
2500
1500
3400
2400
4111

Then the matching values in the right table, courses, are returned from the CROSS JOIN:

student_idcourse_idcourse_idcourse_name
2500500Cryptography
1500500Crypography
3400400Quantum Computing
2400400Quantum Computing
4111

Next, the non-matching rows in the right table are returned:

student_idcourse_idcourse_idcourse_name
2500500Cryptography
1500500Crypography
3400400Quantum Computing
2400400Quantum Computing
4111
100Advanced Physics
200Computer Science
300Economics

Finally, the empty cells are padded with NULL or None:

student_idcourse_idcourse_idcourse_name
2500500Cryptography
1500500Crypography
3400400Quantum Computing
2400400Quantum Computing
4111NULLNULL
NULLNULL100Advanced Physics
NULLNULL200Computer Science
NULLNULL300Economics

Let’s confirm that our result is correct by running a FULL OUTER JOIN query:

%%sql

SELECT * 
FROM last_enrolment
FULL OUTER JOIN courses
    ON last_enrolment.course_id = courses.course_id;
     * sqlite:///joins.db
    Done.
student_idcourse_idcourse_id_1course_name
2500500Cryptography
1500500Cryptography
3400400Quantum Computing
2400400Quantum Computing
4111NoneNone
NoneNone100Advanced Physics
NoneNone200Computer Science
NoneNone300Economics

8. How do you eliminate redundant columns from SQL JOINs?

Answer

The first way to eliminate redundant columns when running JOIN queries is to use the USING clause as our conditional expression:

SELECT *
FROM left_table
[JOIN CLAUSE] right_table
    USING [col_name, ..];

Let’s first run an INNER JOIN query:

%%sql

-- INNER JOIN of the students and student_contact

SELECT *
FROM students
INNER JOIN student_contact
    ON students.student_id = student_contact.student_id;
     * sqlite:///joins.db
    Done.

There are duplicate student_id columns. Let’s eliminate this redundancy with the USING clause:

%%sql

-- INNER JOIN of the students and student_contact with USING

SELECT *
FROM students
INNER JOIN student_contact
    USING (student_id);
     * sqlite:///joins.db
    Done.

You can see that the duplicate column has been removed.

The second way to eliminate the redundant column is to use the NATURAL JOIN. The NATURAL JOIN clause is semantically similar to the INNER JOIN .. USING clause, in which the conditional expression is equality.

Let try NATURAL JOIN with the previous example:

%%sql

-- NATURAL JOIN instead of INNER JOIN ... USING

SELECT *
FROM students
NATURAL JOIN student_contact;
     * sqlite:///joins.db
    Done.

A third way to eliminate duplicate columns is to list the column names you want displayed in SELECT:

%%sql

-- INNER JOIN list column names in SELECT and using ALIASES

SELECT 
    s.student_id, 
    s.first_name, 
    s.last_name, 
    sc.email_address
FROM students AS s
INNER JOIN student_contact AS sc
    ON s.student_id = sc.student_id;
     * sqlite:///joins.db
    Done.

So far, we’ve seen how the SQL JOIN clauses are used to combine two tables. Some of the SQL JOIN interview questions that we’ll see next will require combining and filtering data from more than two tables. Writing SQL JOIN queries for these types of questions may be challenging. Take the Combining Tables in SQL Course to reinforce your knowledge of how SQL JOINS work and prepare yourself for the next questions.

9. How is a UNION clause different from a JOIN clause?

Answer

The JOIN and UNION clauses are used to combine data from two or more tables. With the JOIN clause, the columns and matching rows from both tables are returned. The number of columns in the combined table is the sum of the number of columns on both tables.

UNION combines data from tables by stacking them vertically. The number of columns in the SELECT statement and the order of their data types must be the same. The staff and students tables can be combined with UNION as follows:

%%sql 

/*
* Both tables have columns with the same data types in the same order
* We can use the * wild card instead of specifying columns in SELECT
*/
SELECT * FROM staff

UNION

SELECT * FROM students;
     * sqlite:///joins.db
    Done.
staff_idfirst_namelast_name
1AdaLovelace
1MaryWilson
2Adam Smith
2TimBen
3AliceRobinson
3NikoloTesla
4ReeceBells
%%sql 

/*
 * We specify the columns we want returned in SELECT
*/
SELECT 
    first_name, 
    last_name
FROM staff

UNION

SELECT 
    first_name, 
    last_name
FROM students;
     * sqlite:///joins.db
    Done.
first_namelast_name
AdaLovelace
Adam Smith
AliceRobinson
MaryWilson
NikoloTesla
ReeceBells
TimBen

10. What do you understand by Non-EQUI JOIN?

Answer

Non-EQUI JOINs combine tables using other conditional operators besides equality. The other operators may include: less than (<), less than or equals (<=), greater than (>), greater than or equals (>=), not equals (<>), and BETWEEN. For example:

%%sql

SELECT * 
FROM last_enrolment
FULL OUTER JOIN courses
    ON last_enrolment.course_id > 300
    AND courses.course_id < 300
WHERE courses.course_id IS NOT NULL
    AND last_enrolment.course_id IS NOT NULL;
     * sqlite:///joins.db
    Done.
student_idcourse_idcourse_id_1course_name
2500100Advanced Physics
2500200Computer Science
1500100Advanced Physics
1500200Computer Science
3400100Advanced Physics
3400200Computer Science
2400100Advanced Physics
2400200Computer Science

11. How will you join a table to itself?

Answer

You can join a table to itself if it contains related columns. Let’s assume that Ada Lovelace is the supervisor to the other staff. We’ll add a supervisor column with Ada Lovelace’s staff_id. The supervisor and staff_id columns are related.

%sql SELECT * FROM staff;
     * sqlite:///joins.db
    Done.
staff_idfirst_namelast_name
1AdaLovelace
2Adam Smith
3NikoloTesla
%%sql

-- Add supervisor column

ALTER TABLE staff
ADD supervisor INTEGER;

SELECT * FROM staff;
     * sqlite:///joins.db
    Done.
    Done.
staff_idfirst_namelast_namesupervisor
1AdaLovelaceNone
2Adam SmithNone
3NikoloTeslaNone
%%sql

-- Update supervisor column with Ada Lovelace staff_id

UPDATE staff
SET supervisor = 1
WHERE staff_id <> 1;

SELECT * FROM staff;
     * sqlite:///joins.db
    Done.
    Done.
staff_idfirst_namelast_namesupervisor
1AdaLovelaceNone
2Adam Smith1
3NikoloTesla1

Let’s perform self join on this table to list the staff and their supervisor:

%%sql

SELECT * 
FROM staff s1
LEFT JOIN staff s2
    ON s1.supervisor = s2.staff_id
     * sqlite:///joins.db
    Done.
staff_idfirst_namelast_namesupervisorstaff_id_1first_name_1last_name_1supervisor_1
1AdaLovelaceNoneNoneNoneNoneNone
2Adam Smith11AdaLovelaceNone
3NikoloTesla11AdaLovelaceNone

You can observe that the left table, s1, contains the information for the staff and the right table; s2 contains the information about their supervisors. Let’s make things clearer by specifying column names in SELECT:

%%sql

SELECT
    s1.staff_id AS staff_id,
    s1.first_name AS first_name,
    s1.last_name AS last_name,
    s2.first_name AS supervisor_first_name,
    s2.last_name AS supervisor_last_name
FROM staff s1
LEFT JOIN staff s2
    ON s1.supervisor = s2.staff_id
     * sqlite:///joins.db
    Done.
staff_idfirst_namelast_namesupervisor_first_namesupervisor_last_name
1AdaLovelaceNoneNone
2Adam SmithAdaLovelace
3NikoloTeslaAdaLovelace

Let’s drop the supervisor column we added to the staff table:

%%sql

ALTER TABLE staff
DROP COLUMN supervisor;

SELECT * FROM staff;
     * sqlite:///joins.db
    Done.
    Done.
staff_idfirst_namelast_name
1AdaLovelace
2Adam Smith
3NikoloTesla

12. What do you understand about Nested JOINs?

Answer

Nested JOIN simply means combining more than two tables. Let’s use nested join to create a table containing students’ full information: id, names, contact, and courses last enrolled.

%%sql

SELECT *
FROM students
LEFT JOIN student_contact
    USING (student_id)
LEFT JOIN last_enrolment
    USING (student_id)
LEFT JOIN courses
    USING (course_id);
     * sqlite:///joins.db
    Done.

At the start, the students and student_contact tables are the left and right tables respectively. When these two tables are joined, the resulting table is the left table for the next join operation with the last_enrolment table, which is the right table. The resulting table from this join operation is the left table when joining with the courses table.

13. How will you use a subquery to execute INNER JOINs for more than two tables?

Answer

Both subquery and join can be used to combine data from two or more tables. Subqueries can be used to perform inner join operations if the equality conditional operator is used with the WHERE clause. This is an example of implict join:

%%sql

SELECT *
FROM (
        -- Third join operation

        SELECT * 
        FROM

            -- Second join operation

            (SELECT * 
            FROM
                (

                -- First join operation

                SELECT * 
                FROM students, student_contact
                WHERE students.student_id = student_contact.student_id

                -- End of first join operation

                ) AS sub_query_one, last_enrolment
            WHERE sub_query_one.student_id = last_enrolment.student_id

             -- End of second join operation

            ) AS sub_query_two, courses

        WHERE sub_query_two.course_id = courses.course_id

        -- End of third join operation   
)
ORDER BY 1;
     * sqlite:///joins.db
    Done.

There are duplicate columns in the result. These can be removed by listing the names of the columns in the outermost SELECT statement.

If you’re having a hard time understanding how subqueries work, our SQL Subqueries course will help you get started. If you’re new to SQL and you’re having difficulties understanding the SQL queries, we recommend that you learn the basics from our SQL Fundamentals Skill Path.

14. Write the SQL queries to execute the operations shown in the diagrams

image-2.pngimage-2.png

Answer

Diagram 1 set notation is $A – B$, read as A difference B. We want all the elements that belong to A but not B.

Let’s assume A is students table and B is student_contact. We want to return only records that are in students but not in student_contact.

Let’s view the tables:

%sql SELECT * FROM students;
     * sqlite:///joins.db
    Done.
student_idfirst_namelast_name
1MaryWilson
2TimBen
3AliceRobinson
4ReeceBells
%sql SELECT * FROM student_contact;
     * sqlite:///joins.db
    Done.

Only Reece Bells satisfies the condition in diagram 1. The query is given below:

%%sql

SELECT *
FROM students
LEFT JOIN student_contact
    USING (student_id)
WHERE student_contact.student_id IS NULL;
     * sqlite:///joins.db
    Done.
student_idfirst_namelast_nameemail_address
4ReeceBellsNone

LEFT JOIN returns all the rows in students table, and student_contact.student_id IS NULL sets the intersecting rows to NULL.

Diagram 2 set notation is $(A \cap B)^c$. This is the elements of both A and B that are not common to them. We’ll use the last_enrolment and courses tables.

Let’s view the tables:

%sql SELECT * from last_enrolment;
     * sqlite:///joins.db
    Done.
student_idcourse_id
2500
1500
3400
2400
4111
%sql SELECT * FROM courses;
     * sqlite:///joins.db
    Done.
course_idcourse_name
100Advanced Physics
200Computer Science
300Economics
400Quantum Computing
500Cryptography

The common keys are 400 and 500. These will be excluded. The returned table contains 111, 100, 200, and 300. The query is shown below:

%%sql

SELECT * 
FROM last_enrolment
FULL OUTER JOIN courses
    USING(course_id)
WHERE last_enrolment.course_id IS NULL
    OR courses.course_id is NULL;
     * sqlite:///joins.db
    Done.
student_idcourse_idcourse_name
4111None
None100Advanced Physics
None200Computer Science
None300Economics

15. Discuss a common pitfall with using JOIN

Answer

There are different types of JOINs. We may need to implement a particular type of JOIN to answer our question correctly. As you’re already aware, if you don’t specify the conditional expression, your SQL query will not fail. It will return a CROSS JOIN. If you go on to perform your analysis on the CROSS JOIN, you’ll get an incorrect result. You’ll also get incorrect results when you don’t join your tables with the appropriate conditional expressions.

You’ll also get incorrect results if you incorrectly filter your JOIN with the WHERE clause. Incorrectly filtering can result in an unintended type of JOIN. For example, a LEFT JOIN can be transformed to an INNER JOIN with an incorrect WHERE clause.

Let’s see the LEFT JOIN result:

%%sql

-- LEFT OUTER JOIN of the students and student_contact

SELECT *
FROM students
LEFT JOIN student_contact
    ON students.student_id = student_contact.student_id;
     * sqlite:///joins.db
    Done.

Now let’s incorrectly filter the LEFT JOIN to transform it into an INNER JOIN:

%%sql

-- LEFT OUTER JOIN transformed to INNER JOIN by incorrect filtering

SELECT *
FROM students
LEFT JOIN student_contact
    ON students.student_id = student_contact.student_id
WHERE student_contact.student_id > 0;
     * sqlite:///joins.db
    Done.

Logical errors are introduced into our program this way. Logical errors do not return error messages, making them difficult to detect — especially when we’re working with large tables.

Close the connection to the joins.db database:

%sql -x / --close sqlite:///joins.db

16. How will you structure data for facilitating JOIN in a one-to-many relationship condition?

Answer

In a one-to-many relationship, one record in table A can be associated to more than one record in table B. So the primary key of A is a foreign key in B. The students and last_enrolment tables share a one-to-many relationship, as a student can be enrolled in more than one course.

-- create a students table:

CREATE TABLE students (
    student_id INTEGER PRIMARY KEY,
    first_name TEXT NOT NULL,
    last_name TEXT NOT NULL
)

--create last_enrolment table

CREATE TABLE last_enrolment (
    student_id,
    course_id,
    FOREIGN KEY (student_id) REFERENCES students(student_id),
    FOREIGN KEY (course_id) REFERENCES courses(course_id)
);

In the above code snippet, the student_id PRIMARY KEY in the students table is set as FOREIGN KEY in the last_enrolment table.

17. How will you structure data for facilitating JOIN in a many-to-many relationship condition?

Answer

In a many-to-many relationship, multiple records in table A are associated with multiple records in table B. There is a many-to-many relationship between the students and courses tables. A student can take more than one course, and there can be more than one student in a course.

The many-to-many relationship in our example is broken into two one-to-many relationships. The first one-to-many relationship is the relationship between students and last_enrolment, and the second one-to-many relationship is the relationship between courses and last_enrolment. The last_enrolment table connects the students and courses tables and is referred to as a join table.

-- create a students table:

CREATE TABLE students (
    student_id INTEGER PRIMARY KEY,
    first_name TEXT NOT NULL,
    last_name TEXT NOT NULL
)

-- CREATE the courses table

CREATE TABLE courses (
    course_id INTEGER PRIMARY KEY,
    course_name TEXT NOT NULL
)

--create last_enrolment table

CREATE TABLE last_enrolment (
    student_id,
    course_id,
    FOREIGN KEY (student_id) REFERENCES students(student_id),
    FOREIGN KEY (course_id) REFERENCES courses(course_id)
);

18. Write a query to return information on people that love Jazz music

Write a query to return the email and full_names of people that have bought Jazz music. You’ll be working with the chinook.db. You can find the schema and download the file from here.

Answer

First, we connect to the database and list the tables in it:

%sql sqlite:///chinook.db --connect to the database
%%sql

-- List the tables in the database

SELECT name 
FROM sqlite_master 
WHERE type = 'table';
     * sqlite:///chinook.db
    Done.
name
album
artist
customer
employee
genre
invoice
invoice_line
media_type
playlist
playlist_track
track

We’ll return the email, first_name, and last_name columns on the customer table for the subset where the genre is Jazz. We want to connect the customer table to the genre table. The schema will be helpful here. In the schema, we’ll need to join the customer, invoice, invoice_line, track, and genre tables to get the information we want.

%%sql

SELECT
    DISTINCT customer.email,
    customer.first_name || " " || customer.last_name AS full_name
FROM customer
INNER JOIN invoice
    ON customer.customer_id = invoice.customer_id
INNER JOIN invoice_line 
    ON invoice_line.invoice_id = invoice.invoice_id
INNER JOIN track
    ON track.track_id = invoice_line.track_id
INNER JOIN genre
    ON genre.genre_id = track.genre_id
WHERE genre.name = 'Jazz'
ORDER BY 1;
     * sqlite:///chinook.db
    Done.

It’s possible for a customer to buy several Jazz music items. So the customer details will be returned as many times as they bought Jazz music. We’ve used the DISTINCT keyword to return only one detail for this type of customer.

19. Write a query to return the top 5 spenders on Jazz and Rock

Answer

We’ll continue to work with the chinook.db. We need to join the customer and genre tables and filter by the type of genre. We’ll be returning the customer full_name, and the amount_spent. To get the amount spent, we’ll look at the invoice_line table for the unit_price and quantity purchased.

A single customer can buy music from a genre more than once. Their information will apply multiple times. You need to GROUP BY the customer full_name and SUM the values from the product of unit_price and quantity to get the total amount spent on that genre:

%%sql

-- Top 5 Spenders on Jazz

SELECT
    customer.first_name || " " || customer.last_name AS full_name,
    ROUND(SUM(invoice_line.unit_price * invoice_line.quantity), 2) AS amount_spent
FROM customer
INNER JOIN invoice
    ON customer.customer_id = invoice.customer_id
INNER JOIN invoice_line 
    ON invoice_line.invoice_id = invoice.invoice_id
INNER JOIN track
    ON track.track_id = invoice_line.track_id
INNER JOIN genre
    ON genre.genre_id = track.genre_id
WHERE genre.name = 'Jazz'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 5;
     * sqlite:///chinook.db
    Done.
full_nameamount_spent
Fernanda Ramos15.84
Enrique Muñoz15.84
Hannah Schneider14.85
Astrid Gruber14.85
Kara Nielsen3.96
%%sql

-- Top 5 Spenders on Rock

SELECT
    customer.first_name || " " || customer.last_name AS full_name,
    ROUND(SUM(invoice_line.unit_price * invoice_line.quantity), 2) AS amount_spent
FROM customer
INNER JOIN invoice
    ON customer.customer_id = invoice.customer_id
INNER JOIN invoice_line 
    ON invoice_line.invoice_id = invoice.invoice_id
INNER JOIN track
    ON track.track_id = invoice_line.track_id
INNER JOIN genre
    ON genre.genre_id = track.genre_id
WHERE genre.name = 'Rock'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 5;
     * sqlite:///chinook.db
    Done.
full_nameamount_spent
Helena Holý75.24
François Tremblay74.25
Luís Gonçalves71.28
Hugh O’Reilly71.28
João Fernandes67.32

You can easily get the amount_spent values wrong if you use the total column in the invoice table. This is because a customer can buy songs from more than one genre on the same invoice, and the total column will contain the full amount spent, not the amount spent on a particular genre.

This is demonstrated in the query below. Ellie Sullivan bought 10 songs from different genres on 2017-04-16. The total amount spent on all the songs is 9.9, which is different from the amount spent on a particular genre.

%%sql

-- Common mistake made using invoice total to calculate amount spend on a genre

SELECT
    customer.first_name || " " || customer.last_name AS full_name,
    genre.name,
    invoice_line.unit_price,
    invoice_line.quantity,
    invoice.total
FROM customer
INNER JOIN invoice
    ON customer.customer_id = invoice.customer_id
INNER JOIN invoice_line 
    ON invoice_line.invoice_id = invoice.invoice_id
INNER JOIN track
    ON track.track_id = invoice_line.track_id
INNER JOIN genre
    ON genre.genre_id = track.genre_id
WHERE customer.first_name = 'Ellie' AND customer.last_name = 'Sullivan' AND invoice.invoice_date = '2017-04-16 00:00:00';
     * sqlite:///chinook.db
    Done.
full_namenameunit_pricequantitytotal
Ellie SullivanMetal0.9919.9
Ellie SullivanRock0.9919.9
Ellie SullivanRock0.9919.9
Ellie SullivanRock0.9919.9
Ellie SullivanRock0.9919.9
Ellie SullivanRock0.9919.9
Ellie SullivanRock0.9919.9
Ellie SullivanRock0.9919.9
Ellie SullivanMetal0.9919.9
Ellie SullivanAlternative & Punk0.9919.9

20. Simplify the query to return the top 5 spenders on Jazz with CTE

CTE stands for Common Table Expression. It enables us to simplify complex queries by deconstructing them into simple blocks. We initiate a CTE with the WITH clause. The SQL query that answers the previous question is a bit complex. We can make the query more readable by performing most of the join operations with CTE.

%%sql

-- Start of JOIN operations with CTE

WITH cte_table AS (
    SELECT
        invoice.customer_id,
        invoice_line.unit_price,
        invoice_line.quantity
    FROM invoice
        INNER JOIN invoice_line 
            ON invoice_line.invoice_id = invoice.invoice_id
        INNER JOIN track
            ON track.track_id = invoice_line.track_id
        INNER JOIN genre
            ON genre.genre_id = track.genre_id
    WHERE genre.name = 'Jazz'
)

-- End of JOIN operations with CTE

-- Simplified query to return the top 5 spenders on Jazz

SELECT
    customer.first_name || " " || customer.last_name AS full_name,
    ROUND(SUM(cte_table.unit_price * cte_table.quantity), 2) AS amount_spent
FROM customer
INNER JOIN cte_table
    ON customer.customer_id = cte_table.customer_id
GROUP BY 1
ORDER BY 2 DESC
LIMIT 5;
     * sqlite:///chinook.db
    Done.
full_nameamount_spent
Fernanda Ramos15.84
Enrique Muñoz15.84
Hannah Schneider14.85
Astrid Gruber14.85
Kara Nielsen3.96

Close connection to chinook.db:

%sql -x / --close sqlite:///chinook.db

Takeaway

In this tutorial, we discussed the top 20 SQL JOINs interview questions and provided a step-by-step guide to their answers. If you aspire to become a business analyst, data analyst, or data scientist, knowledge of SQL is one of the leading requirements for your role. No two ways about it: you must learn SQL programming.

As you work with relational databases, one of the most common SQL operations you’ll perform is joining tables. This is why SQL JOIN questions are so popular with interviewers. At your SQL interviews, you’ll get several questions on SQL JOINs to test your knowledge of how they work. We’ve provided answers to some of the most common questions you may be asked.

If you haven’t learned the basics of SQL yet, this tutorial may not be helpful. Learn how to explore, query, join, and filter data from relational databases in our SQL Fundamentals Skill Path. You’ll master the SQL syntaxes and basic commands that you’ll need to start working with relational databases. On our interactive learning platform, you get to practice writing your own SQL queries.

We understand that porfolio projects are very important, and they help showcase your SQL programming and data analysis skills to potential recruiters. You get to complete real-life SQL projects on the SQL Fundamentals Skill Path. We’ve also curated 10 exiciting SQL project ideas to help you get more practice and projects in your SQL portfolio.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *