Interviewer Insights  
About Us  
Comments, Suggestions  
  or General Feedback  
Interested in writing for  
Tech Talk Tips?  

 

 


Database Admin
Using SQL statements

by Mike Wilkins

This month's database question is on SQL and is authored by Mike Wilkins. The question assesses the skill of low to mid-level skilled Developers and DBA's using SQL. Mike has depicted a scenario that many database pros will need to be able to work around. Posing scenarios to your candidates is one of the best ways to assess their real world experience. Ask a number of questions around the scenario to get a broad understanding of the candidate's skills. Next month's newsletter will include a continuation of this scenario to address more senior-level skills.



Background information:

"A company has a simple database. It contains two tables, customer and customer_location. The customer table contains basic information about our customer, including customer name, date they became a customer, national contact, etc.; while the customer_location table contains detailed information about our customer's offices. For example, the customer_location table will contain the address, city, state, zip code, location contact, phone number, fax number, etc.
One customer may have many branch locations, so the relationship is one-to-many. The customer table has a primary key on the customer_id column, which is a unique 'serial' number, like '762345'. The customer_location table also contains a column called customer_id, which references the column in the customer table via a foreign key constraint. Therefore, there is a primary key - foreign key relationship between the customer_table.customer_id and customer_location.customer_id. The data-entry department just finished loading the first 500 customers into the database"

Provide a SQL statement that returns all columns and rows from these two tables, joined by the customer_id (PK-FK).

What to look for in your candidate's response:

SQL is quite flexible, so there are several ways to get the desired result.

ANSWERS (see comments after the answer variations)

ANSWER variation 1 (partially correct)
SELECT * from customer, customer_location
WHERE customer.customer_id=customer_location.customer_id

ANSWER variation 2 (partially correct)
Select customer.customer_id, customer.customer_date, customer.founded_date, customer.web_site,
customer.contact, customer.stock_symbol,
customer_location.location_id, customer_location.address1, customer_location.address2,
customer_location.city, customer_location.state, customer_location.zip,
customer_location.phone
FROM Customer, customer_location
WHERE customer.customer_id = customer_location.customer_id

NOTE: The qualifiers shown in blue are only necessary when the column name is common to tables referenced in the query. Since it is the only column that is common in this query, only customer.customer_id MUST be qualified. Also, an alias may be substituted for the table name, specified in the FROM clause. For example: "FROM Customer C, customer_location CL"



CONCEPT ONE: JOINING THE TABLES

The first important concept a candidate should know is that a JOIN is required. If we simply use:

"select * from customer, customer_location"

the query will run, but it will return the 'CARTESIAN PRODUCT' of the tables. The Cartesian Product is EVERY row from the customer table joined with EVERY row from the customer_location table, without regard for values in any column. The WHERE clause is used to JOIN the tables.


CONCEPT TWO: QUALIFYING COMMON COLUMNS

The second important concept is that common columns in a multi-table query must be qualified with the table name or alias. Joining the table using the WHERE clause, "WHERE customer_id = customer_id", will cause the query to fail because the columns are ambiguous. The database cannot (and should not) assume to which table the column belongs. The same rules apply to common columns referenced in the SELECT or other portions of a query.


CONCEPT THREE: PROBLEMS ENCOUNTERED WITH EQUI-JOINS (BONUS)!!

The more experienced candidate will recognize that a JOIN involving a one-to-many relationship may drop some rows from the query output if there is not a matching customer_location row for every customer row. The primary-to-foreign key constraint does not allow insertion of a customer_location record without a matching value customer_id in the customer table. So there will never be a customer_location (child) row without a matching customer row (parent). However, we can insert a row into customer without having a matching customer_id value (row) in the child table, customer_location. In this case, the simple equi-join, "customer.customer_id = customer_location.customer_id", is not true, and the row from the customer table is not returned.

For example:

customer.customer_id customer_location.customer_id
100 100
100 100
200 200
300 300
300 300
300 300
400
** No matching record **
500 500
500 500


The queries listed above would exclude the customer having the customer_id value of 400 because there is no matching value in the customer_location table.

The knowledgeable candidate will mention that an OUTER JOIN will resolve the situation.

More questions on this scenario next time...


About the author


Mike Wilkins is an Oracle Master DBA with more than twenty years IT experience in a variety of technologies. Mike's roles include, Database Designer/modeler, Programmer-Analyst, Database Administrator, UNIX Systems and Network Administrator, Network Designer/Analyst, and others. Mike has performed 300+ technical interviews in the last five years.



REVIEWNET is a trademark and service mark of ReviewNet Corporation © 1997-2001,
Jenkintown, PA - All Rights Reserved. This information is confidential and proprietary
to ReviewNet Corporation. Use, duplication or misappropriation is strictly prohibited.