What are the tree types of results that a subquery can return give an example of each?
Starting here? This lesson is part of a full-length tutorial in using SQL for Data Analysis. Check out the beginning. Show
In this lesson we'll cover:
In this lesson, you will continue to work with the same San Francisco Crime data used in a previous lesson. Subquery basicsSubqueries (also known as inner queries or nested queries) are a tool for performing operations in multiple steps. For example, if you wanted to take the sums of several columns, then average all of those values, you'd need to do each aggregation in a distinct step. Subqueries can be used in several places within a
query, but it's easiest to start with the
Let's break down what happens when you run the above query: First, the database runs the "inner query"—the part between the parentheses:
If you were to run this on its own, it would produce a result set like any other query. It might sound like a no-brainer, but it's important: your inner query must actually run on its own, as the database will treat it as an independent query. Once the inner query runs, the outer query will run using the results from the inner query as its underlying table:
Subqueries are required to have names, which are added after parentheses the same way you would add an alias to a normal table. In this case, we've used the name "sub." A quick note on formatting: The important thing to remember when using subqueries is to provide some way to for the reader to easily determine which parts of the query will be executed together. Most people do this by indenting the subquery in some way. The examples in this tutorial are indented quite far—all the way to the parentheses. This isn't practical if you nest many subqueries, so it's fairly common to only indent two spaces or so. Practice Problem Write a query that selects all Warrant Arrests from the Try it out See the answer The above examples, as well as the practice problem don't really require subqueries—they solve problems that could also be solved by adding multiple conditions to the Using subqueries to aggregate in multiple stagesWhat if you wanted to figure out how many incidents get reported on each day of the week? Better yet, what if you wanted to know how many incidents happen, on average, on a Friday in December? In January? There are two steps to this process: counting the number of incidents each day (inner query), then determining the monthly average (outer query):
If you're having trouble figuring out what's happening, try running the inner query individually to get a sense of what its results look like. In general, it's easiest to write inner queries first and revise them until the results make sense to you, then to move on to the outer query. Practice Problem Write a query that displays the average number of monthly incidents for each category. Hint: use Try it out See the answer Subqueries in conditional logicYou can use subqueries in conditional logic (in conjunction with
The above query works because the result of the subquery is only one cell. Most conditional logic will work with subqueries containing one-cell results. However,
Note that you should not include an alias when you write a subquery in a conditional statement. This is because the subquery is treated as an individual value (or set of values in the
Joining subqueriesYou may remember that you can filter queries in joins. It's fairly common to join a subquery that hits the same table as the outer query rather than filtering in the
This can be particularly useful when combined with aggregations. When you join, the requirements for your subquery output aren't as stringent as when you use the
Practice ProblemWrite a query that displays all rows from the three categories with the fewest incidents reported. Try it out See the answer Subqueries can be very helpful in improving the performance of your queries. Let's revisit the Crunchbase Data briefly. Imagine you'd like to aggregate all of the companies receiving investment and companies acquired each month. You could do that without subqueries if you wanted to, but don't actually run this as it will take minutes to return:
Note that in order to do this properly, you must join on date fields, which causes a massive
"data explosion." Basically, what happens is that you're joining every row in a given month from one table onto every month in a given row on the other table, so the number of rows returned is incredibly great. Because of this multiplicative effect, you must use The following query shows 7,414 rows:
The following query shows 83,893 rows:
The following query shows 6,237,396 rows:
If you'd like to understand this a little better, you can do some extra research on cartesian products. It's also worth noting that the Of course, you could solve this much more efficiently by aggregating the two tables separately, then joining them together so that the counts are performed across far smaller datasets:
Note: We used a Practice ProblemWrite a query that counts the number of companies founded and acquired by quarter starting in Q1 2012. Create the aggregations in two separate queries, then join them. Try it out See the answer Subqueries and UNIONsFor this next section, we will borrow directly from the lesson on UNIONs—again using the Crunchbase data:
It's certainly not uncommon for a dataset to come split into several parts, especially if the data passed through Excel at any point (Excel can only handle ~1M rows per spreadsheet). The two tables used above can be thought of as different parts of the same dataset—what you'd almost certainly like to do is perform operations on the entire combined dataset rather than on the individual parts. You can do this by using a subquery:
This is pretty straightforward. Try it for yourself: Practice ProblemWrite a query that ranks investors from the combined dataset above by the total number of investments they have made. Try it out See the answer Practice Problem Write a query that does the same thing as in the previous problem, except only
for companies that are still operating. Hint: operating status is in Try it out See the answer What are the four different types of results that a subquery can returned explain what each type is and when it is used?Type of Subqueries
Single row subquery : Returns zero or one row. Multiple row subquery : Returns one or more rows. Multiple column subqueries : Returns one or more columns. Correlated subqueries : Reference one or more columns in the outer SQL statement.
What are the four different types of results that a subquery can Returned?Subqueries can return different types of information:. A scalar subquery returns a single value.. A column subquery returns a single column of one or more values.. A row subquery returns a single row of one or more values.. A table subquery returns a table of one or more rows of one or more columns.. Which two types of results can subqueries return?Types of SQL Subqueries
Returns zero or one row in results. Returns one or more rows in results.
What is subquery explain types of subqueries with example?A subquery is best defined as a query within a query. Subqueries enable you to write queries that select data rows for criteria that are actually developed while the query is executing at run time. More formally, it is the use of a SELECT statement inside one of the clauses of another SELECT statement.
|