Welcome back. You're now in course number two of the data analyst specialization. Here's where we get to look at fun concepts like bringing new data into BigQuery and visualizing it. Now, one of the core building blocks of data analysis is creating and running your SQL queries on raw data sources, and then saving those results into new tables that you can access later. And here, we'll cover the difference between permanent and temporary data tables, and how to store the results of your queries. Okay so, so far you've queried data sets that have already existed out in the public space in BigQuery. Now it's time to create your own and store some new tables permanently for you to access later. And the benefit here again is, if you had a SQL query and you have these amazing insights and results and you want to store it. You needed to do so in a permanent table or some of the other options that we're going to show you. So let's actually hop into a demo which will explain this a little bit more clearly. And here we are back in the BigQuery web UI. And our main goal here is to write a query and store the results into a new table. So how do we do that? Well, first, we need to write a query. Our question that we want to answer is give us all charities or nonprofits that have filed as a school, and have also filed with a paper file instead of electronic for 2015. So first, we have to figure out what table that we're going to query. We're actually going to go to the filings tables, which is the irs_990_2015. And what I like to do is make sure that you have standard SQL mode again enabled by the Show Options, disabling that legacy SQL, query the table, confirm the new query. And the reason why again I like to do this is I'll often misspell the table name or forget the backticks. So I just use this query table function to automatically give me that table name for the from statement of the SQL query. And I don't have all the fields memorized, so I'm actually in the schema and I'm looking for my key fields that I want to filter on. So the first thing is the efiling indicator, which we see right up here. So I know I'm going to immediately format this query, remove the limit. And I want to filter the results where the electronic filing indicator is for paper. And again, since it's a string value, we're going to add that in single quotes. Next, if you remember, we're going to find the school code. I'm just doing a simple find in my web browser, and this brings me down to the schools field here. And to chain multiple filters for your where clause, what you do is you just add the and operator there. And of course you could use and or or a combination of ands and ors. But if you start getting really complex, be sure to use parentheses, so things are a little bit more clear. And instead of copying and pasting this, I can actually just click on that, and it'll bring the field name right into our query editor. So operates as a school, we want that to be yes, as you saw in some of our previous queries. And we want to include all the results from this table, so we're actually just going to do a SELECT *. And on my clipboard from earlier I actually copied and pasted our standard SQL hashtag. And the comment of what we're actually doing, which is selecting the schools that used paper filings in the 2015 filing table. Last check, of course, is make sure that it's a valid query, format it and check the query validator. It's going to process a certain amount of data, which is great. Again, you're processing everything that's in this table. Let's verify the size of that table again by clicking on Details, it's 445 megabytes. You're processing all 445 again, even though you're filtering it. Because BigQuery, the query engine, needs to look at all of those records to see whether or not they meet those conditions. So that all makes sense. So this is nothing new that you all haven't seen before. But what we want to do now is, this is the irs_990 dataset that's created by the BigQuery team and stored under the, I'm going to scroll up here, the bigquery-public-data set. But we don't have anything in our own private data sets that we want to create and store these results of these queries in. So what you can do, and we're introducing here, is if you run this query, we'll show you two ways you can do it. You can run a query, and after it's done processing the result, you can actually do quite a few things with the resulting query results that you see here. So first the key insight here is that for 2015, there were about 3,600 schools that filed with a paper filing for 2015, so that's our quick insight there. And you can do a lot of things with these results. You can actually save it to a Google spreadsheet, you can download it as JSON or download it as a CSV should you want. But what we actually want to do is save this as a table that we can use later on in our queries. So we don't have to keep repeating this query right here. And how you do that is, after you've run the query, you can actually click on this button Save as Table. That's going to bring up a box here. You can give your table a name. But one of the interesting things that you're going to see here is we have our project, and as you remember with how the query is set up from the from statement, you have a project.dataset.table name. And since we don't have a data set created yet, or you might not already for your project, ours is blank, we're going to need to go ahead and create one first. So very simple, for our project again, this is our quick lab sandbox environment. All we have to do is select the drop down arrow, create a new dataset, specify a name. Here we're just going to use irs_990 for this dataset. And again, a dataset is just a collection of tables. In here, that arrow is grayed out for this dataset because there's no tables that we've stored yet. But now that we have a dataset, we can go back and just save it as table. And you see the destination dataset where we're going to be storing this in our particular Google Cloud Platform project, now it does exist. So we're going to give this a creative name like irs_school_paper, and we'll add in 2015. And again, best practice is to use underscores and all lowercase for your table names, just something so you can remember it. So now we've actually kicked off another query job. If you caught it very quickly, under Job History, there was actually one that was there running that job. And if you click into your new dataset, you can actually see that new table that was just created right there. If you don't see that off the bat, feel free to refresh you browser and see if that'll do it for you. So we want to make sure that we save this query. I'm going to save it as IRS 2015 Schools, Who had a paper filing, just so we don't lose our work. And you could access all those previous save and load jobs through your Job History and access all the queries you've executed through your Query History. And again, this is stored for about six months. But be sure, if there's queries that you want to store forever, make sure you save those queries, and you can access those saved queries at any time. Okay, let's preview the results of the table that we just created. Let’s first look to see when it was created. So here’s the schema, and nothing new here since we just did a SELECT *. Now, if you look into the details, you’ll see at the time of this recording, that’s exactly now, and you see that the number of rows as we saw in the pagination was about 3,600. And let's confirm our filters took place in the preview. So p for paper filers, for the electronic filing indicator. And if we scroll over, operates as a school, we have all yeses here in the preview that we have. So fantastic, we've stored this query now permanently. And now another way that we want to cover actually storing this table. So one is you can actually execute that query and then click on that Save Table Results. But if you knew ahead of time that you wanted to store that query into a table, what you can actually do is the following. So go ahead and click on Show Options, when you get to this in your labs. And then in that first option where it says Destination Table, that's your permanent table where you're actually going to be writing to these results. So you can select a table. This is the dataset ID. I'm going to give it the exact same name for a reason that you're going to see in just a moment. Store the results of this query that we're about to run or run again, and here's the key option. You can store the results in that table. If the table is empty, then you can write to it. Or if it already exists and there's records that are present, you can actually append or keep adding on more and more and more rows and more and more records. Or you can completely overwrite the table. So I'm going to show you what happens, I'm going to hide these options now. Very valid query, you can see the destination table is now shown here next to your standard SQL dialogs as well. These are all your options, right, for this particular query. Now, what do you think's going to happen if I run this? Let's go ahead and run it. I see a query failed, right? So as you saw in the options, the table already exists and we did not specify the overwrite option or the append option. So let's go back into Show Options. If you wanted to, I mean, this is a good thing, right? So that table already exists so it's not going to blast out those results unless you explicitly tell it to. And for this particular case, we don't want to keep appending records because that's just going to create duplicative records for us. So we want to click on overwrite table if it exists, hide the options again, rerun the query. And it gives you this helpful pop up, by saying, hey, by the way, this table already exists, are you 100% sure that you want to overwrite it with this query? Because it could be a new query that you've written on top of it, and you just accidentally typed in the same name. So this is a good double-check mechanism there. So rerunning this query, and one of the neat things that I'm going to show you after this, it executed as another job for loading that table. Inside of that new table, now, what you can see is that it indeed is not a new table because this table is still the old table that we created first. It's just been modified, right, just a couple minutes later. So that's how you can tell where you created a table and then new records were added, because the creation time and the last modified time are actually different. Okay, so that's how you create a permanent table. And we'll go into a few of the other topics around what temporary tables are and how that relates to query cache. And then also how to save a query as a view instead. So let's jump back into the lecture slides.