And here we find ourselves in Lab 5. We're going to create some new permanent tables and some views and look at cache along the way. So fresh off your new knowledge of permanent temporary tables and views, let's put that knowledge to the test. If you haven't already attempted Lab 5, feel free to pause this video again, And come back to it for the full walk through. All right, so here we go. So, the first thing we want to do is actually create that data set, which is going to be the bucket that's going to hold all our tables and views. So back in BigQuery, if you haven't created that data set already. This is still live from the lecture content that we covered. But to review, create that new dataset here. You can have multiple datasets for your project. And you have those tables and views that already existed you created earlier. So now that we have a dataset. We need to have a query that we want to restore the results of. So saving query results into a permanent table. So as you saw before, there are two different ways of storing the permanent tables from a query. One is before the query is ran in that show options setting it as a destination table, and second after you've ran the query you can click that save as table button. So you've got this monstrous query here. So I'm going to copy that to my clip board. Paste that in. Run this big old query. And for those of you wondering what this query actually does, if you're interested in subqueries and breaking apart complex queries into different pieces, that's covered in the next course, Achieving Advances in Science. We're going to be breaking apart complex queries and introducing the concepts of common table expression which are pretty much just like those temporary tables that you can use to project out a very difficult query. But for understanding, we're going to cover this query in particular in greater depth in that course. But for now, just suffice it to say, that it brings together a lot of different data from multiple tables for 2015 and pulls some of the common fields for a lot of those EINs as well. So you have the names of the charities and their identification numbers, and some key facts about them like employees contributions and a bunch of other good interesting fields and whether or not they've electronically filed. So on off on query but we don't want to keep rerunning this right? So we want to store these results, and the way to do that again is we kind of save this table and lets follow the instructions and see what we have here. So save it as a table, perfect. And we already have the irs_990 dataset. I think this is irs_990_2015_reporting. Let's confirm that. Perfect as the destination name. Leaving the rest of those default values. And, you see a job immediately kicks off. It stores it really, really fast. That's because we've already ran this query and it's just copying over those 95 MB. And what it's doing behind the scenes, of course if you remember from the content that we just covered in the module. There is a temporary anonymous table as soon as this query is ran. Those results are somewhere. They're stored in an anonymous temporary table and when you specifying as a permanent table, what BigQuery does behind the scenes, wisely it says, Hey, temporary table you've just been promoted to a permanent table. We're just going to give you a fancy new name and have you be accessible through the UI. And if you don't, then the temporary tables themselves are cleared after 24 hours. And again, that's the basis of that query cache that we talked about before. Your new permanent table will show up here, which is fantastic, and next up is creating that view. Okay, so we have the query that's still open from last time and I'm going to save this as a view. As we reviewed in our demo, the way you want to do that is click on Save View. We want to give this a similar name. And it's always good, in my personal experience, to make it very clear that it's a view. So as a matter of personal preference, I always put _view at the end so you don't have to rely on the folks realizing that, hey, it's a different icon. And that's good practice, because again, when you run a review, you're re-running that query underneath every single time, and it just let's folks know that's a view. Okay and let's confirm that's a view, again. We don't see the preview option to review the data because to review, there are no materialize views in BigQuery. The data is not stored as part of the view. The query's reran every single time. So scrolling down to the bottom of the details pane, we'll see again the query that makes up the results here and if we wanted to add an extra comment to the view. Full string out duplicate records here, just a very simple addition. We can save that view and again there's no need to rerun the query that created the view because when anyone actually accesses, like does a select start from your view its going to reference the save query and run that anyways. Compare that to a permanent table where if you changed a permanent table's query that created it, until you rerun that query, the permanent table resolves. That table itself is not going to change until you've overwritten it, appended it, or get into that collision error. If you said write the records if its empty, and if the table's not empty again you'll get that collision error where it says, hey I'm not going to overwrite this table unless you explicitly tell me. And back in the lab, editing an existing view. We've done that, we've saved it. Exporting results, so just a brief review. You can export the results into a CSV or google spreadsheet. Let's paste over in our New Query editor. We are done with this view. going to give us a little bit more screen real estate here. Select all that and run this query. So I have access to this data-to-insights project, but for us, what we're going to be doing is we want to make sure that you're using the project name that you've created the data set in. And this is your project name, then the data set, and then the view if you named it exactly as such right here. Or, if you wanted to, I'm going to pull open the comments here. Delete all of this. I can query that view directly again. Click on that view. Click on query view. And I'm going to paste in what we want to do, which is filter on greater than 1,000 employees and format this. Select all the columns. This is just recreating that from what we did before, where clicking on the column here if you don't want to type it out when the number of employees is greater than 1,000. And then terminate that query with a semicolon. You can run that query. Its basically running a filter on that view. Again to review behind the scenes, I'm going to access that view, show you again the query that's running, so when we do SELECT * from this massive piece right here, this is actually running almost like a nested query. Its running all of the SQL that makes up that view and then its filtering it for the number of employees greater than 1,000. So we have the query that we're going to run. We've ran it. There are almost 3,000 charities in our view, that have greater that 1,000 employees for the 2015 filings. And then you can go ahead and save that to Google Spreadsheets. This is going to ask you to sign in for the account. Allow Google Drive. It's saving it to the Google Spreadsheets, and then pretty soon it should provide you a link to view, which it does. Great, go ahead and click on that. And you can see the CSV stored here. So this is again, if you want to collaborate with other folks, you can use something like Google Spreadsheets, or maybe export this to a CSV in Google Drive and do offline manipulation. But, again, this data gets quickly stale and you'll, again, breaks the fact of storing one fact in one place. So, as soon as you start exporting things out of BigQuery, just be very cognizant of underlying data changes. And it's very easy to change and delete values in a spreadsheet. Whereas, a database, much harder to do so. Okay back in the lab what else do we got? And again don't try to export. The reason why we store a lot of this stuff in the analytics warehouse like big queries, it can handle millions and billions of records if you're exporting a billion records into Google Spreadsheets or even a CSV, you could potentially just bomb out whatever you're storing it as. And let's see, last but not least, let's take a look at the query cache and just confirm that cache is working as it should, pasting over the query. We have the same view as before. I'm just going to go ahead and copy over our project name. Your project name, your data set for us again is irs_990, as I have here. If you have that same convention, great. And let's run this query. It's going to process 60 MB and run in however many seconds. And now, let's rerun it again and it's much, much faster and it's using that cache because again, behind the scenes all of these queries that we'd run where it doesn't have something that cannot be known at the time of running, which is the current timestamp deterministic. You'll have the ability to pull from query cache, because all these results are stored in a temporary table unless you've promoted it to a permanent table, as we discussed a little bit before. Great, you've made it to the end of Lab 5. So, just to recap what we've done, the projects in BigQuery contain multiple different data sets and multiple different data sets are available. We have one data set here and inside of the data set that we have, you could have different tables and views, multiple of them of course. And should you need to delete them or share them, you can actually do that through here and export. And your views have a different view icon and keep in mind that you can actually scroll down and edit the details of the view and see the query that makes it up. This is very interesting for performance reasons as well. If you're inheriting views from other people, you could actually see how well written their code is and sometimes you could be doing what could be a very simple query, but the underlying view itself is complex which could delay URL processing. And that's it. Next up, since we reviewed how to store data from data sources that already exist inside a BigQuery. Like, all these queries existed on that IRS data that existed before, we're going to cover how you can actually ingest new data, like datasets that you own in CSVs or other ingestion options, into BigQuery itself by creating those brand new datasets. Stick around.