All right. In the lecture, we're going to look at Galaxy's Workflow capabilities. So, in the last lecture, we looked at overlapping exons with repeats. But, and, and we looked a very specific dataset. All right. We looked at human chromosome 22, we looked at coding exons there and we looked at repeat elements there. But there's nothing about that analysis that is actually inherently linked to human genome to particularly, the fact that we're looking at exons or that we're looking at repeats. It's just a series of steps for overlapping one, one set of elements with another and, and then studying this core. And so Galaxy Workflows provide a way to take this sort of abstract analysis and represent it, so that it can then be run on other datasets. It's an abstract representation of a multi-step analysis. And the way these are represented in Galaxy is as a set of tools and the flow of datasets between them without actually being tied to any particular datasets. There are two ways to create workflows in Galaxy. We can either do it from scratch, using the Galaxy Workflow editor to select a set of tools and then make it Connections between them. But more commonly, we create workflows by example. And this means that we actually perform an analysis interactively within the Galaxy environment and then extract the workflow from the history that resulted from that analysis. So let's go ahead and actually extract the workflow from the analysis that was done in previous lecture. So, if we go to web browser again, starting back at galaxyproject.org. But what you want to do is go back to usegalaxy.org, the main instance of Galaxy. And assuming it hasn't been too long since you looked at this before, you'll still be logged in. But if not, you'll want to make sure that you log in, that will mean that your previous history is available to you. You'll see that though, that I actually have a new blank history. So what we can do is go to History options again and go to Saved Histories. This is only available as a logged in user and you can see all of my previous histories, in this case. You should have a previous history that has the results of the analysis we did in the last lecture. So, if you go ahead and click on that, it will now become your current history. I'm going to ahead and click up here, where it says unnamed history. That's a history name. And I'm going to actually put a name in here, Exons and Repeats. To make. Hit Enter to make it easier to recognize this history in the future. As an aside, if you're using a newer version of Galaxy or the main Galaxy instance, you'll see this button up here, which is called view all histories. This gives you an even easier way to look at all of your different histories, so you, you can actually see the contents of all the histories you might have. You see your current history over here, you can copy datasets into your current history by dragging, you can switch to any of your other histories by clicking switch to. So this is a feature that makes it very easy to navigate between many histories. You'll find as you use Galaxy, that you it makes sense to have a lot of histories, because this is a way that you can divide up all of your different analysis. You can just click Done or click on Analyze Data to go back to the main Galaxy user interface. Okay. So we want to extract a workflow that represents the analysis that we did here. We click on the cog in the History panel again and just select Extract Workflow. So now we get a view, where we can actually select which steps of the history we actually want to keep for our workflow. In this case, we actually want to keep them all. But sometimes, you might make mistakes during your analysis and so you might just want to select a subset of tools to keep. You'll see here for UCSC Main, a statement that this tool cannot be used in workflows. That's because it's a get, data tool, there's not an easy way to reproduce query. Most external databases and it makes sense to treat it as an info dataset for this workflow and so we're going to have two input datasets. These are datasets that provide when we run the workflow, the genes and their peaks in this case. And then the work flow will have the steps join group join to datasets and cut. And so we can go ahead and click Create Workflow. And now, if we go over to the Workflow tab, this is the workflow view of Galaxy. And if this is your first time using Galaxy, you only have the one workflow, which is this workflow constructed from history Exons and Repeats. I can select it and click Edit and this pulls up the Galaxy Workflow editor. So, in the Galaxy Workflow editor, we can actually make modification to workflows or as I said, we could create a workflow from scratch. You can, this is the workflow canvas and you can move it around by dragging the background or by dragging this blue box in this little mini map over here. So here are our, you know, this, this is showing the flow of our workflow. So first, this was our exons. We joined it against repeats, we grouped on the output. We joined the back to the original exons and then we perform a cutoff operation. So, I'm going to make a minor modification here. By default, this is just called Input dataset. I'm going to give it the name Exons and for the other I'm going to give it the name other feature. Okay. So quick cog. Say, Save. We've saved this workflow, we'll go back to analyze data. So now what I want to do is actually run this workflow on a new type of data. So, I'm going to go ahead and say, get data, UCSC Main table browser again. And the dataset I'm going to get is in the section regulation and it's called CpG islands. These are just dense regions of CpG dinucleotides, they tend to be associated with promoters. We still have chromosome 22 selected here, but make sure you're just getting chromosome 22. The, the format and sent to Galaxy. So what I'm going to do now is, rather than using the repeats, I'm going to use our workflow to run exactly the same steps, set of steps that I ran previously, but using the exons and the CpG islands. So how do I run it in a workflow? Down at the bottom of the tool panel, you'll see a button that says, All Workflows. And if you click that, you'll see a list of all of the workflows that you have within your Galaxy account load in the main panel of Galaxy. And at the top here, they should be ordered in terms of creation time and so at the top here I have my workflow constructed from Exons and Repeats. So this is the run workflow view and you can see here, it's hidden. The various details of the other steps of the workflow, but because these two first input data sets, the steps actually require, you need to select a dataset. I'm going to do so and you see that they're labeled as, with the labels that I've provided in the workflow editor. And so, I'm going to use for the first input and CpG islands for the second. The other thing I'm going to do here is to make my life real simple I'm going to say, send results to a new history. And then it'll ask me to provide a name, I'm going to call this one Exons and CpG. Exons and CpG island. Now, if I run the workflow. It now tells me that the following datasets have been created and they've been created in the new history called Exons and CpG islands. And I'll use the URL history's link again and I can switch to that history. By clicking, switch to. So you'll see what's happened is that, all of the steps that I did before. The four steps, join, group, join two datasets and cut have all been added to the history for me. So basically, it's taken the, it's taken that workflow representation and run this multi-step analysis all at once. The join is already completed and the other steps are in the queue and so it's going to work through this analysis. Now group is running, it's going to work through this analysis and we will get the results of running exactly that same analysis, but now in a different datasets, set, the CpG islands. While that's finishing, I'll show you a couple other features of the workflow editor. So oftentimes, even though your workflow has many steps, you're not actually interested in seeing all of this, all of the intermediate datasets. And so, one feature of the workflow editor is next to every output, there is a little button which sa, which when you hover over, it says, mark a dataset as a workflow output. And so, if you use this, then any data s, any dataset that is not flagged in this way will be hidden in your history once, it completes successfully. And so, in this case, we might only actually want the output of cut. And so by, by flagging that, then if I run this workflow again, only the output of cut will be shown in the history. Another advanced feature of workflow is the post step actions. And so, if you see over here, there's Edit Step Actions and detail and you can select and step and you'll have this. And what it allows you to do is a variety of things. You can rename a dataset, so by default, it's just going to have the Galaxy generated name Cut on data, whatever. I can say, Rename Dataset Create an action and call it scored overlaps, for example. I can change the datatype, so suppo, in, in this case, I went through a series of steps and now I put this back into bed format. I can actually have the workflow change the datatype back to bed. I can get an email notification and, and other features. So these are some of the more advanced features of the Galaxy Workflow editor. So, I'm just going to save the workflow. Return to Analyze Data And we'll see here, our workflow is completed. And now we have the number of CpG islands overlapping each of these exons. No. So, in summary, Galaxy Workflows allow abstract multi-step analysis to be reused. They can be constructed, either from scratch using the workflow editor or by example using existing history. And then within the workflow editor, you can customize workflows, adding additional steps, associating actions with those steps, modifying step parameters et cetera.