What are you doing? The team has been working a lot of late nights testing our new algorithm. Researchers, as you know, are fueled by pop and pizza. Well, the lab's starting to look like a scrapyard. So are you going to ban food in the lab? No, you monster. I was thinking of programming a robot to drive around the lab and collect empty cans. That sounds cool. How would it work? Well, I think we need a vision system to recognize cans, obstacles, and people. The robot will also need to build a map and localize itself in the lab. Then we need to write some servoing routines too. I'm going to have to stop you right there. I think there might be an easier way to do this. Let's solve with Reinforcement Learning. All right RL. How would you do it, Martha? Well, the reward can be the number of cans the robot collects. The agent could simply learn how to collect as many cans as possible, through trial and error. In principle, it wouldn't even need a map of the lab. It can learn everything from scratch. Right. That way when somebody moves some furniture around, the robot could adapt automatically through learning. In fact, if one of the students starts drinking some new wild type of pop in pink cans, that would totally break a pre-learned perception system. Yes. A Reinforcing Agent could simply try collecting the pink cans and find out itself if they are worth collecting. Okay. It sounds like we're going to need a good exploration method and we'll definitely need function approximation if we want to learn from an on-board camera. Yeah, and planning so that the agent can revisit parts of the lab it hasn't been to in a while to check for new cans. But what about the Reward Function? What describes if the robot was successful? Well, we could get one of the students to count the cans in the bin at the end of the day. If there were six cans in the bin plus six. No cans, zero reward. So we'll probably need a TD-based algorithm to handle this delayed feedback. Before we get started, we better review the basics of Reinforcement Learning. Well, the good news is, everything we've just talked about will be covered in our new Coursera specialization on Reinforcement Learning. That's amazing. Are the students really going implement a can collecting robot? No, but once they finish the specialization, they'll have all the conceptual tools they need to do so. Either way, that's a lot of material. I think we're going to need a big team to cover all that stuff. A team of RL experts. What a team. Join us over the coming weeks right here in the RL AI Lab at the University of Alberta. Welcome to the Reinforcement Learning specialization from the University of Alberta on Coursera.