Okay. So this is the really difficult part which is

the code for our derivative.

So I showed in

a previous lecture how to compute

some of those derivatives or,

in other words, how to write down

a mathematical expression for the derivative.

Really the most difficult part though is say,

"How do we convert that mathematical expression

into a block of code and what's

more block of code that's actually efficient?"

Okay. So the first thing we're going to do, again,

is unpack our vector Theta.

So we now have our current estimates for Alpha,

Beta_u for each user and Beta_i for each item,

and we're going to build some new data structures

that will store our derivatives.

So we have a derivative for each user,

a derivative for each item as well

as a derivative for our offset term Alpha.

So we need to have one value

for each of those derivatives.

So all the users,

all the items plus the offset term.

You could think about taking each user one at

a time and trying to write

down this derivative expression,

which you see in the equation.

This would work just fine, but rather inefficient.

What's more efficient is to iterate through

the entire dataset just

once and update each of

the relevant derivatives as we go.

So if we look at a particular point in

the data set, u, i, well,

that's going to update the derivative of Beta_u,

it's going to update the derivative of Beta_i,

and it's also going to update the derivative of Alpha.

So it iterates through that dataset.

Each time we see a user u and an item i,

it will update Alpha, will update

Beta_u, derivative of Beta_u,

more precisely, and the derivative of

Beta_i following this expression that I've given.

Okay. After that, we have two more for

logs which is going to encode

a derivatives of our regularizers

for each user and for each item.

Finally, we've updated our values

for Alpha, Beta_u, and Beta_i.

What this library function wants

is going to be this vector of derivatives.

So it is going to convert those values back to a vector.

So we'll take the derivative of Alpha,

the derivative of our user bias for all users,

and the derivative of our item biases for

all items will return that vector of derivatives.