Is going to be 2 by 4, and

w5 is going to be 1 by 2, okay?

So the general formula to check is that when

you're implementing the matrix for layer L,

that the dimension of that matrix be nL by nL-1.

Now let's think about the dimension of this vector b.

This is going to be a 3 by 1 vector, so you have to add that to another

3 by 1 vector in order to get a 3 by 1 vector as the output.

Or in this example, we need to add this, this is going to be 5 by 1,

so there's going to be another 5 by 1 vector.

In order for the sum of these two things I have in

the boxes to be itself a 5 by 1 vector.

So the more general rule is that in the example on the left,

b1 is n1 by 1, right, that's 3 by 1,

and in the second example, this is n2 by 1.

And so the more general case is that

bL should be nL by 1 dimensional.

So hopefully these two equations help you to double check that the dimensions

of your matrices w, as well as your vectors p, are the correct dimensions.

And of course, if you're implementing back propagation,

then the dimensions of dw should be the same as the dimension of w.

So dw should be the same dimension as w,

and db should be the same dimension as b.

Now the other key set of quantities whose dimensions to check are these z,

x, as well as a of L, which we didn't talk too much about here.

But because z of L is equal to g of a of L, applied element wise,

then z and a should have the same dimension in these types of networks.

Now let's see what happens when you have a vectorized implementation that looks at

multiple examples at a time.

Even for a vectorized implementation,

of course, the dimensions of wb, dw, and db will stay the same.

But the dimensions of z, a, as well as x will

change a bit in your vectorized implementation.

So previously,

we had z1 = w1x+b1

where this was n1 by 1,

this was n1 by n0,

x was n0 by 1, and b was n1 by 1.

Now, in a vectorized

implementation, you

would have z1 = w1x + b1.

Where now z1 is obtained by taking the z1 for

the individual examples, so there's z11, z12,

up to z1m, and stacking them as follows, and this gives you z1.

So the dimension of z1 is that, instead of being n1 by 1,

it ends up being n1 by m, and m is the size you're trying to set.

The dimensions of w1 stays the same, so it's still n1 by n0.

And x, instead of being n0 by 1 is now

all your training examples stacked horizontally.

So it's now n 0 by m, and so you notice that when you take

a n1 by n0 matrix and multiply that by an n0 by m matrix.

That together they actually give you an n1 by m dimensional matrix, as expected.

Now, the final detail is that b1 is still n1 by 1, but

when you take this and add it to b, then through Python broadcasting,

this will get duplicated and turn n1 by m matrix, and then add the element wise.

So on the previous slide, we talked about the dimensions of wb, dw, and db.

Here, what we see is that whereas zL as

well as aL are of dimension nL by 1,

we have now instead that ZL as well AL are nL by m.

And a special case of this is when L is equal to 0,

in which case A0, which is equal to just

your training set input features X,

is going to be equal to n0 by m as expected.

And of course when you're implementing this in backpropagation,

we'll see later you, end up computing dZ as well as dA.

And so these will of course have

the same dimension as Z and A.

So I hope the little exercise we went through helps clarify the dimensions that

the various matrices you'd be working with.

When you implement backpropagation for a deep neural network, so long as you work

through your code and make sure that all the matrices' dimensions are consistent.

That will usually help,

it'll go some ways toward eliminating some cause of possible bugs.

So I hope that exercise for figuring out the dimensions of various matrices you'll

been working with is helpful.

When you implement a deep neural network, if you keep straight

the dimensions of these various matrices and vectors you're working with.

Hopefully they'll help you eliminate some cause of possible bugs,

it certainly helps me get my code right.

So next, we've now seen some of the mechanics of how to do forward

propagation in a neural network.

But why are deep neural networks so effective, and

why do they do better than shallow representations?

Let's spend a few minutes in the next video to discuss that.