Wrong derivative of first layer in get_grad_and_error? #2

simNN7 · 2015-03-07T14:58:03Z

Hi Lars,

I have one question to the derivative you have used during back propagation.
I haven't quite understood why you are using the normalized data in your gradient evaluation for the weights of the input layer.
Shouldn't it be just the unnormalized x values?

In get_grad_and_error function of fine-tuning, you calculate

   x[:, :-1] = get_norm_x(x[:, :-1])
    ...
    for i in range(number_of_weights - 1, -1, -1):
        if i == number_of_weights - 1:
            ...
        elif i == 0:
            ...
            grad = dot(x.T, delta)  #  
        else:
           ...

Is it not:

  D = x[:, :-1] .sum(axis=1)
  D = D[:, numpy.newaxis]
   ...
    for i in range(number_of_weights - 1, -1, -1):
        if i == number_of_weights - 1:
            ...
        elif i == 0:
            ...
            grad = numpy.dot( numpy.append( x[:, :-1], D, axis = 1).T, delta)
        else:
           ...

The text was updated successfully, but these errors were encountered:

larsmaaloee · 2015-03-12T10:36:53Z

Hello,

Sorry for the late reply. The reason for this is to make sure that the log in the cross-entropy cost function doesn’t complain. It has no effect on the performance of the model, but will ensure that you will get no division by zero error:

np.log(0)
main:1: RuntimeWarning: divide by zero encountered in log
-inf

Please let me know if you have any other questions.

Thanks.

Best regards

Lars Maaløe
PHD Student
DTU Compute
Technical University of Denmark (DTU)

Email: [email protected], [email protected]
Phone: 0045 2229 1010
Skype: lars.maaloe
LinkedIn http://dk.linkedin.com/in/larsmaaloe
DTU Orbit http://orbit.dtu.dk/en/persons/lars-maaloee(0ba00555-e860-4036-9d7b-01ec1d76f96d).html

On 07 Mar 2015, at 15:58, simNN7 [email protected] wrote:

Hi Lars,

I have one question to the derivative you used during back propagation.
I haven't quite understood why you are using the normalized data in your gradient evaluation.
Shouldn't it be just the unnormalized x values?

in get_grad_and_error of fine-tuning:

x[:, :-1] = get_norm_x(x[:, :-1])
...
for i in range(number_of_weights - 1, -1, -1):
if i == number_of_weights - 1:
...
elif i == 0:
...
grad = dot(x.T, delta) # <--- unnormalized inputs here?
else:
...
—
Reply to this email directly or view it on GitHub #2.

simNN7 · 2015-03-12T20:37:10Z

Hi,

yes, I see that you need to avoid zero division. My question is not, why are you using the probability of words in the cross-entropy error function, but rather: why are you using the probability-of-words (array) instead of the word count array in the evaluation of the gradient of the first layer (i=0)?

larsmaaloee · 2015-03-19T10:23:56Z

Hello again,

It is a common trick to compare the probabilities to the normalised word counts to avoid sampling from the multinomial distribution.

Best regards

Lars Maaløe
PHD Student
DTU Compute
Technical University of Denmark (DTU)

Email: [email protected], [email protected]
Phone: 0045 2229 1010
Skype: lars.maaloe
LinkedIn http://dk.linkedin.com/in/larsmaaloe
DTU Orbit http://orbit.dtu.dk/en/persons/lars-maaloee(0ba00555-e860-4036-9d7b-01ec1d76f96d).html

On 12 Mar 2015, at 21:37, simNN7 [email protected] wrote:

Hi,

yes, I see that you need to avoid zero division. My question is not, why are you using the probability of words in the cross-entropy error function, but rather: why are you using the probability-of-words (array) instead of the word count array in the evaluation of the gradient of the first layer (i=0)?

—
Reply to this email directly or view it on GitHub #2 (comment).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong derivative of first layer in get_grad_and_error? #2

Wrong derivative of first layer in get_grad_and_error? #2

simNN7 commented Mar 7, 2015

larsmaaloee commented Mar 12, 2015

simNN7 commented Mar 12, 2015

larsmaaloee commented Mar 19, 2015

Wrong derivative of first layer in get_grad_and_error? #2

Wrong derivative of first layer in get_grad_and_error? #2

Comments

simNN7 commented Mar 7, 2015

larsmaaloee commented Mar 12, 2015

simNN7 commented Mar 12, 2015

larsmaaloee commented Mar 19, 2015