the bound enforce for log_prob in line 103 of model.py #44

Roboticyang · 2024-01-07T18:03:55Z

I do not mathematically agree with the bound enforcement for log_prob offset in your Gauss_policy. For pdf's of x and y, in the multivariate cases, the offset would be the logarithm of a determinant of the Jacobian matrix ( y = tanh(x) ) based on the tanh function. The Jacobian happens to be a diagonal matrix, so the offset should be the logarithm of the product of the diagonal elements of the Jacobian matrix. Please let me know if my understanding of pdf's transformation with element-wise change of vector variables is correct or wrong.

Look forward to hearing from you.

Cheers,

Old Yang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the bound enforce for log_prob in line 103 of model.py #44

the bound enforce for log_prob in line 103 of model.py #44

Roboticyang commented Jan 7, 2024

the bound enforce for log_prob in line 103 of model.py #44

the bound enforce for log_prob in line 103 of model.py #44

Comments

Roboticyang commented Jan 7, 2024