You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that there are two slightly different ways to calculate the product of the observation and action space shapes in examples/baselines/sac/sac.py.
Specifically, in the SoftQNetwork class, the following code is used:
Here, np.array(env.single_observation_space.shape).prod() is used for the observation space, while np.prod(env.single_action_space.shape) is used for the action space.
It seems that both methods achieve the same result, but the first one (converting the tuple to a NumPy array) is slightly redundant, as np.prod() can directly handle tuples.
I'm curious about the reason for using these two different approaches. Is there a specific reason for this, or would it be beneficial to unify them for consistency?
Thank you for your time and effort in maintaining this project!
The text was updated successfully, but these errors were encountered:
Hi,
I noticed that there are two slightly different ways to calculate the product of the observation and action space shapes in
examples/baselines/sac/sac.py
.Specifically, in the
SoftQNetwork
class, the following code is used:Here,
np.array(env.single_observation_space.shape).prod()
is used for the observation space, whilenp.prod(env.single_action_space.shape)
is used for the action space.It seems that both methods achieve the same result, but the first one (converting the tuple to a NumPy array) is slightly redundant, as
np.prod()
can directly handle tuples.I'm curious about the reason for using these two different approaches. Is there a specific reason for this, or would it be beneficial to unify them for consistency?
Thank you for your time and effort in maintaining this project!
The text was updated successfully, but these errors were encountered: