Inconsistency in calculating product of observation/action space shapes #895

songyuc · 2025-02-28T14:17:09Z

Hi,

I noticed that there are two slightly different ways to calculate the product of the observation and action space shapes in examples/baselines/sac/sac.py.

Specifically, in the SoftQNetwork class, the following code is used:

nn.Linear(np.array(env.single_observation_space.shape).prod() + np.prod(env.single_action_space.shape), 256),

Here, np.array(env.single_observation_space.shape).prod() is used for the observation space, while np.prod(env.single_action_space.shape) is used for the action space.

It seems that both methods achieve the same result, but the first one (converting the tuple to a NumPy array) is slightly redundant, as np.prod() can directly handle tuples.

I'm curious about the reason for using these two different approaches. Is there a specific reason for this, or would it be beneficial to unify them for consistency?

Thank you for your time and effort in maintaining this project!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistency in calculating product of observation/action space shapes #895

Inconsistency in calculating product of observation/action space shapes #895

songyuc commented Feb 28, 2025

Inconsistency in calculating product of observation/action space shapes #895

Inconsistency in calculating product of observation/action space shapes #895

Comments

songyuc commented Feb 28, 2025