You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have run "AS_SEND_LAT=3 AS_NVLS_ENABLE=1 ./bin/SimAI_simulator -t 16 -w ./example/microAllReduce.txt -n ./HPN_7_0_128_gpus_8_in_one_server_with_100Gbps_A100 -c astra-sim-alibabacloud/inputs/config/SimAI.conf" successfully.
But, I don't know how to analyze the results?
Can you give me some guide?
The text was updated successfully, but these errors were encountered:
I have run "AS_SEND_LAT=3 AS_NVLS_ENABLE=1 ./bin/SimAI_simulator -t 16 -w ./example/microAllReduce.txt -n ./HPN_7_0_128_gpus_8_in_one_server_with_100Gbps_A100 -c astra-sim-alibabacloud/inputs/config/SimAI.conf" successfully. But, I don't know how to analyze the results? Can you give me some guide?
In this case, two Allreduce operations within a TP were performed. You can clearly see the execution time of each communication in the "Fwd xx comm" section of the results, along with their Algobw and Busbw. These metrics are crucial indicators of collective communication performance; for more details, refer to https://github.com/NVIDIA/nccl-tests/blob/master/doc/PERFORMANCE.md.
I have run "AS_SEND_LAT=3 AS_NVLS_ENABLE=1 ./bin/SimAI_simulator -t 16 -w ./example/microAllReduce.txt -n ./HPN_7_0_128_gpus_8_in_one_server_with_100Gbps_A100 -c astra-sim-alibabacloud/inputs/config/SimAI.conf" successfully.
But, I don't know how to analyze the results?
Can you give me some guide?
The text was updated successfully, but these errors were encountered: