-
Notifications
You must be signed in to change notification settings - Fork 376
CRM Testing Framework
The gist of testing the CRM is to create baselines in 2-D and 3-D with 1-mom micro for full coverage. The baselines will run at two different optimization levels to approximate bit-level differences and how they non-linearly amplify over the course of one model day of simulation. This will serve as an envelope for how different your refactoring should be after one model day, and you will be performing a 3-way diff comparing your diff from -O2
against the difference between -O0
and -O2
. You shouldn't be much more than 2x outside that envelope. Note that some variables exist but are filled with junk (such as UNICON variables we'll probably never use), and I have no idea why.
- For PGI
- Use
-O2 -Mvect=nosimd
(the default) and-O0
-
IMPORTANT: You must remove the
-Mvect=nosimd
for the-O0
case because that flag will force-O2
, which you don't want.
- Use
- For everything else
- Use
-O2
and-O0
- Use
You will change the optimization levels in Macros.make
and Macros.cmake
. After simulating at -O0
, you'll need to copy the restart file to the case directory, e.g.,
# Run the -O0 case
cp $RUNDIR/$CASE.cam.r.0001-01-02-00000.nc ./$CASE.cam.r.0001-01-02-00000.optO0.nc
# Run the -O2 case
cp $RUNDIR/$CASE.cam.r.0001-01-02-00000.nc ./$CASE.cam.r.0001-01-02-00000.optO2.nc
With the -O0
and -O2
baselines in place for 2D and 3D, you can then run your regressions and compare against baseline.
The following script will setup your baselines using the PGI compiler on the CPU:
#!/bin/bash
E3SM_HOME=~/ACME-ECP
COMPILER=pgi
MACH=summit-cpu
PES=84x1
RES=ne4_ne4
PROJ=stf006
CASE=sp1vfast2d_baseline
./create_newcase -compset FSP1FAST -case $CASE -compiler $COMPILER -mach $MACH -project $PROJ -pecount $PES -res $RES --handle-preexisting-dirs r || exit -1
cd $CASE
./xmlchange ATM_NCPL=144,STOP_N=1
./xmlchange CHARGE_ACCOUNT=$PROJ
./xmlchange CAM_CONFIG_OPTS="-phys cam5 -use_SPCAM -crm_adv MPDATA -nlev 30 -crm_nz 28 -crm_dx 4000 -crm_dt 20 -microphys mg2 -cppdefs ' -DSP_DIR_NS ' -rad rrtmg -crm_nx 4 -crm_ny 1 -crm_nx_rad 1 -crm_ny_rad 1 -SPCAM_microp_scheme sam1mom -chem none -bc_dep_to_snow_updates"
cat > user_nl_cam << 'eof'
prescribed_aero_cycle_yr = 2000
prescribed_aero_file = 'mam3_1.9x2.5_L30_2000clim_c130319.nc'
prescribed_aero_datapath = '/gpfs/alpine/world-shared/csc190/e3sm/cesm/inputdata/atm/cam/chem/trop_mam/aero'
use_hetfrz_classnuc = .false.
prescribed_aero_type = 'CYCLICAL'
aerodep_flx_type = 'CYCLICAL'
aerodep_flx_datapath = '/gpfs/alpine/world-shared/csc190/e3sm/cesm/inputdata/atm/cam/chem/trop_mam/aero'
aerodep_flx_file = 'mam3_1.9x2.5_L30_2000clim_c130319.nc'
aerodep_flx_cycle_yr = 2000
srf_flux_avg = 1
eof
cd ..
CASE=sp1vfast3d_baseline
./create_newcase -compset FSP1FAST -case $CASE -compiler $COMPILER -mach $MACH -project $PROJ -pecount $PES -res $RES --handle-preexisting-dirs r || exit -1
cd $CASE
./xmlchange ATM_NCPL=144,STOP_N=1
./xmlchange CHARGE_ACCOUNT=$PROJ
cat > user_nl_cam << 'eof'
prescribed_aero_cycle_yr = 2000
prescribed_aero_file = 'mam3_1.9x2.5_L30_2000clim_c130319.nc'
prescribed_aero_datapath = '/gpfs/alpine/world-shared/csc190/e3sm/cesm/inputdata/atm/cam/chem/trop_mam/aero'
use_hetfrz_classnuc = .false.
prescribed_aero_type = 'CYCLICAL'
aerodep_flx_type = 'CYCLICAL'
aerodep_flx_datapath = '/gpfs/alpine/world-shared/csc190/e3sm/cesm/inputdata/atm/cam/chem/trop_mam/aero'
aerodep_flx_file = 'mam3_1.9x2.5_L30_2000clim_c130319.nc'
aerodep_flx_cycle_yr = 2000
srf_flux_avg = 1
eof
From there, just cd sp1vfast[23]d_baseline
and ./case.setup
, change Macros.[c]make
as needed, ./case.build
, and ./case.submit
to create your baseline netCDF files. You must be on a clean and updated master
branch before you generate baselines, e.g.: git checkout master && git fetch origin && git reset --hard origin/master
.
Now, you're ready to perform quick running regression tests. The following script will setup your regression test cases:
#!/bin/bash
E3SM_HOME=~/ACME-ECP
COMPILER=pgigpu
MACH=summit
PES=18x1
RES=ne4_ne4
PROJ=stf006
CASE=sp1vfast2d_regression
./create_newcase -compset FSP1FAST -case $CASE -compiler $COMPILER -mach $MACH -project $PROJ -pecount $PES -res $RES --handle-preexisting-dirs r || exit -1
cd $CASE
./xmlchange ATM_NCPL=144,STOP_N=1
./xmlchange CHARGE_ACCOUNT=$PROJ
./xmlchange CAM_CONFIG_OPTS="-phys cam5 -use_SPCAM -crm_adv MPDATA -nlev 30 -crm_nz 28 -crm_dx 4000 -crm_dt 20 -microphys mg2 -cppdefs ' -DSP_DIR_NS ' -rad rrtmg -crm_nx 4 -crm_ny 1 -crm_nx_rad 1 -crm_ny_rad 1 -SPCAM_microp_scheme sam1mom -chem none -bc_dep_to_snow_updates"
cat > user_nl_cam << 'eof'
prescribed_aero_cycle_yr = 2000
prescribed_aero_file = 'mam3_1.9x2.5_L30_2000clim_c130319.nc'
prescribed_aero_datapath = '/gpfs/alpine/world-shared/csc190/e3sm/cesm/inputdata/atm/cam/chem/trop_mam/aero'
use_hetfrz_classnuc = .false.
prescribed_aero_type = 'CYCLICAL'
aerodep_flx_type = 'CYCLICAL'
aerodep_flx_datapath = '/gpfs/alpine/world-shared/csc190/e3sm/cesm/inputdata/atm/cam/chem/trop_mam/aero'
aerodep_flx_file = 'mam3_1.9x2.5_L30_2000clim_c130319.nc'
aerodep_flx_cycle_yr = 2000
srf_flux_avg = 1
eof
cd ..
CASE=sp1vfast3d_regression
./create_newcase -compset FSP1FAST -case $CASE -compiler $COMPILER -mach $MACH -project $PROJ -pecount $PES -res $RES --handle-preexisting-dirs r || exit -1
cd $CASE
./xmlchange ATM_NCPL=144,STOP_N=1
./xmlchange CHARGE_ACCOUNT=$PROJ
cat > user_nl_cam << 'eof'
prescribed_aero_cycle_yr = 2000
prescribed_aero_file = 'mam3_1.9x2.5_L30_2000clim_c130319.nc'
prescribed_aero_datapath = '/gpfs/alpine/world-shared/csc190/e3sm/cesm/inputdata/atm/cam/chem/trop_mam/aero'
use_hetfrz_classnuc = .false.
prescribed_aero_type = 'CYCLICAL'
aerodep_flx_type = 'CYCLICAL'
aerodep_flx_datapath = '/gpfs/alpine/world-shared/csc190/e3sm/cesm/inputdata/atm/cam/chem/trop_mam/aero'
aerodep_flx_file = 'mam3_1.9x2.5_L30_2000clim_c130319.nc'
aerodep_flx_cycle_yr = 2000
srf_flux_avg = 1
eof
Now, the following script will run both 2-D and 3-D regressions and perform a 3-way diff against baseline:
#!/bin/bash
#BSUB -P stf006
#BSUB -W 02:00
#BSUB -nnodes 1
#BSUB -J regression
#BSUB -o regdim23.%J
#BSUB -e regdim23.%J
#BSUB -alloc_flags gpumps
source $MODULESHOME/init/bash
ulimit -s unlimited
dim2=1
dim3=0
clean=0
build=1
submit=1
E3SM_HOME=~/ACME-ECP
cd $E3SM_HOME/cime/scripts
if [[ $dim2 -eq 1 ]]; then
CASE=sp1vfast2d_regression
if [ ! -d "$CASE" ]; then
echo "************* ERROR: 2D CASE DOES NOT EXIST *************"
else
cd $CASE
fi
if [[ $clean -eq 1 ]]; then
echo "************* CLEANING 2D CASE *************"
./case.build --clean-all
fi
if [[ $build -eq 1 ]]; then
echo "************* BUILDING 2D CASE *************"
./case.build || exit -1
fi
if [[ $submit -eq 1 ]]; then
echo "************* SUBMITTING 2D CASE *************"
./case.submit --no-batch || exit -1
cp /gpfs/alpine/scratch/imn/stf006/e3sm/$CASE/run/$CASE.cam.r.0001-01-02-00000.nc .
fi
echo "************* DIFF'ING 2D *************"
module add python/3.7.0-anaconda3-5.3.0
source activate rrtmgp-env
python $E3SM_HOME/cime/tools/nccmp/nccmp3.py $E3SM_HOME/cime/scripts/sp1vfast2d_baseline/sp1vfast2d_baseline.cam.r.0001-01-02-00000.optO0.nc \
$E3SM_HOME/cime/scripts/sp1vfast2d_baseline/sp1vfast2d_baseline.cam.r.0001-01-02-00000.optO2.nc \
$E3SM_HOME/cime/scripts/sp1vfast2d_regression/sp1vfast2d_regression.cam.r.0001-01-02-00000.nc
source deactivate
module rm python
echo ""
cd ..
fi
if [[ $dim3 -eq 1 ]]; then
CASE=sp1vfast3d_regression
if [ ! -d "$CASE" ]; then
echo "************* ERROR: 3D CASE DOES NOT EXIST *************"
else
cd $CASE
fi
if [[ $clean -eq 1 ]]; then
echo "************* CLEANING 3D CASE *************"
./case.build --clean-all
fi
if [[ $build -eq 1 ]]; then
echo "************* BUILDING 3D CASE *************"
./case.build || exit -1
fi
if [[ $submit -eq 1 ]]; then
echo "************* SUBMITTING 3D CASE *************"
./case.submit --no-batch || exit -1
cp /gpfs/alpine/scratch/imn/stf006/e3sm/$CASE/run/$CASE.cam.r.0001-01-02-00000.nc .
fi
echo "************* DIFF'ING 3D *************"
module add python/3.7.0-anaconda3-5.3.0
source activate rrtmgp-env
python $E3SM_HOME/cime/tools/nccmp/nccmp3.py $E3SM_HOME/cime/scripts/sp1vfast3d_baseline/sp1vfast3d_baseline.cam.r.0001-01-02-00000.optO0.nc \
$E3SM_HOME/cime/scripts/sp1vfast3d_baseline/sp1vfast3d_baseline.cam.r.0001-01-02-00000.optO2.nc \
$E3SM_HOME/cime/scripts/sp1vfast3d_regression/sp1vfast3d_regression.cam.r.0001-01-02-00000.nc
source deactivate
module rm python
echo ""
fi
You'll have to replace source activate rrtmgp-env
with an anaconda environment you've created that includes netCDF. As an example of how to do this:
module load python/3.7.0-anaconda3-5.3.0
conda create -n rrtmgp-env python=3.7 openssl=1.1.1b numpy netcdf4 xarray
You only need to create this environment once for all time. From here, you can just source activate
it. Note that it tends to screw up the E3SM python scripts if you have an anaconda environment loaded while you run them, so it's best to source deactivate
before you run any E3SM script.
Currently we are using FFLAGS
in Depends.summit.cmake
and Depends.summit.[compiler].cmake
to apply the compiler flags for GPU offloading. The reason for this is that the PGI compiler gives wrong answers in runtime if you use the offloading flags for all files. Thus we must use them for only the files we need them for. Also, note that we also have to change LDFLAGS
in Macros.[c]make
for linking purposes.
I recommend for OpenMP offload porting of the CRM that you delete all !$acc
statements in crm_module.F90
that are outside the "main time stepping loop". The reason is that they currently use class and derived type data, which causes compilers some issues. Inside the main time stepping loop, however, you'll find that there aren't any pointers or direct references to derived type data.
For OpenMP porting, I recommend not bothering with optimizing data movement up front. The XL compiler will move all data for you in Fortran, so you don't need to worry about data statements. I recommend not using the depend(inout:asyncid) nowait
clause up front to avoid potential wrong answers due to forgetting to put in !$omp taskwait
. Also, I don't recommend porting everything at once. I recommend rather going, say, 10 kernels at a time, working your way through the code. You will encounter segfaults with OpenMP if you do everything at once. I tried it a few weeks ago.
To get a full traceback in XL, you'll need to specify -g -qtbtable=full
. It actually does an admirable job doing a full traceback.
Also, the XL compiler does not recognize simd
as a useful clause in OpenMP Offload. So just use !$omp target teams distribute parallel for collapse(N) private(...)
to replace !$acc parallel loop collapse(N) private(...)