Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concatenate CICE daily output #31

Closed
wants to merge 2 commits into from
Closed

Conversation

anton-seaice
Copy link
Contributor

Add script to concatenate daily cice output into one file per month (still containing daily data) and delete the individual daily files. This relies on adding nco to the payu environment per ACCESS-NRI/payu-condaenv#24.

Following Aidan's suggestion, the script is taken from https://github.com/COSIMA/1deg_jra55_ryf/blob/master/sync_data.sh#L87-L108, and on that basis I haven't tested beyond checking that it concatenates data.

The only change I made was change the netcdf output type to -4 (netcdf4) instead of -7 (netcdf4_classic).

Looking at ncdump of output looks correct, i.e. it shows time and time bound dimensions of length 31 days for January.

Maybe @aekiss would like to review too?

…still containing daily data). This relies on adding nco to the payu environment per ACCESS-NRI/payu-condaenv#24
Copy link

❌ Automated testing cannot be run on this branch ❌
Source and Target branches must be of the form dev-<config> and release-<config> respectively, and <config> must match between them.
Rename the Source branch or check the Target branch, and try again.

@anton-seaice anton-seaice self-assigned this Mar 22, 2024
Copy link
Member

@aidanheerdegen aidanheerdegen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to get an idea of how long it takes to concatenate some of the high res data and post the results in the PR or Issue.

If it's time-consuming the script could have PBS directives added and run as a postscript which is then submitted to the queue

#concatenate sea-ice daily output
#script inspired from https://github.com/COSIMA/1deg_jra55_ryf/blob/master/sync_data.sh#L87-L108

for d in archive/output*/ice/OUTPUT; do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loops over all the output directories. If we're running this each time we shouldn't have to do that.

I can see two options:

  1. Determine the most recent output directory and just run there
  2. invoke this with the run userscript hook and do the concatenation inn the work/ice/OUPUT directory before it is archived.

The issue with option 2 is that there is already a run userscript. I honestly have no idea what would happen if you tried to run two scripts in a single line, say with && or separated with a ;.

Another point: apparently there has been a requirement in the past to concatenate 6 hourly data in the past

https://github.com/COSIMA/01deg_jra55_iaf/blob/01deg_jra55v140_iaf_cycle4/concat_ice_6hourlies.sh#L6

Is there a way we could accomodate that use case as well in a general way I wonder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we assume that the use might configure the output to be saved in any number of hours, this gets hard ...

Would assume there will always be data saved at 12 hours (i.e. every combination of 1/2/3/4/6/12 hourly would save a 12 hours, then we can find the days by something like $output_dir/iceh*.????-??-01-43200.nc). Messy but probably ok.

The other complexity is that CICE timestamps are at the end of the time period. e.g. with hourly data, there is a file with a name at the midnight at the end of the month. So for January, there is a archive/output001/ice/OUTPUT/iceh_03h.1901-02-01-00000.nc file made, but this contains January data.

So I don't know how, in Bash, to make a list of all files for a month working with both those conditions ? We would probable need to use a calendar tool?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thoughtful engagement. It sounds like we should probably shoot for the common use case to begin with and make an issue to update to a more general form at a later date.

Comment on lines 6 to 23
for f in $d/iceh.????-??-01.nc; do
if [[ ! -f ${f/-01.nc/-IN-PROGRESS} ]] && [[ ! -f ${f/-01.nc/-daily.nc} ]];
then
touch ${f/-01.nc/-IN-PROGRESS}
echo "doing ncrcat -O -L 5 -4 ${f/-01.nc/-??.nc} ${f/-01.nc/-daily.nc}"
${PAYU_PATH}/ncrcat -O -L 5 -4 ${f/-01.nc/-??.nc} ${f/-01.nc/-daily.nc} && chmod g+r ${f/-01.nc/-daily.nc} && rm ${f/-01.nc/-IN-PROGRESS}
if [[ ! -f ${f/-01.nc/-IN-PROGRESS} ]] && [[ -f ${f/-01.nc/-daily.nc} ]];
then
for daily in ${f/-01.nc/-??.nc}
do
# mv $daily $daily-DELETE # rename individual daily files - user to delete
rm $daily
done
else
rm ${f/-01.nc/-IN-PROGRESS}
fi
fi
done
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for f in $d/iceh.????-??-01.nc; do
if [[ ! -f ${f/-01.nc/-IN-PROGRESS} ]] && [[ ! -f ${f/-01.nc/-daily.nc} ]];
then
touch ${f/-01.nc/-IN-PROGRESS}
echo "doing ncrcat -O -L 5 -4 ${f/-01.nc/-??.nc} ${f/-01.nc/-daily.nc}"
${PAYU_PATH}/ncrcat -O -L 5 -4 ${f/-01.nc/-??.nc} ${f/-01.nc/-daily.nc} && chmod g+r ${f/-01.nc/-daily.nc} && rm ${f/-01.nc/-IN-PROGRESS}
if [[ ! -f ${f/-01.nc/-IN-PROGRESS} ]] && [[ -f ${f/-01.nc/-daily.nc} ]];
then
for daily in ${f/-01.nc/-??.nc}
do
# mv $daily $daily-DELETE # rename individual daily files - user to delete
rm $daily
done
else
rm ${f/-01.nc/-IN-PROGRESS}
fi
fi
done
# Don't error if there are no matching patterns
shopt -s nullglob
# Assuming `$d` contains the directory where the data resides
for first_file in $d/iceh.????-??-01.nc
do
# Make a list of all files we wish to concatenate
icefiles=(${first_file/-01.nc/-??.nc})
if [ ${#icefiles[@]} -gt 0 ]
then
iceout="${first_file/-*.nc/-daily.nc}"
ncrcat -O -L 5 -4 "${icefiles[@]}" ${iceout} && rm "${icefiles[@]}"
fi
done

Personally I prefer to just delete the files if the return status of the ncrcat command is ok. Making temporary files ends up introducing extra logic to deal with them.

Note the above is untested, just a suggestion for how to reduce the complexity of the logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making temporary files ends up introducing extra logic to deal with them.

This is copied from the COSIMA scripts. I assume the temporary files were needed for some edge case? @aekiss - Do you know why the temporary files were used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to badger you @aekiss but I'm also curious if there were cases of data loss that prompted the design you implemented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Andrew said it was just for sanity / checking debug in case of failure. I am happy to remove it.

@anton-seaice
Copy link
Contributor Author

It would be good to get an idea of how long it takes to concatenate some of the high res data and post the results in the PR or Issue.

How would one do this? Are there ways to log the PBS "CPU Time" between user scripts?

@anton-seaice
Copy link
Contributor Author

It would be good to get an idea of how long it takes to concatenate some of the high res data and post the results in the PR or Issue.

If it's time-consuming the script could have PBS directives added and run as a postscript which is then submitted to the queue

With one month of 0.1 degree data, this takes ~2.5 minutes to run in the login node. Compared to approx 1.6 hours of Walltime for the model to run. (Amazingly it turns 3.6GB into 1.5GB too!)

This doesn't parallelise, and thats ~2-3% of the walltime, so I guess it is worth worrying about?

(Reducing compression to level 1, reduces the time to ~1m50sec but file size goes up ~6%)

@jo-basevi
Copy link
Collaborator

Instead of using an archive userscript, could instead use the sync userscript? This will run prior to running any sync commands, and it has the benefit of running on a job with less resources. Only con I can think of is that it'll only run when sync is enabled, but I guess that'll be similar to how the post-script sync-data.sh script originally worked.

@aidanheerdegen
Copy link
Member

Amazingly it turns 3.6GB into 1.5GB too!

IKR. There is a reason this is worth doing.

Only con I can think of is that it'll only run when sync is enabled, but I guess that'll be similar to how the post-script sync-data.sh script originally worked

That is a good idea thanks @jo-basevi.

I think it does run into the issue that if someone turns sync on part way through their experiment not all of the ice data files will be smooshed together (technical term). But we can be explicit that users should set sync for this reason (and for other reasons like their data evaporating). We could also give instructions on how to run the command directly on output directories that haven't had their ice files smooshed.

I did wonder if we couldn't define a collate option for CICE that did this work. Ultimately it is a nice idea, but not doable in the time frames we have available and we should plough ahead with using a sync userscript.

@anton-seaice
Copy link
Contributor Author

I think moving to a payu postscript is the best plan, as it runs as a seperate PBS job, this reduces the resources held waiting for a single PE job to complete?

I did wonder if we couldn't define a collate option for CICE that did this work. Ultimately it is a nice idea, but not doable in the time frames we have available and we should plough ahead with using a sync userscript.

I think we might get rid of the need for this step in OM3, or at least remove the grid from the CICE output.

Also - we've added 'nco' as a dependency in some cases. Do we need to document this somewhere (for users who don't use 'vk83' ) ?

@anton-seaice
Copy link
Contributor Author

It looks like setting this as a postscript would stop the sync from running?

https://payu.readthedocs.io/en/latest/config.html#postprocessing

Copy link

❌ Automated testing cannot be run on this branch ❌
Source and Target branches must be of the form dev-<config> and release-<config> respectively, and <config> must match between them.
Rename the Source branch or check the Target branch, and try again.

@anton-seaice
Copy link
Contributor Author

@aidan - I have updated based on the review comments. Back to you.

I switched to using the system nco module, rahter than adding to payu-env ?

I cleaned up the script to remove the uneeded operations and only check the last archive folder.

@jo-basevi
Copy link
Collaborator

Yeah if postscript is used, and sync is enabled, then it won’t rsync the latest output, as payu has no idea on when the postscript job completes or whether it modifies the current output. It’ll still rsync up outputs prior to the last one, if they haven’t already been synced.

Also, if syncing is enabled, a sync job by default runs on a copyq queue. It starts after the main payu run job or collate job, if it’s enabled, has completed. Its resources can be configured similarly to the collate config so can be configured to more/less resources if needed.

#concatenate sea-ice daily output
#script inspired from https://github.com/COSIMA/1deg_jra55_ryf/blob/master/sync_data.sh#L87-L108

out_dir=$(ls -td archive/output??? | head -1)/ice/OUTPUT #latest output dir only
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
out_dir=$(ls -td archive/output??? | head -1)/ice/OUTPUT #latest output dir only
out_dir=$(ls -dr archive/output??? | head -1)/ice/OUTPUT #latest output dir only

for f in $out_dir/iceh.????-??-01.nc; do
#concat daily files for this month
echo "doing ncrcat -O -L 5 -4 ${f/-01.nc/-??.nc} ${f/-01.nc/-daily.nc}"
ncrcat -O -L 5 -4 ${f/-01.nc/-??.nc} ${f/-01.nc/-daily.nc}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ncrcat -O -L 5 -4 ${f/-01.nc/-??.nc} ${f/-01.nc/-daily.nc}
{PAYU_PATH}/ncrcat -O -L 5 -4 ${f/-01.nc/-??.nc} ${f/-01.nc/-daily.nc}


modules:
load:
- nco/5.0.5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- nco/5.0.5

@aidanheerdegen
Copy link
Member

Looking down the barrel of adding this to every config, and not being confident we wouldn't have to update it in the future (see conversation about 6 hourly concatenation) I've made a new repo and moved the code to a PR there

ACCESS-NRI/om2-scripts#1

When we've got that merged I'll manually pop it in vk83 (like I did with mppnccombine-fast) and we'll work on a longer term solution later.

Sorry for mucking you about @anton-seaice

@anton-seaice
Copy link
Contributor Author

Ok - no worries. Ill put my changes there. Do we still need to update the config.yaml here?

@aidanheerdegen
Copy link
Member

Do we still need to update the config.yaml here?

Maybe we'll leave this open and just update the config.yaml as an exemplar when we have a final location of the concat script.

@aidanheerdegen
Copy link
Member

This was superseded by a script in a separate repo in this PR

ACCESS-NRI/om2-scripts#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants