Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEMPAK inefficiency exploited during case run of HR5 with post-processing #3300

Open
ChristopherHill-NOAA opened this issue Feb 4, 2025 · 0 comments
Labels
bug Something isn't working triage Issues that are triage

Comments

@ChristopherHill-NOAA
Copy link
Contributor

What is wrong?

A non-fatal error occupies extra file space and run time during one or more GRIB2-to-GEMPAK conversion processes within the gfs_gempak_fFFF-fFFF job.

What should have happened?

Each GEMPAK process converting the GFS atmospheric forecast GRIB2 products should run without error.

What machines are impacted?

WCOSS2

What global-workflow hash are you using?

7886699

Steps to reproduce

Using the abovelisted hash, a non-fatal error occurs repeatedly during the ecFlow execution of gfs_gempak_f207-f219 within the HR5 workflow specifically in the case of cycle 2019120300. GEMPAK v7.14.1 is known to have been sourced. The frequency of the error is yet to be determined for another cycle case. Please consult @RuiyuSun for complete testing details.

Additional information

While each NAGRIB2 conversion process runs to completion, repeated GEMPAK error messages accumulate within the gfs_gempak_f207-f219 job log file, which occupies much more file space (19 GB) than the next largest log file (4.3 MB), and the generation of the errors results in the addition of approximately 15 seconds to the run time of the ecFlow job.

These errors did not occur within the HR4 workflow, as the conversion process for each forecast hour product was performed in separate ecFlow jobs. The concatenation of five GRIB2-to-GEMPAK forecast hour conversion processes into one ecFlow job in the HR5 workflow may be causing an aggregation of concurrently unfinished GEMPAK processes, exceeding a message queue threshold sourced from the GEMPAK (v7.14.1) environment.

Do you have a proposed solution?

Initial, potential solutions for this issue:

  • source GEMPAK v7.15.1 vice v7.14.1
  • refine each ecFlow job to convert fewer forecast hour products (e.g. 3 vice 5)
  • increase the message queue threshold defined within the GEMPAK environment
  • include a time delay between each GEMPAK process within each gfs_gempak_fFFF-fFFF ecFlow job
  • ensure the GEMPAK command 'gpend' is invoked with each completed or failed GEMPAK process
@ChristopherHill-NOAA ChristopherHill-NOAA added bug Something isn't working triage Issues that are triage labels Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issues that are triage
Projects
None yet
Development

No branches or pull requests

1 participant