Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vcs.download_sample_data_files() keeps on downloading th_yr.nc #392

Open
jypeter opened this issue Mar 12, 2019 · 4 comments
Open

vcs.download_sample_data_files() keeps on downloading th_yr.nc #392

jypeter opened this issue Mar 12, 2019 · 4 comments

Comments

@jypeter
Copy link
Member

jypeter commented Mar 12, 2019

@doutriaux1 I wonder if there is a problem with a) vcs.download_sample_data_files() or with b) th_yr.nc

When I execute a) in a CDAT version where it has already been executed, it apparently sees that the files are already here, except for b) that it downloads 3 times. Same thing if I re-execute a)

-rw-r--r-- 1 jypeter lsce   332776 Mar 12 14:44 th_yr.nc

(cdatm18_py2) jypeter@obelix2 - ...jypeter - 61 >python -c 'import vcs; vcs.download_sample_data_files(); print "\nFinished downloading sample data to", vcs.sample_data'
Downloading: 'th_yr.nc' from 'https://cdat.llnl.gov/cdat/sample_data/' in: /home/share/unix_files/cdat/miniconda3/envs/cdatm18_py2/share/cdat/sample_data/th_yr.nc
Downloading: 'th_yr.nc' from 'https://cdat.llnl.gov/cdat/sample_data/' in: /home/share/unix_files/cdat/miniconda3/envs/cdatm18_py2/share/cdat/sample_data/th_yr.nc
Downloading: 'th_yr.nc' from 'https://cdat.llnl.gov/cdat/sample_data/' in: /home/share/unix_files/cdat/miniconda3/envs/cdatm18_py2/share/cdat/sample_data/th_yr.nc

Finished downloading sample data to /home/share/unix_files/cdat/miniconda3/envs/cdatm18_py2/share/cdat/sample_data

(cdatm18_py2) jypeter@obelix2 - ...jypeter - 62 >ls -ltr /home/share/unix_files/cdat/miniconda3/envs/cdatm18_py2/share/cdat/sample_data | tail
-rw-r--r-- 1 jypeter lsce     3216 Mar  7 16:27 tas_gavg_rnl_ecm.nc
-rw-r--r-- 1 jypeter lsce   510144 Mar  7 16:27 tas_ecm_1979.nc
-rw-r--r-- 1 jypeter lsce   128360 Mar  7 16:27 tas_cru_1979.nc
-rw-r--r-- 1 jypeter lsce  2107996 Mar  7 16:27 psl_6h.nc
-rw-r--r-- 1 jypeter lsce   366116 Mar  7 16:27 ts_da.nc
-rw-r--r-- 1 jypeter lsce  2678584 Mar  7 16:27 tas_mo.nc
-rw-r--r-- 1 jypeter lsce   159468 Mar  7 16:27 tas_mo_clim.nc
-rw-r--r-- 1 jypeter lsce  6280312 Mar  7 16:27 tas_6h.nc
-rw-r--r-- 1 jypeter lsce 34487602 Mar  7 16:27 geos5-sample.nc
-rw-r--r-- 1 jypeter lsce   332776 Mar 12 14:46 th_yr.nc

(cdatm18_py2) jypeter@obelix2 - ...jypeter - 63 >python -c 'import vcs; vcs.download_sample_data_files(); print "\nFinished downloading sample data to", vcs.sample_data'
Downloading: 'th_yr.nc' from 'https://cdat.llnl.gov/cdat/sample_data/' in: /home/share/unix_files/cdat/miniconda3/envs/cdatm18_py2/share/cdat/sample_data/th_yr.nc
Downloading: 'th_yr.nc' from 'https://cdat.llnl.gov/cdat/sample_data/' in: /home/share/unix_files/cdat/miniconda3/envs/cdatm18_py2/share/cdat/sample_data/th_yr.nc
Downloading: 'th_yr.nc' from 'https://cdat.llnl.gov/cdat/sample_data/' in: /home/share/unix_files/cdat/miniconda3/envs/cdatm18_py2/share/cdat/sample_data/th_yr.nc

Finished downloading sample data to /home/share/unix_files/cdat/miniconda3/envs/cdatm18_py2/share/cdat/sample_data

(cdatm18_py2) jypeter@obelix2 - ...jypeter - 64 >ls -ltr /home/share/unix_files/cdat/miniconda3/envs/cdatm18_py2/share/cdat/sample_data | tail
-rw-r--r-- 1 jypeter lsce     3216 Mar  7 16:27 tas_gavg_rnl_ecm.nc
-rw-r--r-- 1 jypeter lsce   510144 Mar  7 16:27 tas_ecm_1979.nc
-rw-r--r-- 1 jypeter lsce   128360 Mar  7 16:27 tas_cru_1979.nc
-rw-r--r-- 1 jypeter lsce  2107996 Mar  7 16:27 psl_6h.nc
-rw-r--r-- 1 jypeter lsce   366116 Mar  7 16:27 ts_da.nc
-rw-r--r-- 1 jypeter lsce  2678584 Mar  7 16:27 tas_mo.nc
-rw-r--r-- 1 jypeter lsce   159468 Mar  7 16:27 tas_mo_clim.nc
-rw-r--r-- 1 jypeter lsce  6280312 Mar  7 16:27 tas_6h.nc
-rw-r--r-- 1 jypeter lsce 34487602 Mar  7 16:27 geos5-sample.nc
-rw-r--r-- 1 jypeter lsce   332776 Mar 12 14:56 th_yr.nc
@doutriaux1
Copy link
Contributor

doutriaux1 commented Mar 12, 2019 via email

@jypeter
Copy link
Member Author

jypeter commented Mar 14, 2019

The file I got seems valid enough. You can compare my md5sum below with the one of the source file

jypeter@obelix4 - ...sample_data - 54 >ls -l th_yr.nc
-rw-r--r-- 1 jypeter lsce 332776 Mar 12 14:56 th_yr.nc

jypeter@obelix4 - ...sample_data - 55 >md5sum th_yr.nc
00f26c388be3a13fecc1d7583e234353  th_yr.nc

jypeter@obelix4 - ...sample_data - 56 >ncdump th_yr.nc | head -30
netcdf th_yr {
dimensions:
        Time_th = UNLIMITED ; // (10 currently)
        Latitude = 64 ;
        bound = 2 ;
        Longitude_th = 128 ;
variables:
        float Time_th(Time_th) ;
                Time_th:units = "years since 1" ;
                Time_th:calendar = "proleptic_gregorian" ;
                Time_th:axis = "T" ;
        float Latitude(Latitude) ;
                Latitude:bounds = "bounds_Latitude" ;
                Latitude:units = "Degrees" ;
                Latitude:title = "" ;
                Latitude:time = "11:47:39" ;
                Latitude:source = "" ;
                Latitude:date = "25/04/02" ;
                Latitude:axis = "Y" ;
        double bounds_Latitude(Latitude, bound) ;
        float Longitude_th(Longitude_th) ;
                Longitude_th:bounds = "bounds_Longitude_th" ;
                Longitude_th:axis = "X" ;
                Longitude_th:units = "degrees_east" ;
                Longitude_th:modulo = 360. ;
                Longitude_th:topology = "circular" ;
        double bounds_Longitude_th(Longitude_th, bound) ;
        float th(Time_th, Latitude, Longitude_th) ;
                th:missing_value = 1.e+20f ;
                th:date = "25/04/02" ;

jypeter@obelix4 - ...sample_data - 57 >ncdump th_yr.nc | tail -30
    249.6738, 249.7106, 249.7536, 249.8022, 249.8556, 249.9131, 249.9741,
    250.0379, 250.1039, 250.1716, 250.2402, 250.3094, 250.3787, 250.4476,
    250.5156, 250.5823, 250.6474, 250.7104, 250.771, 250.8291, 250.8843,
    250.9366, 250.9859, 251.0325, 251.0763, 251.1176, 251.1569, 251.1942,
    251.23, 251.2647, 251.2986, 251.3321, 251.3653, 251.3986, 251.4322,
    251.4664, 251.5014, 251.5375, 251.5748, 251.6137, 251.6545, 251.6972,
    251.7423, 251.7899, 251.8402, 251.8932, 251.949, 252.0076, 252.0686,
    252.1319, 252.1971, 252.2638, 252.3314, 252.3996, 252.4674, 252.5345,
    252.6001, 252.6638, 252.725, 252.7832, 252.838, 252.8891, 252.9364,
    252.9797, 253.019,
  252.6102, 252.6246, 252.6373, 252.6483, 252.6574, 252.6647, 252.6702,
    252.6738, 252.6755, 252.6754, 252.6733, 252.6694, 252.6637, 252.6559,
    252.6463, 252.6347, 252.6212, 252.6057, 252.5882, 252.5687, 252.5471,
    252.5236, 252.4979, 252.4702, 252.4405, 252.4087, 252.3749, 252.3391,
    252.3015, 252.2621, 252.2209, 252.1782, 252.134, 252.0885, 252.0419,
    251.9945, 251.9464, 251.8978, 251.8491, 251.8004, 251.7521, 251.7045,
    251.6576, 251.6119, 251.5676, 251.5249, 251.4841, 251.4452, 251.4086,
    251.3745, 251.3445, 251.3172, 251.2925, 251.2707, 251.2515, 251.2354,
    251.2219, 251.2113, 251.2034, 251.1982, 251.1957, 251.1955, 251.1978,
    251.2023, 251.209, 251.2176, 251.228, 251.2401, 251.2538, 251.2688,
    251.2851, 251.3024, 251.3206, 251.3396, 251.3592, 251.3794, 251.4,
    251.4209, 251.442, 251.4633, 251.4846, 251.5059, 251.5271, 251.5483,
    251.5694, 251.5903, 251.6112, 251.6319, 251.6526, 251.6733, 251.694,
    251.7147, 251.7355, 251.7564, 251.7776, 251.799, 251.8208, 251.8428,
    251.8654, 251.8883, 251.9118, 251.9357, 251.9603, 251.9853, 252.0109,
    252.0369, 252.0636, 252.0906, 252.1181, 252.1458, 252.1738, 252.202,
    252.2303, 252.2586, 252.2866, 252.3145, 252.3419, 252.3688, 252.3951,
    252.4206, 252.4453, 252.4689, 252.4914, 252.5138, 252.5361, 252.5569,
    252.5763, 252.5941 ;
}

@doutriaux1
Copy link
Contributor

@jypeter it's possible the md5 is wrong in our check list, I'll double check

@jypeter
Copy link
Member Author

jypeter commented May 17, 2021

@downiec @jasonb5 I have just noticed the problem again and almost created a new issue !

I'm using the latest stable vcs (and cdms)

 >conda list | egrep '(vcs|cdms)'
cdms2                     3.1.5                    pypi_0    pypi
libcdms                   3.1.2              h981a4fd_113    conda-forge
vcs                       8.2.1              pyh9f0ad1d_0    cdat/label/v8.2.1
vcsaddons                 8.2.1            py38h1e0a361_0    cdat/label/v8.2.1

I have downloaded the full sample data to a new installation

 >ls -ltr /home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm19_nompi_py3/share/cdat/sample_data | tail -3
-rw-r--r-- 1 jypeter lsce  6280312 May 17 16:39 tas_6h.nc
-rw-r--r-- 1 jypeter lsce 34487602 May 17 16:39 geos5-sample.nc
-rw-r--r-- 1 jypeter lsce   332776 May 17 16:49 th_yr.nc

 >md5sum /home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm19_nompi_py3/share/cdat/sample_data/th_yr.nc
00f26c388be3a13fecc1d7583e234353  /home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm19_nompi_py3/share/cdat/sample_data/th_yr.nc

But each time I execute vcs.download_sample_data_files(), it downloads again th_yr.nc, three times!

>>> vcs.download_sample_data_files()
Downloading: 'th_yr.nc' from 'https://cdat.llnl.gov/cdat/sample_data/' in: /home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm19_nompi_py3/share/cdat/sample_data/th_yr.nc
Downloading: 'th_yr.nc' from 'https://cdat.llnl.gov/cdat/sample_data/' in: /home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm19_nompi_py3/share/cdat/sample_data/th_yr.nc
Downloading: 'th_yr.nc' from 'https://cdat.llnl.gov/cdat/sample_data/' in: /home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm19_nompi_py3/share/cdat/sample_data/th_yr.nc

The file has not changed (according to md5sum), or course

 >ls -ltr /home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm19_nompi_py3/share/cdat/sample_data | tail -3
-rw-r--r-- 1 jypeter lsce  6280312 May 17 16:39 tas_6h.nc
-rw-r--r-- 1 jypeter lsce 34487602 May 17 16:39 geos5-sample.nc
-rw-r--r-- 1 jypeter lsce   332776 May 17 17:06 th_yr.nc

 >md5sum /home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm19_nompi_py3/share/cdat/sample_data/th_yr.nc
00f26c388be3a13fecc1d7583e234353  /home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm19_nompi_py3/share/cdat/sample_data/th_yr.nc

Well, you can even see this in the output of the tutorials !! e.g. https://cdat.llnl.gov/Jupyter-notebooks/vcs/VCS_Example/VCS_Example.html

I'm afraid that, if people execute the notebooks on a server where CDAT and the data files were installed by somebody else, the notebook may fail when the download tries to write the data file in somebody else's directory where they don't have write access

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants