diff --git a/_sources/execution.rst.txt b/_sources/execution.rst.txt index 975c395..bd0f2df 100644 --- a/_sources/execution.rst.txt +++ b/_sources/execution.rst.txt @@ -5,14 +5,77 @@ Running the Pipeline Running for a single dataset ============================ -:: +Run all single-dataset processes with the `single-run.py` script. +``` usage: single_run.py [-h] [-w WORKDIR] [-g GROUPDIR] [-G GROUPID] [-p PROJ_DIR] [-n NEW_VERSION] [-m MODE] [-t TIME_ALLOWED] [-b] [-s SUBSET] [-r REPEAT_ID] [-f] [-v] [-d] [-Q] phase proj_code -:: -Required Parameters: -:phase: Pipeline phase to execute -:proj_code: Project code or ID for this dataset execution +positional arguments: + phase Phase of the pipeline to initiate + proj_code Project identifier code + +options: + -h, --help show this help message and exit + -w WORKDIR, --workdir WORKDIR + Working directory for pipeline + -g GROUPDIR, --groupdir GROUPDIR + Group directory for pipeline + -G GROUPID, --groupID GROUPID + Group identifier label + -p PROJ_DIR, --proj_dir PROJ_DIR + Project directory for pipeline + -n NEW_VERSION, --new_version NEW_VERSION + If present, create a new version + -m MODE, --mode MODE Print or record information (log or std) + -t TIME_ALLOWED, --time-allowed TIME_ALLOWED + Time limit for this job + -b, --bypass-errs Bypass all error messages - skip failed jobs + -s SUBSET, --subset SUBSET + Size of subset within group + -r REPEAT_ID, --repeat_id REPEAT_ID + Repeat id (1 if first time running, _ otherwise) + -f Force overwrite of steps if previously done + -v, --verbose Print helpful statements while running + -d, --dryrun Perform dry-run (i.e no new files/dirs created) + -Q, --quality Quality assured checks - thorough run +``` =============================== Running for a group of datasets -=============================== \ No newline at end of file +=============================== + +Run all multi-dataset group processes within the pipeline using the `group_run.py` script. + +``` +usage: group_run.py [-h] [-s SOURCE] [-e VENVPATH] [-w WORKDIR] [-g GROUPDIR] [-p PROJ_DIR] [-n NEW_VERSION] [-m MODE] [-t TIME_ALLOWED] [-b] [-i INPUT] [-S SUBSET] [-r REPEAT_ID] [-f] [-v] [-d] [-Q] phase groupID + +positional arguments: + phase Phase of the pipeline to initiate + groupID Group identifier code + +options: + -h, --help show this help message and exit + -s SOURCE Path to directory containing master scripts (this one) + -e VENVPATH Path to virtual (e)nvironment (excludes /bin/activate) + -w WORKDIR, --workdir WORKDIR + Working directory for pipeline + -g GROUPDIR, --groupdir GROUPDIR + Group directory for pipeline + -p PROJ_DIR, --proj_dir PROJ_DIR + Project directory for pipeline + -n NEW_VERSION, --new_version NEW_VERSION + If present, create a new version + -m MODE, --mode MODE Print or record information (log or std) + -t TIME_ALLOWED, --time-allowed TIME_ALLOWED + Time limit for this job + -b, --bypass-errs Bypass all error messages - skip failed jobs + -i INPUT, --input INPUT + input file (for init phase) + -S SUBSET, --subset SUBSET + Size of subset within group + -r REPEAT_ID, --repeat_id REPEAT_ID + Repeat id (1 if first time running, _ otherwise) + -f Force overwrite of steps if previously done + -v, --verbose Print helpful statements while running + -d, --dryrun Perform dry-run (i.e no new files/dirs created) + -Q, --quality Quality assured checks - thorough run +``` \ No newline at end of file diff --git a/execution.html b/execution.html index fd83fd6..daaf7a3 100644 --- a/execution.html +++ b/execution.html @@ -85,14 +85,129 @@

Running the Pipeline

Running for a single dataset

-

:: +

Run all single-dataset processes with the single-run.py script. +``` usage: single_run.py [-h] [-w WORKDIR] [-g GROUPDIR] [-G GROUPID] [-p PROJ_DIR] [-n NEW_VERSION] [-m MODE] [-t TIME_ALLOWED] [-b] [-s SUBSET] [-r REPEAT_ID] [-f] [-v] [-d] [-Q] phase proj_code

-

Required Parameters: -:phase: Pipeline phase to execute -:proj_code: Project code or ID for this dataset execution

+
+
positional arguments:

phase Phase of the pipeline to initiate +proj_code Project identifier code

+
+
options:
+
-h, --help
+

show this help message and exit

+
+
-w WORKDIR, --workdir WORKDIR
+

Working directory for pipeline

+
+
-g GROUPDIR, --groupdir GROUPDIR
+

Group directory for pipeline

+
+
-G GROUPID, --groupID GROUPID
+

Group identifier label

+
+
-p PROJ_DIR, --proj_dir PROJ_DIR
+

Project directory for pipeline

+
+
-n NEW_VERSION, --new_version NEW_VERSION
+

If present, create a new version

+
+
-m MODE, --mode MODE
+

Print or record information (log or std)

+
+
-t TIME_ALLOWED, --time-allowed TIME_ALLOWED
+

Time limit for this job

+
+
-b, --bypass-errs
+

Bypass all error messages - skip failed jobs

+
+
-s SUBSET, --subset SUBSET
+

Size of subset within group

+
+
-r REPEAT_ID, --repeat_id REPEAT_ID
+

Repeat id (1 if first time running, <phase>_<repeat> otherwise)

+
+
-f
+

Force overwrite of steps if previously done

+
+
-v, --verbose
+

Print helpful statements while running

+
+
-d, --dryrun
+

Perform dry-run (i.e no new files/dirs created)

+
+
-Q, --quality
+

Quality assured checks - thorough run

+
+
+
+
+

```

Running for a group of datasets

+

Run all multi-dataset group processes within the pipeline using the group_run.py script.

+

``` +usage: group_run.py [-h] [-s SOURCE] [-e VENVPATH] [-w WORKDIR] [-g GROUPDIR] [-p PROJ_DIR] [-n NEW_VERSION] [-m MODE] [-t TIME_ALLOWED] [-b] [-i INPUT] [-S SUBSET] [-r REPEAT_ID] [-f] [-v] [-d] [-Q] phase groupID

+
+
positional arguments:

phase Phase of the pipeline to initiate +groupID Group identifier code

+
+
options:
+
-h, --help
+

show this help message and exit

+
+
-s SOURCE
+

Path to directory containing master scripts (this one)

+
+
-e VENVPATH
+

Path to virtual (e)nvironment (excludes /bin/activate)

+
+
-w WORKDIR, --workdir WORKDIR
+

Working directory for pipeline

+
+
-g GROUPDIR, --groupdir GROUPDIR
+

Group directory for pipeline

+
+
-p PROJ_DIR, --proj_dir PROJ_DIR
+

Project directory for pipeline

+
+
-n NEW_VERSION, --new_version NEW_VERSION
+

If present, create a new version

+
+
-m MODE, --mode MODE
+

Print or record information (log or std)

+
+
-t TIME_ALLOWED, --time-allowed TIME_ALLOWED
+

Time limit for this job

+
+
-b, --bypass-errs
+

Bypass all error messages - skip failed jobs

+
+
-i INPUT, --input INPUT
+

input file (for init phase)

+
+
-S SUBSET, --subset SUBSET
+

Size of subset within group

+
+
-r REPEAT_ID, --repeat_id REPEAT_ID
+

Repeat id (1 if first time running, <phase>_<repeat> otherwise)

+
+
-f
+

Force overwrite of steps if previously done

+
+
-v, --verbose
+

Print helpful statements while running

+
+
-d, --dryrun
+

Perform dry-run (i.e no new files/dirs created)

+
+
-Q, --quality
+

Quality assured checks - thorough run

+
+
+
+
+

```

diff --git a/searchindex.js b/searchindex.js index b946d07..8202cf3 100644 --- a/searchindex.js +++ b/searchindex.js @@ -1 +1 @@ -Search.setIndex({docnames:["assess","assess-overview","compute","examples","execution","execution-source","index","init","pipeline-overview","scan","serial-process","validate"],envversion:{"sphinx.domains.c":2,"sphinx.domains.changeset":1,"sphinx.domains.citation":1,"sphinx.domains.cpp":4,"sphinx.domains.index":1,"sphinx.domains.javascript":2,"sphinx.domains.math":2,"sphinx.domains.python":3,"sphinx.domains.rst":2,"sphinx.domains.std":2,sphinx:56},filenames:["assess.rst","assess-overview.rst","compute.rst","examples.rst","execution.rst","execution-source.rst","index.rst","init.rst","pipeline-overview.rst","scan.rst","serial-process.rst","validate.rst"],objects:{"":[[0,0,0,"-","assess"],[5,0,0,"-","group_run"],[5,0,0,"-","single_run"]],"pipeline.compute":[[10,0,0,"-","serial_process"]],"pipeline.compute.serial_process":[[10,2,1,"","Converter"],[10,4,1,"","KerchunkDriverFatalError"],[10,1,1,"","init_logger"]],"pipeline.compute.serial_process.Converter":[[10,3,1,"","hdf5_to_zarr"],[10,3,1,"","ncf3_to_zarr"],[10,3,1,"","tiff_to_zarr"]],"pipeline.init":[[7,1,1,"","get_input"],[7,1,1,"","get_removals"],[7,1,1,"","get_updates"],[7,1,1,"","init_config"],[7,1,1,"","init_logger"],[7,1,1,"","load_from_input_file"],[7,1,1,"","make_dirs"],[7,1,1,"","make_filelist"],[7,1,1,"","text_file_to_csv"]],assess:[[0,1,1,"","check_errs"],[0,1,1,"","extract_keys"],[0,1,1,"","find_codes"],[0,1,1,"","get_attribute"],[0,1,1,"","get_code_from_val"],[0,1,1,"","init_logger"],[0,1,1,"","save_sel"],[0,1,1,"","show_options"]],group_run:[[5,1,1,"","get_attribute"],[5,1,1,"","get_group_len"],[5,1,1,"","init_logger"],[5,1,1,"","main"]],pipeline:[[7,0,0,"-","init"]],single_run:[[5,4,1,"","ExpectTimeoutError"],[5,4,1,"","MissingVariableError"],[5,4,1,"","ProjectCodeError"],[5,1,1,"","get_proj_code"],[5,1,1,"","init_logger"],[5,1,1,"","main"],[5,1,1,"","run_compute"],[5,1,1,"","run_init"],[5,1,1,"","run_scan"],[5,1,1,"","run_validation"]]},objnames:{"0":["py","module","Python module"],"1":["py","function","Python function"],"2":["py","class","Python class"],"3":["py","method","Python method"],"4":["py","exception","Python exception"]},objtypes:{"0":"py:module","1":"py:function","2":"py:class","3":"py:method","4":"py:exception"},terms:{"0":[0,5],"1":5,"2":0,"class":10,"final":7,"function":5,"int":0,"var":[0,5],The:6,all:[0,10],an:6,ar:6,archiv:6,arg:[0,5,7],argument:[0,5],assembl:[0,5],assess:6,assessor:6,associ:7,attribut:7,b:4,base:7,bypass_err:10,can:0,check:0,check_err:0,code:[0,4,5],collect:0,command:[0,7],complet:0,comput:[5,6,10],config:[0,7],configur:[0,5,7,10],consid:0,convers:10,convert:[7,10],correct:[5,6],correspond:0,creat:[6,7],csv:7,current:[0,5],d:4,data:0,dataset:[0,5,6],debug:0,defin:0,determin:0,differ:6,directori:[0,7],displai:0,driver:10,easi:6,ensur:6,env:[0,5],environ:[0,5],error:0,examin:0,exampl:6,except:[5,10],execut:[4,6],expecttimeouterror:5,extract:0,extract_kei:0,f:4,fail:[0,10],fals:10,fetch:0,file:[0,5,6,7,10],filepath:0,find:0,find_cod:0,format:[0,5,7,10],from:[0,5,6,7],g:4,get:[5,7],get_attribut:[0,5],get_code_from_v:0,get_group_len:5,get_input:7,get_proj_cod:5,get_remov:7,get_upd:7,given:[0,5],group:[0,5,6],group_run:5,groupdir:[0,4,5],groupid:[0,4],h:4,hdf5:10,hdf5_to_zarr:10,hdf:6,id:[0,4,5],ignor:0,implement:5,index:[0,6],individu:5,info:0,init:[0,5,6,7,10],init_config:7,init_logg:[0,5,7,10],initialis:5,input:7,job:5,kei:[0,7],kerchunkdriverfatalerror:10,kwarg:10,label:0,level:0,like:0,line:[0,7],list:[0,7],load:7,load_from_input_fil:7,log:0,logger:[0,5,7,10],m:4,main:[5,7],make:6,make_dir:7,make_filelist:7,master:0,md:8,messag:10,metadata:7,missingvariableerror:5,mode:[0,4,5,7,10],modul:6,multipl:6,n:4,name:[0,5,7,10],ncf3_to_zarr:10,netcdf3:10,netcdf:6,new_vers:4,nfile:10,none:[0,7],object:[0,5,7,10],oper:0,option:0,output:[0,6],overview:6,p:4,packag:6,page:6,pair:7,parallel:[5,6],paramet:[4,5],particular:0,pass:[0,5],path:0,pattern:7,perform:10,phase:[0,4],pid:5,pipelin:[0,6,7,10],prefix:7,process:[2,5,6],progress:0,proj_cod:4,proj_dir:[4,7],project:[0,4,5,7],projectcodeerror:5,py:4,python:6,q:4,r:4,rang:0,re:0,read:5,readm:8,redo_pcod:0,remov:7,repeat:0,repeat_id:[4,5],requir:[0,4,5],result:0,run:[0,5,6],run_comput:5,run_init:5,run_scan:5,run_valid:5,s:4,save:0,save_sel:0,savetyp:0,sbatch:5,scan:[5,6],script:[0,5,7],search:6,select:0,serial:[2,6],serial_process:10,set:[6,7],setup:5,show_opt:0,singl:[5,6],single_run:[4,5],slurm:5,some:0,sourc:[6,7],specif:[0,7],stage:0,start:5,step:6,str:0,structur:7,subset:[4,5],summaris:0,t:4,take:[0,5],text:7,text_file_to_csv:7,tfile:10,thi:[0,4,7],tiff:[6,10],tiff_to_zarr:10,time_allow:4,tool:6,type:[0,5,10],unus:0,up:7,updat:7,usag:4,v:4,valid:[5,6],valu:[0,7],variabl:[0,5],variou:0,verbos:[0,5,7,10],w:4,warn:0,when:10,which:0,whole:0,work:[0,6,7],workdir:[0,4,5]},titles:["Assess Module","Assessor Tool","Initialisation Module","Worked Examples","Running the Pipeline","Pipeline Execution","Welcome to kerchunk-builder\u2019s documentation!","Initialisation Module","<no title>","Scanner Module","Serial Processor","Validation Module"],titleterms:{advanc:6,assess:0,assessor:1,builder:6,content:6,dataset:4,document:6,exampl:3,execut:5,group:4,indic:6,initialis:[2,7],kerchunk:6,modul:[0,2,7,9,11],pipelin:[4,5],processor:[2,10],run:4,s:6,scanner:9,serial:10,singl:4,tabl:6,tool:1,valid:11,welcom:6,work:3}}) \ No newline at end of file +Search.setIndex({docnames:["assess","assess-overview","compute","examples","execution","execution-source","index","init","pipeline-overview","scan","serial-process","validate"],envversion:{"sphinx.domains.c":2,"sphinx.domains.changeset":1,"sphinx.domains.citation":1,"sphinx.domains.cpp":4,"sphinx.domains.index":1,"sphinx.domains.javascript":2,"sphinx.domains.math":2,"sphinx.domains.python":3,"sphinx.domains.rst":2,"sphinx.domains.std":2,sphinx:56},filenames:["assess.rst","assess-overview.rst","compute.rst","examples.rst","execution.rst","execution-source.rst","index.rst","init.rst","pipeline-overview.rst","scan.rst","serial-process.rst","validate.rst"],objects:{"":[[0,0,0,"-","assess"],[5,0,0,"-","group_run"],[5,0,0,"-","single_run"]],"pipeline.compute":[[10,0,0,"-","serial_process"]],"pipeline.compute.serial_process":[[10,2,1,"","Converter"],[10,4,1,"","KerchunkDriverFatalError"],[10,1,1,"","init_logger"]],"pipeline.compute.serial_process.Converter":[[10,3,1,"","hdf5_to_zarr"],[10,3,1,"","ncf3_to_zarr"],[10,3,1,"","tiff_to_zarr"]],"pipeline.init":[[7,1,1,"","get_input"],[7,1,1,"","get_removals"],[7,1,1,"","get_updates"],[7,1,1,"","init_config"],[7,1,1,"","init_logger"],[7,1,1,"","load_from_input_file"],[7,1,1,"","make_dirs"],[7,1,1,"","make_filelist"],[7,1,1,"","text_file_to_csv"]],assess:[[0,1,1,"","check_errs"],[0,1,1,"","extract_keys"],[0,1,1,"","find_codes"],[0,1,1,"","get_attribute"],[0,1,1,"","get_code_from_val"],[0,1,1,"","init_logger"],[0,1,1,"","save_sel"],[0,1,1,"","show_options"]],group_run:[[5,1,1,"","get_attribute"],[5,1,1,"","get_group_len"],[5,1,1,"","init_logger"],[5,1,1,"","main"]],pipeline:[[7,0,0,"-","init"]],single_run:[[5,4,1,"","ExpectTimeoutError"],[5,4,1,"","MissingVariableError"],[5,4,1,"","ProjectCodeError"],[5,1,1,"","get_proj_code"],[5,1,1,"","init_logger"],[5,1,1,"","main"],[5,1,1,"","run_compute"],[5,1,1,"","run_init"],[5,1,1,"","run_scan"],[5,1,1,"","run_validation"]]},objnames:{"0":["py","module","Python module"],"1":["py","function","Python function"],"2":["py","class","Python class"],"3":["py","method","Python method"],"4":["py","exception","Python exception"]},objtypes:{"0":"py:module","1":"py:function","2":"py:class","3":"py:method","4":"py:exception"},terms:{"0":[0,5],"1":[4,5],"2":0,"class":10,"final":7,"function":5,"int":0,"new":4,"var":[0,5],"while":4,If:4,The:6,_:4,activ:4,all:[0,4,10],allow:4,an:6,ar:6,archiv:6,arg:[0,5,7],argument:[0,4,5],assembl:[0,5],assess:6,assessor:6,associ:7,assur:4,attribut:7,b:4,base:7,bin:4,bypass:4,bypass_err:10,can:0,check:[0,4],check_err:0,code:[0,4,5],collect:0,command:[0,7],complet:0,comput:[5,6,10],config:[0,7],configur:[0,5,7,10],consid:0,contain:4,convers:10,convert:[7,10],correct:[5,6],correspond:0,creat:[4,6,7],csv:7,current:[0,5],d:4,data:0,dataset:[0,5,6],debug:0,defin:0,determin:0,differ:6,dir:4,directori:[0,4,7],displai:0,done:4,driver:10,dry:4,dryrun:4,e:4,easi:6,ensur:6,env:[0,5],environ:[0,5],err:4,error:[0,4],examin:0,exampl:6,except:[5,10],exclud:4,execut:6,exit:4,expecttimeouterror:5,extract:0,extract_kei:0,f:4,fail:[0,4,10],fals:10,fetch:0,file:[0,4,5,6,7,10],filepath:0,find:0,find_cod:0,first:4,forc:4,format:[0,5,7,10],from:[0,5,6,7],g:4,get:[5,7],get_attribut:[0,5],get_code_from_v:0,get_group_len:5,get_input:7,get_proj_cod:5,get_remov:7,get_upd:7,given:[0,5],group:[0,5,6],group_run:[4,5],groupdir:[0,4,5],groupid:[0,4],h:4,hdf5:10,hdf5_to_zarr:10,hdf:6,help:4,i:4,id:[0,4,5],identifi:4,ignor:0,implement:5,index:[0,6],individu:5,info:0,inform:4,init:[0,4,5,6,7,10],init_config:7,init_logg:[0,5,7,10],initi:4,initialis:5,input:[4,7],job:[4,5],kei:[0,7],kerchunkdriverfatalerror:10,kwarg:10,label:[0,4],level:0,like:0,limit:4,line:[0,7],list:[0,7],load:7,load_from_input_fil:7,log:[0,4],logger:[0,5,7,10],m:4,main:[5,7],make:6,make_dir:7,make_filelist:7,master:[0,4],md:8,messag:[4,10],metadata:7,missingvariableerror:5,mode:[0,4,5,7,10],modul:6,multi:4,multipl:6,n:4,name:[0,5,7,10],ncf3_to_zarr:10,netcdf3:10,netcdf:6,new_vers:4,nfile:10,none:[0,7],nviron:4,object:[0,5,7,10],one:4,oper:0,option:[0,4],otherwis:4,output:[0,6],overview:6,overwrit:4,p:4,packag:6,page:6,pair:7,parallel:[5,6],paramet:5,particular:0,pass:[0,5],path:[0,4],pattern:7,perform:[4,10],phase:[0,4],pid:5,pipelin:[0,6,7,10],posit:4,prefix:7,present:4,previous:4,print:4,process:[2,4,5,6],progress:0,proj_cod:4,proj_dir:[4,7],project:[0,4,5,7],projectcodeerror:5,py:4,python:6,q:4,qualiti:4,r:4,rang:0,re:0,read:5,readm:8,record:4,redo_pcod:0,remov:7,repeat:[0,4],repeat_id:[4,5],requir:[0,5],result:0,run:[0,5,6],run_comput:5,run_init:5,run_scan:5,run_valid:5,s:4,save:0,save_sel:0,savetyp:0,sbatch:5,scan:[5,6],script:[0,4,5,7],search:6,select:0,serial:[2,6],serial_process:10,set:[6,7],setup:5,show:4,show_opt:0,singl:[5,6],single_run:[4,5],size:4,skip:4,slurm:5,some:0,sourc:[4,6,7],specif:[0,7],stage:0,start:5,statement:4,std:4,step:[4,6],str:0,structur:7,subset:[4,5],summaris:0,t:4,take:[0,5],text:7,text_file_to_csv:7,tfile:10,thi:[0,4,7],thorough:4,tiff:[6,10],tiff_to_zarr:10,time:4,time_allow:4,tool:6,type:[0,5,10],unus:0,up:7,updat:7,us:4,usag:4,v:4,valid:[5,6],valu:[0,7],variabl:[0,5],variou:0,venvpath:4,verbos:[0,4,5,7,10],version:4,virtual:4,w:4,warn:0,when:10,which:0,whole:0,within:4,work:[0,4,6,7],workdir:[0,4,5]},titles:["Assess Module","Assessor Tool","Initialisation Module","Worked Examples","Running the Pipeline","Pipeline Execution","Welcome to kerchunk-builder\u2019s documentation!","Initialisation Module","<no title>","Scanner Module","Serial Processor","Validation Module"],titleterms:{advanc:6,assess:0,assessor:1,builder:6,content:6,dataset:4,document:6,exampl:3,execut:5,group:4,indic:6,initialis:[2,7],kerchunk:6,modul:[0,2,7,9,11],pipelin:[4,5],processor:[2,10],run:4,s:6,scanner:9,serial:10,singl:4,tabl:6,tool:1,valid:11,welcom:6,work:3}}) \ No newline at end of file