-
Notifications
You must be signed in to change notification settings - Fork 0
Diviner Data Processing
Paul Hayne, JPL -- June 11, 2015
Adapted for this Wiki by K.-Michael Aye, LASP -- August 21, 2018
For access to the Diviner team data system, you have two options:
- Link: https://luna1.diviner.ucla.edu/divproc/rel/pipetext?op=begin&cat=rdr
- This option is great for simple one-time queries, but gets cumbersome quickly if you are doing serious data processing
Both the UCLA and JPL servers should work in exactly the same way:
- UCLA: luna{i}.diviner.ucla.edu (where i={1, 2} )
- JPL: cabeus.jpl.nasa.gov
This option is tricky to learn, but much more powerful than the web query.
To log in to one of the files servers, simply ssh using the "-Y" flag to pass display information for your screen:
phayne@localhost:~$ ssh -Y cabeus
or more explicitly,
phayne@localhost:~$ ssh -Y [email protected]
If you are off-lab, you must be connected to the VPN. If you cannot get onto the JPL VPN, just connect to UCLA instead:
phayne@localhost:~$ ssh -Y [email protected]
The program divdata is at the core of the data processing pipeline.
A basic help message is displayed by simply typing "divdata" with no options:
phayne@cabeus:data$ divdata
Using new divbase
Quick info:
divdata (No arguments, prints this usage statement)
divdata [type=datatype] fields (to just print out selectable fields)
----
Piping data into other pipes commands:
divdata [type=datatype] [noindex] daterange=BEGIN,END [clat=MIN,MAX] [c=MIN,MAX]
[FIELD=MIN,MAX FIELD=MIN,MAX ...] | PIPES_COMMANDS ...
BEGIN and END for daterange can be the following format:
YYYYMM - A month, gets you all the days in that month.
YYYYMMDD - A day, gets you all the hours in that day.
YYYYMMDDHH - An hour, gets you all the minutes in that hour.
If BEGIN and END are equal, e.g. 200907, you can just use: daterange=200907
Multiple daterange=BEGIN,END arguments specify disjoint times.
Other fields (except for 'c', only one instance of each may be allowed):
clat=MIN,MAX - Center latitude of observation, greatly improves performance
c=MIN,MAX - Channel number, greatly improves performance
You can specify multiple arguments for this, e.g.:
c=1,1 c=5,6 c=8,9
but don't mix inclusive (MIN<=MAX) with exclusive (MIN>MAX)
FIELD=MIN,MAX - Any other FIELD in the dataset,
moderately improves performance
type=DATATYPE - Output data format. Default is 'div38'.
noindex - Do not use indexing to match data constraints.
Significantly SLOWS DOWN your data access.
Use only for debugging, a sanity check to make
sure you are getting all your data. Using this
option *should* not alter your results except in
terms of speed. Let us know otherwise.
nodel - Do not delete the catfile this program creates.
debug=N - Debug level where N is one of:
0 - Normal, only high level messages.
1 - Detailed
2 - Extra detailed.
All debugging messages are printed to standard error.
A note on constraints:
When specifying a MIN,MAX, if MIN<=MAX, you get all the data
between MIN and MAX, inclusively. If MIN>MAX, you get all
the data OUTSIDE of the inclusive MIN,MAX range. Examples:
clat=-70,50 - All latitudes between -70 and 50, inclusively.
clat=50,-70 - (-90,-70.0000000001) + (50.0000000001,90)
c=3,3 - Channel 3 only.
c=3,5 - Channels 3,4,5
c=5,3 - Channels 1,2,6,7,8,9
Here is the basic pipeline concept behind the Diviner data processing system:
The "pipes" routines were developed in the 1990's, but they are still the most efficient and flexible way to manipulate the Diviner data! You can picture data "stream" flowing through pipes. Each of these pipes manipulates the data in some way (e.g. coordinate projection, unit conversion, averaging, histograms, etc). Pipes can be linked end-on-end to apply multiple processing steps in one command line. More information on pipes can be found here:
http://luna1.diviner.ucla.edu/~marks/pipes.html
Below are several examples of using divdata to access the database, apply constraints, and output the results in a useful format.
The following query applies constraints on the date, longitude, latitude, and channel #, in order to pull out local time and Tb for all channel-7 data within a 1°x1° box:
phayne@cabeus:data$ divdata daterange=20090601,20100630 clat=0,1 clon=0,1 c=7,7 af=0,199 cemis=0,10
| pextract extract=cloctime,tb
| pprint > mydata_c7.txt
Let's look at each step in this command:
-
First, divdata pulls out all the data within the specified range of June 1, 2009 to June 30, 2010. Then, divdata restricts the data to the 1°x1° box in latitude and longitude, pulls out only channel 7 data acquired in normal mapping mode ("af=0,199") with emission angle <= 10° (eliminates spacecraft rolls), and passes the stream to pextract.
-
Next, pextract stops the flow of all data fields except local time and brightness temperature, which it pipes to pprint.
-
Finally, pprint converts the binary data stream into ASCII text, and then we write the results to the file mydata_c7.txt.
If everything goes smoothly, you will see this message on the screen:
Latitude disks used: 10
Channels selected : 7
Files matching your date range: 8481
Files skipped due to indexing : 7651
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\--
Files used: 830
Executing: .catfile.2015_06_11-10:00:26.10535.53729390 | extract_c38
| pcons des=/u/marks/luner/c38/src/div38.des clat=0,1 clon=0,1 af=0,199 cemis=0,10
pcons: 66808 records written, 0.020309632 GB
The output data look like this:
phayne@cabeus:data$ head mydata_c7.txt
17.246389389 234.947006226
17.245559692 232.264999390
17.244720459 234.897003174
17.243890762 237.087005615
17.243330002 232.552993774
17.242500305 236.507003784
17.241670609 235.634994507
17.240829468 234.841995239
17.239999771 233.440994263
17.239170074 230.014007568
In this case, we just have the two columns we selected with pextract: cloctime and tb. Below is an example of how you might plot this up in Matlab. Here, I'm using my own binning function "bin1d" in order to see the mean values and standard deviation at each local time. Just for comparison, I plotted the radiative equilibrium temperature, which matches nicely during mid-day: .
Matlab session:
\>\> data = dlmread(\'mydata\_c7.txt\');
\>\> plot(data(:,1),data(:,2),\'.k\',\'markersize\',1)
\>\>
set(gca,\'fontsize\',14,\'color\',\'none\',\'box\',\'on\',\'fontweight\',\'normal\',\'tickdir\',\'in\',\'xminortick\',\'on\',\'yminortick\',\'on\')
\>\> xlabel(\'Local Time\')
\>\> ylabel(\'Brightness Temperature (K)\')
\>\> text(2,375,\'C7\',\'fontsize\',14)
\>\> help bin1d
function \[xbins, ymean, ystd, ymin, ymax, n\] = bin1d(x, y, xbins)
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\--
Bin (x,y) data into bins centered at points in the array xbins. Returns
the mean, standard deviation, minimum, maximum, and number of points in
each bin.
\>\> ltbins = \[1.0,2.8,4.5,6.5,8.2,10,12,13.8,15.5,17.2,19.5,21.2,23\];
% I just eyeballed these bin centers
\>\> \[loctime,tbmean,tbstd\] = bin1d(data(:,1),data(:,2),ltbins);
\>\> figure(1), hold on
\>\>
errorbar(loctime,tbmean,tbstd,tbstd,\'sr\',\'markerfacecolor\',\'r\')
\>\> lday = 6.0:0.1:18.0;
\>\> albedo = 0.1;
\>\> sigma = 5.67e-8; % stefan-boltzmann constant
\>\> emissivity = 0.95;
\>\> hourangle = (2\*pi/24)\*(lday-12);
\>\> F0 = 1361; % solar irradiance (W.m-2)
\>\> F = F0.\*cos(hourangle);
\>\> T\_radeq = ( (1-albedo)/(emissivity\*sigma) \* F ).\^(1/4);
\>\> plot(lday,T\_radeq,\'b\',\'linew\',1)
{width="6.0in" height="4.722916666666666in"}
Often we end up running very similar data queries multiple times, with slight modifications towards the end of the query. For example, we might want to do the above query for each Diviner channel. Instead of running the whole query 9x, we can do the following:
phayne@cabeus:data$ divdata daterange=200906,201006 clat=0,1 clon=0,1
af=0,199 cemis=0,10 | pextract extract=c,cloctime,tb nodes >
mydata.bin
Now the data for all nine channels are in the binary file 'mydata.bin'. We can read this file in using the pipes command pread, along with a "descriptor file":
phayne@cabeus:data$ pcons des=mydescriptor.des c=1,1 < mydata.bin |
pprint > mydata_c1.txt
This reads in the binary file we created above, and restricts the data stream to only channel 1. The descriptor file must match the format of the binary file, e.g. like this:
phayne@cabeus:data$ cat mydescriptor.des
\'Example descriptor file\'
\'c\' \'channel number (1-9)\'
\'cloctime\' \'local solar time\'
\'tb\' \'brightness temperature\'
Now that we have the binary file, it's straightforward to extract all the other channels at a later time (using a for-loop in this case):
phayne@cabeus:data$ for chnum in `seq 2 9`; do pcons
des=mydescriptor.des c=${chnum},${chnum} < mydata.bin | pprint >
mydata_c${chnum}.txt; echo "processed chan. ${chnum}"; done
Switching to Matlab, here is what the data from all seven channels look like:
{width="4.36124343832021in" height="2.8928576115485565in"}
There's a pipes tool called pbin3d, which we use to "bin" the data along up to three dimensions. Most often, we treat the first two dimensions as spatial coordinates (e.g. lon/lat) and the third dimension as either time, or a dummy variable. In the case below, I'm making it a dummy variable using tb.
phayne@cabeus:data$ divdata daterange=20090601,20100630 clat=20,25
clon=-50,-45 c=7,7 cloctime=20,22 af=0,199 cemis=0,10
| pextract extract=clon,clat,tb
| pbin3d x=clon y=clat z=tb t=tb xrange=-50,-45 yrange=20,25
zrange=0,400 deltax=0.02 deltay=0.02 deltaz=400
> mybinneddata.txt
Plotting it up in Matlab:
\>\> data = dlmread(\'mybinneddata.txt\');
\>\> lon = reshape\_pbin3d(data,1);
\>\> lat = reshape\_pbin3d(data,2);
\>\> tb = reshape\_pbin3d(data,4);
\>\> tb(tb\<=0) = nan;
\>\> pcolor(lon,lat,tb), shading flat, axis equal tight
\>\> cb = colorbar;
\>\>
set(gca,\'fontsize\',14,\'color\',\'none\',\'box\',\'on\',\'fontweight\',\'normal\',\'tickdir\',\'in\',\'xminortick\',\'on\',\'yminortick\',\'on\')
\>\> xlabel(\'East Longitude\')
\>\> ylabel(\'Latitude (deg.)\')
\>\> ylabel(\'Latitude\')
{width="3.982646544181977in" height="3.4in"}
This map is fairly sparse (we're only looking at a 2-hr local time window), so I ran the same query, this time using the full mission (200906 -- 201506):
{width="4.0in" height="3.4148151793525807in"}
Here I did the same query on channel 8, and then made a difference map
(rocky areas have large ∆Tb):
{width="4.0in" height="3.6333333333333333in"}