Difference between revisions of "Hdf2nc"
(First edit) |
(No difference)
|
Revision as of 18:00, 6 October 2019
Converts any subset of LOFS raw data to a single netCDF file.
Usage: hdf2nc --histpath=[histpath] --base=[ncbase] --x0=[X0] --y0=[Y0] --x1=[X1] --y1=[Y1] --z0=[Z0] --z1=[Z1] --time=[time] [varname1 ... varnameN]
histpath
: Top level directory that contains the 3D model data
ncbase
: base name of netCDF file
(X0,Y0,Z0)(X1,Y1,Z1)
: This defines the volume (or space) that you wish to convert to netCDF fdata, referenced by integer array indices that span the full model domain, namely (0,0,0) to (nx-1,ny-1,nz-1). Each of these are optional. If none of these are passed to hdf2nc, the full 3D range of data saved will be converted to netCDF. If only some of them are provided, the remainder default to the min/max values from the saved data (the code will extract these values from what has been saved, making no assumptions).
time
: The model time requested in seconds
varname1...varnameN
The list of variables, separated by whitespace, that you wish to convert
hdf2nc Example
h2ologin2:~/project-bagm/brainstorm2017/15m/history.fs8-15m-a% hdf2nc --histpath=3D --base=frob --time=7101 --x0=2300 --y0=2300 --x1=2400 --y1=2400 --z1=10 uinterp winterp dbz histpath = 3D ncbase = frob time = 7101.000000 X0 = 2300 Y0 = 2300 X1 = 2400 Y1 = 2400 Z1 = 10 Setting Z0 to default value of 0 Read cached num_time_dirs of 146 ntimedirs: 146 Read cached sorted time dirs Read cached num node dirs Read cached nodedir Read cached firstfilename and all times We are requesting the following fields: uinterp winterp dbz Working on surface 2D thrhopert and dbz ( acca czza czza aaaa acca czza czza aaaa ) Working on uinterp ( acca czza czza aaaa ) Working on winterp ( acca czza czza aaaa ) Working on dbz ( acca czza czza aaaa ) h2ologin2:~/project-bagm/brainstorm2017/15m/history.fs8-15m-a% lt total 168656 drwxr-xr-x 10 orf PRAC_bagm 4096 Jul 31 11:12 ./ -rw-r--r-- 1 orf PRAC_bagm 1476356 Jul 31 11:12 frob.07101.000000.nc -rw-r--r-- 1 orf PRAC_bagm 116 Jul 31 11:12 frob.07101.000000.nc.cmd
Discussion
If no array index options are provided to the command line, hdf2nc will attempt to convert the entire model domain to a netCDF file. In other words, X0,Y0,Z0,X1,Y1,Z1 are optional arguments to hdf2nc. However for typical use cases with large amounts of data, you will want to convert only a subset of the full model domain!
The output of hdf2nc includes information on some basic metadata, plus some output that tracks the reading of the individual hdf5 files that comprise LOFS. Each letter that comprises the output that looks like
acca czza czza aaaa
represents the successful reading of data from a single hdf5 file. It's kind of a 'base 26' representation of the percentage of data (in the horizontal) requested from each file. If a z
is printed, that means the full horizontal range of data was requested. If a
is printed, a tiny piece of the horizontal range was selected. All intermediate letters represent the space between these two extremes. This output is for your entertainment only; you are essentially watching the assembly of the netCDF file from LOFS data in real time.
hdf2nc always produces a 2D surface plot of density potential temperature perturbation from base state (proportional to buoyancy) and surface (calculated) radar reflectivity. These fields are used so much that they are always written to the netCDF files whether they are requested or not.
Regarding the mention of cached data, the LOFS read routines will look for existing cache files before going out and getting all the metadata from hdf5 files, which, for large amounts of data, is very expensive. Since the data layout never changes (unlesss you change it) the cached files speed things up quite a bit. If you ever change your LOFS data (say, adding new time directories), you must remove the cache files and let LOFS regenerate them so they will contain the new information. Cache files all are prefixed by .cm1hdf5_
and can always be removed, as they will always be regenerated.
In this example, the output file name is frob.07101.000000.nc
, indicating data that was retrieved at t=07101.000000
seconds. Note that LOFS allows for the saving and retrieval of data saved in intervals of less than one second, as time is represented as a floating point variable.
Rationale
LOFS splits the model domain and times into files spread across hundreds directories in large simulations. Often times you may wish to analyze, plot, or visualize a subset of the full model domain at a given time, perhaps to make plots or to feed into visualization software that understands the netCDF format which is one of the most commonly used data formats used in atmospheric science.