Difference between revisions of "Hdf2nc"
m |
|||
(3 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
− | + | hdf2nc (also called lofs2nc) converts any subset of LOFS raw data to a single netCDF file. | |
<code> | <code> | ||
Line 9: | Line 9: | ||
<code>--histpath=[histpath] (required)</code>: Top level directory that contains the 3D model data | <code>--histpath=[histpath] (required)</code>: Top level directory that contains the 3D model data | ||
− | <code>--ncbase=[ncbase] ( | + | <code>--ncbase=[ncbase] (optional)</code>: base name of netCDF file, otherwise generated from the LOFS basename |
<code>--x0=X0 (optional)</code> westmost index in X. Defaults to westmost index in X found in the saved data. | <code>--x0=X0 (optional)</code> westmost index in X. Defaults to westmost index in X found in the saved data. | ||
Line 28: | Line 28: | ||
<code>--debug (optional)</code> Turn on debugging output. | <code>--debug (optional)</code> Turn on debugging output. | ||
+ | |||
+ | <code>--verbose (optional)</code> Turn on verbose output. | ||
<code>--recache (optional)</code> Force regeneration of cache files. | <code>--recache (optional)</code> Force regeneration of cache files. | ||
− | <code>--swaths (optional)</code> Read and write all 2D swath data to netCDF files. | + | <code>--swaths (optional)</code> Read and write all LOFS 2D swath data to netCDF files. |
− | |||
− | |||
− | |||
− | |||
− | <code>-- | + | <code>--zfp (optional)</code> Turn on lossy ZFP compression for 3D data saved in netCDF files. |
− | <code>-- | + | <code>--inprogress (optional)</code> Operate on a 3D directory where a simulation is in progress (since the last directory will be full of zeroed out files that haven't been written to disk yet). |
<code>--offset (optional)</code> Supplied X0,X1,Y0,Y1 values are with respect to what was saved, not (0,0). For instance, if only a subset of the domain was saved, say ranging from 500 to 1000 in X and 600 to 1100 in Y, the following two statements arguments would produce identical results: | <code>--offset (optional)</code> Supplied X0,X1,Y0,Y1 values are with respect to what was saved, not (0,0). For instance, if only a subset of the domain was saved, say ranging from 500 to 1000 in X and 600 to 1100 in Y, the following two statements arguments would produce identical results: | ||
− | |||
<pre> | <pre> | ||
--x0=50 --y0=100 --x1=250 --y1=300 --offset | --x0=50 --y0=100 --x1=250 --y1=300 --offset |
Latest revision as of 13:02, 26 March 2024
hdf2nc (also called lofs2nc) converts any subset of LOFS raw data to a single netCDF file.
Typical usage: hdf2nc --time=[time] --histpath=[histpath] --base=[ncbase] --x0=[X0] --y0=[Y0] --x1=[X1] --y1=[Y1] --z0=[Z0] --z1=[Z1] [varname1 ... varnameN]
--time=[time](required)
: The model time requested in seconds
--histpath=[histpath] (required)
: Top level directory that contains the 3D model data
--ncbase=[ncbase] (optional)
: base name of netCDF file, otherwise generated from the LOFS basename
--x0=X0 (optional)
westmost index in X. Defaults to westmost index in X found in the saved data.
--x1=X1 (optional)
eastmost index in X. Defaults to eastmost index in X found in the saved data.
--y0=Y0 (optional)
southmost index in Y. Defaults to southmost index in Y found in the saved data.
--y1=Y1 (optional)
northmost index in Y. Defaults to northmost index in Y found in the saved data.
--z0=Z0 (optional)
bottommost index in Z. Defaults to 0.
--x0=X0 (optional)
topmost index in Z. Defaults to topmost index in Z found in the saved data.
(X0,Y0,Z0)(X1,Y1,Z1)
: This defines the volume (or space) that you wish to convert to netCDF fdata, with respect to integer array indices that span the full model domain, namely (0,0,0) to (nx-1,ny-1,nz-1). Each of these are optional. If none of these are passed to hdf2nc, the full 3D range of data saved will be converted to netCDF. If only some of them are provided, the remainder default to the min/max values from the saved data (the code will extract these values from what has been saved, making no assumptions).
varname1...varnameN (optional)
The list of variables, separated by whitespace, that you wish to convert.
--debug (optional)
Turn on debugging output.
--verbose (optional)
Turn on verbose output.
--recache (optional)
Force regeneration of cache files.
--swaths (optional)
Read and write all LOFS 2D swath data to netCDF files.
--zfp (optional)
Turn on lossy ZFP compression for 3D data saved in netCDF files.
--inprogress (optional)
Operate on a 3D directory where a simulation is in progress (since the last directory will be full of zeroed out files that haven't been written to disk yet).
--offset (optional)
Supplied X0,X1,Y0,Y1 values are with respect to what was saved, not (0,0). For instance, if only a subset of the domain was saved, say ranging from 500 to 1000 in X and 600 to 1100 in Y, the following two statements arguments would produce identical results:
--x0=50 --y0=100 --x1=250 --y1=300 --offset --x0=550 --y0=700 --x1=750 --y1=900
NOTE: u, v, and w (corresponding to ua, va, wa CM1 3D arrays that exist on the Arakawa C staggered mesh) must be saved in LOFS files for diagnostics to be calculated. This is enforced in order to achieve the highest order accuracy possible with diagnostics (specifically those involving derivatives of velocity components), rather than using velocity variables that have already been averaged to the scalar mesh. However, if u, v, and w have been saved and uinterp, vinterp, winterp have been requested, the code will interpolate those values from ua, va, and wa such so for post-processing and visualization, all variables lie on the same mesh. It is strongly recommended for LOFS to save the native staggered velocity variables (ua, va, wa CM1 arrays, corresponding to output_[uvw] = 1 in the namelist.input file) if you wish to calculate diagnostics. Alternatively, you may wish to calculate diagnostics from within CM1, and save interpolated velocity data. We choose to save the minimum amount of data possible to save on disk space, and push the diagnostic calculations to the post-processing stage.
The following is a list of available additional fields that can be calculated within the hdf2nc code (u, v, w must be saved, as opposed to uinterp, vinterp, winterp):
uinterp
|
u component of wind interpolated to scalar mesh |
vinterp
|
v component of wind interpolated to scalar mesh |
winterp
|
w component of wind interpolated to scalar mesh |
hwin_sr
|
storm relative horizontal wind speed |
hwin_gr
|
ground relative horizontal wind speed |
xvort
|
x component of vorticity |
yvort
|
y component of vorticity |
zvort
|
z component of vorticity |
hvort
|
horizonal vorticity vector magnitude |
vortmag
|
3D vorticity vector magnitude |
hdiv
|
horizontal divergence (du/dx + dv/dy) |
hdf2nc Example
h2ologin2:~/project-bagm/brainstorm2017/15m/history.fs8-15m-a% hdf2nc --histpath=3D --base=frob --time=7101 --x0=2300 --y0=2300 --x1=2400 --y1=2400 --z1=10 uinterp winterp dbz histpath = 3D ncbase = frob time = 7101.000000 X0 = 2300 Y0 = 2300 X1 = 2400 Y1 = 2400 Z1 = 10 Setting Z0 to default value of 0 Read cached num_time_dirs of 146 ntimedirs: 146 Read cached sorted time dirs Read cached num node dirs Read cached nodedir Read cached firstfilename and all times We are requesting the following fields: uinterp winterp dbz Working on surface 2D thrhopert and dbz ( acca czza czza aaaa acca czza czza aaaa ) Working on uinterp ( acca czza czza aaaa ) Working on winterp ( acca czza czza aaaa ) Working on dbz ( acca czza czza aaaa ) h2ologin2:~/project-bagm/brainstorm2017/15m/history.fs8-15m-a% lt total 168656 drwxr-xr-x 10 orf PRAC_bagm 4096 Jul 31 11:12 ./ -rw-r--r-- 1 orf PRAC_bagm 1476356 Jul 31 11:12 frob.07101.000000.nc -rw-r--r-- 1 orf PRAC_bagm 116 Jul 31 11:12 frob.07101.000000.nc.cmd
Discussion
If no array index options are provided to the command line, hdf2nc will attempt to convert the entire model domain to a netCDF file. In other words, X0,Y0,Z0,X1,Y1,Z1 are optional arguments to hdf2nc. However for typical use cases with large amounts of data, you will want to convert only a subset of the full model domain!
The output of hdf2nc includes information on some basic metadata, plus some output that tracks the reading of the individual hdf5 files that comprise LOFS. Each letter that comprises the output that looks like
acca czza czza aaaa
represents the successful reading of data from a single hdf5 file. It's kind of a 'base 26' representation of the percentage of data (in the horizontal) requested from each file. If a z
is printed, that means the full horizontal range of data was requested. If a
is printed, a tiny piece of the horizontal range was selected. All intermediate letters represent the space between these two extremes. This output is for your entertainment only; you are essentially watching the assembly of the netCDF file from LOFS data in real time.
hdf2nc always produces a 2D surface plot of density potential temperature perturbation from base state (proportional to buoyancy) and surface (calculated) radar reflectivity. These fields are used so much that they are always written to the netCDF files whether they are requested or not.
Regarding the mention of cached data, the LOFS read routines will look for existing cache files before going out and getting all the metadata from hdf5 files, which, for large amounts of data, is very expensive. Since the data layout never changes (unlesss you change it) the cached files speed things up quite a bit. If you ever change your LOFS data (say, adding new time directories), you must remove the cache files and let LOFS regenerate them so they will contain the new information. Cache files all are prefixed by .cm1hdf5_
and can always be removed, as they will always be regenerated.
In this example, the output file name is frob.07101.000000.nc
, indicating data that was retrieved at t=07101.000000
seconds. Note that LOFS allows for the saving and retrieval of data saved in intervals of less than one second, as time is represented as a floating point variable.
Rationale
LOFS splits the model domain and times into files spread across hundreds directories in large simulations. Often times you may wish to analyze, plot, or visualize a subset of the full model domain at a given time, perhaps to make plots or to feed into visualization software that understands the netCDF format which is one of the most commonly used data formats used in atmospheric science.