Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Data Format: HDF5

What is HDF5?

All NISAR standard products are in Hierarchical Data Format version 5 (HDF5). HDF5 is a programming library and file format designed to store, organize, and access large scientific datasets. NISAR uses HDF5 to systematically organize radar data and metadata in a way that is both efficient and easy to read, share, and analyze.

HDF was originally developed by the University of Illinois’ National Center for Supercomputing Applications (NCSA) to support data sharing within the scientific community. HDF5 represents a significant redesign compared to earlier versions of HDF, with a more flexible and powerful internal structure. For additional details, users can consult the official HDF documentation at https://support.hdfgroup.org/documentation/.

At a high level, an HDF5 file functions as a container that organizes data into a hierarchy of objects, such as groups, datasets, and datatypes. In general, radar layers are organized into two groups: frequencyA/ and (potentially) frequencyB/. Note that nothing is stored at the root /.

Groups

An HDF5 group is a folder within an HDF5 file. Groups can hold datasets, datatypes, and other groups (subfolders). In essence, groups act like directories on computers. In a NISAR product, datasets are organized through nesting. For example, in a NISAR GCOV product, a dataset may be stored at a path such as:

/science/LSAR/GCOV/grids/frequencyA/HH

In this path, science, LSAR, GCOV, grids, and frequencyA are groups, and HH is a dataset contained within the frequencyA group.

Datasets

An HDF5 dataset is where the actual data lives. This might be an array or a table stored within the HDF5 file. Each dataset will include the data, a dataspace, a datatype, and additional (optional) attributes such as units, range, time, and other descriptions.

Attributes

An HDF5 attribute is a small piece of information that describes a group or dataset. Note that an attribute does not store the data itself. Attributes provide important context that help correctly interpret values within a dataset. Common examples include:

Storing this information with the data helps ensure that datasets can be understood and used correctly without relying on external documentation.

Datatypes

An HDF5 datatype describes the kind of data that is being stored. A datatype explains both how to interpret a dataset and how it is stored. Datatypes fall into three categories: atomic datatypes, composite datatypes, and named datatypes.

A summary of some important datatypes is given below. For more details on HDF5 datatypes and their uses, see the official HDF5 Datatypes documentation.

Atomic Datatypes

Atomic datatypes are typically the simplest datatypes. They serve as building blocks for more complex datatypes. Common atomic datatypes include:

Derived datatypes are customized atomic datatypes, commonly used for N-bit integers, floating-point formats, and other nonstandard data representations. They enable efficient and precise storage when data do not conform to standard numeric formats. Derived datatypes are useful because they:

Composite Datatypes

Composite datatypes are combinations of other datatypes. Some important composite datatypes are described below.

Array datatypes represent fixed-size, multi-dimensional arrays of a specified base datatype, where the array shape is defined as part of the datatype.

Variable-length datatypes represent one-dimensional arrays of a specified base datatype, with a variable number of items.

Compound datatypes represent collections of named fields, each with its own datatype.

Enumeration datatypes map integer values to a predefined set of named labels, improving user readability and consistency.

Named Datatypes

Named datatypes are stored as objects within an HDF5 file. Any datatype (atomic, derived, or composite) can be named or referenced throughout the file. Naming allows datatypes to be: