HDF5 - BKJackson/BKJackson_Wiki GitHub Wiki
What is HDF5
Hierarchical Data Formats - What is HDF5? - Great, short, easy to read explainer from Neon Science.
h5py Project
How to use HDF5 files in Python
HDF5 for Python - The h5py package is a Pythonic interface to the HDF5 binary data format.
h5py Quick Start Guide
Creating an HDF5 file with h5py
The File object is your starting point.
import h5py
import numpy as np
f = h5py.File("mytestfile.hdf5", "w")
dset = f.create_dataset("mydataset", (100,), dtype='i')
with h5py.File("mytestfile.hdf5", "w") as f:
dset = f.create_dataset("mydataset", (100,), dtype='i')
What is stored in this file? Remember h5py.File acts like a Python dictionary, thus we can check the keys.
>>> list(f.keys())
['mydataset']
Examine data set as a Dataset object
>>> dset = f['mydataset']
>>> dset.shape
(100,)
>>> dset.dtype
dtype('int32')
About the h5py project
It lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays. Thousands of datasets can be stored in a single file, categorized and tagged however you want.
H5py uses straightforward NumPy and Python metaphors, like dictionary and NumPy array syntax. For example, you can iterate over datasets in a file, or check out the .shape or .dtype attributes of datasets. You don't need to know anything special about HDF5 to get started.
In addition to the easy-to-use high level interface, h5py rests on a object-oriented Cython wrapping of the HDF5 C API. Almost anything you can do from C in HDF5, you can do from h5py.
Best of all, the files you create are in a widely-used standard binary format, which you can exchange with other people, including those who use programs like IDL and MATLAB.