Dataframe#

Module related to dataFrames.

class h5pandas.dataframe.DatasetAccessor(pandas_obj)[source]#

Accessor to dataset for pandas object from h5pandas.

property attrs#: Return the attributes of the dataset backing the Pandas Object.

property dataset#: Return the dataset backing the Pandas Object.

property file#: Return the file backing the Pandas Object.

property name#: Return the name of the dataset backing the Pandas Object.

h5pandas.dataframe.dataframe_to_hdf(dataframe: DataFrame, h5file: str | Group, dataset_name: str = 'dataframe', index: list | None | Index = None, columns: list[str] | None = None, metadata: dict = {}, *args, **kwargs) → Dataset[source]#

High-level function to write a DataFrame into a HDF5 file.

Dataframe columns names (dataframe.columns) and attributes (dataframe.attrs) will be written inside the dataset attributes and can be retrieve later when accessing the file with h5pandas.

Parameters#

dataframepandas.DataFrame: The dataframe to write.
h5filestr or h5py.File or h5py.Group: If it is a string : the name of the HDF5 file in which the dataframe will be written. If the file already exist then the dataframe is added to this file. Otherwise the file is created. If hdf5file is a h5py.File or h5py.Group object then it will be written inside this object.
dataset_namestr, optional: The name of the dataset that will contain the dataframe. Default = “dataframe”.
index: list, None or pandas.Index, optional: Default=None. If not None, index will be written inside the HDF5 file and can be retrieve later with h5pandas.
columns: list, optional: names of the columns of the dataframe to save, if any. If columns is none then the dataframe names are used. Otherwise, if None, then nothing is written.
metadatadict, optional: Additional metadata to save with the dataframe as dataset attributes. Units or description for example.
*args and **kwargsadditionnal parameters passed directly to h5py.create_dataset: It can be compression options for example. See https://docs.h5py.org/en/stable/high/group.html#h5py.Group.create_dataset and https://pypi.org/project/hdf5plugin/

Returns#

dataseth5py.Dataset or None: The dataset created inside h5file. If h5file is a string, returns None.

h5pandas.dataframe.dataset_to_dataframe(dataset: Dataset, columns=None, index=None, copy=False)[source]#

Transform a dataset into a DataFrame.

Parameters#

dataseth5py.Dataset: The dataset to convert into a DataFrame.
columnsiterable, optional: Column labels to use for resulting frame when data does not have them, defaulting to RangeIndex(0, 1, 2, …, n). If data contains column labels, will perform column selection instead.
indexIndex or array-like, optional: Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.
copybool, optional: Copy data from inputs. For dict data, the default of None behaves like copy=True. For DataFrame or 2d ndarray input, the default of None behaves like copy=False. If data is a dict containing one or more Series (possibly of different dtypes), copy=False will ensure that these inputs are not copied.

Returns#

pandas.DataFrame: A dataFrame backed by the dataset. If you change the dataset values, the DataFrame will be changed.

h5pandas.dataframe.group_to_dataframe(group) → DataFrame[source]#

Transform a group into a DataFrame.

Parameters#

grouph5py.group: The group to convert into a DataFrame.

Returns#

pandas.DataFrame: A dataFrame backed by the dataset. If you change the dataset values, the DataFrame will cbe changed.

h5pandas.dataframe.ndarray_to_hdf(array: ndarray, h5file: str | Group, dataset_name: str = 'array', index: list | None | Index = None, columns: list[str] | None = None, metadata: dict = {}, *args, **kwargs) → Dataset[source]#

High-level function to write a NumpyArray into a HDF5 file.

Parameters#

arraynp.ndarray: The array to write.
h5filestr or h5py.File or h5py.Group: If it is a string : the name of the HDF5 file in which the array will be written. If the file already exist then the array is added to this file. Otherwise the file is created. If hdf5file is a h5py.File or h5py.Group object then it will be written inside this object.
dataset_namestr, optional: The name of the dataset that will contain the array. Default = “array”.
index: list, None or pandas.Index, optional: Default=None. If not None, index will be written inside the HDF5 file and can be retrieve later with h5pandas.
columns: list, optional: names of the columns of the array to save, if any. If the array is a structured array and columns is none then structured names are used. Otherwise, if None, then nothing is written.
metadatadict, optional: Additional metadata to save with the array as dataset attributes. Units or description for example.
*args and **kwargsadditionnal parameters passed directly to h5py.create_dataset: It can be compression options for example. See https://docs.h5py.org/en/stable/high/group.html#h5py.Group.create_dataset and https://pypi.org/project/hdf5plugin/

Returns#

dataseth5py.Dataset or None: The dataset created inside h5file. If h5file is a string, returns None.

Dataframe

Contents

Dataframe#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#