Dataframe#
Module related to dataFrames.
- class h5pandas.dataframe.DatasetAccessor(pandas_obj)[source]#
Accessor to dataset for pandas object from h5pandas.
- property attrs#
Return the attributes of the dataset backing the Pandas Object.
- property dataset#
Return the dataset backing the Pandas Object.
- property file#
Return the file backing the Pandas Object.
- property name#
Return the name of the dataset backing the Pandas Object.
- h5pandas.dataframe.dataframe_to_hdf(dataframe: DataFrame, h5file: str | Group, dataset_name: str = 'dataframe', index: list | None | Index = None, columns: list[str] | None = None, metadata: dict = {}, *args, **kwargs) Dataset[source]#
High-level function to write a DataFrame into a HDF5 file.
Dataframe columns names (dataframe.columns) and attributes (dataframe.attrs) will be written inside the dataset attributes and can be retrieve later when accessing the file with h5pandas.
Parameters#
- dataframepandas.DataFrame
The dataframe to write.
- h5filestr or h5py.File or h5py.Group
If it is a string : the name of the HDF5 file in which the dataframe will be written. If the file already exist then the dataframe is added to this file. Otherwise the file is created. If hdf5file is a h5py.File or h5py.Group object then it will be written inside this object.
- dataset_namestr, optional
The name of the dataset that will contain the dataframe. Default = “dataframe”.
- index: list, None or pandas.Index, optional
Default=None. If not None, index will be written inside the HDF5 file and can be retrieve later with h5pandas.
- columns: list, optional
names of the columns of the dataframe to save, if any. If columns is none then the dataframe names are used. Otherwise, if None, then nothing is written.
- metadatadict, optional
Additional metadata to save with the dataframe as dataset attributes. Units or description for example.
- *args and **kwargsadditionnal parameters passed directly to h5py.create_dataset
It can be compression options for example. See https://docs.h5py.org/en/stable/high/group.html#h5py.Group.create_dataset and https://pypi.org/project/hdf5plugin/
Returns#
- dataseth5py.Dataset or None
The dataset created inside h5file. If h5file is a string, returns None.
- h5pandas.dataframe.dataset_to_dataframe(dataset: Dataset, columns=None, index=None, copy=False)[source]#
Transform a dataset into a DataFrame.
Parameters#
- dataseth5py.Dataset
The dataset to convert into a DataFrame.
- columnsiterable, optional
Column labels to use for resulting frame when data does not have them, defaulting to RangeIndex(0, 1, 2, …, n). If data contains column labels, will perform column selection instead.
- indexIndex or array-like, optional
Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.
- copybool, optional
Copy data from inputs. For dict data, the default of None behaves like
copy=True. For DataFrame or 2d ndarray input, the default of None behaves likecopy=False. If data is a dict containing one or more Series (possibly of different dtypes),copy=Falsewill ensure that these inputs are not copied.
Returns#
- pandas.DataFrame
A dataFrame backed by the dataset. If you change the dataset values, the DataFrame will be changed.
- h5pandas.dataframe.group_to_dataframe(group) DataFrame[source]#
Transform a group into a DataFrame.
Parameters#
- grouph5py.group
The group to convert into a DataFrame.
Returns#
- pandas.DataFrame
A dataFrame backed by the dataset. If you change the dataset values, the DataFrame will cbe changed.
- h5pandas.dataframe.ndarray_to_hdf(array: ndarray, h5file: str | Group, dataset_name: str = 'array', index: list | None | Index = None, columns: list[str] | None = None, metadata: dict = {}, *args, **kwargs) Dataset[source]#
High-level function to write a NumpyArray into a HDF5 file.
Parameters#
- arraynp.ndarray
The array to write.
- h5filestr or h5py.File or h5py.Group
If it is a string : the name of the HDF5 file in which the array will be written. If the file already exist then the array is added to this file. Otherwise the file is created. If hdf5file is a h5py.File or h5py.Group object then it will be written inside this object.
- dataset_namestr, optional
The name of the dataset that will contain the array. Default = “array”.
- index: list, None or pandas.Index, optional
Default=None. If not None, index will be written inside the HDF5 file and can be retrieve later with h5pandas.
- columns: list, optional
names of the columns of the array to save, if any. If the array is a structured array and columns is none then structured names are used. Otherwise, if None, then nothing is written.
- metadatadict, optional
Additional metadata to save with the array as dataset attributes. Units or description for example.
- *args and **kwargsadditionnal parameters passed directly to h5py.create_dataset
It can be compression options for example. See https://docs.h5py.org/en/stable/high/group.html#h5py.Group.create_dataset and https://pypi.org/project/hdf5plugin/
Returns#
- dataseth5py.Dataset or None
The dataset created inside h5file. If h5file is a string, returns None.