gavo.grammars.hdf5grammar module

A grammar producing rows from a table within an HDF5 file.

class gavo.grammars.hdf5grammar.AstropyHDF5TableIterator(grammar, sourceToken, sourceRow=None)[source]

Bases: RowIterator

A row iterator generating rawdicts from Astropy-serialised HDF5 tables.

The table is assumed to contain record arrays; NULL values are properly handled through associated .mask columns.

class gavo.grammars.hdf5grammar.HDF5Grammar(parent, **kwargs)[source]

Bases: Grammar

a grammar for parsing single tables from HDF5 files.

These result in typed records, i.e., values normally come in the types they are supposed to have. The keys in the rows are the column names as given in the HDF file.

Regrettably, there are about as many conventions to serialise tables in HDF5 as there are programmes writing HDF5. This grammar supports a few styles; ask to have more included.

Styles currently implemented:

Astropy:

The table comes as a record array. The grammar is aware of the astropy convention of using adding mask columns as name+”.mask” and will turn masked values to Nones.

Vaex:

The table comes as a group with the columns as individual arrays in the group member’s data dataset. Put the parent of the columns group into the dataset attribute here.

This class is not intended for ingesting large HDF5 files, as it will only process a few thousand rows per second on usual hardware. Use `Element directgrammar`_ for large files.

attrSeq = [<gavo.base.attrdef.UnicodeAttribute object>, <gavo.base.attrdef.UnicodeAttribute object>, <gavo.base.parsecontext.IdAttribute object>, <gavo.base.complexattrs.StructAttribute object>, <gavo.base.parsecontext.OriginalAttribute object>, <gavo.base.complexattrs.PropertyAttribute object>, <gavo.rscdef.common.RDAttribute object>, <gavo.base.complexattrs.StructListAttribute object>, <gavo.base.complexattrs.StructAttribute object>, <gavo.base.attrdef.EnumeratedUnicodeAttribute object>]
clearProperty(name)
completedCallbacks = []
getFullId()
getProperty(name, default=<Undefined>)
hasProperty(name)
managedAttrs = {'dataset': <gavo.base.attrdef.UnicodeAttribute object>, 'enc': <gavo.base.attrdef.UnicodeAttribute object>, 'id': <gavo.base.parsecontext.IdAttribute object>, 'ignoreOn': <gavo.base.complexattrs.StructAttribute object>, 'original': <gavo.base.parsecontext.OriginalAttribute object>, 'properties': <gavo.base.complexattrs.PropertyAttribute object>, 'property': <gavo.base.complexattrs.PropertyAttribute object>, 'rd': <gavo.rscdef.common.RDAttribute object>, 'rowfilter': <gavo.base.complexattrs.StructListAttribute object>, 'rowfilters': <gavo.base.complexattrs.StructListAttribute object>, 'sourceFields': <gavo.base.complexattrs.StructAttribute object>, 'style': <gavo.base.attrdef.EnumeratedUnicodeAttribute object>}
name_ = 'hdf5Grammar'
onElementComplete()[source]
property rd
rowIterator

alias of AstropyHDF5TableIterator

setProperty(name, value)
class gavo.grammars.hdf5grammar.VaexHDF5TableIterator(grammar, sourceToken, sourceRow=None)[source]

Bases: RowIterator

A row iterator generating rawdicts from Vaex-serialised HDF5 tables.

Here, the columns come in separate arrays, much like FITS tables.