gavo.rscdef.dddef module

Definition of data.

Data descriptors describe what to do with data. They contain a grammar, information on where to obtain source data from, and “makes”, a specification of the tables to be generated and how they are made from the grammar output.

class gavo.rscdef.dddef.DataDescriptor(parent, **kwargs)[source]

Bases: Structure, ComputedMetaMixin, IVOMetaMixin, PublishableDataMixin, ExpansionDelegator

A description of how to process data from a given set of sources.

Data descriptors bring together a grammar, a source specification and “makes”, each giving a table and a rowmaker to feed the table from the grammar output.

They are the “executable” parts of a resource descriptor. Their ids are used as arguments to gavoimp for partial imports.

attrSeq = [<gavo.base.meta.MetaAttribute object>, <gavo.base.complexattrs.StructListAttribute object>, <gavo.base.complexattrs.StructAttribute object>, <gavo.base.complexattrs.StructListAttribute object>, <gavo.base.complexattrs.MultiStructAttribute object>, <gavo.base.complexattrs.StructListAttribute object>, <gavo.rscdef.common.ColumnListAttribute object>, <gavo.base.attrdef.BooleanAttribute object>, <gavo.base.complexattrs.PropertyAttribute object>, <gavo.rscdef.common.RDAttribute object>, <gavo.base.parsecontext.OriginalAttribute object>, <gavo.base.parsecontext.IdAttribute object>, <gavo.base.complexattrs.StructAttribute object>, <gavo.base.complexattrs.ListOfAtomsAttribute object>, <gavo.base.attrdef.BooleanAttribute object>, <gavo.base.attrdef.BooleanAttribute object>]
clearProperty(name)
completedCallbacks = []
copyShallowly()[source]

returns a shallow copy of self.

Sources are not copied.

getFullId()
getPrimary()[source]

returns the “primary” table definition in the data descriptor.

“primary” means the only table in a one-table dd, the table with the role “primary” if there are more. If no matching table is found, a StructureError is raised.

getProperty(name, default=<Undefined>)
getTableDefById(id)[source]
getTableDefWithRole(role)[source]
getURL(rendName, absolute=True)[source]
hasProperty(name)
iterSources(connection=None)[source]
iterTableDefs()[source]

iterates over the definitions of all the tables built by this DD.

This will not include system tables.

managedAttrs = {'auto': <gavo.base.attrdef.BooleanAttribute object>, 'binaryGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'cdfHeaderGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'columnGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'contextGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'csvGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'customGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'dependents': <gavo.base.complexattrs.ListOfAtomsAttribute object>, 'dictlistGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'directGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'embeddedGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'fitsProdGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'fitsTableGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'freeREGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'grammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'hdf5Grammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'id': <gavo.base.parsecontext.IdAttribute object>, 'keyValueGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'make': <gavo.base.complexattrs.StructListAttribute object>, 'makes': <gavo.base.complexattrs.StructListAttribute object>, 'meta': <gavo.base.meta.MetaAttribute object>, 'meta_': <gavo.base.meta.MetaAttribute object>, 'mySQLDumpGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'nullGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'odbcGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'original': <gavo.base.parsecontext.OriginalAttribute object>, 'param': <gavo.rscdef.common.ColumnListAttribute object>, 'params': <gavo.rscdef.common.ColumnListAttribute object>, 'pdsGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'properties': <gavo.base.complexattrs.PropertyAttribute object>, 'property': <gavo.base.complexattrs.PropertyAttribute object>, 'publish': <gavo.base.complexattrs.StructAttribute object>, 'rd': <gavo.rscdef.common.RDAttribute object>, 'reGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'recreateAfter': <gavo.base.complexattrs.ListOfAtomsAttribute object>, 'register': <gavo.base.complexattrs.StructAttribute object>, 'registration': <gavo.base.complexattrs.StructAttribute object>, 'remakeOnDataChange': <gavo.base.attrdef.BooleanAttribute object>, 'rowmaker': <gavo.base.complexattrs.StructListAttribute object>, 'rowmakers': <gavo.base.complexattrs.StructListAttribute object>, 'rowsetGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'sources': <gavo.base.complexattrs.StructAttribute object>, 'table': <gavo.base.complexattrs.StructListAttribute object>, 'tables': <gavo.base.complexattrs.StructListAttribute object>, 'transparentGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'unionGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'updating': <gavo.base.attrdef.BooleanAttribute object>, 'voTableGrammar': <gavo.base.complexattrs.MultiStructAttribute object>, 'xmlGrammar': <gavo.base.complexattrs.MultiStructAttribute object>}
metaModel = 'title(1), creationDate(1), description(1),subject, referenceURL(1)'
name_ = 'data'
onElementComplete()[source]
property parent
property rd
resType = 'data'
setProperty(name, value)
validate()[source]
class gavo.rscdef.dddef.IgnoreSpec(parent, **kwargs)[source]

Bases: Structure

A specification of sources to ignore.

Sources mentioned here are compared against the inputsDir-relative path of sources generated by sources (cf. `Element sources`_). If there is a match, the corresponding source will not be processed.

You can get ignored files from various sources. If you give more than one source, the set of ignored files is the union of the the individual sets.

fromdbUpdating is a bit special in that the query must return UTC timestamps of the file’s mtime during the last ingest in addition to the accrefs (see the reference documentation for an example).

Macros are expanded in the RD.

attrSeq = [<gavo.base.attrdef.UnicodeAttribute object>, <gavo.base.attrdef.UnicodeAttribute object>, <gavo.rscdef.common.ResdirRelativeAttribute object>, <gavo.base.parsecontext.IdAttribute object>, <gavo.base.complexattrs.ListOfAtomsAttribute object>, <gavo.rscdef.common.RDAttribute object>]
completeElement(ctx)[source]
completedCallbacks = []
property fromfile
getFullId()
ignoresVaryingStuff()[source]
isIgnored(path)[source]

returns true if path, made inputsdir-relative, should be ignored.

managedAttrs = {'fromdb': <gavo.base.attrdef.UnicodeAttribute object>, 'fromdbUpdating': <gavo.base.attrdef.UnicodeAttribute object>, 'fromfile': <gavo.rscdef.common.ResdirRelativeAttribute object>, 'id': <gavo.base.parsecontext.IdAttribute object>, 'pattern': <gavo.base.complexattrs.ListOfAtomsAttribute object>, 'patterns': <gavo.base.complexattrs.ListOfAtomsAttribute object>, 'rd': <gavo.rscdef.common.RDAttribute object>}
name_ = 'ignoreSources'
prepare(connection)[source]

sets attributes to speed up isIgnored()

property rd
class gavo.rscdef.dddef.Make(parent, **kwargs)[source]

Bases: Structure, ScriptingMixin

A build recipe for tables belonging to a data descriptor.

All makes belonging to a DD will be processed in the order in which they appear in the file.

acceptedScriptTypes = {'beforeDrop', 'newSource', 'postCreation', 'preCreation', 'preImport', 'preIndex', 'sourceDone'}
attrSeq = [<gavo.base.parsecontext.IdAttribute object>, <gavo.base.parsecontext.ReferenceAttribute object>, <gavo.base.attrdef.UnicodeAttribute object>, <gavo.base.attrdef.EnumeratedUnicodeAttribute object>, <gavo.base.parsecontext.ReferenceAttribute object>, <gavo.base.complexattrs.StructListAttribute object>, <gavo.base.parsecontext.ReferenceAttribute object>]
completedCallbacks = []
create(connection, parseOptions, tableFactory, **kwargs)[source]

returns a new empty instance of the table this is making.

getExpander()[source]

used by the scripts of expanding their source.

We always return the expander of the table being made.

managedAttrs = {'id': <gavo.base.parsecontext.IdAttribute object>, 'parmaker': <gavo.base.parsecontext.ReferenceAttribute object>, 'role': <gavo.base.attrdef.UnicodeAttribute object>, 'rowSource': <gavo.base.attrdef.EnumeratedUnicodeAttribute object>, 'rowmaker': <gavo.base.parsecontext.ReferenceAttribute object>, 'script': <gavo.base.complexattrs.StructListAttribute object>, 'scripts': <gavo.base.complexattrs.StructListAttribute object>, 'table': <gavo.base.parsecontext.ReferenceAttribute object>}
name_ = 'make'
onParentComplete()[source]
runParmakerFor(grammarParameters, destTable)[source]

feeds grammarParameter to destTable.

class gavo.rscdef.dddef.SourceSpec(parent, **kwargs)[source]

Bases: Structure

A Specification of a data descriptor’s inputs.

This will typcially be files taken from a file system. If so, DaCHS will, in each directory, process the files in alphabetical order. No guarantees are made as to the sequence directories are processed in.

Multiple patterns are processed in the order given in the RD.

attrSeq = [<gavo.base.structure.DataContent object>, <gavo.base.parsecontext.IdAttribute object>, <gavo.base.complexattrs.StructAttribute object>, <gavo.base.complexattrs.ListOfAtomsAttribute object>, <gavo.base.parsecontext.OriginalAttribute object>, <gavo.base.complexattrs.ListOfAtomsAttribute object>, <gavo.base.attrdef.BooleanAttribute object>]
completeElement(ctx)[source]
completedCallbacks = []
iterSources(connection=None)[source]
managedAttrs = {'content_': <gavo.base.structure.DataContent object>, 'id': <gavo.base.parsecontext.IdAttribute object>, 'ignoreSources': <gavo.base.complexattrs.StructAttribute object>, 'ignoredSources': <gavo.base.complexattrs.StructAttribute object>, 'item': <gavo.base.complexattrs.ListOfAtomsAttribute object>, 'items': <gavo.base.complexattrs.ListOfAtomsAttribute object>, 'original': <gavo.base.parsecontext.OriginalAttribute object>, 'pattern': <gavo.base.complexattrs.ListOfAtomsAttribute object>, 'patterns': <gavo.base.complexattrs.ListOfAtomsAttribute object>, 'recurse': <gavo.base.attrdef.BooleanAttribute object>}
name_ = 'sources'