gavo.grammars.xmlgrammar module

A grammar for generic XML documents.

class gavo.grammars.xmlgrammar.XMLGrammar(parent, **kwargs)[source]

Bases: Grammar

A grammar parsing from generic XML files.

Use this grammar to parse from generic XML files. For now, one rawdict per document is returned (later extensions might let you define elements that will yield rows).

The keys are xpaths (e.g., root/element or root/element/@attr), the values the (joined) text nodes that are immediate children or the element.

When elements are repeated within an element, [ct] is appended to the path element (e.g., root/element([0]).

For now, this grammar ignores namespaces.

Because most of the keys are not valid python identifiers, you cannot use the @key syntax when mapping this. Use vars[key] instead (or <map key=”dest” source=”path”/>).

Do not use this for VOTables; use VOTableGrammar instead.

attrSeq = [<gavo.base.attrdef.UnicodeAttribute object>, <gavo.base.parsecontext.IdAttribute object>, <gavo.base.complexattrs.StructAttribute object>, <gavo.base.attrdef.BooleanAttribute object>, <gavo.base.parsecontext.OriginalAttribute object>, <gavo.base.complexattrs.PropertyAttribute object>, <gavo.rscdef.common.RDAttribute object>, <gavo.base.complexattrs.StructListAttribute object>, <gavo.base.complexattrs.StructAttribute object>]
clearProperty(name)
completedCallbacks = []
getFullId()
getProperty(name, default=<Undefined>)
hasProperty(name)
managedAttrs = {'enc': <gavo.base.attrdef.UnicodeAttribute object>, 'id': <gavo.base.parsecontext.IdAttribute object>, 'ignoreOn': <gavo.base.complexattrs.StructAttribute object>, 'normalizeWhitespace': <gavo.base.attrdef.BooleanAttribute object>, 'original': <gavo.base.parsecontext.OriginalAttribute object>, 'properties': <gavo.base.complexattrs.PropertyAttribute object>, 'property': <gavo.base.complexattrs.PropertyAttribute object>, 'rd': <gavo.rscdef.common.RDAttribute object>, 'rowfilter': <gavo.base.complexattrs.StructListAttribute object>, 'rowfilters': <gavo.base.complexattrs.StructListAttribute object>, 'sourceFields': <gavo.base.complexattrs.StructAttribute object>}
name_ = 'xmlGrammar'
property rd
rowIterator

alias of XMLRowIterator

setProperty(name, value)
class gavo.grammars.xmlgrammar.XMLRowIterator(grammar, sourceToken, sourceRow=None)[source]

Bases: RowIterator

an iterator for XMLGrammars.

gavo.grammars.xmlgrammar.iterEventsCounting(inputFile, normalizeWhitespace)[source]

wraps etree.iterparse so [ct] elements are appended to element names when they are repeated.

This currently takes some pains to strip namespaces, which probably just uglify the keys in almost all applications I can see for this.