gavo.user.rdmanipulator module

Updating table and column metadata.

Originally, this was done by writing into RDs, and the bulk of the code still reflects that.

The problem here is that RDs typically are formatted with lots of love, also within elements – e.g., like this:

<column name="bla" type="text"
        ucd="foo.bar"
        description="A long text carefully
                broken at the right place"
/>

There’s no way one can coax a normal XML parser into giving events that’d allow us to preserve this formatting. Hence, when manipulating RD sources, I need something less sophisticated – the dump XML parser implemented here.

Except possibly for coverage (and even there I have my doubts) all this has turned out to be a bad idea, best shown by the endless trouble it is whith STREAMs. Hence, we’re moving towards stuffing everything computed by the system into the database. Once that’s done, this shouldn’t be called rdmanipulator any more.

class gavo.user.rdmanipulator.Attribute(t)[source]

Bases: list

a sentinel for XML attributes.

class gavo.user.rdmanipulator.Element(t)[source]

Bases: list

a sentinel for XML elements.

These are constructed with lists of the type [tag,…]; the opening (or empty) tag is always item 0.

append(newChild)[source]

Append object to the end of the list.

countElements(name)[source]

returns the number of name elements that are direct children of self.

findElement(name)[source]

returns the first element called name somewhere within the xml grammar-parsed parseResult

This is a depth-first search, and it will return None if there is no such element.

getAttribute(name)[source]

returns the Attribute element with name within self.

If no such attribute exists, a KeyError is raised.

class gavo.user.rdmanipulator.Manipulator[source]

Bases: object

a base class for processXML manipulators.

Pass instances of these into processXML. You must up-call the constructor without arguments.

Override the gotElement(parseResult) method to do what you want. The parseResult is a pyparsing object with the tag name in second position of the first matched thing and the attributes barely parsed out (if you need them, improve the parsing to get at the attributes with less effort.)

gotElement receives an entire element with opening tag, content, and closing tag (or just an empty tag). To manipulate the thing, just return what you want in the document.

There’s also startElement(parsedOpener) that essentially works analogously; you will, however not receive startElements for empty elements, so that’s really intended for bookkeeping.

You also have a hasParent(tagName) method on Manipulators returning whether there’s a tagName element somewhere among the ancestors of the current tag.

gotElement(parsedElement)[source]
hasParent(name)[source]
startElement(parsedOpener)[source]
class gavo.user.rdmanipulator.NewElement(elementName, textContent)[source]

Bases: object

an element to be inserted into a parsed xml tree.

flatten()[source]
gavo.user.rdmanipulator.flatten(arg)[source]

returns a string from a (possibly edited) parse tree.

gavo.user.rdmanipulator.getAttribute(parseResult, name)[source]

returns the Attribute element with name within parseResult.

If no such attribute exists, a KeyError is raised.

gavo.user.rdmanipulator.getChangedRD(rdId, limits)[source]

returns a string corresponding to the RD with rdId with limits applied.

Limits is a sequence of (table-id, column-name, min, max) tuples. We assume the values elements already exist.

gavo.user.rdmanipulator.getXMLGrammar(manipulator)[source]
gavo.user.rdmanipulator.iterCoverageItems(updater)[source]

yields coverage items for inclusion in RDs.

NOTE: so far, we can only have one coverage item. So, it’s enough to just say “fill this into axis x of coverage”. If and when we have more than one coverage items, we’ll have to re-think that. That’s why there’s the “reserved” value in the tuples. We’ll have to put something in there (presumably the index of the coverage element, but perhaps we’ll have a better identity at some point).

gavo.user.rdmanipulator.iterLimitsForRD(rd, tablesOnly)[source]

returns a list of values to fill in for an entire RD.

See iterLimitsForTable.

gavo.user.rdmanipulator.iterLimitsForTable(tableDef, tablesOnly)[source]

returns a list of values to fill in into tableDef.

This will be empty if the table doesn’t exist. Otherwise, it will be a tuple (“limit”, table-id, column-name, min, max) for every column with a reasonably numeric type that has a min and max values.

The other thing that could come back (but currently only does for iterLimitsForRD) is (“coverage”, reserved, axis, literal); see iterCoverageItems for details.

gavo.user.rdmanipulator.main()[source]
gavo.user.rdmanipulator.parseCmdLine()[source]
gavo.user.rdmanipulator.processXML(document, manipulator)[source]

processes an XML-document with manipulator.

document is a string containing the XML, and the function returns serialized an XML. You’re doing yourself a favour if document is a unicode string.

manipulator is an instance of a derivation of Manipulator below. There’s a secret handshake between Manipulator and the grammar, so you really need to inherit, just putting in the two methods won’t do.