gavo.rsc.data module

Making data out of descriptors and sources.

class gavo.rsc.data.DDDependencyGraph(dds, spanRDs=True)[source]

Bases: object

a graph giving the dependency structure between DDs.

This is constructed with a list of DDs.

From it, you can get a build sequence (least-depending thing build first) or a destroy sequence (most-depending things built first).

If you pass spanRDs=True, only DDs residing within the first DD’s RD are considered.

getBuildSequence()[source]
getDestroySequence()[source]
class gavo.rsc.data.Data(dd, tables, parseOptions=<ParseOptions validateRows=False maxRows=None keepGoing=False>, overrideMakes=None)[source]

Bases: MetaMixin, ParamMixin

A collection of tables.

Data, in essence, is the instantiation of a DataDescriptor.

It is what makeData returns. In typical one-table situations, you just want to call the getPrimaryTable() method to obtain the table built.

These also have an attribute contributingMetaCarriers, a list of base.MetaCarrier-s used by votablewrite to create Data Origin INFO-s. By default, that’s the first table. You can add to that attribute

classmethod create(dd, parseOptions=<ParseOptions validateRows=False maxRows=None keepGoing=False>, connection=None)[source]

returns a new data instance for dd.

Existing tables on the database are not touched. To actually re-create them, call recrateTables.

classmethod createWithTable(dd, tableDef, parseOptions=<ParseOptions validateRows=False maxRows=None keepGoing=False>)[source]

builds a table for tableDef with this data item.

This is for when there are many rather similar table structures that can all be built with the same data item.

This can only work if dd only has one make.

dbCatalogChanged()[source]

returns true if a database table has been newly created by this class.

classmethod drop(dd, parseOptions=<ParseOptions validateRows=False maxRows=None keepGoing=False>, connection=None)[source]

drops all tables made by dd if necessary.

dropTables(parseOptions)[source]
getFeeder(**kwargs)[source]
getParam(paramName, default=<Not given/empty>)[source]

returns self’s parameter of paramName, or, failing that, paramName from self’s primaryTable.

getPrimaryTable()[source]

returns the table contained if there is only one, or the one with the role primary.

If no matching table can be found, raise a DataError.

getTableWithRole(role)[source]
recreateTables(connection)[source]

drops and recreates all table that are onDisk.

System tables are only recreated when the systemImport parseOption is true.

runScripts(phase, **kwargs)[source]
updateMeta()[source]
validateParams()[source]

raises a ValidationError if any required parameters within this data’s tables are still None.

class gavo.rsc.data.MultiForcedSources(seq)[source]

Bases: object

This lets you pass in arbitrary sequences as forceSource in makeData.

Without this, the list will be interpreted as a single source.

iterSources(connection)[source]
gavo.rsc.data.makeCombinedData(baseDD, tablesForRoles)[source]

returns a Data instance containing all of tablesForRoles.

A DD is being generated based on baseDD; if baseDD has any tables, they are discarded.

tablesForRoles is a mapping from strings (one of which should be “primary”) to tables; the strings end up as roles.

gavo.rsc.data.makeData(dd, parseOptions=<ParseOptions validateRows=False maxRows=None keepGoing=False>, forceSource=None, connection=None, data=None, runCommit=True)[source]

returns a data instance built from dd.

It will arrange for the parsing of all tables generated from dd’s grammar.

If database tables are being made, you must pass in a connection. The entire operation will then run within a single transaction within this connection (except for building dependents; they will be built in separate transactions).

The connection will be rolled back or committed depending on the success of the operation (unless you pass runCommit=False, in which case even a successful import will not be committed)..

You can pass in a data instance created by yourself in data. This makes sense if you want to, e.g., add some meta information up front.

makeData will usually iterate over the sources given in dd. You can override this with forceSource, which can contain a single source passed to a grammar. If you need to pass in multiple sources, use a MultiForcedSources object (or anything that has an iterSources(dbConnection) method).

gavo.rsc.data.makeDataById(ddId, parseOptions=<ParseOptions validateRows=False maxRows=None keepGoing=False>, connection=None, inRD=None)[source]

returns the data set built from the DD with ddId (which must be fully qualified).

gavo.rsc.data.makeDependentsFor(dds, parseOptions, connection, sysCatChanged)[source]

rebuilds all data dependent on one of the DDs in the dds sequence.

gavo.rsc.data.processSource(data, source, feeder, opts, connection=None)[source]

ingests source into the Data instance data.

If this builds database tables, you must pass in a connection object.

If opts.keepGoing is True,the system will continue importing even if a particular source has caused an error. In that case, everything contributed by the bad source is rolled back (this will only work when filling database tables).

gavo.rsc.data.wrapTable(table, rdSource=None, resTypeDefault='results')[source]

returns a Data instance containing only table (or table if it’s already a data instance).

If table has no rd, you must pass rdSource, which must be an object having and rd attribute (rds, tabledefs, etc, work).

resTypeDefault will be used as the new data item’s _type meta. If you want to override that later, use setMeta(“_type”…) rather than addMeta.

This will grab info meta from the table.