gavo.user.limits module

Updating table and column metadata.

While column statistics can be explicitly defined in values elements (and there may be cases when manually defining them makes sense), the typical case is to gather statistics from the database and keep them in a few tables in the dc schema.

Starting with DaCHS 2.3.1 (schema version 27), there’s dc.simple_col_stats for floats and “2 sigma” statistics.

Starting with DaCHS 2.5.2 (schema version 30), there’s in addition dc.string_col_dist for statistics of enumerated string columns.

The actual acquisition of the statistics is currently done in (and should probably move to rscdef).

gavo.user.limits.dumpStatsForRD(rd, conn)[source]

writes metadata for rd and its tables

gavo.user.limits.dumpTableLevelStats(td, conn)[source]

writes limits metadata for the table td.


yields coverage items for inclusion in RDs.

NOTE: so far, we can only have one coverage item. So, it’s enough to just say “fill this into axis x of coverage”. If and when we have more than one coverage items, we’ll have to re-think that. That’s why there’s the “reserved” value in the tuples. We’ll have to put something in there (presumably the index of the coverage element, but perhaps we’ll have a better identity at some point).

gavo.user.limits.updateForRD(rd, conn, samplePercent=None, acquireColumnMeta=True)[source]

obtains RD- and table-level metadata for rd and writes it to the meta data tables through conn.

gavo.user.limits.updateRDLevelMetadata(rd, conn)[source]

Determines RD-level metadata (coverage, mainly) and inserts it into dc.rds.

gavo.user.limits.updateTableLevelStats(td, conn, samplePercent=None, acquireColumnMeta=True)[source]

determines column metadata for the table td and inserts it into dc.*stats.

samplePercent, if given, says how much of the table to look at; giving this on views will fail.

If acquireColumnMeta is False, only the size of the table is estimated.