gavo.protocols.datapack module

Dumping our resources to frictionless data packages (henceforce: datapack) and loading from them again.


DaCHS-generated RDs can be recognised by the presence of a dachs-rd-id key in the global metadata. Also, we will always write the RD as the first resource; for good measure, we also mark it by having a dachs-resource-descriptor name.

gavo.protocols.datapack.create(args: list)[source]
gavo.protocols.datapack.dumpPackage(rdId: str, destFile) → None[source]

write a zip of the complete data package for a resource descriptor to destFile.

destFile an be anything that zip.ZipFile accepts in w mode.

gavo.protocols.datapack.getPackageMeta(packageName: str) → dict[source]

returns a dict of DaCHS-specific metadata items from a DaCHS-produced data package.

gavo.protocols.datapack.getRDForDump(rdId: str)gavo.rscdesc.RD[source]

loads an RD for later dumping.

The main thing this does is instrument ResdirRelativeAttribute (and possibly later other things) to record what ancillary data the RD has loaded.

This is, of course, not thread-safe or anything, and it could collect false positives when RDs reference or include other RDs.

Only use it while making datapacks.

gavo.protocols.datapack.iterExtraResources(rd: gavo.rscdesc.RD, cleanPath: Callable[[str], str]) → Generator[dict, None, None][source]

yields datapack resources from the datapack-extrafiles property.

This is a json sequence, and files are only returned if they exist. Directories are ignored.

gavo.protocols.datapack.iterRDResources(rd: gavo.rscdesc.RD) → Generator[dict, None, None][source]

yields datapack resource descriptions for the RD and all ancillary files we can discover.

All path names here are relative to the RD. Anything that is not in the RD will not be exported (without serious trickery, that is).

gavo.protocols.datapack.load(args: list)[source]

does the cli interaction.

gavo.protocols.datapack.makeBasicMeta(rd: gavo.rscdesc.RD) → dict[source]

returns a basic, resource-less, datapack descriptor from an RD.

gavo.protocols.datapack.makeDescriptor(rd: gavo.rscdesc.RD) → dict[source]

returns a datapack descriptor in a python dictionary.

gavo.protocols.datapack.namer(template: str) → Callable[], int][source]