Package gavo :: Package helpers :: Module processing :: Class FileProcessor
[frames] | no frames]

Class FileProcessor

source code

object --+

An abstract base for a source file processor.

In concrete classes, you need to define a ``process(fName)`` method receiving a source as returned by the dd (i.e., usually a file name).

You can override the method ``_createAuxiliaries(dataDesc)`` to compute things like source catalogues, etc. Thus, you should not need to override the constructor.

These objects are usually constructed thorough ``api.procmain`` as discussed in :dachsdoc:`processing.html`.

Instance Methods
__init__(self, opts, dd)
x.__init__(...) initializes x; see help(type(x)) for signature
source code
classify(self, fName) source code
process(self, fName) source code
addClassification(self, fName) source code
printTableSize(self) source code
printReport(self, processed, ignored) source code
printVerboseReport(self, processed, ignored) source code
iterJobs(self, nParallel)
executes process() in parallel for all sources and iterates over the results.
source code
iterates over all identifiers that should be processed.
source code
calls the process method of processor for all sources of the data descriptor dd.
source code
getProductKey(self, srcName) source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Static Methods
addOptions(parser) source code
Class Variables
  inputsDir = base.getConfig("inputsDir")

Inherited from object: __class__

Method Details

__init__(self, opts, dd)

source code 

x.__init__(...) initializes x; see help(type(x)) for signature

Overrides: object.__init__
(inherited documentation)

iterJobs(self, nParallel)

source code 

executes process() in parallel for all sources and iterates over the results.

We use this rather than multiprocessing's Pool, as that cannot call methods. I'm working around this here.


source code 

iterates over all identifiers that should be processed.

This is usually the paths of the files to be processed. You can, however, override it to do something else if that fits your problem (example: Previews in SSA use the accref).