Package gavo :: Package helpers :: Module processing :: Class FileProcessor
[frames] | no frames]

Class FileProcessor

source code

object --+
         |
        FileProcessor

An abstract base for a source file processor.

In concrete classes, you need to define a ``process(fName)`` method receiving a source as returned by the dd (i.e., usually a file name).

You can override the method ``_createAuxiliaries(dataDesc)`` to compute things like source catalogues, etc. Thus, you should not need to override the constructor.

These objects are usually constructed thorough ``api.procmain`` as discussed in :dachsdoc:`processing.html`.

Instance Methods
 
__init__(self, opts, dd)
x.__init__(...) initializes x; see help(type(x)) for signature
source code
 
classify(self, fName) source code
 
process(self, fName) source code
 
addClassification(self, fName) source code
 
printTableSize(self) source code
 
printReport(self, processed, ignored) source code
 
printVerboseReport(self, processed, ignored) source code
 
iterJobs(self, nParallel)
executes process() in parallel for all sources and iterates over the results.
source code
 
iterIdentifiers(self)
iterates over all identifiers that should be processed.
source code
 
processAll(self)
calls the process method of processor for all sources of the data descriptor dd.
source code
 
getProductKey(self, srcName) source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Static Methods
 
addOptions(parser) source code
Class Variables
  inputsDir = base.getConfig("inputsDir")
Properties

Inherited from object: __class__

Method Details

__init__(self, opts, dd)
(Constructor)

source code 

x.__init__(...) initializes x; see help(type(x)) for signature

Overrides: object.__init__
(inherited documentation)

iterJobs(self, nParallel)

source code 

executes process() in parallel for all sources and iterates over the results.

We use this rather than multiprocessing's Pool, as that cannot call methods. I'm working around this here.

iterIdentifiers(self)

source code 

iterates over all identifiers that should be processed.

This is usually the paths of the files to be processed. You can, however, override it to do something else if that fits your problem (example: Previews in SSA use the accref).