10.2.9. InputDataResolution: giving job access to the data
When a job needs access to data, there are two ways data can be accessed:
either by downloading the file on the local worker node
or by reading the data remotely, aka
streaming.
The resolution is done in the JobWrapper (see DIRAC jobs: definitions). By default, the resolution logic is implemented in InputDataResolution. It can be overwritten by the Job JDL (see InputDataModule in Job Description Language Reference), or by the /Operations/<>/InputDataPolicy/InputDataModule parameter.
You can look into this class for more details, but to summarize:
it will look into the
jobJDL if it can findInputDataPolicyoption. If so, it will use that as the module.If not, it will check whether a policy is defined for the site we are running on (in
/Operations/InputDataPolicy/<site>).If not, it will run the default policy specified in
/Operations/InputDataPolicy/Default
The InputDataPolicy parameter can either be set directly in the JDL, in which case it should be a full module, or it can be set using the Job class (see setInputDataPolicy())
10.2.9.1. DownloadInputData
This module will download the files locally on the worker node for processing.
See DownloadInputData for details.
10.2.9.2. InputDataByProtocol
This module will generate the URLs necessary to access the files remotely.
See InputDataByProtocol for details.