scatlastb_utils.pipeline.ModuleConfig.InputFiles#
- class scatlastb_utils.pipeline.ModuleConfig.InputFiles(module_name, dataset_config, output_directory=None)#
Class to handle input files for a specific module.
This class is designed to parse input files from a configuration dictionary, map them to unique identifiers, and provide easy access to these files across different datasets. It also supports writing the file mapping to a specified output directory.
Methods table#
|
Get file path for a given dataset and file ID. |
|
Get the file name to file path mapping for all datasets. |
|
Get file name to file path mapping for a given dataset. |
Get input filename wildcards for all datasets. |
|
|
Parse input files. |
|
Set input files for a given dataset. |
Methods#
- InputFiles.get_file(dataset, file_id)#
Get file path for a given dataset and file ID.
- Return type:
[<class ‘str’>, <class ‘pathlib._local.Path’>]
- InputFiles.get_files(as_df=False)#
Get the file name to file path mapping for all datasets.
- Return type:
[<class ‘dict’>, <class ‘pandas.core.frame.DataFrame’>]
- InputFiles.get_files_per_dataset(dataset)#
Get file name to file path mapping for a given dataset.
- Return type:
- static InputFiles.parse(input_files, digest_size=5)#
Parse input files.
Given input files, convert into file name to file path mapping. If no file names are provided, create a unique hash code.