scatlastb_utils.pipeline.ModuleConfig.ModuleConfig#
- class scatlastb_utils.pipeline.ModuleConfig.ModuleConfig(module_name, config, parameters=None, default_target=None, wildcard_names=None, mandatory_wildcards=None, config_params=None, rename_config_params=None, explode_by=None, paramspace_kwargs=None, dont_inherit=None, dtypes=None, write_output_files=True, warn=False)#
ModuleConfig class to handle module configuration and parameters.
This class is designed to encapsulate the configuration for a specific module in a Snakemake pipeline, including input files, parameters, and output targets. It provides methods to retrieve and manipulate these configurations, as well as to write output files based on the module’s parameters and wildcards.
The main use of this class is to initialise the input files, parameters, and any configurations by the user, and to provide a structured way to access and manipulate these configurations within the scAtlasTb Snakemake workflow.
Methods table#
|
Create a copy of the ModuleConfig instance. |
Get complete config with parsed input files. |
|
Get config for datasets that use the module. |
|
|
Get defaults for module. |
|
Get any key from the config via query |
|
Retrieve a specific parameter from the parameters DataFrame. |
|
Retrieve a specific input file for a dataset. |
Retrieve input file wildcards for the module. |
|
Retrieve all input files for the module. |
|
|
Retrieve input files for a specific dataset. |
|
Get output file based on wildcards |
Retrieve the parameters DataFrame for the module. |
|
|
Retrieve the parameter space for the module. |
|
Get the resource profile for the given wildcards. |
|
Retrieve resource information from config['resources'] |
Retrieve wildcard names for the module. |
|
|
Retrieve wildcard instances as dictionary |
|
Set dataset configs |
|
Set the default target for the module. |
|
Set default entries for the module in the config. |
|
Set default entries for a specific dataset in the config. |
|
Update input files for a specific dataset. |
|
Update parameters and all dependent attributes |
Write output files mapping to a TSV file in the output directory. |
Methods#
- ModuleConfig.copy()#
Create a copy of the ModuleConfig instance.
- ModuleConfig.get_for_dataset(dataset, query, default=None, warn=False)#
Get any key from the config via query
- Args:
dataset (str): dataset key in config[‘DATASETS’] query (list): list of keys to walk down the config default (Union[str,bool,float,int,dict,list, None], optional): default value if key not found. Defaults to None.
- ModuleConfig.get_from_parameters(query_dict, parameter_key, **kwargs)#
Retrieve a specific parameter from the parameters DataFrame.
- ModuleConfig.get_input_file(dataset, file_id, **kwargs)#
Retrieve a specific input file for a dataset.
- ModuleConfig.get_input_file_wildcards()#
Retrieve input file wildcards for the module.
- ModuleConfig.get_input_files()#
Retrieve all input files for the module.
- ModuleConfig.get_input_files_per_dataset(dataset)#
Retrieve input files for a specific dataset.
- ModuleConfig.get_output_files(pattern=None, extra_wildcards=None, allow_missing=False, as_dict=False, as_records=False, return_wildcards=False, verbose=False, **kwargs)#
Get output file based on wildcards
- Parameters:
- Return type:
- ModuleConfig.get_parameters()#
Retrieve the parameters DataFrame for the module.
- ModuleConfig.get_paramspace(**kwargs)#
Retrieve the parameter space for the module.
- ModuleConfig.get_profile(wildcards)#
Get the resource profile for the given wildcards.
- Parameters:
wildcards ([<class 'dict'>, Any])
- ModuleConfig.get_resource(resource_key, profile='cpu', attempt=1, attempt_to_cpu=1, factor=0.5, verbose=False)#
Retrieve resource information from config[‘resources’]
- ModuleConfig.get_wildcard_names()#
Retrieve wildcard names for the module.
- ModuleConfig.get_wildcards(**kwargs)#
Retrieve wildcard instances as dictionary
- Parameters:
kwargs – arguments passed to WildcardParameters.get_wildcards
- Return type:
[<class ‘dict’>, <class ‘pandas.core.frame.DataFrame’>]
- Returns:
dictionary of wildcards that can be applied directly for expanding target files
- ModuleConfig.set_default_target(default_target=None, warn=False)#
Set the default target for the module.
- Parameters:
default_target ([<class ‘str’>, typing.Any] (default:
None)) – default output pattern for module, if None, will use config[‘output_map’] or wildcard patternwarn (
bool(default:False)) – if True, warn if no default target is specified
- ModuleConfig.set_defaults(warn=False)#
Set default entries for the module in the config.
- Parameters:
warn (bool)
- ModuleConfig.set_defaults_per_dataset(dataset, warn=False)#
Set default entries for a specific dataset in the config.
- ModuleConfig.update_inputs(dataset, input_files)#
Update input files for a specific dataset.
- Parameters:
dataset (str)
input_files ([<class 'str'>, <class 'dict'>])
- ModuleConfig.update_parameters(wildcards_df=None, parameters_df=None, wildcard_names=None, **paramspace_kwargs)#
Update parameters and all dependent attributes
- Parameters:
wildcards_df (
DataFrame(default:None)) – dataframe with updated wildcardsparameters_df (
DataFrame(default:None)) – dataframe with updated parameterswildcard_names (
list(default:None)) – list of wildcard names to subset the paramspace byparamspace_kwargs – additional arguments for snakemake.utils.Paramspace
- ModuleConfig.write_output_files()#
Write output files mapping to a TSV file in the output directory.