scatlastb_utils.pipeline.ModuleConfig.WildcardParameters#

class scatlastb_utils.pipeline.ModuleConfig.WildcardParameters(module_name, parameters, input_file_wildcards, dataset_config, default_config, dont_inherit, wildcard_names, mandatory_wildcards=None, config_params=None, rename_config_params=None, explode_by=None, paramspace_kwargs=None, dtypes=None)#

Class to handle wildcards and parameters for a specific module.

This class is designed to parse wildcards and parameters from a configuration dictionary, map them to unique identifiers, and provide easy access to these wildcards across different datasets. It also supports writing the wildcard mapping to a specified output directory.

Parameters:
  • module_name (str)

  • parameters (pd.DataFrame)

  • input_file_wildcards (dict)

  • dataset_config (dict)

  • default_config (dict)

  • dont_inherit (list)

  • wildcard_names (list)

  • mandatory_wildcards (list)

  • config_params (list)

  • rename_config_params (dict)

  • explode_by ([str, list])

  • paramspace_kwargs (dict)

  • dtypes (dict)

Methods table#

get_from_parameters(query_dict, parameter_key)

Get entries from parameters dataframe

get_parameters()

Get parameters dataframe.

get_paramspace([wildcard_names, exclude])

Get snakemake.utils.Paramspace object from wildcards_df.

get_wildcards([subset_dict, exclude, ...])

Retrieve wildcard instances as dictionary

set_wildcards([config_params, ...])

Set wildcards and parameters for the WildcardParameters instance.

subset_by_query(query_dict[, columns, verbose])

Helper function to subset self.wildcards_df by query dictionary

update([wildcards_df, parameters_df, ...])

Update the wildcards and parameters of the WildcardParameters instance.

Methods#

WildcardParameters.get_from_parameters(query_dict, parameter_key, wildcards_sub=None, exclude=None, check_query_keys=False, check_null=False, default=None, single_value=True, verbose=False, as_type=None)#

Get entries from parameters dataframe

Parameters:
  • query_dict (dict | Any) – dictionary with column (must be present in parameters_df) to value mapping

  • parameter_key (str) – key of parameter

  • wildcards_sub ([<class ‘list’>, None] (default: None)) – list of wildcards used for subsetting the parameters

  • exclude ([<class ‘list’>, <class ‘str’>] (default: None)) – list of wildcard names to exclude

  • check_query_keys (bool (default: False)) – whether to check if all keys in query_dict are in wildcards_sub

  • check_null (bool)

  • default ([<class 'str'>, None])

  • single_value (bool)

  • verbose (bool)

  • as_type (type)

Returns:

single parameter value or list of parameters as specified by column

WildcardParameters.get_parameters()#

Get parameters dataframe.

Return type:

DataFrame

WildcardParameters.get_paramspace(wildcard_names=None, exclude=None, **kwargs)#

Get snakemake.utils.Paramspace object from wildcards_df.

Parameters:
  • default – whether to return the default paramspace (all wildcards)

  • wildcard_names (list (default: None)) – list of wildcard names to subset the paramspace by

  • kwargs – additional arguments for snakemake.utils.Paramspace

  • exclude (list)

Return type:

Any

Returns:

snakemake.utils.Paramspace object

WildcardParameters.get_wildcards(subset_dict=None, exclude=None, wildcard_names=None, all_params=False, as_df=False, default_datasets=True, verbose=False)#

Retrieve wildcard instances as dictionary

Parameters:
  • exclude ([<class ‘list’>, <class ‘str’>] (default: None)) – list of wildcard names to exclude

  • subset_dict (dict | Any (default: None)) – dictionary with column (must be present in parameters_df) to value mapping

  • wildcard_names (list (default: None)) – list of wildcard names to subset the wildcards by

  • all_params (bool (default: False)) – whether to include all parameters. If False (default), used defined wilcard names

  • as_df (bool (default: False)) – whether to return a dataframe instead of a dictionary

  • default_datasets (bool (default: True)) – whether to subset to default datasets (default: True)

  • verbose (bool)

Return type:

[<class ‘dict’>, <class ‘pandas.core.frame.DataFrame’>]

Returns:

dictionary of wildcards that can be applied directly for expanding target files

WildcardParameters.set_wildcards(config_params=None, wildcard_names=None, explode_by=None, config_entries=None, rename_config_params=None, dtypes=None, dont_inherit=None, warn=False)#

Set wildcards and parameters for the WildcardParameters instance.

Collect wildcards and parameters from an configuration instance (e.g. dataset) for a given module. This function assumes that the keys of the given config keys are the different instances that contain specific parameters for different modules.

Parameters:
  • config_params (list (default: None)) – List of parameters for each config entry of a module e.g. [‘integration’, ‘label’, ‘batch’]

  • wildcard_names (list (default: None)) – names of wildcards to be extracted. Must map to config keys, and prepended by a wildcard name for the config entries e.g. [‘dataset’, ‘method’, ‘label’, ‘batch’]

  • explode_by (list (default: None)) – column to explode by, expecting list entry for that column

  • config_entries (list (default: None)) – list of entries to subset the config by, otherwise use all keys

  • rename_config_params (dict)

  • dtypes (dict)

  • dont_inherit (list)

  • warn (bool)

WildcardParameters.subset_by_query(query_dict, columns=None, verbose=False)#

Helper function to subset self.wildcards_df by query dictionary

Parameters:
  • query_dict (dict | Any) – Mapping of column and value to subset by

  • columns (list (default: None)) – columns to subset wildcards_df by

  • verbose (bool)

Return type:

DataFrame

Returns:

subset of wildcards_df

WildcardParameters.update(wildcards_df=None, parameters_df=None, wildcard_names=None, **kwargs)#

Update the wildcards and parameters of the WildcardParameters instance.

Parameters:
  • wildcards_df (DataFrame (default: None)) – dataframe with updated wildcards

  • parameters_df (DataFrame (default: None)) – dataframe with updated parameters

  • wildcard_names (list (default: None)) – list of wildcard names to subset the paramspace by

  • kwargs – additional arguments for snakemake.utils.Paramspace