scatlastb_utils.pp.find_library_obs

scatlastb_utils.pp.find_library_obs#

scatlastb_utils.pp.find_library_obs(library_obs_names, obs, library_key='library_id', **kwargs)#

Find best subset of barcodes that match library groups

Group the input obs on the specified library_key, identify the best barcode match with the input library_obs_names, and yield a combination of the obs subset to the best matching library along with summary statistics about the identified overlap.

Parameters:
  • library_obs_names – A list of cell names, with the cell barcode present, that you are trying to find the best match for in the obs.

  • obs – A pandas.DataFrame of cell level metadata, with the index having the cell barcode present, and library_key present as a column.

  • library_key (default: 'library_id') – The obs column holding library information. Default: "library_id"

  • **kwargs – Additional arguments for strip_barcodes()

Returns:

A tuple: a subset of obs for the best matching library, with the index stripped to just the 16 base pair 10X cell barcode, and a pandas.Series with information about the identified overlap: the library ID in obs, the input cell count (i.e. the length of library_obs_names), the number of cells in for the library in obs, and the length of the overlap of the two cell pools.