Usage
Getting started
Linkapy is a Python package that is designed to facilitate the integrative analysis of single-cell multi-omics data, where multi-omics means multiple read outs of the same cell. While an attempt is made to keep Linkapy as general as possible, for now it primarily focuses on data that includes methylation and transcription layers. If available, transcriptome data should come as one (or more) featureCount tables, that will be just combined into a single matrix. Methylation data should be provided in the form of ‘allcools’ files, which are tab-separated files containing methylation information for each cytosine in the genome. Support for additional formats is planned.
Usage
Linkapy can be used through the command line, or via the API in Python. For the latter, have a look at the API Reference. To get started, example data from the original scNMT-seq paper can be downloaded:
linkapy example -h
Upon successfull download, an example command will be printed that you can use to get started and familiarize yourself with the data structures.
linkapy CLI
Linkapy CLI - A command line interface to process and analyze single-cell multiome data.
Usage
linkapy CLI [OPTIONS] COMMAND [ARGS]...
Options
- -h, --help
Show this message and exit.
- --version
Show the version and exit.
example
Download test data and get an example command to use Linkapy to generate matrices.
Usage
linkapy CLI example [OPTIONS]
Options
- -h, --help
Show this message and exit.
- -o, --output <output>
Output directory to download the data to.
parsing
Parse single-cell scmethylation - / scNOMe - and/or scRNA data. Either methylation_path or transcriptome_path must be provided.
Usage
linkapy CLI parsing [OPTIONS]
Options
- -h, --help
Show this message and exit.
- -m, --methylation_path <methylation_path>
Path to the directory containing methylation data. Will be searched recursively to match pattern.
- -t, --transcriptome_path <transcriptome_path>
Path to the directory containing transcriptome data. Will be searched recursively to match pattern. Note that these should be featureCounts files.
- -o, --output <output>
Output directory for the results. Default is “linkapy_output”. RNA matrices will be written in arrow format, methylation derived matrices will be written in mtx format. Additionaly, if mudata is set, a MuData object is created as well.)
- --methylation_pattern <methylation_pattern>
Pattern to match methylation files. Can be specified multiple times. Note that every pattern yields a separate matrix.
- --transcriptome_pattern <transcriptome_pattern>
Pattern to match transcriptome files. Can be specified multiple times. Note that every pattern yields a separate matrix.
- --methylation_pattern_names <methylation_pattern_names>
Labels for every methylation pattern provided. Can be specified multiple times. The name will be used to name the output files. If not provided. The asterisks will be removed from the pattern to yield labels.
- --transcriptome_pattern_names <transcriptome_pattern_names>
Labels for every transcriptome pattern provided. Can be specified multiple times. The name will be used to name the output files. If not provided. The asterisks will be removed from the pattern to yield labels.
- --NOMe
Assumes data under methylation_path is NOMe data. Setting this flag is the same as using “–methylation_pattern GCHN –methylation_pattern WCGN”
- -j, --threads <threads>
Number of threads to use for processing. Default is 1.
- -c, --chromsizes <chromsizes>
Path to the chromsizes file for genome reference. Only needed if no regions are provided.
- -r, --regions <regions>
Path to regions file (bed format) to aggregate methylation data over. Can be specified multiple times.
- --blacklist <blacklist>
Path of regions (bed format) to exclude from aggregation. Can be specified multiple times. Note that these are only relevant for methylation data.
- -b, --binsize <binsize>
Size of bins for aggregating methylation data over. Only used if chromsizes are provided.
- -p, --project <project>
Project name. Effectively used as a prefix for the output files.
- -v, --verbose
Enable debugging output.