-
Notifications
You must be signed in to change notification settings - Fork 11
Closed
Description
We have aligned that Matrix-API should be a "canonical intermediate format" for omics data as opposed to a format that can also absorb all potential downstream and upstream representations 1.
There is interest in capturing and linking the underlying information that is used to create the aggregated (observation, feature) matrices. These data types are out of scope for storing in Matrix-API, but it would be valuable to identify the use cases for transformation of these data types into Matrix-API representations.
Use cases, with the underlying molecular information in bold:
- In scRNA-seq, a raw data matrix describes the number of RNA molecules observed for each gene in each cell2.
- In scATAC-seq, genomic alignments are counted or analyzed to create "peak", "genomic bin" or "gene activity score" features. The underlying data can be stored in WIG, BigWIG, or BedGraph formats3.
- In spatial transcriptomics studies, RNA molecules are spatially localized in euclidean space and assigned to cells by a segmentation algorithm. OME/ngff are exploring how to represent these data: Table spec proposal ome/ngff#64 and Nanostring are developing the CosMX assay which will generate this kind of information at large scale.
Footnotes
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels