Skip to content

{FlowRepositoryQuery} is an R package providing complete data on public experiments from the FlowRepository database, including a dedicated Shiny application for filter these experiments according to their marker panels

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

i-cyto/FlowRepositoryQuery

Repository files navigation

FlowRepositoryQuery

Lifecycle: stable

{FlowRepositoryQuery} is an R package providing comprehensive data on all public experiments from the FlowRepository database, updated as of August 1, 2024. It also includes a dedicated Shiny application for filtering experiments based on their marker panels, making it easier to explore relevant datasets.

Access the Application via Shiny Server

You can directly use the web tool through the following Shiny Server link:

Installation Guide

To install the latest version of the {FlowRepositoryQuery} package from GitHub, use the following command:

remotes::install_github("i-cyto/FlowRepositoryQuery", ref="main")

Launch the Application

Once installed, you can start the Shiny application by running:

FlowRepositoryQuery::run_app()

Application appearance on opening

Application appearance on opening

How To

The current tool aims to make it easier to search FlowRepository's public experiments by panel markers. Filtering the panel column is not easy without a regular expression because the names of the markers are not really standardized by humans.

CD3 is not CD33, even though CD33 contains CD3. You therefore need to search for words rather than strings of characters, as shown in the illustration below.

Illustration of CD3 search Top match examples with CD3

You can enter several markers at once, and they will all be present in the resulting selection of experiments.

Illustration of multiple markers search Results of a search for several markers

PD-L1 has many spellings, as does PD1. Therefore, the search must include wildcards to match separators such as '-', '_' or even '.'. This is the role of '.', which symbolizes the wildcard character.

For example, the query 'PD.L1' will find 'PD-L1', 'PD_L1', but also 'PDXL1'. The latter is not relevant but could be easily removed. The most important thing is probably not to miss any writing. If the possible characters are known, they can be listed, for example, 'PD[-_.]L1' for 'PD-L1', 'PD_L1', or 'PD.L1'.

Postfixing a character or a sequence with '*' indicates that it may or may not be present. For example, 'PD[-_.]*L1' will also match 'PDL1'. You probably know enough to enjoy regular expressions for finding markers.

Illustration of PD-L1 search Examples of PD-L1 matching

Context

FlowRepository is an important resource for the community. It contains a large number of public experiments that could be useful for improving analytical pipelines, as external experiments to validate results, or as a source of hypotheses.

To improve analytical pipelines, I'd like to identify experiments and add tags to announce when an experiment could be a standard for developing or validating one of the steps in the pipeline. Unfortunately, it's very difficult to browse and filter the information presented on the FR public page.

I have therefore analyzed the FR public pages, collected information, and compiled it into a table. This table is available in the form of a Google sheet, which could make it easier to access the information. There is a ‘dictionary’ tab to explain the columns reported and their origin.

There is an interesting column entitled "Design". It summarizes the experimental design of an experiment when it is provided by the researcher. It lists each factor (sample types, time points, tissues, etc.), its values, and the number of SCFs per value. This will help me a lot in selecting a clearly defined design. Unfortunately, I know of experiments whose design has not been annotated in FR, which hinders the reuse of experiments and, perhaps, the development of FR.

The experiment metadata was collected on 2024-08-01. There are 2133 experiments, about 413k FCS files for a total volume of about 5 TB.

Contact

If you would like to suggest a feature for the query tool, please open a question on the GitHub repository for this package.

For further information, you can contact Samuel Granjeaud at samuel.granjeaud@inserm.fr.

Source

Source code is available on GitHub.

Credits

We thank FlowRepository for making the data available to the community. We thank ISAC for supporting FlowRepository.


About

You are reading the doc about version : 0.1.0

This README has been compiled on the

Sys.time()
#> [1] "2024-10-21 15:36:23 CEST"

Here are the tests results and package coverage:

devtools::check(quiet = TRUE)
#> ℹ Loading FlowRepositoryQuery
#> ── R CMD check results ────────────────────────── FlowRepositoryQuery 0.1.0 ────
#> Duration: 22.8s
#> 
#> ❯ checking installed package size ... NOTE
#>     installed size is  5.2Mb
#>     sub-directories of 1Mb or more:
#>       data   4.1Mb
#> 
#> 0 errors ✔ | 0 warnings ✔ | 1 note ✖

About

{FlowRepositoryQuery} is an R package providing complete data on public experiments from the FlowRepository database, including a dedicated Shiny application for filter these experiments according to their marker panels

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Code of conduct

Stars

Watchers

Forks

Packages

 
 
 

Contributors