- CellxGene
- Find Published Data
- Contribute and Publish Data
- Download Published Data
- Analyze Public Data
- Get Started
- Hosted Tutorials
- Gene Expression Documentation
- Get Started
- Cell Type and Gene Ordering
- Gene Expression Data Processing
- Available Tissues
- Find Marker Genes
- Annotate and Analyze Your Data
- Get Started
- Getting Started: Install, Launch, Quick Start
- Self Host cellxgene
- Preparing Data
- Annotating Data
- Automatic Annotation
- Gene Sets
- Community Extensions
- Multimodal Annotations
- Join the CellxGene User Community
- Cite cellxgene in your publications
- Frequently Asked Questions
- Learn About Single Cell Data Analysis
Gene Expression — Query Gene Expression Across Tissues
Gene Expression is a tool that allows users to query the expression of any gene across all data in CELLxGENE Discover. A query results in a dot plot per tissue as explained below.
How to Interpret a Gene Expression Dot Plot
Dot Plot Basics
A dot plot can reveal gross differences in expression patterns across cell types and highlights genes that are moderately or highly expressed in certain cell types.
Dot plots visualize values across two dimensions: color and size (Figure 1). The color of the dot approximates average gene expression. Its size represents the percentage of cells within each cell type that expresses the gene.
Figure 1. Two metrics are represented in gene expression dot plots, gene expression and percentage of expressing cells.
The combination of these metrics in a grid of genes by cell types allows to make qualitative assessments of gene expression (Figure 2).
Genes that are lowly expressed or expressed in a small percentage of cells are difficult to visually identify in a dot plot. This is particularly important for certain marker genes that are specifically but lowly expressed in their target cell types, for example transcription factors and cell-surface receptors.
Figure 2. Types of possible qualitative assessments in a dot plot.
How to Make Sense of Normalized Values
The data used to create the averages for the dot plot is quantile normalized and it ranges from 0 to 6 (see "Gene Expression Data Processing" section for details). Roughly, low expression has normalized values lower than 2, medium expression ranges from 2 to 4, and high expression is higher than 4 (Figure 3). These values are used for the dot plot color scheme and are constant and comparable across different dot plots. Additionally, the user has the ability to switch to a relative scale that maps the lowest and highest expression values in a dot plot to the min and max colors, thus providing a wider color range for what's shown in a dot plot.
Figure 3. Examples of high, medium and low expression.
The examples in Figure 3 have a relatively constant percentage of cells expressing a gene (dot size), however to identify highly expressed genes the user is advised to pay attention to both the color intensity and the size of the dot.
How to Navigate Cell Types
Cell types in the dot plot (rows) are ordered by default with a heuristic algorithm that tries to preserve relationships in the Cell Type ontology (CL).
The expressions and cell counts of parent cell type terms are supersets of child terms. In other words, the expression of a gene in a parent cell type includes the expression of that gene in all its descendant cell types.
Caveats of Normalization
Given that data are quantile normalized all expression is relative to the cells it is measured in. As such comparisons of absolute expression across cell types could be made if the number of genes measured is equal across all cells. While this assumption is violated, we attempt to minimize negative effects by excluding cells with low gene coverage thus reducing the variance in the number of genes measured across cells.
Nonetheless, caution is advised when finding subtle differences in the dot plot across cell types.
Users interested in evaluating the pre-normalized absolute expression data can access it through our CELLxGENE census API.