Effective Visualization of Single-Cell Embeddings Poster by Ozette

The scientific poster titled “Data Transformations for Effective Visualization of Single-Cell Embeddings” presents a new embedding-based approach by Ozette for visualizing datasets of millions of data records more effectively. An embedding is a simplified representation of the original data in a low-dimensional space. When the embedding is two-dimensional, one can visualize it as a scatter plot, where each data record is shown as a point and similar points form visual clusters. Unfortunately, standard embeddings can group together different classes of data records, shown as multi-colored clusters. In contrast, Ozette’s new approach, coined “annotation embedding” [1], is able to visually disentangle such mixed clusters. The visualization is developed for biologists to help them make sense of single-cell data more efficiently by revealing distinct cell types, which are shown as single-colored clusters. Similar to how NASA’s James Webb Space Telescope provides much more finegrain images of the stars compared to Hubble, Ozette’s new visualization approach provides a much clearer picture of the landscape of different cell types.

Measuring properties of single cells is a powerful way for biologists, clinical researchers, and doctors to better understand how our immune system or organs work. This improved understanding can ultimately lead to the development of new treatments for diseases like cancer. In order to make sense of the single cell data, analysts typically create a two-dimensional representation called “standard embedding” and visualize such embeddings as scatter plots. In these plots, the position of points is hoped to represent the similarity of cells and reveal clusters of cell types. To validate and interpret the visual clusters, the points are additionally color-coded by computationally-derived labels or the expression of certain traits.

Analysts are already using embedding visualizations in biological and clinical research. However, matching traits or computationally-derived labels to the visual clusters is time consuming as standard embeddings are often not able to visually resolve many cell types as distinct clusters. For instance, in the poster under “Standard Embedding”, a single large cluster on the right side consists of red, orange, green, pale-green, and purple points. In contrast, Ozette’s new visualization method disentangles such clusters of mixed cell types and ensures that each cell type is visually represented by a unique cluster. This is shown in the poster under “Annotation Embedding”, where one can see how the red, orange, green, pale-green, and purple points form distinct clusters while still being located next to each other. The much higher resolution of cell type clusters enables analysts to more effectively interpret and explore single-cell data and make new discoveries more rapidly.

The poster explains Ozette’s new visualization approach through a series of pairwise visual comparisons (“standard embedding” versus the new “annotation embedding”) using data of immune cells from a recently-published study on a type of head and neck cancer. In the upper third, both approaches are prominently shown and inline text snippets are used to explain the visual encoding. The central area of the poster describes the new computational method using two area charts as visual explanations. The lower third continues the visual comparison from the beginning and highlights three key features of the new approach.

The poster was presented at the Intelligent Systems for Molecular Biology (ISMB) conference as part of the BioVis track and builds upon the work by Greene et al., 2021 [1]. Together with notebooks containing code examples, the poster and the underlying software is freely and openly available at https://github.com/flekschas-ozette/ismb-biovis-2022.

[1] Greene et al., 2021, New interpretable machine-learning method for single-cell data reveals correlates of clinical response to cancer immunotherapy, Pattern. https://doi.org/10.1016/j.patter.2021.100372

#