By EVOBYTE Your partner in bioinformatics
Introduction
If bulk omics is a cityscape viewed from an airplane, single‑cell omics is a street‑level stroll with a notepad. Instead of averaging signals across thousands or millions of cells, we measure one cell at a time. That simple shift changes everything: rare cell states no longer vanish into the mean, developmental trajectories become visible, and disease ecosystems like tumors finally come into focus. It’s why projects such as the Human Cell Atlas are building reference maps of cell types and states across organs, life stages, and ancestries—resources that are rapidly becoming the “ImageNet” for cell biology.
What “single‑cell omics” actually means
Single‑cell omics is an umbrella term for assays that capture molecular layers in individual cells. The best‑known is scRNA‑seq, short for single‑cell RNA sequencing, which counts messenger RNA to approximate gene expression. Because transcripts are sparse and fragile, modern protocols add unique molecular identifiers (UMIs) to reduce counting bias and then amplify. After sequencing, we analyze a cell‑by‑gene matrix, perform quality control, reduce dimensions with PCA and UMAP, and cluster to discover putative cell types. Alongside scRNA‑seq, scATAC‑seq profiles open chromatin to infer regulatory elements, transcription factor activity, and cell identity from an epigenomic angle. When you measure both modalities in the same cell—often called “multiome” or “Epi Multiome”—you can directly link regulatory DNA to expressed genes instead of guessing the correspondence.
Multimodal power: CITE‑seq, ATAC+RNA, and trajectory methods
Adding proteins on top of RNA resolves ambiguity. CITE‑seq (Cellular Indexing of Transcriptomes and Epitopes by Sequencing) tags antibodies with DNA barcodes so that surface proteins are read out alongside transcripts in each cell. This is especially helpful in the immune system, where protein markers define functional subsets that transcripts alone may miss. Related methods extend the idea to sample multiplexing and CRISPR screens.
A second leap came with single‑cell ATAC plus gene expression in the same nuclei. These joint assays reduce batch effects and make regulatory links explicit, which improves interpretation of non‑coding variants and stabilizes downstream integration. For data scientists, the takeaway is practical: joint‑profiling matrices tend to integrate faster and with fewer assumptions than separate scRNA‑seq and scATAC‑seq datasets that you try to align post hoc.
Dynamics matter too. RNA velocity infers the future state of a cell by contrasting unspliced and spliced transcripts, offering arrows over your UMAP that suggest trajectories. Newer models such as DeepVelo and InterVelo update the math with flexible kinetics and joint learning of pseudotime and velocity, which improves robustness in branching systems. When your biological question is “where are cells going?”, velocity adds crucial context to static clusters.
Putting cells back in place: spatial transcriptomics and atlases
Single‑cell profiles are powerful, but dissociation removes tissue context. Spatial transcriptomics measures gene expression directly in intact sections, preserving the neighborhood structure that shapes cell behavior. That spatial layer helps you reveal tumor niches, immune infiltration patterns, and morphogen gradients that don’t appear in dissociated data. Recent reviews show growing adoption in oncology, with platforms like 10x Visium and high‑plex imaging enabling subcellular resolution and multimodal readouts. When you integrate spatial data with scRNA‑seq, you can deconvolve spots, assign cell types, and study interactions with more confidence.
These technologies feed large‑scale atlasing efforts. The Human Cell Atlas now hosts collections that span developing and adult tissues and highlight the importance of diverse sampling. For anyone building models, these atlases are goldmines for pretraining, label transfer, and benchmarking batch correction and integration methods such as Seurat’s anchors, Harmony, and scVI.
A practical single‑cell analysis workflow you can try
You don’t need a wet lab to start exploring. With public datasets and open‑source tools, you can reproduce core steps on a laptop.
Here’s a minimal Scanpy flow for scRNA‑seq. It loads an AnnData object, runs QC, normalization, clustering, and UMAP. Keep it small and readable; the goal is to understand each step before tuning parameters.
import scanpy as sc
adata = sc.read_h5ad("pbmc_10k.h5ad") # cells x genes matrix
sc.pp.filter_cells(adata, min_genes=200)
sc.pp.filter_genes(adata, min_cells=3)
adata = adata[adata.obs["pct_counts_mt"] < 10].copy()
sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)
sc.pp.highly_variable_genes(adata, n_top_genes=3000)
adata = adata[:, adata.var.highly_variable]
sc.pp.scale(adata, max_value=10)
sc.tl.pca(adata, n_comps=50)
sc.pp.neighbors(adata, n_neighbors=15, n_pcs=30)
sc.tl.leiden(adata, resolution=0.5)
sc.tl.umap(adata)
sc.pl.umap(adata, color=["leiden"])
If you’re working with multiome data in R, Seurat simplifies joint analysis. This sketch shows loading RNA and ATAC assays from the same cells, computing gene activity, and performing a joint dimensional reduction. The specifics will depend on your file formats and genome build.
library(Seurat)
multi <- Read10X_h5("pbmc_multiome_10k.h5")
obj <- CreateSeuratObject(counts = multi$`Gene Expression`)
obj[["ATAC"]] <- CreateChromatinAssay(counts = multi$Peaks, genome = "hg38")
obj <- NormalizeData(obj) |> FindVariableFeatures() |> ScaleData()
obj <- RunPCA(obj, npcs = 50)
obj <- RunTFIDF(obj, assay = "ATAC") |> FindTopFeatures(assay = "ATAC")
obj <- RunSVD(obj, reduction.name = "lsi", assay = "ATAC")
obj <- FindMultiModalNeighbors(obj, reduction.list = list("pca","lsi"), dims.list = list(1:30, 2:30))
obj <- RunUMAP(obj, nn.name = "weighted.nn", reduction.name = "umap")
DimPlot(obj, reduction = "umap")
In both cases, emphasize data hygiene. Mitochondrial content, doublets, and ambient RNA can distort clusters. For integration, start simple—Seurat’s anchors or Harmony often suffice—and only escalate to deep latent models when your batch effects are stubborn or your sample count is large.
A story to make it concrete
Imagine a clinical team profiling biopsies from patients who respond to an immunotherapy and those who don’t. With scRNA‑seq alone, you might spot exhausted T cells and immunosuppressive myeloid states. Add CITE‑seq, and protein markers sharpen those assignments. Layer in ATAC+RNA from the same cells, and you connect exhausted T‑cell signatures to specific regulatory programs and transcription factors. Finally, align everything to spatial transcriptomics to map resistant niches at the invasive margin. The synthesis tells a complete story: which cells sit where, how they’re regulated, and which levers might convert non‑responders—exactly the kind of multi‑modal insight single‑cell omics was built to deliver.
Summary / Takeaways
Single‑cell omics moves you from population averages to cellular resolution, unlocking discovery in development, immunology, neuroscience, and oncology. Start with scRNA‑seq to chart cell types and states. Add scATAC‑seq—or better, a joint ATAC+RNA “multiome”—to tie expression to regulation. Use RNA velocity to give your map a sense of direction. Then restore context with spatial transcriptomics and connect your findings to atlases so they’re comparable and reusable. The tools are mature, the datasets are public, and the payoffs are real. What question in your domain would benefit most from a cell‑by‑cell view?
Further Reading
- Human Cell Atlas publications and resources
- 10x Genomics: Epi Multiome ATAC + Gene Expression
- Comprehensive integration of single‑cell data (Seurat v3, Cell 2019)
- Advances in spatial transcriptomics and data analysis (J Translational Medicine 2023)
- DeepVelo: deep learning for RNA velocity (Genome Biology 2024)
