By EVOBYTE Your partner in bioinformatics
Introduction
If you work in omics, you’ve probably felt the gap between a clean differential expression table and the messy reality of human tissue. Cells don’t float in spreadsheets; they live in neighborhoods, along gradients, and inside structures that shape their behavior. The HuBMAP Data Portal was built to close that gap. It brings together multi‑modal, single‑cell and spatial datasets from healthy human tissues, then anchors them to a coherent, navigable map of the body so you can search, visualize, and download data with context. Think of it as a living atlas where RNA, proteins, lipids, and morphology meet anatomy.
In this post, we’ll introduce HuBMAP’s mission, explain the idea of spatial maps for tissues and organs, show what kinds of datasets are inside the portal, and walk through how to browse and programmatically access the data. Along the way we’ll unpack key terms like CCF, HRA, and Vitessce so you can get from curiosity to queries with minimal friction.
HuBMAP’s mission and why it matters for omics
HuBMAP, the Human BioMolecular Atlas Program, is an NIH‑funded consortium creating an open, multi‑scale atlas of healthy human tissues at single‑cell resolution. The goal is straightforward but ambitious: make high‑quality, interoperable data easy to find and analyze, and tie that information back to where it came from in the body. This emphasis on “where” is crucial for spatial omics because cell identity and function depend on location, neighbors, and local microenvironments.
The data portal is the main window into this effort. It hosts public datasets, provides in‑browser visualization, and links to standardized protocols and quality control reports so you can trace provenance and evaluate data fitness. Access is open for browsing and visualization, while downloads for processed and certain controlled files are managed through Globus, with protections for sensitive content. This balance aims to maximize reuse while preserving participant privacy.
As of October 2025, the portal indexed more than five thousand datasets spanning dozens of organs and many assay types, and it continues to grow. That scale makes it a compelling first stop for benchmarking pipelines, building reference panels, or assembling cross‑organ comparisons.
Spatial maps, the Human Reference Atlas, and the Common Coordinate Framework
The portal’s superpower is not just volume; it’s structure. HuBMAP builds spatial maps using the Human Reference Atlas (HRA) and the Common Coordinate Framework (CCF). The HRA provides 3D reference organs, standard terminologies, and linkages to anatomical structures, cell types, and biomarkers, so datasets can be compared and explored consistently rather than piecemeal.
The CCF is the scaffolding that makes this possible. It defines three complementary views of every sample: who the specimen came from (clinical metadata), what anatomical structure and cell types are involved (semantic context using ASCT+B tables), and where exactly the sample sits in a 3D organ model (spatial registration). With these layers in place, you can search and visualize in ways that respect biology—navigating from whole organ to tissue block to single cells while keeping vocabulary and coordinates aligned.
To make this usable day to day, HuBMAP offers two interfaces that researchers reach for again and again. The Registration User Interface (RUI) lets data providers spatially register tissue blocks against 3D organ models. The Exploration User Interface (EUI) then exposes those registrations to everyone else, so you can browse semantically and spatially explicit data—from whole body down to single cell—and inspect how datasets align within an organ. In practice, this means you can jump from “kidney cortex” to a set of Visium slides, or from “airway epithelium” to multiplexed images, and see each sample in anatomical context before you ever download a file.
What’s inside: multimodal spatial and single‑cell datasets
The portal aggregates three main modality families—sequencing, microscopy, and mass spectrometry—with a deep roster of specific assays under each umbrella. On the sequencing side you’ll find single‑cell and single‑nucleus RNA‑seq, ATAC‑seq, multiome profiles, and targeted RNA assays. For spatial transcriptomics, the catalog includes Visium, DBiT‑seq, GeoMx, CosMx, Xenium, and additional emerging platforms, all described using shared metadata schemas to aid filtering and reproducibility. (docs.hubmapconsortium.org)
On the imaging side, there are high‑plex immunofluorescence methods such as CODEX, CyCIF, MIBI, PhenoCycler, and Cell DIVE, plus histology and confocal or light‑sheet microscopy. For mass spectrometry, you’ll see MALDI, DESI, SIMS, LC‑MS proteomics, and related metabolite or lipid imaging, which pair beautifully with spatial transcriptomics to connect gene expression with biochemical state. Because HuBMAP enforces assay‑specific metadata fields—think instrument vendors, total imaging rounds, or panel details—you can slice the catalog precisely to your study design.
Critically, the portal doesn’t stop at metadata search. Many datasets open directly in the browser with Vitessce, a web‑native viewer for spatial and single‑cell experiments. You can synchronize scatter plots with tissue images, explore clusters, toggle markers, and inspect segmentation masks without leaving the page. That immediate feedback loop helps you triage relevance before spending time and storage on downloads.
How to browse the HuBMAP Data Portal
Start on the portal’s search page and approach it like you would a good EHR filter: begin broad, then tighten. Choose an organ to anchor your search—kidney, lung, uterus, brain—and immediately refine by specimen type or derivation, such as fresh frozen section or single‑cell suspension. Next, narrow by assay class. If you’re hunting for spatial transcriptomics, select platforms like Visium or CosMx; if you’re validating antibodies, target CODEX or CyCIF. Every time you refine, the results update with donors, samples, datasets, and curated collections you can step through, with links to protocols, QC reports, and visualization where available. You’ll see consistent HuBMAP IDs—for example HBM123.ABCD.456—that you can carry into programmatic queries and manuscripts.
When you need to reason across space, pivot to the EUI. There, you can load a 3D organ model, toggle registered tissue blocks, and click into derivatives like Visium slides or multiplexed images. It’s a subtle shift—from filtering rows to exploring anatomy—but it changes how you interpret results. You begin to notice gradients along axes, zonation effects, or how immune cells pool near particular landmarks. A good pattern is to scout candidates in the EUI, verify assay and QC details on the dataset page, preview in Vitessce, then decide what to download.
Access is intentionally simple for discovery. Anyone can browse and visualize public datasets without logging in. To download processed data at scale, sign in via Globus; protected files with human genetic sequences may require additional approvals. The portal’s access model labels datasets as public, consortium, or protected, which helps you plan timelines for analysis and publication.
Programmatic access: from filters to API calls in minutes
Once you’ve found your bearings, you’ll likely want to script searches and automate downloads. HuBMAP exposes a Search API with a convenient parameterized endpoint, plus a Python SDK. You can replicate your portal filters with a couple of query parameters and get back JSON you can feed into notebooks, CLIs, or pipelines.
Here’s a minimal Python example that finds published Visium datasets from right lung and returns their IDs. You can adapt organ codes (RL for right lung, RK for right kidney, HT for heart, and so on) or swap dataset_type for other assays like CODEX or ATACseq.
# pip install hubmap-sdk requests from hubmap_sdk.searchsdk import SearchSdk search = SearchSdk(service_url="https://search.api.hubmapconsortium.org/v3/") query = "param-search/datasets?origin_samples.organ=RL&dataset_type=Visium%20(with%20probes)" resp = search.session.get(search.service_url + query, timeout=20) resp.raise_for_status() datasets = [d["hubmap_id"] for d in resp.json()] print(datasets[:5])
Those few lines cover 80% of programmatic use cases: identify by organ and assay, grab IDs or a manifest, and hand the results to your favorite workflow engine. The API docs also detail nested attributes you can query—such as targeted vs. whole‑transcriptome runs—plus endpoints for donors and samples when you need richer clinical or specimen context.
Summary / Takeaways
HuBMAP’s Data Portal is more than a repository; it’s a spatially aware operating system for omics. You get standardized metadata, QC and protocols for trust, Vitessce for fast triage, and the HRA/CCF backbone to keep results anatomically grounded. You can browse like a clinician reading an imaging study, then pivot to APIs when you’re ready to scale. If you’ve been stitching together single‑cell tables and slide images by hand, this is your chance to stand on a coordinated scaffold and move faster.
If you’re new, start with an organ you know well and follow a dataset from the portal page into Vitessce and then the EUI. If you’re ready to automate, translate your filters into a parameterized search call and capture a manifest for the CLT. And if you’re building atlases of your own, consider aligning to the HRA so your results land in a shared spatial language others can query. The maps are there; the next discovery is about to have an address.
