LLM Integration

Summary

The HuBMAP Data Portal serves a wide range of biological researchers—from experimental biologists to bioinformaticians to clinicians—each with different levels of computational expertise and distinct workflows. Yet both groups share two core challenges:

  1. Navigation: “Where do I go to find the right data or answer my research question?”

  2. Understanding: “What is this data telling me in the context of biology?”

To address these gaps, I explored three LLM-enabled experiences that would lower the barrier to insight and make HuBMAP more accessible, intuitive, and context-aware.

Role

UX Research, UI Design

End Users

Experimental Biologists, Bioinformaticians, Clinicians, Educators & Students

Project Duration

2023

Target Users & Pain Points

Experimental Biologist

Expertise in human biology, not computational tools

Navigational Pain Points:

  • “Where do I ask my research question?”

  • “What page actually helps me understand cell distribution, markers, or tissue context?”

Understanding pain points:
  • Interpreting data outputs

  • Understanding what analysis results mean biologically

Computational Biologist

Has computational skills in Python/R/Matlab to analyze biological data.

Navigational Pain Points:
  • Finding datasets with the right modalities

  • Knowing whether files are complete or suitable for download

Understanding pain points:
  • Biological context behind datasets

  • Understanding data quality / tissue annotation nuances

Both groups benefit from intelligent, context-aware AI.

3 LLM Applications

(1) HuBMAP Assistant (Conversational Navigation)

A GPT-style interactive assistant that:

  • Helps users navigate the portal through natural-language interaction

  • Interprets open-ended biological queries and translates them into actionable searches

  • Directs users to the most relevant datasets, organs, tools, or pages

  • Provides contextual summaries and explanations without requiring menu-based navigation

(2) Generative Search AI

Existing search is exact-match and dependent on metadata filtering. Generative Search AI enables semantic, biological, and goal-oriented search by interpreting natural-language queries and converting them into structured filters. It can also automatically generate visual summaries—such as assay distributions or organ-level comparisons—based on the user’s query, helping researchers understand results at a glance.

(3) AI Tooltips (Inline Context + Biological Definitions)

AI tooltips provide on-demand explanations for complex scientific terms, improving interpretability and reducing cognitive load while users explore datasets.

How it Works:

When a user hovers over a complex term (e.g., snRNA-seq, CD markers), AI generates:

  • A concise definition

  • Relevant biological or tissue context


Why it is useful:

  • Helps experimental biologists understand file structures, assay names, and modality-specific language

  • Helps computational researchers interpret domain-specific biological terminology


Design Intent

Tooltips appear only when needed and stay lightweight to avoid overwhelming users with long blocks of text. They support quick understanding without forcing users to navigate away from the page or read documentation.

Impact

  • Presented the AI proposal to stakeholders, establishing a clear design direction for LLM features in the portal

  • Provided a unified UX vision that aligned engineering, product, and scientific teams

  • Opened opportunities for future intelligent analysis features within HuBMAP’s in-browser data analysis environment

liaw_logo_white
©

2025