Healthcare Knowledge Graph

Structured, standards-based healthcare data built to support safe, accurate, and intelligent AI applications.

Clinical Terminologies & Code Systems (UMLS)

SNOMED CT

Standardized clinical terminology for documenting patient data.

LOINC

Lab tests and clinical observations terminology.

RxNorm

Normalized naming system for medications (linked to NDCs).

ICD-10

International Statistical Classification of Diseases and Related Health Problems.

MeSH

Biomedical vocabulary used for indexing and literature tagging.

CPT

AMA-maintained procedure codes for outpatient/office-based care.

HCPCS Level II

Codes for durable medical equipment, drugs, supplies.

NDC

FDA-maintained medication product codes.

MedDRA

Medical terminology for regulatory activities.

Clinical Standards and Interoperability (FHIR/HL7)

FHIR R4 / R5 Resources

Standard schemas and APIs for patient, condition, encounter, medication, observation, etc.

HL7 v2.x Segments

Messaging segments used in lab, ADT, and billing transactions.

US Core FHIR Profiles

Official U.S. profiles to ensure compliance with 21st Century Cures Act.

SMART on FHIR App Gallery

Library of existing apps and associated APIs for EHR integration.

Clinical Quality Language (CQL)

Standard for authoring and sharing logic used in quality measures and CDS.

Value Set Authority Center (VSAC)

NLM-managed collection of value sets used in quality reporting and decision support.

Regulatory, Licensing, and Directory Data

NPPES (NPI Registry)

National Provider Identifier registry of healthcare professionals and organizations.

PECOS

Medicare enrollment data for eligible providers and suppliers.

CLIA Laboratory Database

Directory of all CLIA-certified labs and their testing capabilities.

CMS Medicare Advantage & Part D Plan Data

Details on active health plans, formularies, and star ratings.

CMS Hospital Compare & Provider Data Catalog

Facility-level data on hospitals, quality scores, services, and affiliations.

AHRQ Quality Indicators

National-level hospital and patient safety metrics.

DEA Controlled Substance License Database

Database of providers licensed to prescribe controlled substances.

State Medical Board Licensure Data

Verified licensure information from state medical boards for physicians and other licensed healthcare professionals.

FDA Registered Establishments Database

Directory of all facilities registered with the FDA, including manufacturers of medical devices and drugs.

Payer, Plan & Benefits Data

CMS Payer Data Exchange (PDex)

Standard for payers to share claims, encounter, and formulary data using FHIR.

Health Insurance Marketplace Public Use Files

ACA plan benefit and cost sharing data by county.

NAIC Plan Registry

State-level insurer licensing and financial data.

X12 270/271 Eligibility Formats

Standards for verifying patient insurance eligibility and benefits.

CMS Formulary Reference File

Official list of drugs covered by Part D with tier info.

CDC Reportable Conditions Knowledgebase (RCKMS)

State-specific rules for case reporting to public health.

CDC Vaccine Code Sets (CVX, MVX)

Codes for immunizations and vaccine manufacturers.

CDC WONDER / NNDSS / BRFSS

Aggregated data sets on disease surveillance, risk factors, and death statistics.

FDA Recall & Adverse Event Reporting (FAERS)

Reports of device and drug safety issues.

Social Determinants & Location-Based Data

CMS Social Vulnerability Index (SVI)

Composite ZIP code-level index used to assess SDOH risk.

Area Deprivation Index (ADI)

Neighborhood-level socioeconomic disadvantage scores.

HRSA HPSA & MUA Maps

Designations of medically underserved areas and provider shortages.

SAMHSA Treatment Locator

National database of behavioral and addiction treatment centers.

CDC PLACES

Local health data on chronic disease prevalence, health behaviors, and access to care by ZIP code.

USDA Food Access Atlas

Geospatial data identifying food deserts and access to grocery stores and healthy food options.

Training Data

Commercial Claims Databases

De-identified insurance claims data for modeling cost, utilization, and outcomes (e.g., Optum, Truven).

Patient-Generated Health Data (PGHD)

De-identified data from wearables, mobile health apps, and patient-reported outcomes.

MIMIC-IV Clinical Database

De-identified ICU and hospital encounter data including vitals, labs, medications, and outcomes.

National Health and Nutrition Examination Survey (NHANES)

Comprehensive health survey data including physical exams, lab tests, and behavioral data.

SEER Cancer Registry

Population-based cancer incidence and outcomes dataset from the National Cancer Institute.

Synthea Synthetic Health Records

Realistic, synthetic patient records with demographics, encounters, diagnoses, and procedures.