Identify chemicals associated with your genes using enrichment analysis and data from the Comparative Toxicogenomics Database.
Given a list of genes (e.g. from a differential expression analysis), ctdR tells you which chemicals are significantly associated with those genes according to the CTD database. Two complementary methods are available:
Tests whether your gene list overlaps significantly with each chemical's known targets. Best when you have a defined gene set with a clear significance cutoff.
Powered by clusterProfiler::enricher
Uses your full ranked gene list to find chemicals whose targets cluster at the extremes. No arbitrary cutoff needed.
Powered by fgsea::fgsea
devtools::install_github("drake69/ctdR")
CTD_chem_gene_ixns.csv.gz
from ctdbase.org
and decompress it.
library(ctdR)
import_CTD("~/Downloads/CTD_chem_gene_ixns.csv")
results <- enrichment_CTD(my_genes, method = "ORA")
head(results)
You have a set of genes involved in inflammation and want to find which chemicals are known to interact with them.
library(ctdR)
# A set of inflammatory genes (Entrez IDs)
# TNF, IL6, IL1B, PTGS2, CXCL8, CCL2, NFKB1, STAT3, MMP9, ICAM1
genes <- data.frame(
entrez_ids = c("7124", "3569", "3553", "5743", "3576",
"6347", "4790", "6774", "4318", "3383"),
pvalue = c(0.001, 0.002, 0.003, 0.005, 0.007,
0.01, 0.015, 0.02, 0.03, 0.04)
)
# Find chemicals enriched for these genes
ora_results <- enrichment_CTD(genes, method = "ORA")
# View top hits
head(ora_results[, c("ChemicalName", "padj",
"foldEnrichment", "Count")])
#> ChemicalName padj foldEnrichment Count
#> 1 Lipopolysaccharides 1.2e-15 8.3 9
#> 2 Dexamethasone 3.4e-12 6.1 8
#> 3 Acetaminophen 7.8e-10 5.4 7
#> ...
You have differential expression results and want to use the full ranking (not just significant genes) to discover chemical associations.
library(ctdR)
# Load your DEG results (Entrez IDs + p-values)
deg_results <- read.csv("my_deg_results.csv")
# Prepare input: must have 'entrez_ids' column + numeric column
genes <- data.frame(
entrez_ids = deg_results$entrez_id,
pvalue = deg_results$pvalue
)
# Run GSEA — uses full ranking, no cutoff needed
gsea_results <- enrichment_CTD(genes, method = "GSEA")
# View top enriched chemicals
head(gsea_results[, c("ChemicalName", "NES",
"padj", "size")])
#> ChemicalName NES padj size
#> 1 Benzo(a)pyrene 2.41 0.001 312
#> 2 Valproic Acid 2.18 0.003 287
#> 3 Estradiol -1.95 0.005 445
#> ...
This package does not bundle or redistribute any data from the Comparative Toxicogenomics Database. CTD data are maintained by NC State University. Users must download the data directly from ctdbase.org and comply with the CTD Terms of Service.
Please cite CTD in publications: Davis AP et al. (2023) Nucleic Acids Research, 51(D1), D1257-D1262. doi:10.1093/nar/gkac833