MyNixOS website logo
Description

Identifying Synergistic Gene Pairs in Single-Cell and Spatial Transcriptomics.

Discovers synergistic gene pairs in single-cell RNA-seq and spatial transcriptomics data. Unlike conventional pairwise co-expression analyses that rely on a single correlation metric, scPairs integrates 14 complementary metrics across five orthogonal evidence layers to compute a composite synergy score with optional permutation-based significance testing. The five evidence layers span cell-level co-expression (Pearson, Spearman, biweight midcorrelation, mutual information, ratio consistency), neighbourhood-aware smoothing (KNN-smoothed correlation, neighbourhood co-expression, cluster pseudo-bulk, cross-cell-type, neighbourhood synergy), prior biological knowledge (GO/KEGG co-annotation Jaccard, pathway bridge score), trans-cellular interaction, and spatial co-variation (Lee's L, co-location quotient). This multi-scale design enables researchers to move beyond simple co-expression towards a comprehensive characterisation of cooperative gene regulation at transcriptomic and spatial resolution. For more information, see the package documentation at <https://github.com/zhaoqing-wang/scPairs>.

scPairs: Identifying Synergistic Gene Pairs in Single-Cell and Spatial Transcriptomics

GitHub Version License: MIT R ≥ 4.1.0 GitHub Maintainer

Overview

scPairs is an R package for the systematic identification and multi-evidence evaluation of synergistic gene pairs in single-cell and spatial transcriptomics data. Unlike conventional pairwise co-expression analyses that rely on a single correlation metric, scPairs integrates 14 complementary metrics across five orthogonal evidence layers to compute a composite synergy score with optional permutation-based significance testing.

The five evidence layers span cell-level co-expression (Pearson, Spearman, biweight midcorrelation, mutual information, ratio consistency), neighbourhood-aware smoothing (KNN-smoothed correlation, neighbourhood co-expression, cluster pseudo-bulk, cross-cell-type, neighbourhood synergy), prior biological knowledge (GO/KEGG co-annotation Jaccard, pathway bridge score), trans-cellular interaction, and spatial co-variation (Lee's L, co-location quotient). This multi-scale design enables researchers to move beyond simple co-expression towards a comprehensive characterisation of cooperative gene regulation at transcriptomic and spatial resolution.

Table of Contents

  1. Preparation
  2. Quick Start
  3. Architecture
  4. Visualization Gallery
  5. Functions
  6. Built-in Test Dataset
  7. Citation
  8. License
  9. Contact

1. Preparation

1.1 Installation

# Install from GitHub
if (!require("devtools")) install.packages("devtools")
devtools::install_github("zhaoqing-wang/scPairs")

# Optional: prior knowledge integration (GO/KEGG annotation)
if (!require("BiocManager")) install.packages("BiocManager")
BiocManager::install(c("org.Mm.eg.db", "org.Hs.eg.db", "AnnotationDbi"))
Install missing CRAN dependencies manually
install.packages(c("data.table", "ggplot2", "ggraph", "ggrepel",
                   "igraph", "Matrix", "patchwork", "Seurat",
                   "tidygraph", "tidyr"))
# Optional accelerators and extras
install.packages(c("RANN", "ggExtra", "crayon"))

1.2 Dependencies

Required: Seurat (≥ 4.0), data.table, ggplot2, ggraph, ggrepel, igraph, Matrix, patchwork, tidygraph, tidyr

Optional:

PackagePurpose
org.Mm.eg.db / org.Hs.eg.dbGO & KEGG prior knowledge (mouse / human)
AnnotationDbiGene annotation infrastructure
RANNFast approximate KNN (10–100× speedup for neighbourhood metrics)
ggExtraMarginal density panels in PlotPairScatter()
crayonColourised startup message

2. Quick Start

library(scPairs)
sce <- readRDS("your_data.rds")  # Seurat object, normalised

2.1 Global Pair Discovery

# All 14 metrics (default)
result <- FindAllPairs(sce, n_top_genes = 1000, top_n = 200)

# Expression metrics only — no annotation databases required
result <- FindAllPairs(sce, mode = "expression")

# Prior knowledge only — fast pathway-based screening
result <- FindAllPairs(sce, mode = "prior_only", organism = "mouse")

# With permutation testing (empirical p-values)
result <- FindAllPairs(sce, n_top_genes = 500, n_perm = 999)

# Visualise
PlotPairNetwork(result, top_n = 50)
PlotPairHeatmap(result, top_n = 25)

2.2 Query-Centric Partner Search

tp53_partners <- FindGenePairs(sce, gene = "TP53", top_n = 20)
PlotPairNetwork(tp53_partners)
PlotPairDimplot(sce, gene1 = "TP53", gene2 = tp53_partners$pairs$gene2[1])

2.3 In-Depth Pair Assessment

assessment <- AssessGenePair(sce, gene1 = "CD8A", gene2 = "CD8B")
print(assessment)                                          # structured summary

PlotPairSmoothed(sce, gene1 = "CD8A", gene2 = "CD8B")    # raw vs KNN-smoothed
PlotPairSummary(sce,  gene1 = "CD8A", gene2 = "CD8B",
                result = assessment)                       # full evidence dashboard

# All visualisation functions accept any result class interchangeably
PlotPairNetwork(assessment)
PlotPairHeatmap(assessment)

2.4 Prior Knowledge & Bridge Gene Network

# Requires: BiocManager::install(c("org.Mm.eg.db", "AnnotationDbi"))
result <- AssessGenePair(sce, gene1 = "Adora2a", gene2 = "Ido1",
                         organism = "mouse")

# 6-panel synergy dashboard (expression + neighbourhood + prior evidence)
PlotPairSynergy(sce, gene1 = "Adora2a", gene2 = "Ido1")

# Standalone bridge gene network
PlotBridgeNetwork(sce, gene1 = "Adora2a", gene2 = "Ido1",
                  organism = "mouse", top_bridges = 15)

# Custom interaction databases (CellChatDB, CellPhoneDB, SCENIC, etc.)
custom_db <- data.frame(gene1 = c("Adora2a"), gene2 = c("Ido1"))
result <- FindGenePairs(sce, gene = "Adora2a", custom_pairs = custom_db)

2.5 Spatial Transcriptomics

Spatial metrics are detected and computed automatically for Visium, MERFISH, Slide-seq, and similar modalities:

result <- FindAllPairs(spatial_obj, n_top_genes = 500)   # spatial metrics auto-enabled
PlotPairSpatial(spatial_obj, gene1 = "EPCAM", gene2 = "KRT8")

3. Architecture

3.1 Computation Modes

ModeMetrics ComputedDependenciesUse Case
"all" (default)All 14 metricsAll packagesFull multi-evidence analysis
"expression"Co-expression + neighbourhood (10)No annotation DBsNo external databases needed
"prior_only"Prior knowledge scores only (2)org.*.db, AnnotationDbiFast pathway screening

3.2 Five Evidence Layers (14 Metrics)

LayerMetricsWeight
Cell-level (5)Pearson (cor_pearson), Spearman (cor_spearman), Biweight midcorrelation (cor_biweight), Mutual information (mi_score), Ratio consistency (ratio_consistency)1.0 – 1.5
Neighbourhood (5)KNN-smoothed correlation (smoothed_cor), Neighbourhood score (neighbourhood_score), Cluster correlation (cluster_cor), Cross-cell-type score (cross_celltype_score), Neighbourhood synergy (neighbourhood_synergy)1.2 – 1.5
Prior knowledge (2)GO/KEGG co-annotation Jaccard (prior_score), Pathway bridge score (bridge_score)1.8 – 2.0
Spatial (2)Lee's L (spatial_lee_L), Co-location quotient (spatial_clq)1.2 – 1.5

Metrics are rank-normalised to [0, 1] and combined via weighted summation:

$$\text{synergy_score} = \frac{\sum_i w_i \cdot \text{rank_norm}(m_i)}{\sum_i w_i}$$

3.3 Unified Output Schema

All three workflows produce a pairs data.table with consistent columns: gene1, gene2, synergy_score, rank, confidence, plus all metric columns. Every visualisation function accepts any result class interchangeably.

Confidence assignment:

With permutation p-valuesWithout p-values (score quantiles)
High: p_adj < 0.01High: ≥ 95th percentile
Medium: p_adj < 0.05Medium: ≥ 80th percentile
Low: p_adj < 0.10Low: ≥ 50th percentile

4. Visualization Gallery

The following plots are generated from the built-in scpairs_testdata object (100 cells × 20 genes, synthetic co-expression patterns). GENE3 and GENE4 are the injected globally co-expressed pair (Pearson r ≈ 0.89).

Pair Discovery — Network & Heatmap

Gene Pair NetworkSynergy Score Heatmap

Left: PlotPairNetwork() — synergy-weighted interaction graph; node size ∝ connectivity, edge width ∝ synergy score.
Right: PlotPairHeatmap() — pairwise synergy matrix; top-ranked pairs clustered by score.

Co-expression — UMAP Overlays

Raw expression (3-panel):

PlotPairDimplot() — UMAP coloured by individual gene expression and co-expression product. GENE3 and GENE4 show overlapping high-expression regions, revealing their synergistic co-activation.

Raw vs KNN-smoothed (6-panel):

PlotPairSmoothed() — top row: raw expression; bottom row: KNN-smoothed. Smoothing reduces single-cell noise and clarifies the co-expression domain across the embedding.

Expression Distribution & Scatter

Violin by ClusterGene–Gene Scatter

Left: PlotPairViolin() — expression distributions of GENE3, GENE4, and their product per cluster.
Right: PlotPairScatter() — cell-level gene–gene scatter coloured by cluster; positive correlation visible across all three groups.


5. Functions

5.1 Discovery (3 functions)

FunctionPurpose
FindAllPairs()Global discovery of synergistic gene pairs across all variable genes
FindGenePairs()Find synergistic partners for a specific query gene
AssessGenePair()In-depth assessment of a specific gene pair with per-cluster detail

5.2 Visualization (11 functions)

FunctionInputPurpose
PlotPairNetwork()Any result classSynergy-weighted gene interaction network
PlotPairHeatmap()Any result classSynergy score heatmap across top gene pairs
PlotPairDimplot()Seurat objectUMAP co-expression overlay (3-panel)
PlotPairSmoothed()Seurat objectRaw + KNN-smoothed UMAP expression (6-panel)
PlotPairSummary()Seurat objectComprehensive multi-panel summary figure
PlotPairSpatial()Spatial SeuratSpatial co-expression map (3-panel)
PlotPairCrossType()Seurat objectCross-cell-type interaction heatmap
PlotPairViolin()Seurat objectExpression distributions by cluster
PlotPairScatter()Seurat objectGene–gene scatter with optional marginal densities
PlotPairSynergy()Seurat objectMulti-evidence synergy dashboard (4-panel)
PlotBridgeNetwork()Seurat objectBridge gene network with pathway-weighted edges

6. Built-in Test Dataset

scpairs_testdata is a minimal synthetic Seurat object shipped with the package for immediate use in examples and tests:

data(scpairs_testdata)
scpairs_testdata
#> An object of class Seurat
#> 20 features across 100 samples within 1 assay
#> Active assay: RNA (20 features, 20 variable features)
#>  3 layers present: counts, data, scale.data
#>  2 dimensional reductions calculated: pca, umap
PropertyValue
GenesGENE1–GENE20 (synthetic)
CellsCELL001–CELL100
Clusters3 balanced clusters (seurat_clusters)
ReductionsPCA (5 components), UMAP (2 dimensions)
Co-expressionGENE3/GENE4 globally (r ≈ 0.89); GENE1/GENE2 cluster-1-specific
File size26 KB (xz-compressed)

The dataset is generated by data-raw/make_testdata.R with set.seed(7391) for full reproducibility.


7. Citation

Wang Z (2026). scPairs: Identifying Synergistic Gene Pairs in Single-Cell and
Spatial Transcriptomics. R package version 0.1.8.
https://github.com/zhaoqing-wang/scPairs

8. License

MIT


9. Contact

Author: Zhaoqing Wang (ORCID) | Email:[email protected] | Issues:scPairs Issues.

Metadata

Version

0.1.8

License

Unknown

Platforms (78)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows