MyNixOS website logo
Description

Traceability Engine for Clinical Submission Readiness.

Quantifies and explains end-to-end traceability between clinical submission artifacts (ADaM (Analysis Data Model) outputs, derivations, SDTM (Study Data Tabulation Model) sources, specs, code). Builds trace models from metadata and mapping sheets, computes trace levels, and emits standardized R4SUB (R for Regulatory Submission) evidence table rows via 'r4subcore'.

r4subtrace

r4subtrace is the traceability engine in the R4SUB ecosystem. It quantifies and explains end-to-end traceability between clinical submission artifacts -- primarily ADaM outputs <-> derivations <-> SDTM sources <-> specs <-> code -- and converts trace evidence into standardized R4SUB Evidence Table rows (from r4subcore).

It focuses on answering one question:

Can we prove where each analysis variable/value came from, and can a reviewer follow it?

Why r4subtrace?

In real submissions, issues are rarely "a single failed rule." Many are trace failures:

  • Missing or ambiguous derivation documentation
  • ADaM variable not linkable to SDTM sources
  • Mismatch between spec and what code produces
  • Inconsistent naming across specs, define.xml, and datasets
  • Reviewer cannot reproduce or validate lineage

r4subtrace formalizes traceability as evidence + measurable indicators.

What r4subtrace measures

Traceability levels

  • L0 -- None: no linkage available
  • L1 -- Spec-only: ADaM spec defines derivation but no code mapping
  • L2 -- Spec + source mapping: ADaM var mapped to SDTM vars/domains
  • L3 -- Spec + code mapping: mapping exists with high confidence or derivation text

Installation

pak::pak(c("R4SUB/r4subcore", "R4SUB/r4subtrace"))

Quick start

1) Create run context

library(r4subcore)
library(r4subtrace)

ctx <- r4sub_run_context(study_id = "ABC123", environment = "DEV")

2) Load metadata

adam_meta <- read.csv("adam_metadata.csv")  # columns: dataset, variable, label, type
sdtm_meta <- read.csv("sdtm_metadata.csv")  # same structure

map <- read.csv("trace_map.csv")
# recommended columns:
# adam_dataset, adam_var, sdtm_domain, sdtm_var, derivation_text(optional), confidence(optional)

3) Build trace model and evidence

tm <- build_trace_model(
  adam_meta = adam_meta,
  sdtm_meta = sdtm_meta,
  mapping   = map
)

ev <- trace_model_to_evidence(tm, ctx = ctx, source_name = "r4subtrace", source_version = "0.1.0")

validate_evidence(ev)
evidence_summary(ev)

4) Compute trace coverage score

ind <- trace_indicator_scores(ev)
ind

Core objects

Trace Model

A list with:

  • nodes: tidy table of assets (dataset/variable/spec/program)
  • edges: tidy table of relationships + confidence
  • diagnostics: issues found (orphans, ambiguities, conflicts)

Trace Evidence

Evidence rows are emitted for:

  • each ADaM variable trace level
  • each orphan/ambiguity/conflict
  • aggregate coverage metrics

Indicators

  • TRACE_VAR_COVERAGE_L2PLUS: proportion of ADaM variables with L2+ trace
  • TRACE_VAR_COVERAGE_L3PLUS: proportion with L3+ trace
  • TRACE_ORPHAN_VAR_COUNT: orphan ADaM vars with no SDTM mapping
  • TRACE_AMBIGUOUS_MAPPING_COUNT: vars mapped to multiple SDTM sources
  • TRACE_MEAN_TRACE_LEVEL: mean trace level across all ADaM variables

Design principles

  • Graph-first: traceability is a graph problem
  • Evidence-first: all conclusions are backed by explicit evidence rows
  • Tool-agnostic: can ingest mapping from any source format
  • Reviewer-centric: emphasize explainability, not just metrics

License

MIT.

Metadata

Version

0.1.0

License

Unknown

Platforms (78)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows