MyNixOS website logo
Description

Occupational Risk Integrated Systematic Mapping and Analysis.

A complete pipeline for systematic bibliometric mapping of occupational health and safety (OHS) evidence. Starting from reference files exported from major bibliographic databases such as Web of Science, Scopus, PubMed, Dimensions, EBSCO, and others, 'orisma' automates ingestion, deduplication, relevance filtering, occupational risk category extraction, bibliometric analysis, and report generation. The package is related to bibliometric science mapping and evidence synthesis workflows described by Aria and Cuccurullo (2017) <doi:10.1016/j.joi.2017.08.007>, Westgate (2019) <doi:10.1002/jrsm.1374>, and Lajeunesse (2016) <doi:10.1111/2041-210X.12472>, but adds a domain-specific occupational safety and health layer. The package implements three original bibliometric indicators: (1) the Worker-Risk Disconnection Index (WRDI), measuring the proportion of studies that characterise an occupational risk without including direct worker exposure data; (2) the Risk Category Saturation Index (RCS), measuring the relative over- or under-representation of each risk category relative to a uniform baseline; and (3) the Material-Gap Profile (MGP), measuring the ratio between a material's known hazard potential and its coverage in the occupational health literature. Two additional preventive intelligence indicators are provided: (4) the Abstract Sufficiency Score (ASS, 0-5), a cumulative hierarchical index of the preventively useful information contained in an abstract; and (5) the Bridge Article Score (0-5), identifying studies that simultaneously address technology, hazardous agent, worker population, exposure measurement, and preventive recommendations. Risk categories are extracted using a built-in occupational risk dictionary of 58 categories anchored in ISO 45001:2018, INSST, NIOSH, and EU-OSHA frameworks, organised in six blocks: Safety, Industrial Hygiene, Ergonomics, Psychosociology, Biological Hazards, and Emerging Technologies. The dictionary is user-extensible. Outputs include bilingual HTML reports, occupational risk sheets, priority reading rankings, guided extraction matrices for systematic review, and reproducibility certificates with MD5 hashes.

orisma orisma logo

Occupational Risk Integrated Systematic Mapping and Analysis

CRAN status R-CMD-check License: MIT

orisma is an R package for systematic bibliometric mapping of occupational risk evidence.

It is designed for researchers, occupational safety and health professionals, industrial hygienists, ergonomists, psychosocial risk specialists and prevention practitioners who need to understand whether the scientific literature on a given topic is actually connected to workers, workplaces, exposure conditions and preventive decision-making.

Unlike general bibliometric tools, orisma focuses on the preventive usefulness of scientific evidence. It does not only count publications or keywords. It helps identify whether a research field is technically abundant but weakly connected to real occupational exposure, workplace tasks or preventive action.


Why ORISMA?

Emerging technologies, new work processes and complex occupational hazards often generate a large scientific literature before their real workplace risks are fully understood.

This creates a practical and methodological problem:

A topic may appear well studied, but the available evidence may still lack data on workers, real exposure conditions, tasks, sectors, controls or preventive recommendations.

orisma was created to detect this gap.

It helps answer questions such as:

  • Is the literature technically abundant but weakly connected to real workers?
  • Which occupational risk categories are over-represented or under-represented?
  • Which articles are most useful for occupational risk assessment?
  • Which studies connect technical science with applied prevention?
  • Which risks require on-site assessment because the literature lacks worker-level evidence?
  • Which records are probably off-topic, biomedical, clinical or weakly occupational and should be reviewed manually?

What does ORISMA do?

Starting from reference files exported from major bibliographic databases such as Web of Science, Scopus, PubMed, Dimensions, EBSCO and others, orisma runs a complete workflow.

Processing time depends on corpus size, file format, deduplication complexity and the number of risk categories analysed.

  1. Ingestion — reads RIS, BibTeX and CSV files from multiple databases.
  2. Deduplication — applies a three-step pipeline: exact DOI, normalised title and fuzzy matching.
  3. Relevance guard — flags or excludes records with weak topic or occupational relevance.
  4. Risk extraction — scans titles, abstracts and keywords against a 58-category occupational risk dictionary.
  5. Bibliometric analysis — generates matrices, temporal trends, co-occurrence structures and risk distributions.
  6. Preventive indicators — computes WRDI, RCS, MGP, ASS and Bridge Article Score.
  7. Priority ranking — identifies articles with higher preventive usefulness.
  8. Reports — generates bilingual HTML reports, practitioner risk sheets, extraction matrices and validation samples.

Main preventive bibliometric indicators

IndicatorFull nameWhat it measures
WRDIWorker-Risk Disconnection IndexProportion of studies characterising a risk without direct worker exposure data
RCSRisk Category Saturation IndexRelative dominance of each risk category compared with a uniform baseline
MGPMaterial-Gap ProfileRatio between a material's hazard potential and its coverage in the occupational health literature
ASSAbstract Sufficiency ScoreAmount of preventively useful information contained in each abstract, scored from 0 to 5
Bridge scoreBridge Article ScoreDegree to which a study connects technical science with applied occupational prevention

Worker-Risk Disconnection Index (WRDI)

The Worker-Risk Disconnection Index measures the proportion of studies that characterise a risk without reporting direct data on workers or workplace exposure.

A high WRDI suggests that the literature is technically developed but weakly connected to real working conditions.

WRDI valueInterpretation
0.00-0.30Reasonable connection with worker-level evidence
0.30-0.70Partial disconnection; manual review recommended
0.70-1.00High technical-worker disconnection; on-site assessment is especially important

WRDI is not a substitute for expert judgement. It is a signal that helps prioritise deeper review.


Risk Category Saturation Index (RCS)

The Risk Category Saturation Index measures whether a risk category is over-represented or under-represented compared with a uniform distribution across the dictionary.

It helps identify:

  • dominant risk categories in the corpus;
  • under-studied risk categories;
  • risk areas where literature volume may not match preventive relevance;
  • potential evidence gaps.

## Material-Gap Profile (MGP)

The Material-Gap Profile is designed for corpora where records can be stratified by material, substance or agent.

It helps identify materials or agents that appear hazardous but remain poorly covered in the occupational health literature.

This is especially useful for topics such as:

  • metal additive manufacturing;
  • nanomaterials;
  • battery manufacturing;
  • advanced materials;
  • chemical agents;
  • biological agents;
  • emerging technologies.

Abstract Sufficiency Score (ASS)

The ASS is a cumulative 0-5 score measuring how much preventively useful information an abstract contains.

ScoreMeaning
0Non-informative for OHS purposes
1Mentions a hazard but no occupational context
2Mentions occupational or workplace context
3Mentions exposure measurement or quantification
4Mentions worker exposure with a result
5Complete preventive abstract: worker population, exposure measurement, method and prevention

The ASS is not a measure of study quality. It is a measure of how informative the abstract is for occupational prevention.


Bridge articles

A bridge article connects technical science with applied occupational prevention.

It usually combines:

  1. A technology, process or work task.
  2. A hazardous agent or risk factor.
  3. A real worker population or workplace setting.
  4. Exposure measurement or workplace assessment.
  5. Preventive recommendations or control measures.

Bridge articles are useful because they help practitioners move from general scientific evidence to concrete preventive action.


Installation

# From CRAN (once published)
install.packages("orisma")

# Development version from GitHub
# install.packages("remotes")
remotes::install_github("Aguilar-Elena/orisma")

Minimal usage — 3 lines

library(orisma)

refs   <- orm_load("my_references/")   # load RIS/BibTeX/CSV files
result <- orm_run(refs)                 # full pipeline (2-3 sec)
orm_report(result, lang = "en")         # generate all outputs

For Spanish output:

options(orisma.lang = "es")
refs   <- orm_load("mis_referencias/")
result <- orm_run(refs)
orm_report(result, lang = "es", out_dir = "resultados/")

Complete function reference

FunctionFor whomWhat it does
orm_load()EveryoneMulti-source ingestion with format auto-detection
orm_dedup()EveryoneThree-step deduplication: DOI, title and fuzzy matching
orm_relevance_guard()BothFlags or excludes records with weak topic or occupational relevance
orm_extract()ResearcherRisk category extraction via occupational risk dictionary
orm_analyse()ResearcherComputes WRDI, RCS and MGP
orm_autodim()ResearcherAutomatic dimension discovery
orm_dim_matrix()ResearcherRisk x dimension heatmap
orm_ass()BothAbstract Sufficiency Score per record
orm_ass_plot()BothASS distribution plot
orm_bridge()BothBridge article detection and classification
orm_ranking()BothPriority reading list
orm_priority()BothRED/AMBER/GREEN/GREY priority classification
orm_run()EveryoneComplete ORISMA pipeline in one call
orm_run_guarded()EveryoneComplete pipeline with relevance-control layer
orm_report()ResearcherFull HTML report with visualisations and tables
orm_risk_sheet()OHS practitionerActionable risk sheet
orm_extraction_matrix()BothGuided extraction template for PDF review
orm_validate()ResearcherManual validation sample
orm_dict()EveryoneLoad or customise the risk dictionary

Outputs generated automatically

After running orm_report() and orm_risk_sheet():

For researchers

FileDescription
orisma_report.htmlInteractive bilingual executive report with 7 plots
orisma_corpus.csvAll records after deduplication
orisma_matrix.csvBinary risk category matrix (records x categories)
orisma_indicators.csvWRDI, RCS, MGP per category
prisma_log.csvPRISMA-compatible selection flow
analysis.orismaReproducibility certificate (JSON with MD5 hashes)
plots/7 publication-ready PNG plots

For OHS practitioners

FileDescription
orisma_risk_sheet.htmlActionable risk sheet with RED/AMBER/GREEN traffic light
orisma_extraction_matrix.csvPre-filled extraction template for PDF review
orisma_priority_ranking.csvTop-20 priority articles by bridge + ASS score
orisma_validation_sample.csvManual validation sample

Risk dictionary

The built-in dictionary covers 58 occupational risk categories in 6 blocks.

BlockAreaExamples
ASafety at workFalls, collision, fire, explosion, work equipment
BIndustrial hygieneChemical agents, dust, noise, vibration, radiation
CErgonomicsPostures, manual handling, repetitive movements, workload
DPsychosociologyMental workload, autonomy, social support, violence, harassment
EBiological hazardsBacteria, viruses, fungi, parasites, biological agents
FEmerging technologiesRobotics, AI, nanotechnology, additive manufacturing, wearables

The dictionary can be extended for any domain:

dict <- orm_dict()

# Add terms to an existing category
dict <- orm_dict_add_terms(dict, "nanomaterials", c("nano-aerosol", "NOAA"))

# Add a completely new category
dict <- orm_dict_add_category(dict,
  key      = "exoskeleton_risk",
  label_en = "Exoskeleton-related musculoskeletal risk",
  label_es = "Riesgo musculoesqueletico por exoesqueleto",
  terms    = c("exoskeleton", "powered exosuit", "wearable robot")
)

Supported databases

DatabaseRecommended formatBatch limit
Web of ScienceRIS (Plain text)1 000
ScopusRIS or CSV2 000
PubMedRISNo limit
DimensionsCSV or RIS2 500
EBSCO (CINAHL, BSC)RIS25 000
ProQuestRIS or BibTeX100
Cochrane LibraryRISNo limit
Ovid / MEDLINERIS1 000
ScienceDirectRISNo limit
The Lens (free)RIS or CSVNo limit

Export all databases in RIS format, place files in a folder, and run orm_load("folder/"). ORISMA detects the source database automatically from the filename.


Abstract Sufficiency Score (ASS)

The ASS is a cumulative 0-5 index measuring how much preventively useful information an abstract contains:

ScoreMeaning
0Non-informative for OHS purposes
1Mentions a hazard but no occupational context
2Mentions occupational/workplace context
3Mentions exposure measurement or quantification
4Mentions exposure in workers with a result
5Complete: exposure + worker population + method + prevention

Bridge articles

A bridge article connects technical science with applied OHS prevention. It simultaneously addresses:

  1. Technology or process
  2. Hazardous agent
  3. Workers (real workplace population)
  4. Exposure measurement
  5. Preventive recommendation

Articles meeting 4-5 criteria = Strong bridge (highest priority for reading). Articles meeting 3 criteria (must include workers + measurement) = Partial bridge.


Methodological note

ORISMA uses dictionary-based automatic classification. This may produce false positives. Manual validation of a representative sample is recommended using orm_validate(), which computes Cohen's Kappa between automatic and manual classification. A Kappa >= 0.7 is acceptable for high-impact journal publication.

ORISMA does not include country-specific regulations or limit values, as these vary by jurisdiction. The practitioner applies the relevant national/regional regulation based on the risk categories identified.


Limitations

ORISMA relies primarily on bibliographic metadata, titles, abstracts and keywords. It may miss information that appears only in the full text.

Automatic classification may produce false positives or false negatives, especially when terms are used differently across disciplines. This is why ORISMA includes a relevance guard and a validation workflow.

WRDI, ASS and Bridge Score should be interpreted as prioritisation and mapping indicators, not as definitive quality assessment tools.

Country-specific legal requirements, occupational exposure limits and regulatory thresholds are not embedded in ORISMA because they vary by jurisdiction. Practitioners should apply the relevant national or regional legislation after identifying the risk categories.


Citation

If you use orisma in your research, please cite:

Aguilar-Elena, R. & Delgado-Garcia, A. (2025). orisma: Occupational Risk
Integrated Systematic Mapping and Analysis. R package version 0.1.0.
Universidad Internacional de Valencia (VIU) & Universidad de Salamanca (USAL).
https://github.com/Aguilar-Elena/orisma

Authors

PhD. Raul Aguilar-Elena · [email protected]
Occupational Risk Prevention and Occupational Health Research Group (GPRL)
Universidad Internacional de Valencia (VIU), Valencia, Spain

Ana Delgado-Garcia · [email protected]
Universidad de Salamanca (USAL), Salamanca, Spain


License

MIT © 2025 Raul Aguilar-Elena & Ana Delgado-Garcia.

Metadata

Version

0.1.0

License

Unknown

Platforms (80)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arc-linux
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • sh4-linux
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows