MyNixOS website logo
Description

Qualitative Analysis with Large Language Models.

Tools for AI-assisted qualitative data coding using large language models ('LLMs') via the 'ellmer' package, supporting providers including 'OpenAI', 'Anthropic', 'Google', 'Azure', and local models via 'Ollama'. Provides a 'codebook'-based workflow for defining coding instructions and applying them to texts, images, and other data. Includes built-in 'codebooks' for common applications such as sentiment analysis and policy coding, and functions for creating custom 'codebooks' for specific research questions. Supports systematic replication across models and settings, computing inter-coder reliability statistics including Krippendorff's alpha (Krippendorff 2019, <doi:10.4135/9781071878781>) and Fleiss' kappa (Fleiss 1971, <doi:10.1037/h0031619>), as well as gold-standard validation metrics including accuracy, precision, recall, and F1 scores following Sokolova and Lapalme (2009, <doi:10.1016/j.ipm.2009.03.002>). Provides audit trail functionality for documenting coding workflows following Lincoln and Guba's (1985, ISBN:0803924313) framework for establishing trustworthiness in qualitative research.

quallmer quallmer website

Lifecycle:experimental CRANstatus R-CMD-check Codecov testcoverage pkgdown

The quallmer package is an easy-to-use toolbox to quickly apply AI-assisted qualitative coding to large amounts of texts, images, pdfs, tabular data and other structured data.

Using qlm_code(), users can apply codebook-based qualitative coding powered by large language models (LLMs) to generate structured, interpretable outputs. The package includes built-in codebooks for common applications and allows researchers to create custom codebooks tailored to their specific research questions using qlm_codebook(). To ensure quality and reliability of AI-generated coding, quallmer provides qlm_compare() for evaluating inter-rater reliability and qlm_validate() for assessing accuracy against gold standards. With qlm_replicate(), researchers can systematically compare results across different models and settings to assess sensitivity and reproducibility. The qlm_trail() function creates complete audit trails following Lincoln and Guba’s (1985) concept for establishing trustworthiness in qualitative research.

The quallmer package makes AI-assisted qualitative coding accessible without requiring deep expertise in R, programming or machine learning.

The quallmer workflow

1. Define codebook

qlm_codebook()

  • Creates custom codebooks tailored to specific research questions and data types.
  • Uses instructions and type specifications from ellmer to define coding instructions and output structure.
  • Example codebook objects (e.g., data_codebook_sentiment) demonstrate how to use built-in codebooks for common qualitative coding tasks.
  • Extensible framework allows researchers to define domain-specific coding schemes.

2. Code data

qlm_code()

  • Applies LLM-based coding to qualitative data using a qlm_codebook.
  • Works with any LLM supported by ellmer.
  • Returns a qlm_coded object containing the coded results and metadata for reproducibility.

3. Replicate with different settings

qlm_replicate()

  • Re-executes coding with optional overrides (different models, codebooks, or parameters).
  • Tracks provenance chain for comparing results across different configurations.
  • Enables systematic assessment of coding reliability and sensitivity to model choices.

4. Compare and validate results

qlm_compare()

  • Compares multiple qlm_coded objects to assess inter-rater reliability.
  • Computes agreement metrics including Krippendorff’s alpha, Cohen’s kappa, and Fleiss’ kappa.
  • Useful for evaluating consistency across different coders, models, or coding runs.

qlm_validate()

  • Validates LLM-coded output against gold standard human coding.
  • Computes classification metrics: accuracy, precision, recall, F1-score, and Cohen’s kappa.
  • Supports multiple averaging methods (macro, micro, weighted) and per-class breakdowns.

5. Document audit trail

qlm_trail()

  • Creates complete audit trails following Lincoln and Guba’s (1985) concept for establishing trustworthiness.
  • Captures the full decision history: models, parameters, timestamps, and parent-child relationships.
  • Stores all coded results for confirmability and dependability.
  • Reconstructs branching workflows when multiple coded objects are compared or validated.
  • Generates replication materials: environment setup, API configuration, and executable R code.
  • Use qlm_trail(..., path = "filename") to save RDS archive and Quarto report.

Interactive quallmer app

For an interactive Shiny application to perform manual coding, review AI-generated annotations, and compute agreement metrics, see the companion package quallmer.app.

Supported LLMs

The package supports all LLMs currently available with the ellmer package. For authentication and usage of each of these LLMs, please refer to the respective ellmer documentation and see our tutorial for setting up an OpenAI API key or getting started with an open-source Ollama model.

Installation

You can install the development version of quallmer from https://github.com/quallmer/quallmer with:

# install.packages("pak")
pak::pak("quallmer/quallmer")

Example use and tutorials

To learn more about how to use the package, please refer to our step-by-step tutorials.

Acknowledgments

Development of this package was assisted by Claude Code, an AI coding assistant by Anthropic, for code refactoring, documentation updates, and package restructuring.

Metadata

Version

0.3.0

License

Unknown

Platforms (78)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows