MyNixOS website logo
Description

Checks Title, Abstract and Keywords to Optimise Discoverability.

A suite of tools are provided here to support authors in making their research more discoverable. check_keywords() - this function checks the keywords to assess whether they are already represented in the title and abstract. check_fields() - this function compares terminology used across the title, abstract and keywords to assess where terminological diversity (i.e. the use of synonyms) could increase the likelihood of the record being identified in a search. The function looks for terms in the title and abstract that also exist in other fields and highlights these as needing attention. suggest_keywords() - this function takes a full text document and produces a list of unigrams, bigrams and trigrams (1-, 2- or 2-word phrases) present in the full text after removing stop words (words with a low utility in natural language processing) that do not occur in the title or abstract that may be suitable candidates for keywords. suggest_title() - this function takes a full text document and produces a list of the most frequently used unigrams, bigrams and trigrams after removing stop words that do not occur in the abstract or keywords that may be suitable candidates for title words. check_title() - this function carries out a number of sub tasks: 1) it compares the length (number of words) of the title with the mean length of titles in major bibliographic databases to assess whether the title is likely to be too short; 2) it assesses the proportion of stop words in the title to highlight titles with low utility in search engines that strip out stop words; 3) it compares the title with a given sample of record titles from an .ris import and calculates a similarity score based on phrase overlap. This highlights the level of uniqueness of the title. This version of the package also contains functions currently in a non-CRAN package called 'litsearchr' <https://github.com/elizagrames/litsearchr>.

discoverableresearch

This package provides a suite of functions to support authors designing discoverable research articles (i.e. easily findable using bibliographic databases that index titles, abstracts and keywords).

The package offers the following functions:

  1. 'format_keywords()' - accepts a text string containing keywords separated by a given character and outputs a lower case vector list of strings
  2. 'check_keywords()' - checks a set of keywords for an article to assess whether they are already represented in the title and abstract
  3. 'check_title_length()' - checks a given title for an article to assess how discoverable it is based on its length and proportion of words that are non-stop words
  4. 'compare_title()' - checks given title for an article to assess how discoverable it is based on similarity scores relative to a given sample set of titles
  5. 'get_tokens()' - removes stop words from a text
  6. 'suggest_title()' - suggests possible title words by extracting uni-, bi-, and tri-grams from a long text (e.g. article full text), having removed punctuation and stop words. Returns the remaining words as a vector of strings and assesses whether they are already present in the title or abstract
  7. 'suggest_keywords()' - suggests possible keywords by extracting uni-, bi-, and tri-grams from a long text (e.g. article full text), having removed punctuation and stop words. Returns the remaining words as a vector of strings and assesses whether they are already present in the abstract or keywords
  8. 'check_fields()' - scans ngrams (1-, 2-, and 3- word phrases) from titles, abstracts and (original, as supplied) keywords to examine overlap to identify terms in the title and keywords that could be amended or replaced to optimise 'discoverability'.

Example

For example, consider the following record:

title <- "On the use of keywords and the discoverability of systematic reviews"

abstract <- "In this research article, we investigate the use of different strategies for the development of article keywords by academic writers to improve the discoverability of systematic reviews. We appraise the utility of keyword diversity and provide suggestions of how to draft potential keywords for article indexing and cataloging in bibliographic databases in order to improve systematic review discoverability."

keywords <- "systematic reviews; discoverability; keywords; indexing; cataloging; bibliographic databases"

It is immediately obvious that the term 'systematic reviews' exists in the title and the keywords, reducing the 'discoverability' of the record. This package allows the user to automatically identify terms that shoudl be replaced and identify from a full text document, which other terms might be useful replacements. The following example has higher 'discoverability':

title <- "A research article investigating diverse keyword terminology and systematic reviews discoverability"

abstract <- "In this empirical manuscript, we describe the results of a study to assess the impacts of employing different strategies for the development of article keywording by academic writers to improve the effectiveness of academic searches in retrieving relevant articles, specifically within the context of evidence syntheses. We appraise the utility of terminological diversity and provide suggestions of how to draft potential synonyms for article indexing and cataloging in bibliographic databases in order to improve findability and research impact."

keywords <- "systematic reviewing; systematic literature review; abstracting; open discovery; information retrieval; informatics"

Metadata

Version

0.0.1

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows