Description
Sequence Globally Unique Identifier (SEGUID) Checksums.
Description
Implementation of the original Sequence Globally Unique Identifier (SEGUID) algorithm [Babnigg and Giometti (2006) <doi:10.1002/pmic.200600032>] and SEGUID v2 (<https://www.seguid.org>), which extends SEGUID v1 with support for linear, circular, single- and double-stranded biological sequences, e.g. DNA, RNA, and proteins.
README.md
SEGUID v2: Checksums for Linear, Circular, Single- and Double-Stranded Biological Sequences
This R package, seguid, implements SEGUID v2 together with the original SEGUID algorithm.
Examples
Single-stranded DNA
> library(seguid)
## Linear single-stranded DNA
> lsseguid("TATGCCAA")
[1] "lsseguid=EevrucUNYjqlsxrTEK8JJxPYllk"
## Linear single-stranded DNA
> lsseguid("AATATGCC")
[1] "lsseguid=XsJzXMxgv7sbpqIzFH9dgrHUpWw"
## Circular single-stranded DNA
> csseguid("TATGCCAA")
[1] "csseguid=XsJzXMxgv7sbpqIzFH9dgrHUpWw"
## Same rotating two basepairs
> csseguid("GCCAATAT")
[1] "csseguid=XsJzXMxgv7sbpqIzFH9dgrHUpWw"
Double-stranded DNA
> library(seguid)
## Linear double-stranded DNA
> ldseguid("AATATGCC", "GGCATATT")
[1] "cdseguid=dUxN7YQyVInv3oDcvz8ByupL44A"
## Same swapping Watson and Crick
> ldseguid("GGCATATT", "AATATGCC")
[1] "cdseguid=dUxN7YQyVInv3oDcvz8ByupL44A"
## Circular double-stranded DNA
> cdseguid("TATGCCAA", "TTGGCATA")
[1] "cdseguid=dUxN7YQyVInv3oDcvz8ByupL44A"
## Same swapping Watson and Crick
> cdseguid("TTGGCATA", "TATGCCAA")
[1] "cdseguid=dUxN7YQyVInv3oDcvz8ByupL44A"
## Same rotating two basepairs (= minimal rotation by Watson)
> cdseguid("AATATGCC", "GGCATATT")
[1] "cdseguid=dUxN7YQyVInv3oDcvz8ByupL44A"