MyNixOS website logo
Description

Split Chromosome 'Fasta' File.

Chromosome files in the 'Fasta' format usually contain large sequences like human genome. Sometimes users have to split these chromosomes into different files according to their chromosome number. The 'chromseq' can help to handle this. So the selected chromosome sequence can be used for downstream analysis like motif finding. Howard Y. Chang(2019) <doi:10.1038/s41587-019-0206-z>.

chromseq

Some fasta files contain all chromosomes from one genome,sometimes users have to split these chromosomes into different files according to their number label.The msqchr can help to handle this, so that the specific chromosome fasta file can be used for downstream analysis.

Installation

devtools::install_github("MSQ-123/chromseq")

Usage

Replace tedious chromosome identifier into simple format. So that the subtracted ids are easy to manipulate.

data("id")
simpleID<- replaceText(type = "text",input = id)

Subtract chromosome ids from a fasta file

data("text")
text<- replaceText(type = "text",input = text)
id <- subFasID(text = text)

Transform the large character object into special list:

fil <- tempfile(fileext = ".data")
write(text,file = fil)
con0 <- file(fil, "r")
tex <- readToList(id,text = text,con = con0)

Sort the chromosome list according to their number. Note: the “single” and “double” chromosome should be sort separately. Note: This data is already sorted, this is just for expository purposes.

tex2<- sortList(id=id,tex = tex,chrsig = "single")
tex3 <- sortList(id=id,tex = tex,chrsig = "double")

Now we can split the chromosome fasta file into different files according to their number.

outdir <- tempdir()
splitChr(tex = tex2,chr=seq(1,9),sex = TRUE,outdir = outdir)
#chromosome X or Y is "single",so the tex should be a
#"single" chromosome list.
#In the below case, sex should be F.
splitChr(tex = tex3,chr=seq(10,22),sex = FALSE,outdir = outdir)
Metadata

Version

0.1.3

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows