MyNixOS website logo
Description

A parser for web documents according to the HTML5 specification.

mangrove provides HTML parsing for the Willow web browser suite. As such, it has not necessarily been written with a broader audience in mind, but the resulting data structures should still be generic enough to serve as a general parsing library should you need HTML5 compatibility (most likely, its codified error recovery algorithms); if you do use this for other projects, please do share any issues —or even just discomforts— that broader usage reveals. Notably, however, mangrove makes no attempt to parse CSS, JavaScript, or to access linked files, leaving those tasks to other parts of the suite and merely generates a simple document tree from the markup.

About

'mangrove' provides an HTML5-compatible parser for web documents, implemented in Haskell. In keeping with the immutable data paradigms, an emphasis has been placed on avoiding side effects and mutable structures rather than strictly following the official algorithms. The resulting document tree can be returned to willow to be styled and rendered.

This readme is rather sparse, as it has been written for a subfolder of the complete repository; for full info on the project, see the primary readme in either this directory, its parent, or the online host, whichever of those links may work.

Coverage reporting

Unfortunately, the invocation of hpc by cabal-install <= 3.4.0.0 doesn't work properly when multiple packages are developed as part of the same project.
Until the next version is released, I recommend that you don't enable coverage reports for mangrove, in order for the tests themselves to run correctly.

Metadata

Version

0.1.0.0

License

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows