Description
Fast parsing and extracting information from (possibly malformed) HTML/XML documents.
Description
Fast TagSoup parser. Speeds of 20-200MB/sec were observed.
Works only with strict bytestrings.
This library is intended to be used in conjunction with the original tagsoup package:
import Text.HTML.TagSoup hiding (parseTags, renderTags)
import Text.HTML.TagSoup.FastBesides speed fast-tagsoup correctly handles HTML <script> and <style> tags, converts tags to lower case and can decode non UTF-8 XML for you.
This parser is used in production in BazQux Reader feeds and comments crawler.