Fast, effectful byte streams.
This library enables fast and safe streaming of byte data, in either Word8
or Char
form. It is a core addition to the streaming ecosystem and avoids the usual pitfalls of combinbing lazy ByteString
s with lazy IO
.
We follow the philosophy shared by streaming
that "the best API is the one you already know". Thus this library mirrors the API of the bytestring
library as closely as possible.
See the module documentation and the README for more information.
streaming-bytestring
This library enables fast and safe streaming of byte data, in either Word8
or Char
form. It is a core addition to the streaming
ecosystem and avoids the usual pitfalls of combinbing lazy ByteString
s with lazy IO
.
This library is used by streaming-attoparsec
to enable vanilla Attoparsec parsers to work with streaming
"for free".
Usage
Importing and Types
Modules from this library are intended to be imported qualified. To avoid conflicts with both the bytestring
library and streaming
, we recommended Q
as the qualified name:
import qualified Streaming.ByteString.Char8 as Q
Like the bytestring
library, leaving off the Char8
will expose an API based on Word8
. Following the philosophy of streaming
that "the best API is the one you already know", these APIs are based closely on bytestring
. The core type is ByteStream m r
, where:
m
: The Monad used to fetch further chunks from the "source", usuallyIO
.r
: The final return value after all streaming has concluded, usually()
as instreaming
.
You can imagine this type to represent an infinitely-sized collection of bytes, although internally it references a strictByteString
no larger than 32kb, followed by monadic instructions to fetch further chunks.
Examples
File Input
To open a file of any size and count its characters:
import Control.Monad.Trans.Resource (runResourceT)
import qualified Streaming.Streaming.Char8 as Q
-- | Represents a potentially-infinite stream of `Char`.
chars :: ByteStream IO ()
chars = Q.readFile "huge-file.txt"
main :: IO ()
main = runResourceT (Q.length_ chars) >>= print
Note that file IO specifically requires the resourcet
library.
Line splitting and Stream
interop
In the example above you may have noticed a lack of Of
that we usually see with Stream
. Our old friend lines
hints at this too:
lines :: Monad m => ByteStream m r -> Stream (ByteStream m) m r
A stream-of-streams, yet no Of
here either. The return type can't naively be Stream (Of ByteString) m r
, since the first line break might be at the very end of a large file. Forcing that into a single strict ByteString
would crash your program.
To count the number of lines whose first letter is i
:
countOfI :: IO Int
countOfI = runResourceT
. S.length_ -- IO Int
. S.filter (== 'i') -- Stream (Of Char) IO ()
. S.concat -- Stream (Of Char) IO ()
. S.mapped Q.head -- Stream (Of (Maybe Char)) IO ()
. Q.lines -- Stream (ByteStream IO) IO ()
$ Q.readFile "huge-file.txt" -- ByteStream IO ()
Critically, there are several functions which when combined with mapped
can bring us back into Of
-land:
head :: Monad m => ByteStream m r -> m (Of (Maybe Char) r)
last :: Monad m => ByteStream m r -> m (Of (Maybe Char) r)
null :: Monad m => ByteStream m r -> m (Of Bool) r)
count :: Monad m => ByteStream m r -> m (Of Int) r)
toLazy :: Monad m => ByteStream m r -> m (Of ByteString r) -- Be careful with this.
toStrict :: Monad m => ByteStream m r -> m (Of ByteString r) -- Be even *more* careful with this.
When moving in the opposite direction API-wise, consider:
fromChunks :: Stream (Of ByteString) m r -> ByteStream m r