Haskell library to follow content published on any subject.
Please, see the README on GitHub at https://github.com/waiting-for-dev/follow#readme
Follow
Follow
is a Haskell library to build recipes which allow you to follow the content published about any subject you are interested.
Here bellow you have a quick tutorial you can follow. Just run the snippets of code in the repl.
:set -XOverloadedStrings
import Follow
import Data.Time (LocalTime)
import Control.Monad (join)
import qualified Data.Text.IO as T (writeFile)
import Data.Yaml (decodeFileThrough)
Subject
A subject is just a bunch of information about what is being followed. It consists of a title, a description and a list of tags,
haskell =
Subject
{ sTitle = "Haskell"
, sDescription = "Some resources about Haskell"
, sTags = ["haskell", "programming"]
}
Directory
A directory is just a subject and a list of entries.
An entry is an item meant to contain an URI with content relative to the associated subject along with associated information.
manualDirectory =
Directory
{ dSubject = haskell
, dEntries =
[ Entry
{ eURI =
Just "https://bartoszmilewski.com/2013/06/19/basics-of-haskell/"
, eGUID = Just "basics-of-haskell"
, eTitle = Just "Basics of Haskell"
, eDescription = Just "Introductory material for Haskell"
, eAuthor = Just "Bartosz Milewski"
, ePublishDate = Just (read "2013-06-19 14:14:00" :: LocalTime)
}
]
}
Fetchers
Of course, building list of entries by hand is not very useful. Fetchers are functions which usually reach the outside world to return a list of entries and which can throw an error.
Any fetcher can be used, but Follow
tries to ship with common ones. Right now there are two fetchers available:
- Feed: Take entries from a RSS or Atom feed.
- Web Scraping: Take entries scraping the HTML of a web page.
The function directoryFromFetched
can be used to glue a subject with some fetched content:
import qualified Follow.Fetchers.Feed as Feed
directory =
directoryFromFetched (Feed.fetch "https://bartoszmilewski.com/feed/") haskell
Middlewares
Fetched content may need some further processing in order to fit what is actually desired. A middleware is a function which transforms a directory into another directory, allowing us to do any kind of transformation.
The aim of Follow
is to provide some common middlewares. For now, there are these ones:
- Filter: Filter entries according some predicate.
- Sort: Sort entries.
- Decode: Decodes entries from UTF8 or other encodings.
import qualified Follow.Middlewares.Sort as Sort
sortedDirectory =
Sort.apply (Sort.byGetter eTitle) <$> directory
Digesters
Once you have your distillate content, you need some way to consume it. A Digester
is a function which transforms a directory into anything that can be consumed by an end user.
As before, Follow
aims to provide useful ones out of the box. Right now the following are available:
- Simple Text: Simple textual representation of the directory.
- Pocket: Send fetched links to Pocket.
import qualified Follow.Digesters.SimpleText as SimpleText
content = SimpleText.digest <$> sortedDirectory
Now, for example, you are ready to save the content to a file:
join $ T.writeFile "/your/path/haskell.txt" <$> content
Recipes: Combining sources and middlewares
Content is not limited to be fetched from a single source. Instead, a directory can be built merging the entries fetched from different sources. Also, the stack of middlewares to be applied to each source can be given in a single shot.
This whole process specification is called a Recipe
, and it contains all the information needed to follow a subject.
To build the recipe you need to provide three fields:
- The subject being followed.
- A list of two field tuples where:
- First field is some fetched content.
- Second field is a list of middlewares to apply to the fetched content in the first field.
- A list of middlewares to apply to the directory resulted after applying the list of fetched/middlewares.
haskellRecipe =
Recipe
{ rSubject = haskell
, rSteps =
[ ( Feed.fetch "https://bartoszmilewski.com/feed/"
, [Sort.apply (Sort.byGetter eTitle)])
, (Feed.fetch "https://planet.haskell.org/rss20.xml", [])
]
, rMiddlewares = []
}
You can combine the function directoryFromRecipe
and some digester to quickly consume a recipe:
SimpleText.digest <$> directoryFromRecipe haskellRecipe
Collecting recipes
One nice thing in Follow is that you don't need to create the recipes programmatically each time you need them. Instead, you can store them in a YAML file and just parse them when you need.
For example, the previous recipe can be represented in a file recipe.yml
as the following:
subject:
title: Haskell
description: Some resources about Haskell
tags: [haskell, programming]
steps:
-
- type: feed
options:
url: "https://bartoszmilewski.com/feed/"
-
- type: sort
options:
function:
type: by_field
options:
field: title
middlewares: []
You can use now decode functions in Data.Yaml
to get the recipe back:
recipe' <- decodeFileThrow "/your/path/recipe.yml" :: IO (Recipe IO)
directory' = directoryFromRecipe recipe'
Look at src/Follow/Parser.hs
for details about encoding each kind of fetcher and middleware.
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/waiting-for-dev/follow. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.
License
The package is available as open source under the terms of the GNU LGPLv3 License.