Benchmarks to compare concurrency APIs.
Benchmarks to compare the pure concurrency overhead of various flavors of concurrent streamly
streams and the async
package.
Use cabal new-bench
or stack bench
to run the benchmarks. To generate charts, run the benchmarks with --csv-raw=results.csv
option and then run makecharts results.csv
. Charts are generated in the charts
directory.
concurrency-benchmarks
Benchmarks to compare the pure concurrency overhead of various flavors of concurrent streamly streams and the async package.
Run the run.sh
script to run the benchmarks and create the charts. You can use cabal new-bench
or stack bench
to run the benchmarks. To generate charts, run the benchmarks with --csv-raw=results.csv
option and then run makecharts results.csv
. Charts are generated in the charts
directory.
Methodology
A total of 10,000 tasks are run for each concurrency mechanism being compared. Two independent experiments are performed:
- In the first experiment, each task is just a noop i.e. it takes almost 0 time to execute.
- In the second experiment, each task introduces a 5 second delay
The first case shows streamly's smart scheduling to automatically run the tasks in less number of threads than the actual number of tasks. When the tasks do not block and have a very low latency, streamly may run multiple tasks per thread. Therefore streamly is much faster on this benchmark.
In the second case a 5 second delay is introduced to make sure that streamly uses one thread per task which is similar to what async
does and therefore a fair comparison. For the async
package, mapConcurrently
is used which can be compared with streamly's ahead
style stream.
For streamly this is the code that is benchmarked, by default streamly has a limit on the buffer size and the number of threads, we set those limits to -1
which means there is no limit:
let work = (\i -> threadDelay 5000000 >> return i)
in runStream
$ aheadly
$ maxBuffer (-1)
$ maxThreads (-1)
$ S.fromFoldableM $ map work [1..10000]
For async
this is the code that is benchmarked:
let work = (\i -> threadDelay 5000000 >> return i)
mapConcurrently work [1..10000]
Results
These charts compare streamly-0.5.1 and async-2.2.1
on a MacBook Pro with a 2.2 GHz Intel Core i7 processor.
When compiling, -threaded -with-rtsopts "-N"
GHC options were used to enable the use of multiple processor cores in parallel.
For streamly, results for both async
and ahead
style streams are shown.
Zero delay case
Peak Memory Consumed
Time Taken
5 second delay case
Peak Memory Consumed
Time Taken
Note, this time shows the overhead only and not the full time taken by the benchmark. For example the actual time taken by the async
benchmark is 5.135
seconds, but since 5 second in this is the delay introduced by each parallel task, we compute the overhead of concurrency by deducting the 5 seconds from the actual time taken, so the overhead is 135 ms
in case of async
.
Feedback
Feedback is welcome. Please raise an issue, send a PR or send an email to the author.