Description

Calculate the Revealed Aggregator of Probability Predictions.

Description

Forecasters predicting the chances of a future event may disagree due to differing evidence or noise. To harness the collective evidence of the crowd, Ville Satopää (2021) "Regularized Aggregation of One-off Probability Predictions" <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3769945> proposes a Bayesian aggregator that is regularized by analyzing the forecasters' disagreement and ascribing over-dispersion to noise. This aggregator requires no user intervention and can be computed efficiently even for a large numbers of predictions. The author evaluates the aggregator on subjective probability predictions collected during a four-year forecasting tournament sponsored by the US intelligence community. The aggregator improves the accuracy of simple averaging by around 20% and other state-of-the-art aggregators by 10-25%. The advantage stems almost exclusively from improved calibration. This aggregator -- know as "the revealed aggregator" -- inputs a) forecasters' probability predictions (p) of a future binary event and b) the forecasters' common prior (p0) of the future event. In this R-package, the function sample_aggregator(p,p0,...) allows the user to calculate the revealed aggregator. Its use is illustrated with a simple example.

README.md

cran.r-project.org

braggR

The goal of braggR is to provide easy access to the revealed aggregator proposed in Satopää (2021).

Installation

You can install the released version of braggR from CRAN with:

install.packages("braggR")

Example

This section illustrates braggR on Scenario B in Satopää (2021).

library(braggR)
# Forecasters' probability predictions:
p = c(1/2, 5/16, 1/8, 1/4, 1/2)

## Aggregate with a fixed common prior of 0.5.

# Sample the posterior distribution:
post_sample = sample_aggregator(p, p0 = 0.5, num_sample = 10^6, seed = 1)
# The posterior means of the model parameters:
colMeans(post_sample[,-1])
#>       rho     gamma     delta        p0 
#> 0.3821977 0.4742795 0.6561926 0.5000000
# The posterior mean of the level of rational disagreement:
mean(post_sample[,3]-post_sample[,2])
#> [1] 0.09208173
# The posterior mean of the level of irrational disagreement:
mean(post_sample[,4]-post_sample[,3])
#> [1] 0.1819131

# The revealed aggregator (a.k.a., the posterior mean of the oracle aggregator):
mean(post_sample[,1])
#> [1] 0.1405172
# The 95% credible interval of the oracle aggregator:
quantile(post_sample[,1], c(0.025, 0.975))
#>        2.5%       97.5% 
#> 0.001800206 0.284216903

This illustration aggregates the predictions in p by sampling the posterior distribution 1,000,000 times. The common prior is fixed to p0 = 0.5. By default, the level of burnin and thinning have been set to num_sample/2 and 1, respectively. Therefore, in this case, out of the 1,000,000 initially sampled values, the first 500,000 are discarded for burnin. Given that thinning is equal to 1, no more draws are discarded. The final output post_sample then holds 500,000draws for the aggregate and the model parameters, rho, gamma, delta, and p0. Given that p0 was fixed to 0.5, it is not sampled in this case. Therefore all values in the final column of post_sample are equal to 0.5. The other quantities, however, show posterior variability and can be summarized with the posterior mean. The first column of post_sample represents the posterior sample of the oracle aggregator. The average of these values is called the revealed aggregator in Satopää (2021). The final line shows the 95% credible interval of the oracle aggregator.

# Aggregate based on a prior beta(2,1) distribution on the common prior.
# Recall that Beta(1,1) corresponds to the uniform distribution.
# Beta(2,1) has mean alpha / (alpha + beta) = 2/3 and 
# variance alpha * beta / ((alpha+beta)^2*(alpha+beta+1)) = 1/18

# Sample the posterior distribution:
post_sample = sample_aggregator(p, alpha = 2, beta = 1, num_sample = 10^6, seed = 1)
# The posterior means of the oracle aggregator and the model parameters:
colMeans(post_sample)
#> aggregate       rho     gamma     delta        p0 
#> 0.1724935 0.5636953 0.6376554 0.9892552 0.6662238

This repeats the first illustration but, instead of fixing p0 to 0.5, the common prior is now sampled from a beta(2,1) distribution. As a result, the final column of post_sample shows posterior variability and averages to a value close to the prior mean 2/3.

r-braggR

braggR

Installation

Example

Version

License

Status

Source

Homepage

Platforms (75)

braggR

Installation

Example

Version

License

Status

Source

Homepage

Platforms75 (75)

Platforms (75)