'a la Carte' on Text (ConText) Embedding Regression.
About
conText provides a fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the 'a la carte' embeddings approach developed by Khodak et al. (2018) and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021).
How to Install
install.packages("conText")
Datasets
To use conText you will need three objects:
- A (quanteda) corpus with the documents and corresponding document variables you want to evaluate.
- A set of (GloVe) pre-trained embeddings.
- A transformation matrix specific to the pre-trained embeddings.
conText includes sample objects for all three but keep in mind these are just meant to illustrate function implementations. In this Dropbox folder we have included the raw versions of these objects including the full Stanford GloVe 300-dimensional embeddings (labeled glove.rds) and its corresponding transformation matrix estimated by Khodak et al. (2018) (labeled khodakA.rds).
Quick Start Guides
Check out this Quick Start Guide to get going with conText
.