Description

Data Sets for Psychometric Modeling.

Description

Collection of data sets from various assessments that can be used to evaluate psychometric models. These data sets have been analyzed in the following papers that introduced new methodology as part of the application section: Jimenez, A., Balamuta, J. J., & Culpepper, S. A. (2023) <doi:10.1111/bmsp.12307>, Culpepper, S. A., & Balamuta, J. J. (2021) <doi:10.1080/00273171.2021.1985949>, Yinghan Chen et al. (2021) <doi:10.1007/s11336-021-09750-9>, Yinyin Chen et al. (2020) <doi:10.1007/s11336-019-09693-2>, Culpepper, S. A. (2019a) <doi:10.1007/s11336-019-09683-4>, Culpepper, S. A. (2019b) <doi:10.1007/s11336-018-9643-8>, Culpepper, S. A., & Chen, Y. (2019) <doi:10.3102/1076998618791306>, Culpepper, S. A., & Balamuta, J. J. (2017) <doi:10.1007/s11336-015-9484-7>, and Culpepper, S. A. (2015) <doi:10.3102/1076998615595403>.

README.md

cran.r-project.org

edmdata

The goal of edmdata R data package is to provide a set of assessment data sets for psychometric modeling.

Installation

The edmdata package is available on both CRAN and GitHub. The CRAN version is considered stable while the GitHub version is in a state of development and may break.

You can install the stable version of the edmdata package with:

install.packages("edmdata")

For the development version, you can install the edmdata package from GitHub with:

# install.packages("remotes")
remotes::install_github("tmsalab/edmdata")

Using data in the package

There are two ways to access the data contained within this package.

The first is to load the package itself and type the name of a data set. This approach takes advantage of R’s lazy loading mechanism, which avoids loading the data until it is used in R session. For details on how lazy loading works, please see Section 1.17: Lazy Loading of the R Internals manual.

# Load the `edmdata` package
library("edmdata")

# See the first 10 observations of the `items_revised_psvtr` dataset
head(items_revised_psvtr)

# View the help documentation for `items_revised_psvtr`
?items_revised_psvtr

The second approach is to use the data() command to load data on the fly without loading the package. After using data(), the data set will be available to use under the given name.

# Loading `items_revised_psvtr` without a `library(edmdata)` call
data("items_revised_psvtr", package = "edmdata")

# See the first 10 observations of the `items_revised_psvtr` dataset
head(items_revised_psvtr)

# View the help documentation for `items_revised_psvtr`
?items_revised_psvtr

Data Sets Included

Examination for the Certificate of Proficiency in English (ECPE) (Templin & Bradshaw, 2014; Templin & Hoffman, 2013).
- items_ecpe: N = 2922 subject responses to J = 28 items.
- qmatrix_ecpe: J = 28 items and K = 3 traits.
- TMSA Papers: Culpepper & Chen (2019)
Fraction Addition and Subtraction (C. Tatsuoka, 2002; K. K. Tatsuoka, 1984).
- items_fractions: N = 536 subject responses to J = 20 items.
- qmatrix_fractions: J = 536 items and K = 20 traits.
- TMSA Papers: Yinghan Chen et al. (2021), Yinyin Chen et al. (2020), Culpepper (2019b), Culpepper & Chen (2019), Yinghan Chen et al. (2018)
Elementary Probability Theory (Heller & Wickelmaier, 2013).
- items_probability_part_one_full: N = 504 subject responses to J = 12 items.
- items_probability_part_one_reduced: N = 431 subject responses to J = 12 items.
- qmatrix_probability_part_one: J = 12 items and K = 4 traits.
- TMSA Papers: Yinghan Chen et al. (2021)
Revised PSVT:R (Culpepper & Balamuta, 2017; Yoon, 2011).
- items_revised_psvtr: N = 516 subject responses to J = 30 items.
- TMSA Papers: Culpepper & Balamuta (2017), Culpepper (2015)
Subset of Early Childhood Longitudinal Study, Kindergarten Class of 1998-1999’s Approaches to Learning (NCES, 2010).
- items_ordered_eclsk_atl: N = 13354 subject responses to J = 12 items.
- TMSA Papers: Culpepper (2019a)
Trends in International Mathematics and Science Study 2015 (TIMSS) Grade 8 Student Background Survey Item Responses (Mullis et al., 2016).
- items_ordered_timss15_background: N = 9672 subject responses to J = 16 items.
Calculus-based probability and statistics course homework problems Jimenez et al. (2023)
- items_ordered_pswc_hw: N = 288 subject responses to J = 29 items.
Programme for International Student Assessment (PISA) 2012 U.S. Student Questionnaire Problem-Solving Vignettes (Culpepper & Balamuta, 2021).
- items_ordered_pisa12_us_vignette: N = 3075 subject responses to J = 12 items.
Programme for International Student Assessment (PISA) 2012 U.S. Math Assessment.
- items_pisa12_us_math: N = 4978 subject responses to J = 76 items.
Last Series of the Standard Progressive Matrices (SPM-LS) (Myszkowski & Storme, 2018; Raven, 1941; Robitzsch, 2020).
- items_spm_ls: N = 499 subject responses to J = 12 items.
Human Connectome Project’s Penn Progressive Matrices Fluid Intelligence Assessment
- items_hcp_penn_matrix: N = 1201 subject responses to J = 24 items.
- items_hcp_penn_matrix_missing: N = 1201 subject responses with missing data indicators to J = 24 items.
Experimental Matrix Reasoning Test (OpenPsychometrics, 2012a).
- items_matrix_reasoning: N = 400 subject responses to J = 25 items.
- TMSA Papers: Yinyin Chen et al. (2020)
Taylor Manifest Anxiety Scale (OpenPsychometrics, 2012b; Taylor, 1953).
- items_taylor_manifest_anxiety_scale: N = 4468 subject responses to J = 50 items.
Narcissistic Personality Inventory (OpenPsychometrics, 2013; Raskin & Terry, 1988).
- items_narcissistic_personality_inventory: N = 11243 subject responses to J = 40 items.
Pre-generated identified Q matrices.
- qmatrix_oracle_k2_j12: 12 items and 2 traits.
- qmatrix_oracle_k3_j20: 20 items and 3 traits.
- qmatrix_oracle_k4_j20: 20 items and 4 traits.
- qmatrix_oracle_k5_j30: 30 items and 5 traits.
Pre-generated strategy sets.
- strategy_oracle_k3_j20_s2: 20 items, 3 traits, and 2 strategies.
- strategy_oracle_k3_j30_s2: 30 items, 3 traits, and 2 strategies.
- strategy_oracle_k3_j40_s2: 40 items, 3 traits, and 2 strategies.
- strategy_oracle_k3_j50_s2: 50 items, 3 traits, and 2 strategies.
- strategy_oracle_k4_j20_s2: 20 items, 4 traits, and 2 strategies.
- strategy_oracle_k4_j30_s2: 30 items, 4 traits, and 2 strategies.
- strategy_oracle_k4_j40_s2: 40 items, 4 traits, and 2 strategies.
- strategy_oracle_k4_j50_s2: 50 items, 4 traits, and 2 strategies.

Build Scripts

Want to see how each data set was imported? Check out the data-raw folder!

Authors

James Joseph Balamuta, Steven Andrew Culpepper, Jeffrey Douglas

Citing the `edmdata` package

To ensure future development of the package, please cite edmdata package if used during an analysis or simulation study. Citation information for the package may be acquired by using in R:

citation("edmdata")

License

MIT

References

Chen, Yinghan, Culpepper, S. A., Chen, Y., & Douglas, J. (2018). Bayesian estimation of the DINA q matrix. Psychometrika, 83(1), 89–108. https://doi.org/10.1007/s11336-017-9579-4

Chen, Yinyin, Culpepper, S. A., & Liang, F. (2020). A sparse latent class model for cognitive diagnosis. Psychometrika, 1–33. https://doi.org/10.1007/s11336-019-09693-2

Chen, Yinghan, Liu, Y., Culpepper, S. A., & Chen, Y. (2021). Inferring the number of attributes for the exploratory DINA model. Psychometrika, 86(1), 30–64. https://doi.org/10.1007/s11336-021-09750-9

Culpepper, S. A. (2014). If at first you don’t succeed, try, try again: Applications of sequential IRT models to cognitive assessments. Applied Psychological Measurement, 38(8), 632–644. https://doi.org/10.1177/0146621614536464

Culpepper, S. A. (2015). Bayesian estimation of the DINA model with gibbs sampling. Journal of Educational and Behavioral Statistics, 40(5), 454–476. https://doi.org/10.3102/1076998615595403

Culpepper, S. A. (2019a). An exploratory diagnostic model for ordinal responses with binary attributes: Identifiability and estimation. Psychometrika, 84(4), 921–940. https://doi.org/10.1007/s11336-019-09683-4

Culpepper, S. A. (2019b). Estimating the cognitive diagnosis $Q$ matrix with expert knowledge: Application to the fraction-subtraction dataset. Psychometrika, 84(2), 333–357. https://doi.org/10.1007/s11336-018-9643-8

Culpepper, S. A., & Balamuta, J. J. (2017). A Hierarchical Model for Accuracy and Choice on Standardized Tests. Psychometrika, 82(3), 820–845. https://doi.org/10.1007/s11336-015-9484-7

Culpepper, S. A., & Balamuta, J. J. (2021). Inferring latent structure in polytomous data with a higher-order diagnostic model. Multivariate Behavioral Research, 1–19. https://doi.org/10.1080/00273171.2021.1985949

Culpepper, S. A., & Chen, Y. (2019). Development and application of an exploratory reduced reparameterized unified model. Journal of Educational and Behavioral Statistics, 44(1), 3–24. https://doi.org/10.3102/1076998618791306

Heller, J., & Wickelmaier, F. (2013). Minimum discrepancy estimation in probabilistic knowledge structures. Electronic Notes in Discrete Mathematics, 42, 49–56. https://doi.org/10.1016/j.endm.2013.05.145

Jimenez, A., Balamuta, J. J., & Culpepper, S. A. (2023). A sequential exploratory diagnostic model using a pólya-gamma data augmentation strategy. British Journal of Mathematical and Statistical Psychology, 76(3), 513–538. https://doi.org/10.1111/bmsp.12307

Mullis, I. V. S., Martin, M. O., Goh, S., & Cotter, K. (Eds. ). (2016). TIMSS 2015 encyclopedia: Education policy and curriculum in mathematics and science. Retrieved from Boston College, TIMSS & PIRLS International Study Center website: http://timssandpirls.bc.edu/timss2015/encyclopedia/.

Myszkowski, N., & Storme, M. (2018). A snapshot of g? Binary and polytomous item-response theory investigations of the last series of the standard progressive matrices (SPM-LS). Intelligence, 68, 109–116. https://doi.org/10.1016/j.intell.2018.03.010

NCES. (2010). Early childhood longitudinal study, kindergarten class of 1998-99 (ECLS-k) kindergarten through fifth grade approaches to learning and self-description questionnaire (SDQ) items and public-use data files. https://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2010070

OpenPsychometrics. (2012a). Experimental matrix reasoning IQ test. https://openpsychometrics.org/_rawdata/IQ1.zip

OpenPsychometrics. (2012b). Taylor manifest anxiety scale. https://openpsychometrics.org/_rawdata/TMA.zip

OpenPsychometrics. (2013). Narcissistic personality inventory. https://openpsychometrics.org/_rawdata/NPI.zip

Raskin, R., & Terry, H. (1988). A principal-components analysis of the narcissistic personality inventory and further evidence of its construct validity. Journal of Personality and Social Psychology, 54(5), 890. https://doi.org/10.1037/0022-3514.54.5.890

Raven, J. C. (1941). Standardization of progressive matrices, 1938. British Journal of Medical Psychology, 19(1), 137–150. https://doi.org/10.1111/j.2044-8341.1941.tb00316.x

Robitzsch, A. (2020). Regularized latent class analysis for polytomous item responses: An application to SPM-LS data. Preprint. https://doi.org/10.20944/preprints202007.0269.v1

Tatsuoka, C. (2002). Data analytic methods for latent partially ordered classification models. Journal of the Royal Statistical Society: Series C (Applied Statistics), 51(3), 337–350. https://doi.org/10.1111/1467-9876.00272

Tatsuoka, K. K. (1984). Analysis of errors in fraction addition and subtraction problems. Final report.https://eric.ed.gov/?id=ED257665

Taylor, J. A. (1953). A personality scale of manifest anxiety. The Journal of Abnormal and Social Psychology, 48(2), 285. https://doi.org/10.1037/h0056264

Templin, J., & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika, 79(2), 317–339. https://doi.org/10.1007/s11336-013-9362-0

Templin, J., & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using mplus. Educational Measurement: Issues and Practice, 32(2), 37–50. https://doi.org/10.1111/emip.12010

Yoon, S. Y. (2011). Psychometric properties of the revised purdue spatial visualization tests: Visualization of rotations (the revised PSVT: r). Purdue University. https://docs.lib.purdue.edu/dissertations/AAI3480934/

r-edmdata

edmdata

Installation

Using data in the package

Data Sets Included

Build Scripts

Authors

Citing the `edmdata` package

License

References

Version

License

Status

Source

Homepage

Platforms (77)

edmdata

Installation

Using data in the package

Data Sets Included

Build Scripts

Authors

Citing the edmdata package

License

References

Version

License

Status

Source

Homepage

Platforms77 (77)

Citing the `edmdata` package

Platforms (77)