Description
Data Quality Checks and Statistical Assumption Testing for Agricultural Experiments.
Description
Provides a comprehensive pipeline for data quality checks and statistical assumption diagnostics in agricultural experimental data. Functions cover outlier detection using Interquartile Range (IQR) fence, Z-score, modified Z-score (Hampel identifier), Grubbs test and Dixon Q-test with consensus flagging; missing data pattern analysis and mechanism classification (Missing Completely At Random/Missing At Random/Missing Not At Random (MCAR/MAR/MNAR)) via Little's test; normality testing using Shapiro-Wilk, Anderson-Darling, Kolmogorov-Smirnov, Lilliefors, Pearson chi-square and Jarque-Bera tests; homogeneity of variance via Bartlett, Levene and Fligner-Killeen tests; independence of errors via Durbin-Watson, Breusch-Godfrey and Wald-Wolfowitz runs tests; experimental design validation for Completely Randomised Design (CRD), Randomised Complete Block Design (RCBD), Latin Square Design (LSD) and factorial designs; qualitative variable consistency checks; and automated HyperText Markup Language (HTML) report generation. Designed to align with Findable, Accessible, Interoperable and Reusable (FAIR) data principles. Methods follow Gomez and Gomez (1984, ISBN:978-0471870920) and Montgomery (2017, ISBN:978-1119492443).
README.md
agriDQ
agriDQ provides a comprehensive pipeline for data quality checks and statistical assumption diagnostics in agricultural experimental data.
Installation
# From CRAN
install.packages("agriDQ")
Functions
| Category | Functions |
|---|---|
| Outlier detection | IQR fence, Z-score, modified Z-score (Hampel), Grubbs, Dixon Q-test |
| Missing data | Little's MCAR test, MAR/MNAR pattern analysis |
| Normality | Shapiro-Wilk, Anderson-Darling, Lilliefors, Jarque-Bera, KS, Pearson |
| Homogeneity | Bartlett, Levene, Fligner-Killeen |
| Independence | Durbin-Watson, Breusch-Godfrey, Wald-Wolfowitz runs test |
| Design check | CRD, RCBD, LSD, factorial design validation |
| Qualitative | Consistency checks for categorical variables |
| Report | Automated HTML report generation |
Quick start
library(agriDQ)
# Load example data
data(agri_trial)
# Full pipeline
result <- run_agriDQ_pipeline(agri_trial,
response = "yield",
treatment = "treatment",
block = "block")
print(result)
# HTML report
generate_agriDQ_report(result, output_file = "report.html")
References
Gomez, K.A. and Gomez, A.A. (1984). Statistical Procedures for Agricultural Research, 2nd ed. Wiley. ISBN: 978-0471870920.
Montgomery, D.C. (2017). Design and Analysis of Experiments, 9th ed. Wiley. ISBN: 978-1119492443.
License
GPL (>= 3)