Description
Implementation of Frequent-Directions Algorithm for Efficient Matrix Sketching.
Description
Implement frequent-directions algorithm for efficient matrix sketching. (Edo Liberty (2013) <doi:10.1145/2487575.2487623>).
README.md
frequentdirections
Implementation of Frequent-Directions algorithm for efficient matrix sketching [E. Liberty, SIGKDD2013]
Installation
# Not yet onCRAN
install.packages("frequentdirections")
# Or the development version from GitHub:
install.packages("devtools")
devtools::install_github("shinichi-takayanagi/frequentdirections")
Example
Download example data
Here, we use Handwritten digits USPS dataset as sample data. In the following example, we assume that you save the above sample data into /tmp
directory.
Load data
The dataset has 7291 train and 2007 test images in h5
format. The images are 16*16 grayscale pixels.
library("h5")
file <- h5file("/tmp/usps.h5")
x <- file["train/data"][]
y <- file["train/target"][]
str(x)
#> num [1:7291, 1:256] 0 0 0 0 0 0 0 0 0 0 ...
Plot example image
Example the number 8
image(matrix(x[338,], nrow=16, byrow = FALSE))
Plot SVD
Plot the original data on the first and second singular vector plane.
x <- scale(x)
frequentdirections::plot_svd(x, y)
Matrix Sketching
l = 8 case
eps <- 10^(-8)
# 7291 x 256 -> 8 * 256 matrix
b <- frequentdirections::sketching(x, 8, eps)
frequentdirections::plot_svd(x, y, b)
l = 32 case
# 7291 x 256 -> 32 * 256 matrix
b <- frequentdirections::sketching(x, 32, eps)
frequentdirections::plot_svd(x, y, b)
l = 128 case
# 7291 x 256 -> 128 * 256 matrix
b <- frequentdirections::sketching(x, 128, eps)
frequentdirections::plot_svd(x, y, b)
This result is almost the same with the original data SVD expression.
That’s why we can think that the original data is expressed with only 128
rows.