All Things Data and Springsteen.
spRingsteen
The spRingsteen package provides a number of dataframes describing the songs, albums, tours, and setlists of Bruce Springsteen's career. The data (collected from Brucebase) is provided in a tidy form which is easily analyzed in R
. The scripts which are used to scrape the data in their entirety, alongside a SQLite representation of the data may be viewed at a second repository springsteen_db
.
Installation
You can install the released version of spRingsteen from CRAN with:
install.packages("spRingsteen")
Alternatively, you can install the development version of spRingsteen from GitHub like so:
remotes::install_github("obrienjoey/spRingsteen")
Data refresh
While the spRingsteenCRAN version is updated every few months, the Github (Dev) version is updated on a daily basis. The update_data
function enables to overcome this gap and keep the installed version with the most recent data available on the Github version:
library(spRingsteen)
update_data()
Note: must restart the R session to have the updates available
Usage
Concerts
The package includes datasets around the career of Bruce Springsteen. For example, the touring history of him and his numerous bands is stored in concerts
:
library(spRingsteen)
library(dplyr)
concerts
#> # A tibble: 2,930 x 6
#> gig_key date location state city country
#> <chr> <date> <chr> <chr> <chr> <chr>
#> 1 /gig:1973-01-03-main-point-bryn-mawr-pa-early 1973-01-03 THE MAIN POINT~ PA <NA> USA
#> 2 /gig:1973-01-03-main-point-bryn-mawr-pa-late 1973-01-03 THE MAIN POINT~ PA <NA> USA
#> 3 /gig:1973-01-04-main-point-bryn-mawr-pa-early 1973-01-04 THE MAIN POINT~ PA <NA> USA
#> 4 /gig:1973-01-04-main-point-bryn-mawr-pa-late 1973-01-04 THE MAIN POINT~ PA <NA> USA
#> 5 /gig:1973-01-05-main-point-bryn-mawr-pa-early 1973-01-05 THE MAIN POINT~ PA <NA> USA
#> 6 /gig:1973-01-05-main-point-bryn-mawr-pa-late 1973-01-05 THE MAIN POINT~ PA <NA> USA
#> 7 /gig:1973-01-06-main-point-bryn-mawr-pa-early 1973-01-06 THE MAIN POINT~ PA <NA> USA
#> 8 /gig:1973-01-06-main-point-bryn-mawr-pa-late 1973-01-06 THE MAIN POINT~ PA <NA> USA
#> 9 /gig:1973-01-08-paul-s-mall-boston-ma-early 1973-01-08 PAUL'S MALL, B~ MA <NA> USA
#> 10 /gig:1973-01-08-paul-s-mall-boston-ma-late 1973-01-08 PAUL'S MALL, B~ MA <NA> USA
#> # ... with 2,920 more rows
# how many concerts have occurred in each country?
concerts %>%
count(country, sort = TRUE)
#> # A tibble: 39 x 2
#> country n
#> <chr> <int>
#> 1 USA 2261
#> 2 Canada 96
#> 3 England 88
#> 4 Australia 56
#> 5 Germany 52
#> 6 Spain 51
#> 7 Italy 50
#> 8 France 43
#> 9 Sweden 37
#> 10 Ireland 26
#> # ... with 29 more rows
Setlists
It also has information of the setlists performed in these shows which are stored in setlists
.
setlists
#> # A tibble: 52,100 x 4
#> gig_key song_key song song_number
#> <chr> <chr> <chr> <int>
#> 1 /gig:1973-01-03-main-point-bryn-mawr-pa-early /song:it-s-hard-to-be-a-sai~ It's~ 1
#> 2 /gig:1973-01-03-main-point-bryn-mawr-pa-early /song:santa-ana Sant~ 2
#> 3 /gig:1973-01-03-main-point-bryn-mawr-pa-early /song:secret-to-the-blues Secr~ 3
#> 4 /gig:1973-01-03-main-point-bryn-mawr-pa-early /song:new-york-song New ~ 4
#> 5 /gig:1973-01-08-paul-s-mall-boston-ma-early /song:growin-up Grow~ 1
#> 6 /gig:1973-01-09-wbcn-studio-boston-ma /song:satin-doll Sati~ 1
#> 7 /gig:1973-01-09-wbcn-studio-boston-ma /song:bishop-danced Bish~ 2
#> 8 /gig:1973-01-09-wbcn-studio-boston-ma /song:wild-billy-s-circus-s~ Circ~ 3
#> 9 /gig:1973-01-09-wbcn-studio-boston-ma /song:song-for-orphans Song~ 4
#> 10 /gig:1973-01-09-wbcn-studio-boston-ma /song:does-this-bus-stop-at~ Does~ 5
#> # ... with 52,090 more rows
# what song has been played most by Springsteen?
setlists %>%
count(song, sort = TRUE)
#> # A tibble: 994 x 2
#> song n
#> <chr> <int>
#> 1 Born To Run 1710
#> 2 Thunder Road 1440
#> 3 The Promised Land 1387
#> 4 Badlands 1195
#> 5 Tenth Avenue Freeze-Out 1107
#> 6 Dancing In The Dark 1050
#> 7 Born In The U.s.a. 1011
#> 8 The Rising 881
#> 9 Rosalita (Come Out Tonight) 812
#> 10 Hungry Heart 737
#> # ... with 984 more rows
# which song has most frequently opened a show?
setlists %>%
filter(song_number == 1) %>%
count(song, sort = TRUE) %>%
slice(1)
#> # A tibble: 1 x 2
#> song n
#> <chr> <int>
#> 1 Growin' Up 272
Songs
Further details of the songs themselves are available in songs
, including the album of appearance and also the full lyrics in some cases. This allows for some text mining or sentiment analysis using a package like tidytext.
library(tidytext)
#> Warning: package 'tidytext' was built under R version 4.1.3
# what word appears most frequently in the **Born in the U.S.A** album?
songs %>%
filter(album == "Born In The U.S.A.") %>%
select(title, lyrics) %>%
unnest_tokens(word, lyrics) %>%
count(word, sort = TRUE) %>%
anti_join(stop_words, by = 'word')
#> # A tibble: 513 x 2
#> word n
#> <chr> <int>
#> 1 la 158
#> 2 yeah 47
#> 3 alright 41
#> 4 sha 40
#> 5 glory 37
#> 6 days 35
#> 7 u.s.a 32
#> 8 born 30
#> 9 hoo 27
#> 10 baby 26
#> # ... with 503 more rows
Tours
Lastly, the tour
table contains the tours associated with each concert.
tours %>%
count(tour, sort = TRUE)
#> # A tibble: 24 x 2
#> tour n
#> <chr> <int>
#> 1 Non-tour Shows 575
#> 2 Springsteen On Broadway 268
#> 3 The River Tour 213
#> 4 The Wild, The Innocent & The E Street Shuffle Tour 197
#> 5 Born In The U.S.A. Tour 156
#> 6 Greetings From Asbury Park Tour 147
#> 7 Wrecking Ball Tour 134
#> 8 The Reunion Tour 132
#> 9 The Ghost Of Tom Joad Tour 128
#> 10 The Rising Tour 120
#> # ... with 14 more rows
Of course the real advantage of this package is in combining the different dataframes in order to infer useful information:
# what was the most played song on each tour?
setlists %>%
left_join(tours, by = 'gig_key') %>%
count(song, tour) %>%
group_by(tour) %>%
filter(n == max(n)) %>%
arrange(desc(tour))
#> # A tibble: 95 x 3
#> # Groups: tour [25]
#> song tour n
#> <chr> <chr> <int>
#> 1 Death To My Hometown Wrecking Ball Tour 134
#> 2 Leap Of Faith World Tour 1992-93 103
#> 3 American Land Working On A Dream Tour 83
#> 4 Born To Run Working On A Dream Tour 83
#> 5 The Promised Land Vote For Change 22
#> 6 Adam Raised A Cain Tunnel Of Love Express Tour 67
#> 7 All That Heaven Will Allow Tunnel Of Love Express Tour 67
#> 8 Born In The U.s.a. Tunnel Of Love Express Tour 67
#> 9 Born To Run Tunnel Of Love Express Tour 67
#> 10 Brilliant Disguise Tunnel Of Love Express Tour 67
#> # ... with 85 more rows