First, we load the required packages. Note that we use devtools to download a developer version of gtrendsR.
if (!require('gtrendsR'))
devtools::install_github('PMassicotte/gtrendsR')
## Loading required package: gtrendsR
if(!require("pacman")) install.packages("pacman")
## Loading required package: pacman
pacman::p_load(gtrendsR,maps,ggplot2,lettercase,viridis,pals,scico,ggrepel)
Let’s first look at how the impact of hurricanes Katrina (August 2005) and Harvey (August 2017) are reflected in how Americans have used these names as google search items over time. We’ll also add change some of the plot settings by creating our own ggplot theme.
The gprop argument controls whether we want general web, news, image or youtube searches.
The time argument is set to “all” and will gather data between 2004 and the time the code is run.
Focusing on searches made in the us, we’ll set the geo argument to “US”.
NOTE: adding a line for Hurricane Irma used to work just fine, but it currently seems to mess up the output and only show straight lines. Using the plot function returns a ggplot object, which we can then customize. However, I was unable to suppress the default plot.
my_theme <- function() {
theme_bw() +
theme(panel.background = element_blank(),
plot.background = element_rect(fill = "seashell"),
panel.border = element_blank(), # facet border
strip.background = element_blank(), # facet title background
plot.margin = unit(c(.5, .5, .5, .5), "cm"),
panel.spacing = unit(3, "lines"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.background = element_blank(),
legend.key = element_blank(),
legend.title = element_blank())
}
hurricanes <- gtrends(c("katrina","harvey"),
time ="all",
gprop = "web",
geo = c("US"))
plot(hurricanes) +
my_theme() +
geom_line(size=1.5) +
scale_colour_viridis_d(option = "C", begin = .2, end = .7)
To understand what is actually measured on the y-axis, have a look here:
https://support.google.com/trends/answer/4365533?hl=en
The above plot shows clear spikes around the time when Katrina and Harvey hit the US. We could also plot more cyclical data.
cycles <- gtrends(c("spring break","vacation"),
time = "2008-01-01 2018-01-01",
gprop = "web",
geo = c("US"))
plot(cycles) +
my_theme() +
geom_line(size=1.5) +
scale_colour_viridis_d(option = "C", begin = .2, end = .7)
As you can see in the output below, the gtrends function actually returns a list of data frames with various kinds of data.
hurricanes <- gtrends(c("Katrina","Harvey","Irma"),
time = "all",
gprop = "web",
geo = c("US"))
for(df in hurricanes){
print(head(df))
}
## date hits keyword geo gprop category
## 1 2004-01-01 <1 Katrina US web 0
## 2 2004-02-01 <1 Katrina US web 0
## 3 2004-03-01 <1 Katrina US web 0
## 4 2004-04-01 1 Katrina US web 0
## 5 2004-05-01 <1 Katrina US web 0
## 6 2004-06-01 <1 Katrina US web 0
## NULL
## location hits keyword geo gprop
## 1 Louisiana 100 Katrina US web
## 2 Mississippi 72 Katrina US web
## 3 Virginia 48 Katrina US web
## 4 Alabama 35 Katrina US web
## 5 District of Columbia 35 Katrina US web
## 6 Texas 31 Katrina US web
## location hits keyword geo gprop
## 1 Roanoke-Lynchburg VA 100 Katrina US web
## 2 Biloxi-Gulfport MS 92 Katrina US web
## 3 New Orleans LA 84 Katrina US web
## 4 Baton Rouge LA 55 Katrina US web
## 5 Hattiesburg-Laurel MS 48 Katrina US web
## 6 Jackson MS 43 Katrina US web
## location hits keyword geo gprop
## 1 Brookneal 100 Katrina US web
## 2 New Orleans 53 Katrina US web
## 3 Leesburg 40 Katrina US web
## 4 Baton Rouge 35 Katrina US web
## 5 Dearing NA Katrina US web
## 6 Santa Clara 30 Katrina US web
## NULL
## subject related_queries value geo keyword category
## 1 100 top katrina hurricane US Katrina 0
## 2 99 top hurricane US Katrina 0
## 3 41 top katrina kaif US Katrina 0
## 4 15 top new orleans US Katrina 0
## 5 15 top new orleans katrina US Katrina 0
## 6 12 top katrina bowden US Katrina 0
Now, let’s compare the amount of interest in Hurricane Harvey for each US state.
harvey <- gtrends(c("Harvey"),
gprop = "web",
time = "2017-08-18 2017-08-25",
geo = c("US"))
harvey <- harvey$interest_by_region
harvey$region <- sapply(harvey$location,tolower) #change to lower case for merging with map data
statesMap <- map_data("state")
harveyMerged <- merge(statesMap,harvey,by="region")
On the fourth line in the code above, we fetch an empty map for plotting our geographical data. On the fifth line, we merge our google trends data with the underlying map data. Both data sets contain a column called “region”, which will be used to merge the data frames. Note that the region labels need to be identical. therefore, on line 3, we change the capitalized state names to lowercase. Now we can plot the data! But first we’ll modify our ggplot theme for plotting maps.
my_theme2 <- function() {
my_theme() +
theme(axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank())
}
ggplot() +
geom_polygon(data=harveyMerged,aes(x=long,y=lat,group=group,fill=log(hits)),colour="white") +
scale_fill_gradientn(colours = rev(scico(15, palette = "tokyo")[2:7])) +
my_theme2() +
ggtitle("Google search interest for Hurricane Harvey\nin each state from the week prior to landfall in the US")
Note that we have plotted the log-transformed hits variable. We then use the same procedure for plotting regional searches for Hurricane Irma, except we add a label for each state, change the colours and add the coord_fixed argument for a nicer looking map. Placing the state names is a bit tricky. I’ve used a simple solution that centers the names. However, some of the smaller eastern state names would overlap when using geom_text(). Luckily, geom_text_repel() from the ggrepel package takes care of this issue.
irma <- gtrends(c("irma"),
gprop = "web",
time = "2017-09-03 2017-09-10",
geo = c("US"))
irma <- irma$interest_by_region
statesMap <- map_data("state")
irma$region <- sapply(irma$location,tolower)
irmaMerged <- merge(statesMap ,irma,by="region")
regionLabels <- aggregate(cbind(long, lat) ~ region, data=irmaMerged,
FUN=function(x) mean(range(x)))
irmaMerged %>%
ggplot() +
geom_polygon(aes(x=long,y=lat,group=group,fill=log(hits)),colour="white") +
my_theme2() +
geom_text_repel(data=regionLabels, aes(long, lat, label = str_title_case(region)), size=3) +
coord_fixed(1.3) +
ggtitle("Google search interest for Hurricane Irma\nin each state from the week prior to landfall in the US") +
scale_fill_gradientn(colours = ocean.tempo(15)[2:10])
Lastly, we,’ll plot searches for Hurricane Katrina by state, but focusing on searches between its formation and dissipation.
katrina <- gtrends(c("katrina"),
gprop = "web",
time="2005-08-23 2005-08-31",
geo = c("US"))
katrina <- katrina$interest_by_region
statesMap <- map_data("state")
katrina$region <- sapply(katrina$location,tolower)
katrinaMerged <- merge(statesMap ,katrina,by="region")
regionLabels <- aggregate(cbind(long, lat) ~ region, data=katrinaMerged,
FUN=function(x) mean(range(x)))
katrinaMerged %>% ggplot() +
geom_polygon(aes(x=long,y=lat,group=group,fill=log(hits)),colour="white") +
scale_fill_continuous(low="ivory",high="midnightblue") +
guides(fill = "colorbar") +
geom_text_repel(data=regionLabels, aes(long, lat, label = str_title_case(region)), size=3) +
my_theme2() +
coord_fixed(1.3) +
ggtitle("Google search interest for Hurricane Katrina\nin each state between formation and dissipation")
BONUS: In which US states do people search for “guns” the most?
guns <- gtrends(c("guns"), gprop = "web", time="all", geo = c("US"))
guns <- guns$interest_by_region
statesMap <- map_data("state")
guns$region <- sapply(guns$location,tolower)
gunsMerged <- merge(statesMap,guns,by="region")
regionLabels <- aggregate(cbind(long, lat) ~ region, data=gunsMerged,
FUN=function(x) mean(range(x)))
gunsMerged %>% ggplot() +
geom_polygon(aes(x=long,y=lat,group=group,fill=log(hits)),colour="white") +
geom_text_repel(data=regionLabels, aes(long, lat, label = str_title_case(region)), size=3) +
my_theme2() +
coord_fixed(1.3) +
scale_fill_distiller(palette = "Reds") +
ggtitle("Google search interest for guns in each state")