#LOAD PACKAGES
library(tidyverse)
#install.packages("sf") - note some students are getting a pop-up when they install the sf package for the first time. Select the "no" option when it pops up in your console. 
library(sf)
#some students are needing into install the rgeos package seperately as well
#library(rgeos)Maps with maps and sf
Before we get started, some context:
- Ris fantastic for spacial analysis (not covered in this class⦠look for classes related to spacial statistics)
- Ris great for interactive data visualization (via- leafletor- shiny⦠more on this on Thursday)
- Ris okay at spacial data visualization (creating maps).- There are many different packages in Rfor creating maps. Iāve found that different packages perform best for different maps. We will talk about a few different ones today.
- If you have a highly map-centric project, there is nothing wrong with working in ArcGIS or QGIS if you find the mapping tools in R insufficient. There are many recent improvements with new packages (like sp,rgdalandrgeos) which profiles much of the functionality of GIS packages! Exciting! (not very beginner friendly - requires familiarity with GIS concepts)
 
- There are many different packages in 
Using the sf package
Vector data for maps are typically encoded using the āsimple featuresā standard produced by the Open Geospatial Consortium. The sf package developed by Edzer Pebesma provides an excellent toolset for working with such data, and the geom_sf() and coord_sf() functions in ggplot2 are designed to work together with the sf package.
For our first example, we will be working with a dataset of North Carolina that is built in to the sf package.
demo(nc, ask = FALSE, echo = FALSE)You should notice that the nc dataset is now saved in your R environment. This dataset contains information about Sudden Infant Death Syndrome (SIDS) for North Carolina counties, over two time periods (1974-78 and 1979-84). Letās take a look at that dataset.
Each row represents a county in North Carolina. This data frame contains the following columns:
- AREACounty polygon areas in degree units
- PERIMETERCounty polygon perimeters in degree units
- CNTY_Internal county ID
- NAMECounty names
- FIPSCounty ID
- FIPSNOCounty ID
- CRESS_IDCressie papers ID
- BIR74births, 1974-78
- SID74SID deaths, 1974-78
- NWBIR74non-white births, 1974-78
- BIR79births, 1979-84
- SID79SID deaths, 1979-84
- NWBIR79non-white births, 1979-84
- geominformation needed to plot the map for each county
Letās begin by simply plotting the map using geom_sf. Note that you donāt need to specify the x- or y-axes ā sf figures that out for you.
nc %>%
  ggplot() +
  geom_sf()
Letās pretty it up:
nc %>%
  ggplot() +
  geom_sf(col="black", fill="darkgrey") +
  theme_light() +
  ggtitle("North Carolina Counties")
Cloropleth maps
A choropleth map is a type of thematic map where areas (such as countries, states, or regions) are shaded or colored based on data values. Itās commonly used to visualize statistical information, such as population density, election results, or income levels, by using different shades or colors to represent varying data ranges.

Suppose we want to shade each of these counties, based on the number of births in 1974.
map <- nc %>%
  ggplot() +
  geom_sf( aes(fill = BIR74), col ="black") +
  theme_light()+
  ggtitle("North Carolina, Birth Rates in 1974")
map
Color Palettes
Qualitative Color Palettes
| Best for⦠| Categories (unordered) | 
| Examples | Species, Groups, Brands | 
| RColorBrewerPalettes | "Set1","Dark2","Paired" | 
| Example R Code | scale_fill_brewer(palette = "Set1") | 
| wesandersonPalettes | "GrandBudapest1","Darjeeling1","Moonrise2" | 
| Example R Code | scale_fill_manual(values = wes_palette("GrandBudapest1")) | 


Sequential Color Palettes
| Best for⦠| Ordered, continuous data | 
| Examples | Temperature, Population Density | 
| RColorBrewerPalettes | "Blues","Reds","Greens" | 
| Example R Code | scale_fill_brewer(palette = "Blues") | 
| viridisPalettes | "viridis","magma","plasma","cividis" | 
| Example R Code | scale_fill_viridis_c(option = "magma") | 
| Build your Own | scale_fill_gradientn(c("red", "yellow")) | 


Note: Be sure that higher values are encoded with the darkest colors!
Diverging Color Palettes
| Best for⦠| Data with a central midpoint | 
| Examples | Election Results, Anomaly Detection | 
| RColorBrewerPalettes | "RdBu","Spectral" | 
| Example R Code | scale_fill_brewer(palette = "RdBu") | 
| Build your Own | scale_fill_manual(values = c("red", "orange")) | 


ā Match palette type to data type
ā Choose colorblind-friendly palettes when designing for general audiences
ā Limit colors to avoid overwhelming the reader - for categortical data limit the number of distinct colors to 5-8 max (beyond that, consider grouping)
ā Consider the meaning of colors in your audienceās cultural context.
ā
 If the data is skewed, consider using the scales package to log -scale.
š“ Avoid: Using blue for land in maps
Customizing Cholepleth maps
library(RColorBrewer)
map + 
  scale_fill_viridis_c(option = "magma", direction = -1) 
Adding labels with geom_sf_text()
map + 
  scale_fill_viridis_c(option = "magma", direction = -1)+ 
  geom_sf_text(aes(label = NAME), size = 1)
Since population density naturally drives most data trends, these maps frequently fail to provide any useful or surprising information.

š“ Correlation doesnāt imply causation! Just because two variables show similar patterns doesnāt mean one causes the other.
ā Use rates, percentages, or per capita values rather than absolute numbers. Example: Instead of showing total website users per state, show website users per 100,000 residents.
ā Use location quotients or z-scores to highlight areas with unusually high or low values relative to expectations. Example: Show the percentage of a stateās population that subscribes to Martha Stewart Living relative to the national average.
