In the previous statistical analysis, we conducted a proportion test to see whether the car accidents rates are the same in each borough. The results show that the proportions of car accidents are different across boroughs. Now we are still interested in the relationship between the location and the number of accidents, so we created a map to visualize to further explore this relationship.
In order to clearly see the difference in each borough, we first cleaned the data by removing rows with missing values for the borough, latitude, and longitude variables, aggregated the latitude and longitude and kept two decimal places.
map_data =
accidents1 %>%
filter(!is.na(borough)) %>%
filter(!is.na(latitude)) %>%
filter(!is.na(longitude)) %>%
mutate(
latitude = round(latitude, digits = 2),
longitude = round(longitude, digits = 2)) %>%
group_by(latitude, longitude) %>%
count(latitude, longitude, borough) %>%
arrange(desc(n)) %>%
mutate(text_label = str_c("</b>Lat: ", latitude, "°", "</b><br>Lng: ", longitude, "°", "</b><br>Borough: ", borough, "</b><br>Number_of_accidents: ", n))
pal = colorNumeric(
palette = "YlGnBu",
domain = map_data$n)
map_data %>%
leaflet() %>%
setView(lng = -74.00666, lat = 40.71643, zoom = 10) %>%
addTiles() %>%
leaflet::addLegend("bottomright",
pal = pal,
values = ~n,
title = "Number of Car Accidents</br>",
bins = 10,
opacity = 1,
labFormat = labelFormat(suffix = " cases")
) %>%
addCircleMarkers(
lng = ~longitude,
lat = ~latitude,
color = ~pal(n),
radius = 4,
popup = ~ text_label,
fillOpacity = 0.8,
stroke = FALSE,
opacity = 1)
According to the plot, the color scale indicated the number of car accidents at each location, with warmer colors (yellow) indicating a lower number of accidents and cooler colors (blue and green) indicating a higher number of accidents. From the mapping plot, we can see that the accidents occurred more in Bronx and Manhattan from January 2020 to August 2020, followed by Brooklyn. Fewer accidents occurred in Queens and Staten Island.
All of the information can be used to identify areas of the city with higher numbers of car accidents and develop strategies for reducing the number of accidents in those areas. For example, one potential strategy might be deploying more traffic police to areas with high numbers of accidents in order to maintain traffic order and reduce congestion.
We generated an interactive map to show the locations of car accidents in New York City in 2020 and the number of injuries resulting from each accident.
accidents_2020_map = accidents1 %>%
filter(!is.na(borough)) %>%
filter(!is.na(latitude)) %>%
filter(!is.na(longitude)) %>%
mutate(
total_injured = number_of_persons_injured + number_of_pedestrians_injured + number_of_cyclist_injured + number_of_motorist_injured,
textlab = paste0("Number of Injuries: ", total_injured, "\nAddress: ", on_street_name)
)
total_injuried_map = accidents_2020_map %>%
plot_ly(
lat = ~latitude,
lon = ~longitude,
type = "scattermapbox",
mode = "markers",
alpha = 0.5,
color = ~total_injured,
text = ~textlab,
colors = "YlOrRd")
total_injuried_map %>% layout(
mapbox = list(
style = 'carto-positron',
zoom = 9,
center = list(lon = -73.9, lat = 40.7)),
title = "<b> Total Number of Injuries </b>",
legend = list(title = list(text = "Borough of Accidents", size = 9),
orientation = "h",
font = list(size = 9)))
From the plot, we can tell that the maximum injury in one accident is 30 people. However, most accidents have less than 5 injuries. A few accidents have more than 5 injuries.
This information can be used to identify areas of the city with higher numbers of injuries and develop strategies for reducing the number of injuries in those areas. For example, one potential strategy might be increasing public awareness of road safety and encouraging drivers to be more careful in areas with high numbers of injuries.