R-packages that we need to reproduce the existing maps and graphs:

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ REQUIRED LIBRARIES @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
library(visNetwork) # allows graphs to be generated for desired network
library(sf)         # encodes spatial vector data
library(dplyr)      # A fast, consistent tool for working with data frame like objects
library(purrr)      # purrr enhances R'/s functional programming (FP) toolkit
library(raster)     # the best to handle spatial raster datasets
library(leaflet)    # Leaflet is one of the most popular open-source JavaScript libraries for interactive maps. 
library(tmap)       # something like leaflet ( If you needed something rather than leaflet)
library(revgeo)     # for reverse geocoding... I am hopping you all know what geocoding is!! 
library(osmdata)    # downloading and using data from OpenStreetMap (OSM)
library(htmltools)  # Handy package for HTML generation and output.
library(htmlwidgets)# A framework for creating HTML widgets that render in various contexts 

In marketing, geomarketing (also called marketing geography) is aimed at using geographic information in the process of planning and implementation of marketing activities. It can be used in market analyses, target group localization, sales territory planning, prodcution of retail digital maps such as purchasing power and regional market data as well as appraisals of real estate, locations and sales networks for the industry, and manufacturing sectors. Below are some questions that geomarketing can respond to them:

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ REQUIRED LIBRARIES @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
library(visNetwork)
 
# create data:
nodes <- data.frame(id = 1:4, color.background = c("red", "lightblue", "lightgreen", "orange"), group = LETTERS[1:4])
edges <- data.frame(from = c(1,1,1,2,2,2,3,3,3,4,4,4), to = c(2,3,4,1,3,4,1,2,4,1,2,3))

visNetwork(nodes, edges, width = "100%", main = "Simple geomarketing aplications network")%>%
  visGroups(groupname = "A", color = "red") %>%
  visGroups(groupname = "B", color = "lightblue") %>%
  visGroups(groupname = "C", color = "lightgreen") %>%
  visGroups(groupname = "D", color = "orange") %>%
  visEdges(arrows = "to") %>% 
  visPhysics(solver = "forceAtlas2Based", 
            forceAtlas2Based = list(gravitationalConstant = -35)) %>%
  visLayout(randomSeed = 213) %>%
  visLegend()

In this presentation, I will simply demonstrate how we can access AUSTRALIAN POPULATION GRID 2018 and apply it to see where we can start a new shop. Of course, setting up a new shop needs a lots of research and this is just a simple demonstration. For this task, we assume that our client wants to start chain of restaurants in Australia, where the stores should be placed in urban areas with as many potential customers as possible. We also assume that we have no prior information about Australian cities. Let's do this!

OK, first thing first, and that is setting up a working directory:

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ SETTING UP WORKING DIRECTORY @@@@@@@@@@@@@@@@@@@@@
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
wd <- getwd() 
# wd

From here to the end of this tutorial we will do below tasks: 1. Download the Australian census data (input data) from Australian Bureau of Statistics (ABS) © Commonwealth of Australia and apply it for identifying spatial distribution of target groups (Step 1). 2. Identify metropolitan areas with high population densities (Step 2). 3. Assessing whether the existing shops over/under-exploit the market potential by downloading detailed geographic data (from OpenStreetMap, with osmdata) for these areas (Step 3). 4. Locating potential market areas by assessing and mapping relative desirability of different locations through map algebra (Step 4) 5. Ranking final scores

1 Downdload the Australian population grid

First, download, unzip, and read in a australian population grid 2018 in esri grid format zip file.

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ DOWNLOADING DATA FROM SERVER @@@@@@@@@@@@@@@@@@@@@
download.file("https://www.abs.gov.au/AUSSTATS/subscriber.nsf/log?openagent&australian%20population%20grid%202018%20in%20tiff%20format.zip&3218.0&Data%20Cubes&E89D43E67E233926CA2583C9000DF951&0&2017-18&27.03.2019&Latest", 
              destfile = "australian population grid 2018 in esri grid format.zip", mode = "wb")
unzip("australian population grid 2018 in esri grid format.zip") # unzip the files
census_au = raster::raster("apg18e_1_0_0.tif")

census_au: is Australian Population Grid 2018 in GeoTIFF format containing more than 30,344,100 grid cells across Australia. Once we have loaded our population raster in R, to speed up the process and also for better visualisation purpose, we replace 0 gird-cells with NA.

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ DATA MANIPULATION TASK @@@@@@@@@@@@@@@@@@@@@
# Praparing raster data for better visualisation
vis.ras = census_au
vis.ras[vis.ras <= 0] <- NA

pal <- colorNumeric(c("yellow","orange","red","red3"), values(vis.ras),
  na.color = "transparent")

leaflet() %>% addTiles() %>%
  addRasterImage(vis.ras, colors = pal, opacity = 0.8) %>%
  setView(144.96, -37.81, zoom = 5) %>%
  addLegend(pal = pal, values = values(vis.ras),
    title = "Population Density")

We then need to reclassify raster data that was created above:

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ DATA RECLASSIFICATION @@@@@@@@@@@@@@@@@@@@@@@@@@@@
input_ras = vis.ras
rcl_pop = matrix(c(0,500,1, 500,2000,2, 2000,5000,3, 5000,8000,4, 
                   8000, Inf, 5), ncol = 3, byrow = TRUE)

input_ras = reclassify(x = input_ras, rcl = rcl_pop, right = NA)
reclass = stack(vis.ras, input_ras)
names(reclass) = c("org_pop", "rcl_pop")

2 Identify metropolitan areas with high population densities

Here, we identify metropolitan areas. Note that the input raster layers have the cell size of 1 km by 1 km, which is too small/detailed to delineate metropolitan areas. Thus, we first aggregate the input raster layers by the factor of 20. Next, keep only those cells with more than half a million people (now, the cell size is 10 km by 10 km).

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ DATA AGGREGATION AND SUBSETTING @@@@@@@@@@@@@@@@@@
pop_agg = aggregate(reclass$org_pop, fact = 20, fun = sum)
pop_agg = pop_agg[pop_agg > 200000, drop = FALSE] # try what happens with drop = TRUE

# Potting
pal <- colorNumeric(c("yellow","orange","red","red3"), values(pop_agg),
  na.color = "transparent")

leaflet() %>% addTiles() %>%
  addRasterImage(pop_agg, colors = pal, opacity = 0.8) %>%
  addLegend(pal = pal, values = values(pop_agg),
    title = "Population Density")

Above map plot demonstrates that only seven metropolitan regions have our expected population requirement. Each region consists of one or more raster cells. Let's join all cells belonging to one region by raster::clump().

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ DATA AGGREGATION AND SUBSETTING @@@@@@@@@@@@@@@@@@
#Detect clumps (patches) of connected cells
polys = pop_agg %>% 
  clump() %>%
  rasterToPolygons() %>%
  st_as_sf()
## Loading required namespace: igraph
metros = polys %>%
  group_by(clumps) %>%
  summarize()

3 Assessing whether the existing shops over/under-exploit the market potential

Still, the metroplitan names are missing in metros. To add the names, we reverse-geocode, which is a task similar to geocoding by in the opposite direction (input: coordinates -> output: addresses). This is primarily done using revgeo package, which provides access to the open-source Photon geocoder for OpenStreetMap, Google Maps and Bing. By default, it uses the Photon API. Please keep in mind that revgeo::revgeo() only accepts geographical coordinates (latitude/longitude); therefore, the first requirement is to bring the metropolitan polygons into an appropriate coordinate reference system.

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ REVERSING GEOCODES @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
# EPSG:4326 (WGS84)
metros_wgs = st_transform(metros, 4326)
coords = st_centroid(metros_wgs) %>%
  st_coordinates() %>%
  round(4)

# Now, let's reverse geocode with given coordinates.
metro_names = revgeo(longitude = coords[, 1], latitude = coords[, 2], 
                     output = "frame") # this returns a data frame
## [1] "Getting geocode data from Photon: http://photon.komoot.de/reverse?lon=153.0345&lat=-27.4862"
## [1] "Getting geocode data from Photon: http://photon.komoot.de/reverse?lon=153.3894&lat=-27.9767"
## [1] "Getting geocode data from Photon: http://photon.komoot.de/reverse?lon=115.8934&lat=-32.0094"
## [1] "Getting geocode data from Photon: http://photon.komoot.de/reverse?lon=138.6113&lat=-34.8709"
## [1] "Getting geocode data from Photon: http://photon.komoot.de/reverse?lon=151.0733&lat=-33.8454"
## [1] "Getting geocode data from Photon: http://photon.komoot.de/reverse?lon=149.1168&lat=-35.3054"
## [1] "Getting geocode data from Photon: http://photon.komoot.de/reverse?lon=145.0634&lat=-37.8645"
# One minor manual edit: W?lfrath to Duesseldorf (no Umlaut).
metro_names = dplyr::pull(metro_names, city) %>% 
  as.character() %>% 
  ifelse(. == "Adelaide", "Melbourne", .)

metro_names[[4]] = "Adelaide"
metro_names[[7]] = "Melbourne"

We then need to identify the competing stores as points of interest. 1. The osmdata package provides easy-to-use access to Open Street Map (OSM) data.

  1. Instead of downloading shops for the whole of Australia, we restrict the query to the defined metropolitan areas, reducing computational load and providing shop locations only in areas of interest.

  2. The subsequent code chunk does this using a number of functions including:
#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ POINT OF INTEREST @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
shops = map(
  metro_names, 
  function(x) {
    message("Downloading shops of: ", x, "\n")
    
    # give the server a bit time
    Sys.sleep(sample(seq(5, 10, 0.1), 1))
    
    # specify a query with a bounding box and a key   
    # query = opq(x) %>%
    #   add_osm_feature(key = "shop")
    
    query = getbb (x, base_url = "https://nominatim.openstreetmap.org",) %>% 
        opq() %>%
        add_osm_feature(key = "amenity", value = "restaurant")
    
    # Return an OSM query in sf format 
    points = osmdata_sf(query)
    
    # request the same data again if nothing has been downloaded 
    iter = 2
    while (nrow(points$osm_points) == 0 & iter > 0) {
      points = osmdata_sf(query)
      iter = iter - 1
    }
    # 
    points = st_set_crs(points$osm_points, 4326)
  }
)

Once the above API querries are finished, we need to check if all seven metropolitan areas have at least some restaurant. Since these metros are the largest and there are more than \(22,000\) restaurants in Australia, it's unlikely that any of them has no restaurant.

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ CHECK IF ALL EIGHT METROPOLITAN HAVE SHOPS @@@@@@@
# checking if we have downloaded restaurant for each metropolitan area
ind = map(shops, nrow) == 0
if (any(ind)) {
  message("There are/is still (a) metropolitan area/s without any features:\n",
          paste(metro_names[ind], collapse = ", "), "\nPlease fix it!")
}

Once we made sure thast all metros have some restaurant, it is required to clean the results by selecting two columns.

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ CLEANING SHOPS DATA @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
# select only specific columns
shops = map(shops, dplyr::select, osm_id)
# putting all list elements into a single data frame
shops = do.call(rbind, shops)

As the last step, convert the spatial point object into a raster. The sf obejct, shops, is converted to a raster layer with the same parameters (dimensions, resolution, CRS) as the reclass object from Step 3.

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ RECLASSIFICATION OF SHOPS RASTER DATA @@@@@@@@@@@
# to match crs of shops and reclass
shops = st_transform(shops, proj4string(reclass)) 
# create poi raster
poi = rasterize(x = shops, y = reclass, field = "osm_id", fun = "count")
#paraparing cities that we  have extracted the shops for
library(maps)
df.cities <- world.cities[world.cities$country.etc == "Australia" &
                            world.cities$name %in% metro_names,] 

#paraparing population raster for better visualization in next step
vis.ras[vis.ras <= 2000] <- NA

Next we just need a function so that we can inteactively map population density and number of existing resturant shops for same selected cities. Here is how we can do the same thing in straight R code.

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ SIDE BY SIDE MAPPING FUNCTION USING LEAFLET @@@@@
map.fun <- function(df.cities=df.cities, pop.ras=vis.ras, shop.ras=poi, i=i){
  pal <- colorNumeric(c("yellow","orange","red","red3"), values(pop.ras),
          na.color = "transparent")
  map1 <- leaflet() %>% addTiles() %>%
            addRasterImage(pop.ras, colors = pal, opacity = 0.6) %>%
            setView(df.cities$long[[i]], df.cities$lat[[i]], zoom = 11) %>%
            addLegend(pal = pal, values = values(pop.ras),
                      title = "Population")
  
  pal <- colorNumeric(c("yellow","orange","red","red3"), values(shop.ras),
            na.color = "transparent")
  map2 <- leaflet() %>% addTiles() %>%
            addRasterImage(shop.ras, colors = pal, opacity = 0.6) %>%
            setView(df.cities$long[[i]], df.cities$lat[[i]], zoom = 11) %>%
            addLegend(pal = pal, values = values(shop.ras),
                      title = "Resturants")
  
  tagList(
    tags$head(tags$script(
      type="text/javascript",
      src="https://cdn.rawgit.com/turban/Leaflet.Sync/master/L.Map.Sync.js"
    )),
    tags$div(style="display:inline;width:50%;float:left;",map1),
    tags$div(style="display:inline;width:50%;float:left;",map2),
    onStaticRenderComplete(
      '
  var leaf_widgets = Array.prototype.map.call(
    document.querySelectorAll(".leaflet"),
    function(ldiv){
      return HTMLWidgets.find("#" + ldiv.id);
    }
  );
  
  // make this easy since we know only two maps
  leaf_widgets[0].sync(leaf_widgets[1]);
  leaf_widgets[1].sync(leaf_widgets[0]);
  '
    )
  ) %>%
    browsable
}

ADELAIDE

#@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ PLOTTING ADELAIDE MAP @@@@@@@@@@@@@@@@@@@@@@@@
# ADELAIDE
map.fun(df.cities=df.cities, pop.ras=vis.ras, shop.ras=poi, i=1)