Code
# import libraries
import numpy as np
import pandas as pd
import polars as pl
import duckdb as db
import folium
from great_tables import GT, md
from warnings import filterwarnings
filterwarnings('ignore')A Case Study using Duckdb, Polars and Folium
Jesus LM
Jan, 2025
Geospatial analysis involves the application of spatial concepts and techniques to data that has geographic coordinates. With the rise of big data and the increasing availability of geospatial information, the demand for effective geospatial analysis tools has grown significantly. Python, with its rich ecosystem of libraries, has emerged as a powerful and popular choice for geospatial data scientists.
['url',
'address',
'name',
'online_order',
'book_table',
'rate',
'votes',
'phone',
'location',
'rest_type',
'dish_liked',
'cuisines',
'approx_cost(for two people)',
'reviews_list',
'menu_item',
'listed_in(type)',
'listed_in(city)']
[{'url': 0,
'address': 0,
'name': 0,
'online_order': 0,
'book_table': 0,
'rate': 7775,
'votes': 0,
'phone': 1208,
'location': 21,
'rest_type': 227,
'dish_liked': 28078,
'cuisines': 45,
'approx_cost(for two people)': 346,
'reviews_list': 0,
'menu_item': 0,
'listed_in(type)': 0,
'listed_in(city)': 0}]
| Zomato Restaurants | |||||||
| address | name | rate | votes | location | rest_type | dish_liked | cuisines |
|---|---|---|---|---|---|---|---|
| 942, 21st Main Road, 2nd Stage, Banashankari, Bangalore | Jalsa | 4.1/5 | 775 | Banashankari | Casual Dining | Pasta, Lunch Buffet, Masala Papad, Paneer Lajawab, Tomato Shorba, Dum Biryani, Sweet Corn Soup | North Indian, Mughlai, Chinese |
| 2nd Floor, 80 Feet Road, Near Big Bazaar, 6th Block, Kathriguppe, 3rd Stage, Banashankari, Bangalore | Spice Elephant | 4.1/5 | 787 | Banashankari | Casual Dining | Momos, Lunch Buffet, Chocolate Nirvana, Thai Green Curry, Paneer Tikka, Dum Biryani, Chicken Biryani | Chinese, North Indian, Thai |
| 1112, Next to KIMS Medical College, 17th Cross, 2nd Stage, Banashankari, Bangalore | San Churro Cafe | 3.8/5 | 918 | Banashankari | Cafe, Casual Dining | Churros, Cannelloni, Minestrone Soup, Hot Chocolate, Pink Sauce Pasta, Salsa, Veg Supreme Pizza | Cafe, Mexican, Italian |
Source: Shan Singh |
|||||||
Lets make every place more readible so that u will get more more accurate geographical co-ordinates..
[{'location': 'HSR, Bangalore, Karnataka, India'},
{'location': 'BTM, Bangalore, Karnataka, India'},
{'location': 'BTM, Bangalore, Karnataka, India'},
{'location': 'BTM, Bangalore, Karnataka, India'},
{'location': 'BTM, Bangalore, Karnataka, India'}]
Schema([('url', String),
('address', String),
('name', String),
('online_order', Boolean),
('book_table', Boolean),
('rate', String),
('votes', Int64),
('phone', String),
('location', String),
('rest_type', String),
('dish_liked', String),
('cuisines', String),
('approx_cost(for two people)', String),
('reviews_list', String),
('menu_item', String),
('listed_in(type)', String),
('listed_in(city)', String)])
first we will learn how to extract Latitudes & longitudes using ‘location’ feature
[{'name': 'Jakkur, Bangalore, Karnataka, India'},
{'name': 'Kalyan Nagar, Bangalore, Karnataka, India'},
{'name': 'RT Nagar, Bangalore, Karnataka, India'},
{'name': 'Koramangala 7th Block, Bangalore, Karnataka, India'},
{'name': 'Kaggadasapura, Bangalore, Karnataka, India'}]
lat = [] # define lat list to store all the latitudes
lon = [] # define lon list to store all the longitudes
for name in pl.Series(rest_loc.select('name')):
location = geolocator.geocode(name)
if location is None:
lat.append(np.nan)
lon.append(np.nan)
else:
lat.append(location.latitude)
lon.append(location.longitude)[13.0621474,
12.9846713,
12.981015523680384,
12.985098650000001,
12.9096941,
nan,
12.9067683,
12.938455602031697,
12.9176571,
12.9489339]
| Zomato Restaurants Coordinates | ||
| name | lat | lon |
|---|---|---|
| Sahakara Nagar, Bangalore, Karnataka, India | 13.0621 | 77.5801 |
| Kaggadasapura, Bangalore, Karnataka, India | 12.9847 | 77.6791 |
| Infantry Road, Bangalore, Karnataka, India | 12.9810 | 77.6021 |
| CV Raman Nagar, Bangalore, Karnataka, India | 12.9851 | 77.6631 |
| JP Nagar, Bangalore, Karnataka, India | 12.9097 | 77.5866 |
Source: Shan Singh |
||
We have found out latitude and longitude of each location listed in the dataset using geopy This is used to plot maps.
| name | lat | lon |
|---|---|---|
| str | f64 | f64 |
| "Sadashiv Nagar, Bangalore, Kar… | NaN | NaN |
| "Rammurthy Nagar, Bangalore, Ka… | NaN | NaN |
| Zomato Restaurants Count | |
| name | count |
|---|---|
| BTM, Bangalore, Karnataka, India | 5124 |
| HSR, Bangalore, Karnataka, India | 2523 |
| Koramangala 5th Block, Bangalore, Karnataka, India | 2504 |
| JP Nagar, Bangalore, Karnataka, India | 2235 |
| Whitefield, Bangalore, Karnataka, India | 2144 |
Source: Shan Singh |
|
Now we can say that these are locations where most of restaurants are located.
Lets create Heatmap of this results so that it becomes more user-friendly.
Now, in order to perform spatial analysis, we need latitudes & longitudes of every location, so lets merge both dataframes in order to get geographical co-ordinates.
| Zomato Restaurants Count & coordinates | |||
| name | count | lat | lon |
|---|---|---|---|
| BTM, Bangalore, Karnataka, India | 5124 | 12.9163603 | 77.604733 |
| HSR, Bangalore, Karnataka, India | 2523 | 12.90056335 | 77.64947470503677 |
| Koramangala 5th Block, Bangalore, Karnataka, India | 2504 | 12.9348429 | 77.6189768 |
| JP Nagar, Bangalore, Karnataka, India | 2235 | 12.9096941 | 77.5866067 |
| Whitefield, Bangalore, Karnataka, India | 2144 | 12.9696365 | 77.7497448 |
Source: Shan Singh |
|||
now in order to show-case it via Map(Heatmap) ,first we need to create BaseMap so that I can map our Heatmap on top of BaseMap !
<folium.plugins.heat_map.HeatMap at 0x3058e0da0>
You can interact with the above map by zooming in or out.
Majority of the Restaurants are avaiable in the city centre area.
<folium.plugins.fast_marker_cluster.FastMarkerCluster at 0x30a2cd280>
You can interact with the above map by zooming in or out.
Plotting Markers on the Map :
Folium gives a folium.Marker() class for plotting markers on a map
Just pass the latitude and longitude of the location, mention the popup and tooltip and add it to the map.
Plotting markers is a two-step process.
You can interact with the above map by zooming in or out.
Rate field cleaning
In order to Analyse where are the restaurants situated with high average rate, first we need to clean ‘rate’ feature
[{'rate': '-'}, {'rate': 'NEW'}]
14.999226245744351
[{'rating': 2.4000000953674316},
{'rating': 2.299999952316284},
{'rating': 0.0},
{'rating': 3.5},
{'rating': 4.099999904632568},
{'rating': 4.400000095367432},
{'rating': 4.699999809265137},
{'rating': 4.900000095367432},
{'rating': 2.700000047683716},
{'rating': 3.9000000953674316},
{'rating': 3.799999952316284},
{'rating': 3.4000000953674316},
{'rating': 3.0},
{'rating': 2.5999999046325684},
{'rating': 3.299999952316284},
{'rating': 4.199999809265137},
{'rating': 2.200000047683716},
{'rating': 4.0},
{'rating': 4.5},
{'rating': 2.5},
{'rating': 3.5999999046325684},
{'rating': 3.700000047683716},
{'rating': 2.0999999046325684},
{'rating': 4.800000190734863},
{'rating': 3.200000047683716},
{'rating': 2.799999952316284},
{'rating': 4.300000190734863},
{'rating': 2.9000000953674316},
{'rating': 2.0},
{'rating': 4.599999904632568},
{'rating': 3.0999999046325684},
{'rating': 1.7999999523162842}]
| name | rate | votes | location | dish_liked | rating |
|---|---|---|---|---|---|
| str | str | i64 | str | str | f32 |
| "Byg Brewski Brewing Company" | "4.9/5" | 16345 | "Sarjapur Road, Bangalore, Karn… | "Cocktails, Dahi Kebab, Rajma C… | 4.9 |
| "Byg Brewski Brewing Company" | "4.9/5" | 16345 | "Sarjapur Road, Bangalore, Karn… | "Cocktails, Dahi Kebab, Rajma C… | 4.9 |
| "Byg Brewski Brewing Company" | "4.9/5" | 16345 | "Sarjapur Road, Bangalore, Karn… | "Cocktails, Dahi Kebab, Rajma C… | 4.9 |
| "Belgian Waffle Factory" | "4.9/5" | 1746 | "Brigade Road, Bangalore, Karna… | "Coffee, Berryblast, Nachos, Ch… | 4.9 |
| "Belgian Waffle Factory" | "4.9/5" | 1746 | "Brigade Road, Bangalore, Karna… | "Coffee, Berryblast, Nachos, Ch… | 4.9 |
| name | avg_rating | count |
|---|---|---|
| str | f32 | u32 |
| "Brookefield, Bangalore, Karnat… | 3.374697 | 581 |
| "Thippasandra, Bangalore, Karna… | 3.095396 | 152 |
| "Electronic City, Bangalore, Ka… | 3.04191 | 964 |
| "Koramangala 1st Block, Bangalo… | 3.263946 | 965 |
| "Koramangala 3rd Block, Bangalo… | 3.978755 | 193 |
| … | … | … |
| "RT Nagar, Bangalore, Karnataka… | 3.278125 | 64 |
| "Jalahalli, Bangalore, Karnatak… | 3.486956 | 23 |
| "Commercial Street, Bangalore, … | 3.109709 | 309 |
| "Banaswadi, Bangalore, Karnatak… | 3.362927 | 499 |
| "Koramangala 5th Block, Bangalo… | 3.901511 | 2381 |
lets consider only those restaurants who have send atleast 400 orders
| name | avg_rating | count |
|---|---|---|
| str | f32 | u32 |
| "Brookefield, Bangalore, Karnat… | 3.374697 | 581 |
| "Electronic City, Bangalore, Ka… | 3.04191 | 964 |
| "Koramangala 1st Block, Bangalo… | 3.263946 | 965 |
| "Bannerghatta Road, Bangalore, … | 3.271675 | 1324 |
| "HSR, Bangalore, Karnataka, Ind… | 3.484063 | 2128 |
| … | … | … |
| "Richmond Road, Bangalore, Karn… | 3.688013 | 634 |
| "Koramangala 7th Block, Bangalo… | 3.747846 | 1089 |
| "Frazer Town, Bangalore, Karnat… | 3.56488 | 578 |
| "Banaswadi, Bangalore, Karnatak… | 3.362927 | 499 |
| "Koramangala 5th Block, Bangalo… | 3.901511 | 2381 |
| name | lat | lon |
|---|---|---|
| str | f64 | f64 |
| "Sahakara Nagar, Bangalore, Kar… | 13.062147 | 77.580061 |
| "Kaggadasapura, Bangalore, Karn… | 12.984671 | 77.679091 |
| "Infantry Road, Bangalore, Karn… | 12.981016 | 77.602133 |
| "CV Raman Nagar, Bangalore, Kar… | 12.985099 | 77.663117 |
| "JP Nagar, Bangalore, Karnataka… | 12.909694 | 77.586607 |
| … | … | … |
| "Seshadripuram, Bangalore, Karn… | 12.993188 | 77.575342 |
| "Jakkur, Bangalore, Karnataka, … | 13.078474 | 77.606894 |
| "Bommanahalli, Bangalore, Karna… | 12.908945 | 77.623904 |
| "Kammanahalli, Bangalore, Karna… | 13.009346 | 77.637709 |
| "Nagawara, Bangalore, Karnataka… | 13.042279 | 77.624858 |
lets merge both the dataframe so that we can get coordinates as well
| name | avg_rating | count | lat | lon |
|---|---|---|---|---|
| str | f32 | u32 | f64 | f64 |
| "JP Nagar, Bangalore, Karnataka… | 3.412929 | 1849 | 12.909694 | 77.586607 |
| "Koramangala 4th Block, Bangalo… | 3.814351 | 864 | 12.932778 | 77.629405 |
| "Whitefield, Bangalore, Karnata… | 3.384171 | 1693 | 12.969637 | 77.749745 |
| "Bannerghatta Road, Bangalore, … | 3.271675 | 1324 | 12.951856 | 77.604011 |
| "Jayanagar, Bangalore, Karnatak… | 3.61525 | 1718 | 12.939904 | 77.582638 |
| … | … | … | … | … |
| "Ulsoor, Bangalore, Karnataka, … | 3.541396 | 901 | 12.977879 | 77.62467 |
| "Frazer Town, Bangalore, Karnat… | 3.56488 | 578 | 12.998683 | 77.615525 |
| "Indiranagar, Bangalore, Karnat… | 3.652168 | 1936 | 12.996298 | 77.545278 |
| "Koramangala 6th Block, Bangalo… | 3.662465 | 1111 | 12.939025 | 77.623848 |
| "Kammanahalli, Bangalore, Karna… | 3.499809 | 525 | 13.009346 | 77.637709 |
<folium.plugins.heat_map.HeatMap at 0x30a39bcb0>
You can interact with the above map by zooming in or out.
Python, with its powerful libraries and ease of use, has become an indispensable tool for geospatial analysis. By leveraging the capabilities of libraries like GeoPandas, Shapely, and folium, data scientists can effectively explore and analyze geospatial data, gain valuable insights, and make informed decisions.
In this article, we have shown a brief overview of geospatial analysis in Python.
Jesus LM
Economist & Data Scientist