1. Overview
Overview
Welcome to the Build Snowflake Native App Quickstart focused on Geospatial Analysis!
Within this Quickstart we will build a snowflake native application leveraging powerful Snowflake native Geospatial analysis functionalities coupled with a SafeGraph POI listing from the Snowflake Marketplace.
Snowflake Native Applications provide developers a way to package applications for consumption by other Snowflake users. The Snowflake Marketplace is a central place for Snowflake users to discover and install Snowflake Native Applications.
The application you'll build will analyse orders data geospatially and provide powerful visualizations to the application user to understand their data better. Let's explore the application from the perspective of the application provider and an application consumer.
Prerequisites
Before beginning, please make sure you have completed the Introduction to Tasty Bytes Quickstart which provides a walkthrough on setting up a trial account and deploying the Tasty Bytes Foundation required to complete this Quickstart.
What You Will Learn
- How to Access the Snowflake Marketplace
- How to Acquire SafeGraph POI Data from the Snowflake Marketplace
- How to Create a View
- How to Create a Geography Point
- How to Calculate Distance between Points
- How to Collect Points
- How to Draw a Minimum Bounding Polygon and Calculate its Area
- How to conduct Geospatial Analysis with H3 (Hexagonal Hierarchical Geospatial Indexing System)
- Snowflake Native App Framework
- Snowflake Native App deployment
What You Will Build
- An Analytics Ready View Complete with First and Third Party Data
- An Understanding of How to Conduct Geospatial Analysis in Snowflake
- A Snowflake Native Application
2. Native App Architecture & Concepts
Snowflake Native Apps are a new way to build data intensive applications. Snowflake Native Apps benefit from running inside Snowflake and can be installed from the Snowflake Marketplace, similar to installing an app on a smart phone. Snowflake Native Apps can read and write data to a user's database (when given permission to do so). Snowflake Native Apps can even bring in new data to their users, providing new insights.
When discussing Snowflake Native Apps, there are two personas to keep in mind: Providers and Consumers.
Providers : Developers of the app. The developer or company publishing an app to the Snowflake Marketplace is an app provider. Consumer : Users of an app. When a user installs an app from the Snowflake Marketplace, they are a consumer of the app.
In real world, Provider of such Native Application can be Geospatial product experts and Consumers can be organizations like Retailers, Suppliers etc
This quickstart demonstrates the following flow and architecture.
Provider Leverages Snowflake Marketplace to acquire POI Data provisioned by Safegraph
Provider Builds a snowflake native application embedding Geospatial functionalities on consumer data integrated with the POI data acquired from Marketplace
Provider deploys the application. In real world, provider will deploy application to Snowflake Marketplace, as demo we will be deploying to test Snowflake account locally Consumer Ingests their 1st party Orders Data into Snowflake from operational systems like Salesforce. In this demo we will briefly see the ingestion performed using Fivetran and leverage the tasty bytes sample data already made available in your account Consumer runs the application with the Orders data and benefits from the visual insights delivered!
3. Safegraph POI Data from Marketplace
Overview
In our demo scenario, the native application provider is a Geospatial analysis expert building products for enterprises. One of their customers, Tasty Bytes operates Food Trucks in numerous cities and countries across the globe with each truck having the ability to choose two different selling locations per day.
One important item that Tasty Bytes Executives are interested in is to learn more about how these locations relate to each other as well as if there are any locations we currently serve that are potentially too far away from hot selling city centers. Unfortunately their first-party data does not give us the building blocks required to complete this sort of Geospatial analysis. Thankfully, the Snowflake Marketplace has great listings from SafeGraph that can assist us here.
The native app provider acquires POI data to develop an application that will integrate POI data with customer first party data and provides powerful Geospatial Analytics and Visualization
To begin, acquire the SafeGraph listing by following the steps below within Snowsight:
- In the bottom left corner, ensure you are operating as ACCOUNTADMIN
- In the left pane, navigate to ‘Data Products' (Cloud Icon) and select ‘Marketplace'
- In the search bar, enter: ‘SafeGraph: frostbyte'
- Select the ‘SafeGraph: frostbyte' listing and click ‘Get'
- Adjust Database name to:‘TB_SAFEGRAPH'
- Grant access to: ‘PUBLIC'
4. Tasty Bytes Orders Data
As part of the Introduction to Tasty Bytes Quickstart you would have already created the database TB_101, schemas, tables (with data loaded) and views
- Execute below query to create a large warehouse
create WAREHOUSE IDENTIFIER('"LARGE_WH"') COMMENT = '' WAREHOUSE_SIZE = 'Large' AUTO_RESUME = true AUTO_SUSPEND = 300 ENABLE_QUERY_ACCELERATION = false WAREHOUSE_TYPE = 'STANDARD' MIN_CLUSTER_COUNT = 1 MAX_CLUSTER_COUNT = 1 SCALING_POLICY = 'STANDARD'
- Execute below query to load orders data from view to table using the large warehouse created
USE warehouse LARGE_WH;
USE ROLE SYSADMIN;
USE database TB_101;
create table TB_101.HARMONIZED. ORDERS as (select * from TB_101.HARMONIZED.ORDERS_V);
ORDERS table is ready to be used in the native application!
5. Fivetran Ingestion
While we are using the sample orders data for the demo, in real world organizations like Tasty Bytes will be leveraging systems like Salesforce or their internal system to house orders data. Cutting edge tools like Fivetran are used to ingest data from the source systems into snowflake.
6. Download Native App Codebase
Please visit the following link to download the codebase for the native app and place it in a folder in your local machine. https://github.com/BlueCloud-Inc/snowflake-geospatial-native-app
Explore the code base to understand the files, hierarchy and configurations.
There src directory is used to store all of our various source code including stored procedures, user defined functions (UDFs), our streamlit application, and even our installation script setup.sql.
│ .gitignore
│ LICENSE
│ local_test_env.yml
│ native app steps.pptx
│ prepare_data.sh
│ README.md
│ snowflake.yml
│
├───app
│ ├───data
│ │ order_data.csv
│ │ shipping_data.csv
│ │ site_recovery_data.csv
│ │
│ └───src
│ │ manifest.yml
│ │ setup.sql
│ │
│ └───libraries
│ environment.yml
│ procs.py
│ streamlit.py
│ streamlit_bkp.py
│ udf.py
│
├───output
│ └───deploy
│ │ manifest.yml
│ │ setup.sql
│ │
│ └───libraries
│ environment.yml
│ procs.py
│ streamlit.py
│ streamlit_bkp.py
│ udf.py
│
└───scripts
setup-package-script.sql
7. Finding Locations Furthest Away from Top Selling Center Point
Below query helps to identify locations that are furthest away from Top Selling center point.App consumer like Tasty Byte might want to take these locations off the schedule. Within the query we use native functions like TO_GEOGRAPHY, ST_MAKEPOINT, ST_COLLECT, ST_CENTROID, ST_DISTANCE . This SQL is embedded in the streamlit.py script.
WITH _CENTER_POINT AS (
WITH _top_10_locations AS (
SELECT TOP 10
o.location_id,
ST_MAKEPOINT(o.longitude, o.latitude) AS geo_point,
SUM(o.price) AS total_sales_usd
FROM ORDERS_V o
WHERE primary_city = '{input_city_selection}'
GROUP BY o.location_id, o.latitude, o.longitude
ORDER BY total_sales_usd DESC
)
SELECT
ST_COLLECT(tl.geo_point) AS collect_points,
ST_CENTROID(collect_points) AS geometric_center_point
FROM _top_10_locations tl
), _paris_locations AS (
SELECT DISTINCT
location_id,
location_name,
ST_MAKEPOINT(longitude, latitude) AS geo_point,
latitude,
longitude,
SUM(price) AS total_sales_usd,
ARRAY_SIZE(ARRAY_UNIQUE_AGG(customer_id)) AS customer_loyalty_visitor_count
FROM ORDERS_V
WHERE primary_city = '{input_city_selection}'
GROUP BY location_id, location_name, latitude, longitude
)
SELECT TOP {input_number_of_locations}
location_id,
location_name,
ROUND(ST_DISTANCE(geo_point, TO_GEOGRAPHY(_CENTER_POINT.geometric_center_point))/1000,2) AS kilometer_from_top_selling_center,
longitude,
latitude,
_CENTER_POINT.geometric_center_point,
total_sales_usd,
customer_loyalty_visitor_count
FROM _paris_locations,_CENTER_POINT
ORDER BY kilometer_from_top_selling_center DESC
8. Geospatial Analysis with H3 (Hexagonal Hierarchical Geospatial Indexing System)
Overview
H3 is a way of dividing the Earth's surface into hexagonal shapes, organizing them into levels of resolution, and assigning unique codes to each hexagon for easy reference.
This system was created by Uber for tasks like mapping, location-based services, and market analysis. A lower resolution corresponds to larger hexagons covering broader areas, while a higher resolution means smaller hexagons representing more specific locations.
Leveraging, Snowflake H3 functionality, below query helps to generate H3 Codes and finding top selling Hexagons. App consumer like Tasty Byte might want to focus their sales and marketing in these top selling Hexagons. Within the query we use native functions like H3_LATLNG_TO_CELL, H3_LATLNG_TO_CELL_STRING. This SQL is embedded in the streamlit.py script.
WITH _top_50_locations AS (
SELECT TOP {input_number_of_locations}
location_id,
ARRAY_SIZE(ARRAY_UNIQUE_AGG(customer_id)) AS customer_loyalty_visitor_count,
H3_LATLNG_TO_CELL(latitude, longitude, 7) AS h3_integer_resolution_6,
H3_LATLNG_TO_CELL_STRING(latitude, longitude, 7) AS h3_hex_resolution_6,
SUM(price) AS total_sales_usd
FROM orders_v
WHERE primary_city = '{input_city_selection}'
GROUP BY ALL
ORDER BY total_sales_usd DESC
)
SELECT
h3_hex_resolution_6,
COUNT(DISTINCT location_id) AS number_of_top_50_locations,
SUM(customer_loyalty_visitor_count) AS customer_loyalty_visitor_count,
SUM(total_sales_usd) AS total_sales_usd
FROM _top_50_locations
GROUP BY ALL
ORDER BY total_sales_usd DESC
9. Share the Provider POI Data
In order for the POI data to be available to the application consumer, as app provider we need to share it in the application package via reference usage.
The following steps are performed by the setup-package-script.sql file, which is automatically run whenever we deploy the application:
- Creates a schema in the application package that will be used for sharing the shipping data
- Create a view within that schema
- Grants usage on the schema to the application package
- Grants reference usage on the database holding the provider POI data to the application package
- Grants SELECT privileges on the view to the application package, meaning the app will be able to SELECT on the view once it is installed
-- ################################################################
-- Create SHARED_CONTENT_SCHEMA to share in the application package
-- ################################################################
USE {{ package_name }};
create schema if not exists shared_content_schema;
use schema shared_content_schema;
create or replace view FROSTBYTE_TB_SAFEGRAPH_S as select * from NATIVE_APP_QUICKSTART_DB.NATIVE_APP_QUICKSTART_SCHEMA.FROSTBYTE_TB_SAFEGRAPH_S;
grant usage on schema shared_content_schema to share in application package {{ package_name }};
grant reference_usage on database NATIVE_APP_QUICKSTART_DB to share in application package {{ package_name }};
grant select on view FROSTBYTE_TB_SAFEGRAPH_S to share in application package {{ package_name }}
10. Manifest.yml
The manifest.yml file is an important aspect of a Snowflake Native App. This file defines some metadata about the app, configuration options, and provides references to different artifacts of the application.
Let's take a look at the one provided in the GitHub repository:
#version identifier
manifest_version: 1
version:
name: V4
label: Version one
comment: The first version of the application
#artifacts that are distributed from this version of the package
artifacts:
setup_script: setup.sql
default_streamlit: app_instance_schema.streamlit
extension_code: true
#runtime configuration for this version
configuration:
log_level: debug
trace_level: off
references:
- order_table:
label: "Orders Table"
description: "Select table"
privileges:
- SELECT
object_type: Table
multi_valued: false
register_callback: app_instance_schema.update_reference
11. Installation Script
The installation script setup.sql defines all Snowflake objects used within the application. This script runs every time a user installs the application into their environment.
-- ==========================================
-- This script runs when the app is installed
-- ==========================================
-- Create Application Role and Schema
create application role if not exists app_instance_role;
create or alter versioned schema app_instance_schema;
-- Share data
create or replace view app_instance_schema.FROSTBYTE_TB_SAFEGRAPH_S as select * from shared_content_schema.FROSTBYTE_TB_SAFEGRAPH_S;
-- Create Streamlit app
create or replace streamlit app_instance_schema.streamlit from '/libraries' main_file='streamlit.py';
create or replace procedure app_instance_schema.update_reference(ref_name string, operation string, ref_or_alias string)
returns string
language sql
as $$
begin
case (operation)
when 'ADD' then
select system$set_reference(:ref_name, :ref_or_alias);
when 'REMOVE' then
select system$remove_reference(:ref_name, :ref_or_alias);
when 'CLEAR' then
select system$remove_all_references();
else
return 'Unknown operation: ' || operation;
end case;
return 'Success';
end;
$$;
-- Grant usage and permissions on objects
grant usage on schema app_instance_schema to application role app_instance_role;
grant SELECT on view app_instance_schema.FROSTBYTE_TB_SAFEGRAPH_S to application role app_instance_role;
grant usage on streamlit app_instance_schema.streamlit to application role app_instance_role;
grant usage on procedure app_instance_schema.update_reference(string, string, string) to application role app_instance_role;
12. Test the App
local_test_env.yml
This file is a conda environment file that contains the dependencies allowing you to run the tests. To activate the testing environment:
conda env update -f local_test_env.yml
conda activate geo-test
To deactivate it:
conda deactivate
13. Snowflake CLI
Snowflake CLI's open-source nature means that developers can leverage the community's collective knowledge and contributions to improve and enhance the tool. By using Snowflake CLI, developers can expect a streamlined, efficient experience that empowers them to work with Snowflake in new and innovative ways. Snowflake CLI is a powerful and flexible tool that helps developers streamline their workflow and optimize their Snowflake experience.
We will leverage Snowflake CLI to connect to the snowflake account and install the native application
>snow connection add
Name for this connection: <connection_name>
Snowflake account name: <account_name>
Snowflake username: <userid>
Snowflake password [optional]: <password.
Role for the connection [optional]:
Warehouse for the connection [optional]:
Database for the connection [optional]:
Schema for the connection [optional]:
Connection host [optional]:
Connection port [optional]:
Snowflake region [optional]:
Authentication method [optional]:
Path to private key file [optional]:
Path to token file [optional]:
>snow connection set-default <connection_name>
14. Install Application
With the environment created, we can now create the application package for the app. You'll run a command that creates this package and does a few key things:
- Creates the application package for the native app
- Marks that the native app makes use of an external database in the provider's account
- Creates views and grants the setup script access to the views
- For more details, see the comments in the snowflake.yml file.
Deploy the application package
Open a new terminal in the root of the repository and execute the following command:
snow app run
This command will upload source files, create the application package, and install the application object automatically. When you run it again, it will perform the minimum steps necessary to ensure the application is up-to-date with your local copy.
Snowflake CLI project is configured using snowflake.yml file.
15. Run the Streamlit Application
In Snowsight, navigate to Data products > Apps
Click on the app to launch it and give it a few seconds to warm up.
When running the app for the first time, you'll be prompted to do some first-time setup by granting the app access to certain tables (i.e., create object-level bindings). The bindings link references defined in the manifest file to corresponding objects in the Snowflake account. These bindings ensure that the application can run as intended.
Upon running the application, you will see this:
16. Conclusion & Next Steps
Congratulations, you have now developed Snowflake Native Application for Geospatial Analysis! As next steps and to learn more, checkout additional documentation at docs.snowflake.com and demos of other Snowflake Native Apps at developers.snowflake.com/solutions.
Additional resources
Snowflake Native App Developer ToolkitNative Apps Examples
What we've covered
Acquire data from marketplace Prepare data to be included in your application. Create an application package that contains the data and geospatial based business logic of your application. Share data with an application package. Add business logic to an application package. View and test the application in Snowsight.