Simple to get started
Free Starter Package
You don’t have to license expensive software to get started. You can download and start working with the SQL statements that we offer and can build your companies’ identity graph today. This way you will be able to identity data foundation issues before you license expensive software. Implementing MVP Use-Cases yourself will also help you to show the business value to your organization and ensure future investments to improve customer experience.
The Starter Package is helping you to get started, as simple as it can be. The SQL statements that are provided are helping you to execute the five steps below and show the basic principles of the ID modelling workflow.
Short Description
Starter Package
Included Components
SQL Statements: Create Input Data
SQL Statements: Create List of Nodes & Edges
SQL Statements: Create Dictionary
SQL Statements: Integer Encoding
SQL Statements: Edges Main Table
SQL Statements: Complex Deduplication
SQL Statement: Create Nested Results Table
Missing Components
SQL Statement: Create Flat Results Table
SQL Statements: Create separate CRM & Cookie Graph
SQL Statements: CRM Profile & Device Connections
SQL Statements: Data Quality Reporting
SQL Statements: User ID Profile Reporting
Necessary skills
The overview of the contents
Contents of the Starter Package
The Identity Graph Starter Package provides SQL scripts for creating a demo Identity Graph in BigQuery. It guides users through initializing tables, loading demo data, and cleaning inputs to support CRM data modeling. The process culminates with graph modeling, anomaly detection, deduplication, and generating a final nested results table for comprehensive ID analysis.
Step 1 – Initialize Environment
Step 1.1 – Update Column Descriptions
- Create or replace the stored procedure to update column descriptions dynamically for any specified table
- The procedure attempts to update the column description
Step 1.2 – Initialize List of CRM Edges
- Create or replace the list_of_crm_edges table
- Update the column descriptions for the list_of_crm_edges table
Step 1.3 – Initialize ID Dictionary
- Create or replace the id_dictionary table
- Update the column descriptions for the id_dictionary table
Step 1.4 – Initialize Integer List of CRM Edges
- Create or replace the list_of_int_crm_edges table
- Update the column descriptions for the list_of_int_crm_edges table
Step 1.5 – Initialize Edges Main Table CRM
- Create or replace the edges_main_table_crm table
- Update the column descriptions for the edges_main_table_crm table
Step 1.6 – Initialize Iteration Log
- Create or replace the iteration_log_crm table
- Update the column descriptions for the iteration_log_crm table
Step 2 – Load Demo Data
Step 2.1 – Load CRM Edges
- Insert demo data into list_of_crm_edges
Step 3 – Clean Input Data
Step 3.1 – Clean Input Data
- Update ID type descriptions in list_of_crm_edges
- Replace whitespace with underscore (_) and convert uppercase letters to lowercase in type_1 and type_2
Step 4 – Graph Modelling
Step 4.1 – Update ID Dictionary
- Merge new IDs in the list_of_crm_edges into the id_dictionary
Step 4.2 – Integer Encoding of IDs
- Use the id_dictionary to integer encode the list_of_crm_edges
Step 4.3 – Set Edges Table
- Merge new edges from the list_of_int_crm_edges into the edges_main_table_crm
Step 4.4 – Detect Anomalies
- Update the count of edges that include the more frequently observed ID of the edge
- Update the z_score values for each edge and mark edges as “Normal” or “Outlier”
- Reset the group_id to remove the edge that was identified as “Outlier” from modelling
Step 4.5 – Update Edges Table
- Initialize the group ID for edges without a group_id that are marked as “Normal”
- Duplicate edges and switch int_1 and int_2 as final data preparation before the Complex Deduplication
Step 4.6 – Complex Deduplication
- Run the Complex Deduplication in iterations until no more group_id updates can be found
Step 5 – Save Results
Step 5.1 – Save Nested Results Table
- Define a list of available ID types logged in the id_dictionary
- Define the dynamic SQL to iterate each ID type logged in the id_dictionary
- Execute the dynamic SQL to build the final nested results table that includes all relevant ID types
Reviews
There are no reviews yet.