DI basics for Analytics teams
Impact_Analytics
Module 1: Introduction to Data Ingestion - Duration: 1/2 Hour
Chapter 1: Pipeline concepts
-
-
-
-
- Overview of data warehouse options for clients to share data: GCS, SFTP, MySQL, Snowflake, etc.
- Time-based and event-based triggers for data ingestion
- Warning and Failure Alerts
- Understanding Airflow
- Incremental vs Full-Replace
-
-
-
Module 2: Prerequisites for Implementation - Duration: 1/2 Hour
Chapter 1: Infrastructure set-up
-
-
-
-
- Configuring the VM
- Configuring the BQ & Postgres
- Configuring the Airflow
- Setting up Self-help UI
-
-
-
Module 3 : Sourcing Process - Duration: 1 Hour
Chapter 1: Environment Configuration and .env Files
-
-
-
-
- Creating and configuring .env files for sourcing
- Explanation of necessary environment variables for sourcing
-
-
-
Chapter 2 : Extraction & Intermediate Process
-
-
-
-
- One time historic data transfer
- Periodic data transfer
- Understanding the mapping table for sourcing - creation and managing it
- Understanding the raw tables
- Understanding sourcing logic concept - Periodic vs Historical
- Creation, validation & signoff of Sourcing Queries
- Understanding the intermediate tables
-
-
-
Chapter 3: Common Issues and FAQs
-
-
-
-
- Addressing common problems and solutions
- Frequently asked questions about the sourcing process
-
-
-
Module 4 : Data Ingestion Process - Master Tables - Duration: 1 Hour
Chapter 1: Environment Configuration and .env Files
-
-
-
-
- Creating and configuring .env files for Ingestion
- Explanation of necessary environment variables for Ingestion
-
-
-
Chapter 2: Ingestion Process
-
-
-
-
- Understanding the generic master mapping table
- Understanding the generic schema mapping table
- Understanding the concept of QC bot
- Understanding Product attributes and hierarchy
- Understanding Store attributes and hierarchy
-
-
-
Chapter 3: Common Issues and FAQs
-
-
-
-
- Addressing common problems and solutions
- Frequently asked questions about the master tables in ingestion process
-
-
-
Module 5 : Data Ingestion Process - Derived Tables - Duration: 1 Hour
Chapter 1: Derived Tables Process
-
-
-
-
- Understanding the derived table mapping
- Understanding the stored procedures
- Understanding the derived table queries json
- Understanding the derived table python scripts
- Understanding the concept of QC bot
-
-
-
Chapter 2: Common Issues and FAQs
-
-
-
-
- Addressing common problems and solutions
- Frequently asked questions about the derived tables in ingestion process
-
-
-
Module 6 : Data Ingestion Process - Modeling Tables - Duration: 1 Hour
Chapter 1: ADA ModelingTables Process
-
-
-
-
- Understanding the use of FMD table - Feature metadata and ADA queries
- Understanding the ADA master tables
- Understanding the master input imputed table
- Understanding the store clustering and lost sales imputation tables
- Understanding the final modeling table (FMT)
- Understanding the concept of QC bot
-
-
-
Chapter 2: Common Issues and FAQs
-
-
-
-
- Addressing common problems and solutions
- Frequently asked questions about the ADA modeling tables in ingestion process
-
-
-