Skip to main content

DI basics for Analytics teams


Impact_Analytics

Module 1: Introduction to Data Ingestion - Duration: 1/2 Hour

Chapter 1: Pipeline concepts

          1. Overview of data warehouse options for clients to share data: GCS, SFTP, MySQL, Snowflake, etc.
          2. Time-based and event-based triggers for data ingestion
          3. Warning and Failure Alerts
          4. Understanding Airflow
          5. Incremental vs Full-Replace

Module 2: Prerequisites for Implementation - Duration: 1/2 Hour

Chapter 1: Infrastructure set-up

          1. Configuring the VM
          2. Configuring the BQ & Postgres
          3. Configuring the Airflow
          4. Setting up Self-help UI

Module 3 : Sourcing Process - Duration: 1 Hour

Chapter 1: Environment Configuration and .env Files

          1. Creating and configuring .env files for sourcing
          2. Explanation of necessary environment variables for sourcing

Chapter 2 : Extraction & Intermediate Process

          1. One time historic data transfer
          2. Periodic data transfer
          3. Understanding the mapping table for sourcing - creation and managing it
          4. Understanding the raw tables
          5. Understanding sourcing logic concept - Periodic vs Historical
          6. Creation, validation & signoff of Sourcing Queries
          7. Understanding the intermediate tables

Chapter 3: Common Issues and FAQs

          1. Addressing common problems and solutions
          2. Frequently asked questions about the sourcing process

Module 4 : Data Ingestion Process - Master Tables - Duration: 1 Hour

Chapter 1: Environment Configuration and .env Files

          1. Creating and configuring .env files for Ingestion
          2. Explanation of necessary environment variables for Ingestion

Chapter 2: Ingestion Process

          1. Understanding the generic master mapping table
          2. Understanding the generic schema mapping table
          3. Understanding the concept of QC bot
          4. Understanding Product attributes and hierarchy
          5. Understanding Store attributes and hierarchy

Chapter 3: Common Issues and FAQs

          1. Addressing common problems and solutions
          2. Frequently asked questions about the master tables in ingestion process

Module 5 : Data Ingestion Process - Derived Tables - Duration: 1 Hour

Chapter 1: Derived Tables Process

          1. Understanding the derived table mapping
          2. Understanding the stored procedures
          3. Understanding the derived table queries json
          4. Understanding the derived table python scripts
          5. Understanding the concept of QC bot

Chapter 2: Common Issues and FAQs

          1. Addressing common problems and solutions
          2. Frequently asked questions about the derived tables in ingestion process

Module 6 : Data Ingestion Process - Modeling Tables - Duration: 1 Hour

Chapter 1: ADA ModelingTables Process

          1. Understanding the  use of FMD table - Feature metadata and ADA queries
          2. Understanding the ADA master tables
          3. Understanding the master input imputed table
          4. Understanding the store clustering and lost sales imputation tables
          5. Understanding the final modeling table (FMT)
          6. Understanding the concept of QC bot

Chapter 2: Common Issues and FAQs

          1. Addressing common problems and solutions
          2. Frequently asked questions about the ADA modeling tables in ingestion process
Enroll