Skip to content

denisecase/smart-sales-olap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

smart-sales-olap

This project illustrates creating a multidemensional data store from which we can query to illustrate the concept of dimensions and metrics.

Cubing concepts (such as slicing, dicing, and drilldowns) are still widely used, although pre-computation of cubes may not be required anymore. Snowflake, Power Bi, Tableau and more can compute as needed using the most up-to-date information sources.

IMPORTANT: Align OLAP Scripts with Your DW Schema

Ensure that the OLAP scripts you run are compatible with the schema of your data warehouse. This example assumes a specific schema (detailed below).

Data Warehouse Schema and Example Data

Dimension Table: customer

  • Contains information about customers.
  • Columns:
    • customer_id: Unique identifier for each customer.
    • name: Name of the customer.
    • region: Customer's region (e.g., North, East, West, South).
    • join_date: Date the customer joined.

Example Rows:

customer_id,name,region,join_date
1001,William White,East,2021-11-11
1002,Wylie Coyote,East,2023-02-14
1003,Dan Brown,West,2023-10-19

Dimension Table: product

  • Contains information about products sold.
  • Columns:
    • product_id: Unique identifier for each product.
    • name: Name of the product.
    • category: Product category (e.g., Electronics, Clothing).
    • unit_price_usd: Price of a single unit (in USD).

Example Rows:

product_id,name,category,unit_price_usd
101,laptop,Electronics,793.12
102,hoodie,Clothing,39.10
103,cable,Electronics,22.76

Fact Table: sale

  • Contains transactional data for each sale.
  • Columns:
    • sale_id: Unique identifier for each transaction.
    • customer_id: ID of the customer who made the purchase.
    • product_id: ID of the product sold.
    • store_id: ID of the store (additional data example)
    • campaign_id: ID of the active marketing campaign (addl data)
    • sale_date: Date of the sale.
    • sale_amount_usd: Total revenue generated by the transaction (in USD).

Example Rows:

sale_id,customer_id,product_id,sale_date,sale_amount_usd
550,1001,101,2024-01-06,6344.96
551,1002,102,2024-01-06,312.80
552,1003,103,2024-01-16,431.00

Output: Multidimensional Table (CSV file)

This example outputs a multidimensional data set with the following column names (yours will differ).

DayOfWeek,product_id,customer_id,sale_amount_usd_sum,sale_amount_usd_mean,sale_id_count,sale_ids
Friday,101,1001,6344.96,6344.96,1,[582]
Friday,102,1009,312.8,312.8,1,[583]
Friday,104,1008,431.0,431.0,1,[593]

Manage Local .venv

In a VS Code terminal:

  1. Create & activate .venv
  2. Install dependencies with requirements.txt

Mac/Linux

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade -r requirements.txt

Windows

py -m venv .venv
.\.venv\Scripts\activate
py -m pip install --upgrade -r requirements.txt

Run Scripts

olap_cubing.py

This script:

  • Connects to the DW.
  • Aggregates data into an OLAP cube based on specified dimensions and metrics.
  • Saves the multidimensional dataset (cube) to an intermediate CSV file.

olap_goal_sales_by_day.py

This script:

  • Loads the precomputed OLAP cube.
  • Analyzes sales data to identify patterns, such as total sales by day of the week.
  • Outputs actionable insights and visualizations.

In a VS Code terminal:

  1. Activate .venv
  2. Run the script

Mac/Linux

source .venv/bin/activate
python3 scripts/olap_cubing.py
python3 scripts/olap_goal_sales_by_day.py

Windows

.\.venv\Scripts\activate
py scripts\olap_cubing.py
py scripts\olap_goal_sales_by_day.py

Goal

Goal: Sales - Low Revenue - by Day of Week

Results

Which day of the week has the least sales: Friday

Sales by Day of Week

Goal

Goal: Sales by Product - by Day of Week

Sales by Product by Day of Week

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages