# Data Mart ## What a Data Mart Is A Data Mart is a database optimized for reading and analysis rather than for recording transactions. While the OLTP schema is normalized to avoid redundancy, the Data Mart is deliberately denormalized into a **star schema** — a central fact table surrounded by dimension tables — so that analytical queries are fast and simple to write. In a star schema: - **Fact tables** hold measurable events with numeric metrics (prize money, medal count, points) - **Dimension tables** hold descriptive context that you slice and filter by (game type, country, organization region) The Data Mart is stored in the **Oracle university lab schema** and populated by Apache NiFi reading from the MySQL OLTP. The DDL is in `sql/datamart_schema.sql`. --- ## Dimensions ### DIM_DATE A standard calendar dimension covering every date in the EWC 2025 event window (July 8 – August 24, 2025). Using a dedicated date dimension allows Power BI to filter by week, group by month, or compare by quarter with no extra calculation. | Column | Example | |---|---| | date_key | 20250708 (YYYYMMDD integer) | | full_date | 2025-07-08 | | year | 2025 | | quarter | 3 | | month / month_name | 7 / July | | week_number | 28 | | day_of_month / day_name | 8 / Tuesday | --- ### DIM_GAME Describes each of the 25 game titles. Enables slicing facts by genre (MOBA vs FPS vs Battle Royale) and by platform (PC vs Mobile vs Console). | Column | Example | |---|---| | name | Counter-Strike 2 | | game_type | FPS | | platform | PC | --- ### DIM_COUNTRY Countries with their geographic region. Intentionally kept lean — the medal counts that live in the OLTP `country` table are not carried into this dimension because they are derived facts, not descriptive attributes. | Column | Example | |---|---| | name | South Korea | | region | Asia | --- ### DIM_ORGANIZATION All esports clubs and teams. Includes partner metadata to enable analysis by partner tier (Current partner vs non-partner) and social reach. | Column | Example | |---|---| | name | Team Falcons | | region | Middle East | | country | Saudi Arabia | | club_partner_status | Current | | founded_year | 2017 | | social_media_followers_m | 4.0 | --- ### DIM_MEDAL A simple three-row table representing the medal types. Includes `medal_rank` (1/2/3) so reports can sort Gold → Silver → Bronze correctly without relying on alphabetical ordering. | medal_type | medal_rank | |---|---| | Gold | 1 | | Silver | 2 | | Bronze | 3 | --- ## Fact Tables ### FACT_TOURNAMENT **Grain:** one row per tournament (27 rows). This is the primary financial fact table. It answers questions about prize money distribution across games, genres, platforms, and time. | Column | Type | Description | |---|---|---| | game_key | FK → DIM_GAME | What game | | start_date_key | FK → DIM_DATE | When it started | | end_date_key | FK → DIM_DATE | When it ended | | winner_org_key | FK → DIM_ORGANIZATION | Winning organization (NULL for individual-winner events) | | event_name | text | Degenerate dimension | | gender | text | Open / Men / Women | | **prize_pool_usd** | measure | Total prize pool in USD | | **num_participants** | measure | Number of competing teams/players | | **duration_days** | measure | Tournament length in days | | **has_club_points** | measure | 1 if tournament awarded Club Championship points | **Example questions this enables:** - What was the total prize money awarded to MOBA tournaments vs FPS tournaments? - Which platform (PC or Mobile) had higher average prize pools? - How did prize pools vary across the 6-week event? --- ### FACT_MEDAL_AWARD **Grain:** one row per player-medal (257 rows). This fact table captures individual competitive performance. Each medalist player contributes one row with a `medal_count` of 1 and a `medal_points` of 3/2/1. Both columns are additive — you can SUM them freely to get team medal totals, country medal totals, etc. | Column | Type | Description | |---|---|---| | game_key | FK → DIM_GAME | Game the medal was won in | | medal_key | FK → DIM_MEDAL | Gold / Silver / Bronze | | country_key | FK → DIM_COUNTRY | Player's nationality | | org_key | FK → DIM_ORGANIZATION | Player's team | | date_key | FK → DIM_DATE | Tournament start date | | player_name | text | Degenerate dimension | | **medal_count** | measure | Always 1 — additive for totals | | **medal_points** | measure | Gold=3, Silver=2, Bronze=1 | **Example questions this enables:** - Which country won the most medals overall? By region? - Which game genre produced the most medals for Asian countries? - Which organization accumulated the most medal points across all events? - Did South Korea dominate PC games while Southeast Asia dominated mobile games? --- ### FACT_CLUB_STANDING **Grain:** one row per club in the Club Championship (24 rows). This is a snapshot — it represents the final standings at the end of EWC 2025. | Column | Type | Description | |---|---|---| | org_key | FK → DIM_ORGANIZATION | The club | | **final_rank** | measure | Final position (1 = best) | | **total_points** | measure | Total Club Championship points earned | | **prize_money_usd** | measure | Prize money from Club Championship | | **tournament_wins** | measure | Number of tournaments the club won | | **top_8_finishes** | measure | Total top-8 tournament finishes | | **eligible_to_win** | measure | 1 if the club was eligible for the grand prize | **Example questions this enables:** - How does prize money correlate with tournament wins vs breadth of top-8 finishes? - Do Middle Eastern clubs outperform European clubs in the Club Championship? - What is the average total_points for Current club partners vs non-partners? --- ## Star Schema Diagram ``` DIM_DATE ┌──────────┐ │ date_key │ └────┬─────┘ │ start/end │ DIM_GAME ────────── FACT_TOURNAMENT ────────── DIM_ORGANIZATION (game_key) (prize_pool_usd (org_key) num_participants duration_days has_club_points) DIM_COUNTRY ──┐ DIM_ORGAN. ──┼── FACT_MEDAL_AWARD ──── DIM_GAME DIM_MEDAL ──┘ (medal_count (game_key) DIM_DATE ─────┘ medal_points) DIM_ORGANIZATION ── FACT_CLUB_STANDING (total_points prize_money_usd tournament_wins top_8_finishes) ``` --- ## Why Three Fact Tables A single fact table would require choosing one grain, which would make some analyses awkward or impossible. - `FACT_TOURNAMENT` is at tournament grain — you cannot get per-player medal counts from it. - `FACT_MEDAL_AWARD` is at player-medal grain — you cannot get prize pool totals from it without denormalizing tournament data into it. - `FACT_CLUB_STANDING` captures a snapshot that has no natural place in the other two tables. Keeping them separate means each fact table has a clean, single grain. Power BI can build relationships between them through the shared dimensions.