Files
IPZ_1/docs/07_conclusion.md
2026-05-17 17:17:04 +02:00

4.5 KiB
Raw Blame History

Conclusion

What Was Built

This project delivers a complete data warehousing pipeline from raw CSV files to an analytics-ready Data Mart:

Deliverable Details
OLTP schema 14 tables, 3NF, MySQL 8.4
Seed script .NET 10 single-file C# script, loads ~700 rows across all tables
Docker setup One-command start/stop for MySQL, Linux and Windows
Data Mart schema 3 fact tables, 5 dimension tables, star schema, Oracle DDL
NiFi ETL 8 pipelines, extract SQL + load SQL, documented step-by-step
Documentation This docs folder

Analytical Potential

The Data Mart enables a wide range of OLAP analyses. Below are the most interesting ones, mapped to the fact and dimension tables that support them.

Prize Money Distribution

"Where did the $100M go?"

Using FACT_TOURNAMENT sliced by DIM_GAME:

  • Total prize pool by game genre (MOBA, FPS, Battle Royale, etc.)
  • Average prize pool per participant by platform (PC vs Mobile)
  • Prize concentration — what percentage of total prize money went to the top 5 tournaments
  • Relationship between tournament duration and prize pool size

Country and Regional Dominance

"Which part of the world ruled EWC 2025?"

Using FACT_MEDAL_AWARD sliced by DIM_COUNTRY and DIM_GAME:

  • Medal tally by country (gold/silver/bronze breakdown)
  • Medal points by region (Asia vs Europe vs North America vs Middle East)
  • Genre specialization — do Asian countries dominate MOBAs? Does Europe lead in FPS?
  • Countries with most medals per player (efficiency metric using medal_count / total_players)

Club Championship Performance

"Which clubs built the best all-around teams?"

Using FACT_CLUB_STANDING sliced by DIM_ORGANIZATION:

  • Points vs prize money correlation — are points a good predictor of earnings?
  • Tournament breadth vs depth — do clubs that win fewer tournaments but make more top-8 finishes rank higher?
  • Performance by region — Middle Eastern clubs (Team Falcons, Twisted Minds) vs European clubs
  • Club partner ROI — do Current club partners finish higher than non-partners on average?

Event Timeline Analysis

"How did the event unfold week by week?"

Using FACT_TOURNAMENT sliced by DIM_DATE:

  • Prize money at stake each week
  • Which weeks had the most high-value tournaments running simultaneously
  • Tournament density (how many events overlapped)

Suggested Power BI Reports

Report 1 — Prize & Tournament Analysis

A financial overview dashboard with:

  • KPI cards: Total prize pool, number of tournaments, average prize per tournament
  • Bar chart: Prize pool by game_type (filter: platform)
  • Treemap: Prize pool breakdown by individual game
  • Line chart: Cumulative prize money awarded by week (using DIM_DATE.week_number)
  • Scatter plot: Prize pool vs num_participants (are bigger tournaments better funded?)

Slicers: Platform, Gender, Club Championship Points (Yes/No)

Report 2 — Performance & Medal Analysis

A competitive performance dashboard with:

  • Map visual: Medal points by country (filled map using country name)
  • Stacked bar chart: Gold/Silver/Bronze medals by region
  • Matrix: Organizations × Game genres (medal_count as values — shows which orgs are specialists vs all-rounders)
  • Bar chart: Top 10 organizations by total medal_points
  • Table: Club Championship standings with conditional formatting on total_points and prize_money_usd

Slicers: Region, Game Type, Medal Type


Limitations

Dataset size — With only 27 tournaments and 257 medalists, the dataset is small by data warehousing standards. The analyses are valid but a real production data mart would have years of historical data for trend analysis.

Player earnings — The prize_earned_usd column in the player roster is zero for all players. Individual prize splits were not publicly available at the time the dataset was compiled, so per-player financial analysis is not possible.

Individual vs team events — Games like Chess, StarCraft II, and the fighting games are individual competitions. Their medal and match data is structured the same way as team events, but the "organization" in those cases is the player's sponsoring team rather than a competing unit. This is a nuance that Power BI visuals should label clearly.

Static snapshot — This is a point-in-time dataset for EWC 2025. The Data Mart has no slowly changing dimension (SCD) logic or historical tracking. It reflects the final state of the event.