Files
IPZ_1/docs/01-overview.md
2026-05-17 21:27:42 +02:00

4.9 KiB
Raw Permalink Blame History

Hotel Reservations — Data Warehouse Project

Project Summary

This project implements a complete Data Warehousing pipeline for a hotel reservation system, covering all standard DW layers:

MySQL OLTP  ──►  Apache NiFi ETL  ──►  Oracle Data Mart  ──►  Power BI Reports
(source)          (transform)           (analytical store)      (OLAP queries)

The system is built around the A.24 Hotel Reservations domain from the course specification. The OLTP database was populated with ~635,000 synthetically generated rows covering 200 hotels, 100,000 guests, 500,000 bookings, and 531,000 room bookings across a 4-year period (20222025).


Business Context

A hotel chain needs to answer questions like:

  • Which countries generate the most revenue per quarter?
  • How does occupancy differ between peak and off-peak seasons?
  • What is the revenue contribution of 5-star vs 3-star hotels?
  • How has a hotel's revenue changed after upgrading its star rating?

These questions require historical, multi-dimensional analysis that a normalized OLTP database cannot serve efficiently. The data mart provides pre-modelled, denormalized data optimized for analytical queries.


Architecture

┌─────────────────────────────────────────────────────────┐
│  SOURCE LAYER                                           │
│  MySQL 8.4  (Docker/Podman, port 13306)                 │
│  Database: hotel_reservations                           │
│  13 normalized tables, ~635K rows                       │
└───────────────────────┬─────────────────────────────────┘
                        │ JDBC (MySqlConnector)
                        ▼
┌─────────────────────────────────────────────────────────┐
│  ETL LAYER                                              │
│  Apache NiFi                                            │
│  5 Process Groups: Date Dim / Static Dims /             │
│  SCD2 Hotel / SCD1 Guest / Incremental Fact             │
└───────────────────────┬─────────────────────────────────┘
                        │ JDBC (Oracle JDBC)
                        ▼
┌─────────────────────────────────────────────────────────┐
│  DATA MART LAYER                                        │
│  Oracle (university lab schema)                         │
│  Star schema: 6 dimensions + 1 fact table               │
│  SCD Type 2 on DIM_HOTEL                                │
└───────────────────────┬─────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────────┐
│  PRESENTATION LAYER                                     │
│  Power BI Desktop                                       │
│  OLAP reports via DirectQuery / Import                  │
└─────────────────────────────────────────────────────────┘

Technology Stack

Component Technology Version
OLTP Database MySQL 8.4
Container runtime Docker / Podman
Data generator C# (.NET) 10
ETL tool Apache NiFi
Data Mart Oracle RDBMS university lab
Reporting Power BI Desktop

Repository Structure

IPZ_1/
├── docker/
│   ├── start.sh          # Start MySQL container (Linux/macOS)
│   ├── stop.sh           # Stop MySQL container
│   ├── start.ps1         # Start MySQL container (Windows)
│   └── stop.ps1          # Stop MySQL container
├── sql/
│   ├── schema.sql        # MySQL OLTP DDL
│   └── datamart_schema.sql  # Oracle Data Mart DDL
├── generator/
│   └── generate.cs       # .NET 10 data generator script
└── docs/
    ├── 01-overview.md    # This file
    ├── 02-oltp.md        # OLTP database design
    ├── 03-datamart.md    # Data mart design
    ├── 04-setup.md       # Setup and run guide
    └── nifi-flow.md      # NiFi ETL flow reference