This commit is contained in:
2026-05-17 21:27:42 +02:00
parent 718407d709
commit 571c749e25
5 changed files with 875 additions and 0 deletions

258
docs/02-oltp.md Normal file
View File

@@ -0,0 +1,258 @@
# OLTP Database — Design & Details
## Overview
The OLTP (Online Transaction Processing) database models a **hotel reservation system** using a fully normalized relational schema in **MySQL 8.4**. It follows 3NF and enforces referential integrity via foreign keys.
- **Database:** `hotel_reservations`
- **Character set:** `utf8mb4` / `utf8mb4_unicode_ci`
- **Tables:** 13
- **Total rows:** ~635,000
---
## Entity-Relationship Model
The schema covers five entity groups:
```
hotel_chain ──┐
country ───────┼──► hotel ──► hotel_room ──► room_booking ──► booking ──► guest
star_rating ──┘ │
└──► country
hotel_characteristic ◄──► hotel (M:N via hotel_hotel_characteristic)
room_type ◄──── hotel_room
room_type ◄──┐
rate_period ◄─┴── period_room_rate (price per room type per season)
```
---
## Table Descriptions
### Reference / Lookup Tables
#### `hotel_chain`
International hotel chains (Hilton, Marriott, Accor, etc.).
| Column | Type | Description |
|--------|------|-------------|
| `hotel_chain_id` | INT UNSIGNED PK | Surrogate key |
| `code` | VARCHAR(10) UNIQUE | Short code (e.g. `HLT`) |
| `name` | VARCHAR(100) | Full name |
**Rows:** 10
---
#### `country`
Countries from which guests come and where hotels are located.
| Column | Type | Description |
|--------|------|-------------|
| `country_id` | INT UNSIGNED PK | Surrogate key |
| `code` | CHAR(2) UNIQUE | ISO 3166-1 alpha-2 (e.g. `GB`) |
| `name` | VARCHAR(100) | Country name |
| `currency` | VARCHAR(10) | ISO currency code (e.g. `EUR`) |
**Rows:** 40 (Europe, Americas, Asia, Africa, Oceania)
---
#### `star_rating`
Hotel classification from 1★ to 5★.
| Column | Type | Description |
|--------|------|-------------|
| `star_rating_id` | INT UNSIGNED PK | Surrogate key |
| `code` | TINYINT UNIQUE | 15 |
| `description` | VARCHAR(20) | e.g. `3 Star` |
**Rows:** 5
---
#### `hotel_characteristic`
Amenities and features a hotel may offer.
| Column | Type | Description |
|--------|------|-------------|
| `characteristic_id` | INT UNSIGNED PK | Surrogate key |
| `code` | VARCHAR(20) UNIQUE | e.g. `POOL`, `SPA`, `WIFI` |
| `description` | VARCHAR(100) | Human-readable label |
**Rows:** 12 (WiFi, Pool, Gym, Spa, Restaurant, Bar, Parking, Valet, Conference, Shuttle, Room Service, Pet Friendly)
---
#### `room_type`
Types of rooms a hotel can offer, with a standard (base) rate.
| Column | Type | Description |
|--------|------|-------------|
| `room_type_id` | INT UNSIGNED PK | Surrogate key |
| `code` | VARCHAR(20) UNIQUE | e.g. `SINGLE`, `SUITE` |
| `description` | VARCHAR(100) | e.g. `Junior Suite` |
| `standard_rate` | DECIMAL(10,2) | Base nightly rate (EUR) |
| `smoking_yn` | BOOLEAN | Smoking allowed flag |
**Rows:** 7 (Single €80, Double €120, Twin €115, Deluxe €180, Suite €280, Executive €450, Family €200)
---
#### `rate_period`
Seasonal pricing periods. Each period maps to a month range and applies a rate multiplier.
| Column | Type | Description |
|--------|------|-------------|
| `rate_period_id` | INT UNSIGNED PK | Surrogate key |
| `code` | VARCHAR(20) UNIQUE | e.g. `PEAK`, `WINTER` |
| `description` | VARCHAR(50) | Human-readable label |
| `month_from` | TINYINT | Start month (112) |
| `month_to` | TINYINT | End month (112) |
**Rows:** 4
| Code | Period | Months | Multiplier |
|------|--------|--------|-----------|
| PEAK | Peak Season | JunAug | ×1.5 |
| HIGH | High Season | MarMay | ×1.2 |
| AUTUMN | Autumn Season | SepNov | ×1.1 |
| WINTER | Winter Season | DecFeb | ×0.9 |
---
### Junction Tables
#### `period_room_rate`
The effective nightly rate for each (room_type, rate_period) combination.
Rate = `standard_rate × season_multiplier`.
| Column | Type | Description |
|--------|------|-------------|
| `room_type_id` | INT UNSIGNED PK/FK | |
| `rate_period_id` | INT UNSIGNED PK/FK | |
| `rate` | DECIMAL(10,2) | Effective nightly rate |
**Rows:** 28 (7 room types × 4 seasons)
---
#### `hotel_hotel_characteristic`
M:N junction between hotels and their amenities.
| Column | Type |
|--------|------|
| `hotel_id` | INT UNSIGNED PK/FK |
| `characteristic_id` | INT UNSIGNED PK/FK |
**Rows:** ~1,415
---
### Core Entity Tables
#### `hotel`
Individual hotel properties.
| Column | Type | Description |
|--------|------|-------------|
| `hotel_id` | INT UNSIGNED PK | |
| `hotel_chain_id` | INT UNSIGNED FK | NULL for independent hotels |
| `country_id` | INT UNSIGNED FK | |
| `star_rating_id` | INT UNSIGNED FK | |
| `code` | VARCHAR(20) UNIQUE | e.g. `HTL0001` |
| `name` | VARCHAR(150) | |
| `address` | VARCHAR(200) | |
| `postcode` | VARCHAR(20) | |
| `city` | VARCHAR(100) | |
| `url` | VARCHAR(200) | |
**Rows:** 200 (50 cities, star distribution: 5% 1★, 10% 2★, 40% 3★, 30% 4★, 15% 5★)
---
#### `hotel_room`
Individual rooms within each hotel.
| Column | Type | Description |
|--------|------|-------------|
| `room_id` | INT UNSIGNED PK | |
| `hotel_id` | INT UNSIGNED FK | |
| `room_type_id` | INT UNSIGNED FK | |
| `room_number` | VARCHAR(10) | Format: `{floor}{number}`, e.g. `101` |
| `floor` | TINYINT UNSIGNED | |
**Rows:** 5,334 (560 rooms per hotel depending on star rating)
---
#### `guest`
Hotel guests.
| Column | Type | Description |
|--------|------|-------------|
| `guest_id` | INT UNSIGNED PK | |
| `country_id` | INT UNSIGNED FK | Guest's home country |
| `name` | VARCHAR(150) | Full name |
| `email` | VARCHAR(150) | Unique synthetic email |
| `address` | VARCHAR(200) | |
| `city` | VARCHAR(100) | |
**Rows:** 100,000
---
#### `booking`
A reservation made by a guest at a hotel. One booking can cover multiple rooms.
| Column | Type | Description |
|--------|------|-------------|
| `booking_id` | INT UNSIGNED PK | |
| `guest_id` | INT UNSIGNED FK | |
| `hotel_id` | INT UNSIGNED FK | |
| `date_from` | DATE | Check-in |
| `date_to` | DATE | Check-out |
| `status` | ENUM | `confirmed`, `cancelled`, `completed`, `no_show` |
| `created_at` | DATETIME | When booking was made |
**Rows:** 500,000
**Status distribution:** 80% completed, 10% confirmed, 7% cancelled, 3% no_show
**Date range:** 2022-01-01 2025-12-31
**Seasonal distribution:** JuneAugust heaviest (peak), DecemberFebruary lightest
---
#### `room_booking`
A specific room assigned within a booking. Stores the rate **as it was at booking time** (snapshot), independent of any future rate changes.
| Column | Type | Description |
|--------|------|-------------|
| `room_booking_id` | INT UNSIGNED PK | |
| `booking_id` | INT UNSIGNED FK | |
| `room_id` | INT UNSIGNED FK | |
| `date_from` | DATE | |
| `date_to` | DATE | |
| `nightly_rate` | DECIMAL(10,2) | Rate at time of booking |
| `total_amount` | DECIMAL(10,2) | `nightly_rate × nights` |
**Rows:** 531,382
**Room count per booking:** 90% single room, 8% two rooms, 2% three rooms
---
## Data Generation
The database was populated using a **single-file C# script** (`generator/generate.cs`) running on .NET 10, using `MySqlConnector` as the only dependency.
Key generation decisions:
- **Seasonal booking distribution** via rejection sampling — months JunAug are ~2.7× more likely than JanFeb
- **Rate snapshot** — each `room_booking.nightly_rate` is looked up from `period_room_rate` at insert time and stored, not re-computed later
- **Realistic stay lengths** — 30% one night, 25% two nights, 20% three nights, tapering off to 14-night stays
- **Cancelled/no-show bookings** partially skip room assignment (60% of cancellations have no room_booking)
```bash
# Run generator
dotnet run generator/generate.cs
```