Risk Methodology & Model Validation (v0.1)
This document provides a comprehensive overview of the FloodWatch Ghana risk model, the district risk leaderboard, engineering history, and a full quantitative and qualitative validation of both model versions against the May 18, 2025 Greater Accra flood event.
1. The Risk Model (v0.1)
FloodWatch Ghana v0.1 is a structural baseline model. It identifies areas chronically prone to flooding based on their physical and environmental characteristics — terrain, drainage, land cover, and rainfall patterns.
1.1 Weighted Composite Formula
Each 30m pixel is assigned a risk score from 0 (Low) to 1 (High) using a weighted combination of five input layers:
| Component | Weight | Direction | Source | Rationale |
|---|---|---|---|---|
| Elevation | 30% | Inverted | NASA SRTM (30m) | Low-lying areas are natural catchments for surface runoff. |
| Precipitation | 25% | Normal | GPM IMERG Final Run (0.1°) | Actual observed monthly rainfall drives runoff volume. |
| Terrain Slope | 20% | Inverted | Derived from SRTM | Flat terrain cannot drain quickly and pools surface water. |
| Imperviousness | 15% | Normal | ESA WorldCover (10m) | Paved and urban surfaces prevent infiltration into soil. |
| Water Proximity | 10% | Inverted | OpenStreetMap | Proximity to rivers and drainage channels increases inundation risk. |
Formula:
Risk = 0.30×(1−DEM_norm) + 0.25×Rain_norm + 0.20×(1−Slope_norm) + 0.15×Imperv_norm + 0.10×(1−Water_norm)
All input layers are min-max normalised to [0, 1] before compositing. The final composite is reclassified using percentile-based stretching (p25 and p75 breakpoints) to distribute risk scores across the full [0, 1] range and avoid compression in the middle.
1.2 Rainfall Data Source — Why GPM IMERG over ERA5/CHIRPS
The rainfall layer is the most operationally significant input to update. Two approaches have been used across model versions:
| Source | Type | Latency | Accuracy | Used in |
|---|---|---|---|---|
| CHIRPS v2.0 | Climatological mean | Days | Moderate | v0.1 original |
| ERA5-Land | Reanalysis mean | Days | Good | v0.1 original fallback |
| GPM IMERG Final Run | Actual monthly observed | ~3.5 months | Best (gauge-corrected) | v0.1 recalculated |
| GPM IMERG Late Run | Near real-time | ~12 hours | Good | v0.1 recalculated fallback |
CHIRPS and ERA5 return the same climatological average for June regardless of the year — June 2019 and June 2024 produce identical values. GPM IMERG returns the actual measured precipitation for that specific month, making the model genuinely responsive to real rainfall conditions. The v0.1 recalculated model uses GPM IMERG Final Run for June 2024 (198 mm/month mean over Greater Accra).
2. District Risk Leaderboard
2.1 Original Model (CHIRPS/ERA5 rainfall, no percentile reclassification)
Mean risk scores computed per district from the original pipeline run.
| Rank | District | Mean Risk | Max Risk | Flooded May 2025 |
|---|---|---|---|---|
| 1 | Ablekuma West | 0.8398 | 0.9928 | No |
| 2 | Weija Gbawe | 0.8123 | 0.9957 | Yes |
| 3 | Ga Central | 0.7258 | 0.9796 | No |
| 4 | Accra Metropolis | 0.7170 | 0.8681 | Yes |
| 5 | Ga West | 0.7063 | 0.9091 | No |
| 6 | Ga South | 0.7035 | 0.9168 | No |
| 7 | Ablekuma North | 0.6922 | 0.9129 | No |
| 8 | Ablekuma Central | 0.6876 | 0.9673 | No |
| 9 | Ayawaso East | 0.6741 | 0.8010 | No |
| 10 | Korle-Klottey | 0.6708 | 0.8220 | No |
| 11 | La-Dade-Kotopon | 0.6665 | 0.8261 | No |
| 12 | Ayawaso North | 0.6428 | 0.7923 | No |
| 13 | Okaikwei North | 0.6216 | 0.7901 | No |
| 14 | Ayawaso Central | 0.6024 | 0.7815 | No |
| 15 | Krowor | 0.5994 | 0.7714 | No |
| 16 | Ledzokuku | 0.5633 | 0.7841 | No |
| 17 | Ayawaso West | 0.5555 | 0.7763 | No |
| 18 | Ga East | 0.5529 | 0.8011 | Yes |
| 19 | Ga North | 0.5399 | 0.8156 | No |
| 20 | Tema | 0.5215 | 0.7691 | Yes |
| 21 | Tema West | 0.4887 | 0.7598 | Yes |
| 22 | Ningo-Prampram | 0.4646 | 1.0000 | No |
| 23 | Ada East | 0.4635 | 0.8083 | No |
| 24 | La-Nkwantanang-Madina | 0.4619 | 0.7043 | Yes |
| 25 | Adenta | 0.4610 | 0.7331 | Yes |
| 26 | Kpone-Katamanso | 0.4507 | 0.7605 | No |
| 27 | Ada West | 0.4376 | 0.7787 | No |
| 28 | Shai Osudoku | 0.4223 | 0.9061 | No |
| 29 | Ashaiman | 0.3654 | 0.6527 | No |
2.2 Recalculated Model (GPM IMERG rainfall + percentile reclassification, April 2026)
| Rank | District | Mean Risk | Flooded May 2025 |
|---|---|---|---|
| 1 | Ayawaso North | 0.9301 | No |
| 2 | Ledzokuku | 0.9204 | No |
| 3 | Krowor | 0.9181 | No |
| 4 | Tema West | 0.8956 | Yes |
| 5 | Ashaiman | 0.8941 | No |
| 6 | La-Dade-Kotopon | 0.8910 | No |
| 7 | Ga Central | 0.8814 | No |
| 8 | Ablekuma North | 0.8734 | No |
| 9 | Ayawaso West | 0.8686 | No |
| 10 | Ayawaso East | 0.8682 | No |
| 11 | Okaikwei North | 0.8570 | No |
| 12 | Ayawaso Central | 0.8525 | No |
| 13 | Korle-Klottey | 0.8525 | No |
| 14 | Adenta | 0.8408 | Yes |
| 15 | Tema | 0.8357 | Yes |
| 16 | Ga West | 0.8160 | No |
| 17 | Ablekuma Central | 0.8157 | No |
| 18 | Ga East | 0.8114 | Yes |
| 19 | Accra Metropolis | 0.7847 | Yes |
| 20 | Weija Gbawe | 0.7570 | Yes |
| 21 | La-Nkwantanang-Madina | 0.7531 | Yes |
| 22 | Ga South | 0.7523 | No |
| 23 | Ga North | 0.7507 | No |
| 24 | Kpone-Katamanso | 0.7430 | No |
| 25 | Ablekuma West | 0.7418 | No |
| 26 | Ningo-Prampram | 0.4870 | No |
| 27 | Ada East | 0.4747 | No |
| 28 | Shai Osudoku | 0.4541 | No |
| 29 | Ada West | 0.3971 | No |
3. Validation — May 18, 2025 Flood Event
3.1 Event Summary
On May 18, 2025, Greater Accra experienced a severe flash flooding event following approximately 132mm of rainfall in a short period — roughly the equivalent of a full month's rain in a single day. The event caused widespread flooding across multiple districts. Reported flooded districts (sourced from The Watchers, GDACS, and Copernicus EMS):
Flooded (7 of 29 districts): Weija Gbawe · Accra Metropolis · Ga East · Tema · Tema West · La-Nkwantanang-Madina · Adenta
Not flooded (22 districts): All remaining districts.
3.2 Quantitative Metrics — Model Comparison
Mean Risk Score by Flood Status
| Metric | Original Model | Recalculated Model | Verdict |
|---|---|---|---|
| Mean risk — flooded districts | 0.5736 | 0.8112 | Recalc higher ✓ |
| Mean risk — non-flooded districts | 0.5953 | 0.7745 | — |
| Difference (flooded − non-flooded) | −0.0217 | +0.0367 | Recalc correct direction ✓ |
| % flooded districts flagged High Risk (≥0.70) | 28.6% (2/7) | 100% (7/7) | Recalc better ✓ |
| % non-flooded districts flagged High Risk (≥0.70) | 18.2% (4/22) | 81.8% (18/22) | Original more precise |
Confusion Matrix at 0.70 Threshold
| Original Model | Recalculated Model | |
|---|---|---|
| True Positives (flooded, flagged high) | 2 | 7 |
| False Positives (not flooded, flagged high) | 4 | 18 |
| True Negatives (not flooded, flagged low) | 18 | 4 |
| False Negatives (flooded, missed) | 5 | 0 |
| Precision | 0.33 | 0.28 |
| Recall | 0.29 | 1.00 |
| F1 Score | 0.31 | 0.44 |
3.3 Qualitative Assessment
Original Model
The original model correctly placed two of the most historically flood-prone districts — Weija Gbawe (rank 2) and Accra Metropolis (rank 4) — in its top tier. These are well-known chronic flood zones in Greater Accra and their high ranking reflects genuine structural risk (low elevation, dense impervious surfaces, proximity to the Odaw River and Korle Lagoon drainage system).
However, the model missed five flooded districts entirely at the 0.70 threshold:
- Ga East (rank 18), Tema (rank 20), Tema West (rank 21) — ranked mid-table, well below the high-risk cutoff
- La-Nkwantanang-Madina (rank 24), Adenta (rank 25) — ranked near the bottom
This is the model's most significant qualitative failure. Adenta and La-Nkwantanang-Madina are peri-urban and inland districts that were overwhelmed by the volume of the May 2025 event — their structural characteristics (moderate slope, mixed land cover) do not mark them as chronic flood zones, but a 132mm single-day rainfall event overloaded their drainage regardless. The original model, built on climatological rainfall averages, had no mechanism to capture this.
The mean risk of flooded districts (0.574) was actually lower than non-flooded districts (0.595) — the model ranked flooded areas as marginally safer on average. This is a fundamental failure of direction.
Recalculated Model
The recalculated model shows a meaningful improvement. With GPM IMERG actual rainfall (June 2024, 198 mm/month mean) and percentile reclassification applied:
- All 7 flooded districts score above 0.70 — recall is perfect (1.00)
- The mean risk of flooded districts (0.811) now correctly exceeds non-flooded districts (0.774)
- Tema West rises to rank 4, reflecting its genuine vulnerability to both structural factors and rainfall exposure
- Adenta (rank 14) and Tema (rank 15) move into the top half of the risk distribution, better reflecting their susceptibility to high-rainfall events
The main weakness of the recalculated model is low precision (0.28): 18 of 22 non-flooded districts also score above 0.70. The score distribution is compressed into a narrow high band (most districts fall between 0.74–0.93), making it difficult to discriminate flooded from non-flooded at the district mean level alone. The bottom four districts — Ningo-Prampram, Ada East, Shai Osudoku, Ada West — are correctly identified as low risk; these are predominantly rural and coastal areas with very different terrain and land cover.
3.4 Overall Verdict — Which Model Performs Better?
The recalculated model is the stronger performer.
| Criterion | Original | Recalculated | Winner |
|---|---|---|---|
| Direction of risk signal | Wrong (flooded < non-flooded) | Correct (flooded > non-flooded) | Recalculated |
| Recall — flooded districts caught | 0.29 | 1.00 | Recalculated |
| F1 Score | 0.31 | 0.44 | Recalculated |
| Precision | 0.33 | 0.28 | Original (marginally) |
| Qualitative alignment (known flood zones) | Partial (2/7) | Strong (7/7) | Recalculated |
| Score discrimination across districts | Better spread | Compressed mid-high | Original |
| Rainfall data quality | Climatological average | Actual observed | Recalculated |
The recalculated model wins on every meaningful criterion except precision. Its near-zero false negative rate is critical for a flood risk application — missing a flooded district is a worse failure than over-flagging a safe one. The original model's apparent precision advantage is misleading: it achieved it by simply scoring most districts as moderate risk, meaning it also missed five of the seven districts that actually flooded.
The precision gap (0.28 vs 0.33) is a known structural limitation of both models. A static weighted composite applied at the district mean level will always have difficulty separating flash-flood-driven events from structural risk — the underlying issue is that the May 2025 event was an extreme single-day episode, while the model represents chronic susceptibility. Improving precision requires dynamic, event-driven inputs.
4. Engineering History & Bug Resolutions
4.1 The "Global Average" Bug (0.508)
During early development, every district incorrectly displayed a uniform Mean Risk Score of 0.508.
- Cause: The frontend was sending undefined bounding boxes to the TiTiler API, which defaulted to computing the global average of the entire raster.
- Resolution: Shifted from dynamic runtime calculation to static pre-calculated statistics. Zonal statistics (mean, max, median, std, histogram) are now baked into the GeoJSON district properties via
scripts/precalculate_stats.pyat pipeline time. This ensures 100% accuracy and instant loading with no API dependency at render time.
4.2 Rainfall Source Upgrade (CHIRPS → GPM IMERG)
The original model ingested rainfall from CHIRPS v2.0 or ERA5-Land — both climatological products that return the same historical average regardless of the actual year processed. This meant the model could not respond to unusually wet or dry months.
The pipeline was upgraded to use NASA GPM IMERG as the primary source, with a 4-tier fallback chain:
GPM IMERG Final Run → GPM IMERG Late Run → ERA5-Land → CHIRPS v2.0
GPM IMERG Final Run is bias-corrected against ground rain gauges and available with approximately 3.5 months latency. For June 2024, the actual observed mean rainfall over Greater Accra was 198 mm/month (range: 126–300 mm/month across the region), compared to the climatological average which does not vary by event.
4.3 Percentile Reclassification
The original pipeline applied min-max normalisation directly to the composite score, which resulted in compressed mid-range scores across most districts. The recalculated model adds a percentile-based reclassification step using the 25th and 75th percentile breakpoints of the pixel-level risk distribution:
score < p25 → mapped to [0.00, 0.33] (low tier) p25 ≤ score < p75 → mapped to [0.33, 0.67] (moderate tier) score ≥ p75 → mapped to [0.67, 1.00] (high tier)
This better utilises the full output range and sharpens the separation between low, moderate, and high risk areas at the pixel level — though district mean compression remains at the 0.70+ band for most urban districts.
4.4 COG Pipeline & Tile Serving
All risk outputs are served as Cloud-Optimised GeoTIFFs (COG) from Google Cloud Storage, rendered via TiTiler. Previous versions encountered issues with:
- Pixel bleeding at district edges — resolved by removing a boundary buffer that was clipping edge pixels
- nodata=nan tile blanking — resolved by removing the
&nodata=nanTiTiler parameter - COG version cache — managed via
?v=Nquery string versioning on the COG URL
5. Limitations & Roadmap
5.1 Current Limitations
- Static structural model: Cannot capture event-specific dynamics. A 132mm single-day rainfall will overwhelm peri-urban districts regardless of their chronic risk score.
- District-level aggregation: Mean risk at the district level conceals localised hotspots. High-risk pixels within a nominally moderate district are invisible in the leaderboard.
- Rainfall temporal mismatch: The June 2024 GPM data does not correspond to the May 2025 validation event. A proper temporal validation would require running the model with May 2025 GPM data specifically.
- No drainage infrastructure data: The model has no representation of storm drain capacity, culvert blockages, or drainage network connectivity — a major driver of urban flash flooding in Accra.
- Score compression: The percentile reclassification improves pixel-level spread but most urban districts still cluster in the 0.74–0.93 band at the mean level, limiting district-level discrimination.
5.2 Future Roadmap (v1.1)
| Feature | Description | Impact |
|---|---|---|
| Dynamic risk layer | Real-time GPM IMERG rainfall thresholds triggering risk score adjustments on the day of an event | High — addresses the core precision gap |
| Sentinel-1 SAR validation | Flood extent mapping via Google Earth Engine for quantitative spatial accuracy metrics beyond district means | High — enables pixel-level validation |
| Drainage infrastructure layer | OSM and NADMO drainage network data as an additional composite input | Medium |
| Property-level API | Dynamic backend for individual parcel risk queries | Medium |
| Temporal validation | Rerun model with May 2025 GPM data to validate under event-matched rainfall | Medium |