diff --git a/documentation/PROJECT__AE_Lookups_fixes_and_docs_update.md b/documentation/PROJECT__AE_Lookups_fixes_and_docs_update.md new file mode 100644 index 0000000..853c099 --- /dev/null +++ b/documentation/PROJECT__AE_Lookups_fixes_and_docs_update.md @@ -0,0 +1,144 @@ +# Project: V3 Lookup Bug Fix — Timezone Group Data + PARTITION BY Revert + +> **Status:** 🔧 Action Required +> **Date:** 2026-03-23 +> **Related doc:** `PROJECT__V3_UNIFORM_LOOKUP_SYSTEM.md` +> **Reported by:** Frontend Agent (Scott Idem / One Sky IT) + +--- + +## 1. Summary + +Two bugs were discovered in the V3 Uniform Lookup System during IDAA Recovery Meetings +timezone dropdown testing. They stem from a single root cause: the `lu_v3_time_zone` +table was seeded with regional `group` values (`"United States"`, `"Europe"`) instead of +individual timezone names — contrary to the design specified in Phase 2 of the lookup +architecture doc, which explicitly states `lu_v3_time_zone (Group: name)`. + +An attempted fix changed `PARTITION BY group` to `PARTITION BY name` in +`get_lookup_list_v3()`. This unintentionally broke country deduplication, which depends +on `PARTITION BY group` being correct (country group = `alpha_2_code`, e.g. `"US"`). + +--- + +## 2. Root Cause + +### 2.1 Timezone `group` values were set to regional names instead of timezone names + +The `lu_v3_time_zone` table has two groups where multiple records share a single group value: + +| `group` value | Count | Example records | +|----------------|-------|-----------------| +| `United States` | 13 | US/Alaska, US/Arizona, US/Central, US/East-Indiana, US/Eastern, US/Hawaii, US/Indiana-Starke, US/Michigan, US/Mountain, US/Pacific, US/Pacific-New, US/Samoa, US/Aleutian | +| `Europe` | 63 | Europe/London, Europe/Paris, Europe/Prague, Europe/Rome, ... (all Europe/* zones) | + +All other timezone records already have `group = name` (e.g., `Canada/Eastern` has +`group = "Canada/Eastern"`). The US and Europe records were loaded incorrectly. + +**Effect:** `PARTITION BY group` collapsed all 13 US/* records into a single winner and +all 63 Europe/* records into a single winner. Only ~7 distinct US timezones and 1 Europe +timezone appeared in the dropdown instead of all 76. + +### 2.2 Attempted fix broke country lookup deduplication + +Changing `PARTITION BY group` → `PARTITION BY name` in `get_lookup_list_v3()` fixed the +timezone collapse but broke `lu_v3_country`. + +`lu_v3_country` has (at minimum) two records for `alpha_2_code = "US"`: +- `id=240`: global default (`account_id=NULL`), `group="US"` +- `id=251`: account-specific (`account_id=1`), `group="US"` + +With `PARTITION BY group`, both records share `group="US"` and are correctly deduped — +the account-specific record wins per the override hierarchy. With `PARTITION BY name`, +if the two records have different `name` values they are treated as separate identities +and both survive, resulting in duplicate `alpha_2_code="US"` entries in the API response. + +The frontend's `{#each lu_country_list as country (country.alpha_2_code)}` then throws: +> `Svelte error: each_key_duplicate — Keyed each block has duplicate key 'US'` + +The same risk applies to `lu_v3_country_subdivision`. + +--- + +## 3. Correct Fix (Two Steps) + +### Step 1 — Revert `app/methods/lookup_methods.py` + +Change `PARTITION BY name` back to `PARTITION BY group`: + +```python +# lookup_methods.py — get_lookup_list_v3() +ROW_NUMBER() OVER ( + PARTITION BY `group` # <-- revert to this + ORDER BY + (for_type = :for_type AND for_id = :for_id) DESC, + (account_id = :account_id) DESC, + created_on DESC +) as rank_priority +``` + +This restores correct behavior for all three active V3 lookup types +(`country`, `country_subdivision`, `time_zone`). + +### Step 2 — Fix the `lu_v3_time_zone` data + +Set `group = name` for all records where the group is a regional label rather than the +timezone's own name. Run once against the database: + +```sql +UPDATE lu_v3_time_zone +SET `group` = `name` +WHERE `group` IN ('United States', 'Europe'); +``` + +**Verification:** +```sql +-- Should return 0 rows after the fix +SELECT `group`, COUNT(*) as cnt +FROM lu_v3_time_zone +GROUP BY `group` +HAVING cnt > 1; +``` + +--- + +## 4. Why PARTITION BY `group` Is Correct + +As documented in `PROJECT__V3_UNIFORM_LOOKUP_SYSTEM.md` (Section 2.1): + +> `group`: The primary business key/cluster key. *Note: Must be populated for hierarchy to work.* + +The `group` field IS the deduplication identity. Each lookup type uses a different natural +key for `group`: + +| Lookup type | `group` field | Example | +|---|---|---| +| `country` | `alpha_2_code` | `"US"`, `"CA"`, `"GB"` | +| `country_subdivision` | `code` | `"US-NY"`, `"CA-ON"` | +| `time_zone` | `name` (= the IANA timezone identifier) | `"US/Eastern"`, `"Europe/London"` | + +For `time_zone`, `group` and `name` are intended to be the same value — each timezone +is its own identity. There is no meaningful concept of "override all US timezones as a +group." Each one is individually addressable. + +--- + +## 5. Regression Tests to Add / Update + +- `test_timezone_us_dedup()` — assert all 13 US/* priority zones are present individually +- `test_timezone_europe_dedup()` — assert all Europe/* priority zones present individually +- `test_country_us_dedup()` — assert only one `alpha_2_code="US"` record returned; + account-specific override wins over global default +- General: `GET /v3/lookup/time_zone/list?only_priority=true` should return exactly 72 + records (the current count of priority=1 enabled timezones) + +--- + +## 6. What Was NOT Changed (and Should Not Be) + +- The endpoint signature for `GET /v3/lookup/{lu_type}/list` — it does not and should + not expose `limit`, `offset`, or `order_by_li` query params. The frontend sends these + but they are correctly ignored. The sort order is hardcoded and correct: + `ORDER BY COALESCE(priority, 0) DESC, COALESCE(sort, 0) DESC, name ASC` +- Country and country_subdivision data — no changes needed to those tables +- Frontend code — no backend-side changes are needed on the frontend for this fix