fix(download): strict ID typing — remove cross-resolution from hosted_file download

event_file download now resolves event_file_id → hosted_file_id explicitly before
delegating, rather than relying on a cross-resolution fallback inside the hosted_file
endpoint. The hosted_file download endpoint now only accepts hosted_file IDs.

Cross-resolution was added reactively (ea117bf) to patch incorrect frontend ID usage
and was never a deliberate design decision. With no per-record account ownership check
on the download path, the implicit ID aliasing was an unauditable gap.

- download_event_file_action: resolves event_file → hosted_file via Redis + SQL before
  delegating; 404s explicitly if chain is broken
- download_file_action: strict hosted_file ID only; cross-resolution fallback removed
- Also fixes ?key= not being forwarded (was missing from event_file endpoint signature)
- TODO: per-record account ownership check (P1), archive_content download endpoint (P2)
- Docs: breaking change note added to frontend guide (remove ~2026-06-24)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Scott Idem
2026-06-10 14:46:47 -04:00
parent 2429a1f731
commit 3806613427
4 changed files with 25 additions and 25 deletions

View File

@@ -281,20 +281,34 @@ async def download_event_file_action(
response: Response,
event_file_id: str = Path(min_length=11, max_length=22),
filename: Optional[str] = Query(None, min_length=4, max_length=255),
key: Optional[str] = Query(None),
site_key: Optional[str] = Query(None),
range: Optional[str] = Header(None),
account: AccountContext = Depends(get_account_context_optional),
delay: DelayParams = Depends(),
):
"""
Semantic alias for hosted_file download with Event-specific context.
Download the underlying file for an event_file record.
Resolves event_file_id → hosted_file_id explicitly before delegating.
"""
# Simply delegate to the universal hosted_file download logic
ef_int_id = redis_lookup_id_random(record_id_random=event_file_id, table_name='event_file')
if not ef_int_id:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Event file record not found.")
ef_rec = sql_select(sql="SELECT hosted_file_id FROM event_file WHERE id = :id", data={'id': ef_int_id})
if not ef_rec or not ef_rec.get('hosted_file_id'):
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Event file has no associated hosted file.")
hosted_file_id_random = get_id_random(record_id=ef_rec['hosted_file_id'], table_name='hosted_file')
if not hosted_file_id_random:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Hosted file record not found.")
from app.routers.api_v3_actions_hosted_file import download_file_action
return await download_file_action(
response=response,
hosted_file_id=event_file_id, # The universal downloader now resolves this!
hosted_file_id=hosted_file_id_random,
filename=filename,
key=key,
site_key=site_key,
range=range,
account=account,

View File

@@ -227,28 +227,11 @@ async def download_file_action(
if not is_authorized:
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Authentication required or invalid access key.")
# 2. Resolve File Record
# ID Vision: Attempt to resolve the ID.
# 🛑 REMINDER: If adding a new specialized 'container' object (like event_person_profile),
# ensure the lookup logic is mirrored here to allow direct downloads via container ID.
# If not found in hosted_file, check if it's an event_file or archive_content ID that we can resolve.
# 2. Resolve File Record — strict: only accepts hosted_file IDs.
# Container objects (event_file, archive_content) must resolve to a hosted_file_id
# in their own dedicated download endpoints before calling this function.
resolved_id = redis_lookup_id_random(record_id_random=hosted_file_id, table_name='hosted_file')
if not resolved_id:
log.info(f"ID {hosted_file_id} not found in hosted_file. Checking container tables...")
# A. Check event_file
if ef_id := redis_lookup_id_random(record_id_random=hosted_file_id, table_name='event_file'):
if ef_rec := sql_select(sql="SELECT hosted_file_id FROM event_file WHERE id = :id", data={'id': ef_id}):
resolved_id = ef_rec.get('hosted_file_id')
log.info(f"Resolved event_file {hosted_file_id} to hosted_file {resolved_id}")
# B. Check archive_content
if not resolved_id:
if ac_id := redis_lookup_id_random(record_id_random=hosted_file_id, table_name='archive_content'):
if ac_rec := sql_select(sql="SELECT hosted_file_id FROM archive_content WHERE id = :id", data={'id': ac_id}):
resolved_id = ac_rec.get('hosted_file_id')
log.info(f"Resolved archive_content {hosted_file_id} to hosted_file {resolved_id}")
if not resolved_id:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Hosted file record not found.")

View File

@@ -286,7 +286,7 @@ When seeding new lookup data (e.g., adding timezones in bulk):
## 5. Event File Data Retrieval (Hosted Files)
Every Event File (`event_file`) **must** have a linked Hosted File (`hosted_file`). The Hosted File itself is a metadata record for binary content (files), which is accessed via separate Action endpoints (e.g., `/v3/action/hosted_file/download`). This API endpoint provides metadata about the associated hosted file. To retrieve this additional metadata:
Every Event File (`event_file`) **must** have a linked Hosted File (`hosted_file`). The Hosted File is a metadata record for binary content, accessed via dedicated Action endpoints. To download an event file use `/v3/action/event_file/{event_file_id}/download` — not the hosted_file endpoint directly (each endpoint only accepts its own ID type). To retrieve hosted file metadata alongside an event file record:
* **Endpoint:** `GET /v3/crud/event_file/{event_file_id}`
* **Query Parameter:** Add `inc_hosted_file=true`
@@ -308,7 +308,8 @@ Event files can be downloaded without standard auth headers using one of two byp
- **Method:** `GET`
- **Path:** `/v3/action/event_file/{event_file_id}/download`
The endpoint also accepts `hosted_file_id` or `archive_content_id` directly — it resolves the chain automatically.
> [!WARNING]
> **Breaking change (2026-06-10):** This endpoint now requires an `event_file_id`. Previously it accepted `hosted_file_id` or `archive_content_id` and resolved the chain automatically — that cross-resolution has been removed. Pass the correct ID type for the endpoint you are calling. If you were routing downloads through `/v3/action/hosted_file/{hosted_file_id}/download` as a workaround, switch to this endpoint using `event_file_id`. *(Remove this note after ~2026-06-24.)*
#### Auth bypass options

View File

@@ -23,6 +23,8 @@
- [x] **[P4] Expose `pool_size` / `max_overflow` as env vars** — `create_ae_engine()` calls `settings.DB.get('pool_size', 10)` but `settings.DB` property doesn't include those keys, so they're always hardcoded 10/20. Add `AE_DB_POOL_SIZE` / `AE_DB_POOL_MAX_OVERFLOW` to `config.py`.
## 📋 Feature Tasks
- [ ] **[P1] Download endpoint account ownership check (June 2026):** `download_file_action` (hosted_file) has no per-record account isolation — any authenticated caller can download any file if they know the ID. After passing the auth gate (`account.auth_method != 'guest'`), the endpoint loads and serves the file with no check that `hosted_file.account_id` matches `account.account_id`. Fix: after loading `hosted_file_obj`, compare `hosted_file_obj.account_id` to `account.account_id`; reject with 403 unless `account.super` or `auth_method == 'bypass'`. Apply the same check in `download_event_file_action` after the `event_file` record is resolved (the event_file's `account_id` is the authoritative ownership gate for that path). Note: `?key=` and `?site_key=` bypass paths are intentionally loose and should remain unchanged for external link support.
- [ ] **[P2] `archive_content` download endpoint (June 2026):** No dedicated download endpoint exists for `archive_content`. Previously, callers could pass an `archive_content_id` to the `hosted_file` download endpoint and cross-resolution would handle it — this was removed as part of the Option B strict-ID-typing change. Needs a new `GET /v3/action/archive_content/{archive_content_id}/download` endpoint that resolves `archive_content_id → hosted_file_id` explicitly (same pattern as `download_event_file_action`).
- [x] **`order_by_li` view-join ambiguity fix (June 2026):** Using view-only join columns (e.g. `event_presenter_family_name` from `v_event_file`) in `order_by_li` caused MariaDB error "Unknown column 'account_id' in WHERE" (HTTP 400). Root cause: `filter_order_by` validated columns against the view — which passes for join-derived fields — and `sql_and_qry_part` generates an unqualified `account_id =` clause that becomes ambiguous when MariaDB expands the view's JOIN inline. Fix: `filter_order_by` now accepts `raw_table_name` and validates ORDER BY columns against the physical table only. Join-only view columns are silently stripped. Updated all three call sites in `api_crud_v3.py` (×2) and `api_crud_v3_nested.py` (×2). **Follow-up (lower priority):** qualify `account_id` in `sql_and_qry_part` to fix the root ambiguity for any future JOIN-capable views.
- [x] **Core Isolation:** Harden `apply_forced_account_filter` to Fail-Closed.
- [x] **IDAA Baseline:** Remove `public_read` from Event, CMS, and Archive objects.