V3 Migration Phase 1: Stabilize Hosted File models, IDs, and whitelisting. Added comprehensive verification tests.

This commit is contained in:
Scott Idem
2026-01-22 18:30:34 -05:00
parent df0ce7f910
commit 1837b442cf
2 changed files with 106 additions and 29 deletions

View File

@@ -1,7 +1,7 @@
import datetime, pytz
from typing import Dict, List, Optional, Set, Union
from pydantic import BaseModel, EmailStr, Field, Json, PrivateAttr, ValidationError, validator
from typing import Dict, List, Optional, Set, Union, ClassVar
from pydantic import BaseModel, EmailStr, Field, Json, PrivateAttr, ValidationError, validator, root_validator
from app.db_sql import redis_lookup_id_random
from app.lib_general import log, logging
@@ -14,22 +14,13 @@ class Hosted_File_Link_Base(BaseModel):
log.setLevel(logging.WARNING) # DEBUG, INFO, WARNING, ERROR, EXCEPTION, CRITICAL
log.debug(locals())
# id_random: Optional[str] = Field(
# **base_fields['hosted_file_link_id_random'],
# alias = 'hosted_file_link_id_random',
# )
id: Optional[int] = Field(
#alias = 'hosted_file_link_id'
)
account_id_random: Optional[str]
account_id: Optional[int]
id: Optional[Union[int, str]] = Field(None)
account_id: Optional[Union[int, str]] = Field(None, **base_fields['account_id_random'])
hosted_file_id_random: Optional[str]
hosted_file_id: Optional[int]
hosted_file_id: Optional[Union[int, str]] = Field(None, **base_fields['hosted_file_id_random'])
link_to_type: Optional[str] # Should this be renamed to "link_to_obj_type" for clarity?
link_to_id_random: Optional[str] # Should this be renamed to "link_to_obj_id_random" for clarity?
link_to_id: Optional[int] # Should this be renamed to "link_to_obj_id" for clarity?
link_to_id: Optional[Union[int, str]] = Field(None) # Random string or integer
# notes: Optional[str]
created_on: Optional[datetime.datetime] = None
@@ -40,21 +31,33 @@ class Hosted_File_Link_Base(BaseModel):
_processed_at: datetime.datetime = PrivateAttr(default_factory=datetime.datetime.now)
@validator('account_id', always=True)
def account_id_lookup(cls, v, values, **kwargs):
if isinstance(v, int) and v > 0: return v
elif id_random := values.get('account_id_random'):
return redis_lookup_id_random(record_id_random=id_random, table_name='account')
return None
@root_validator(pre=True)
def map_v3_ids(cls, values):
"""
Vision Transformer:
Map DB keys to clean API keys and strip internal integers.
"""
# 1. Map account_id
if a_rid := values.get('account_id_random'):
if not isinstance(values.get('account_id'), int):
values['account_id'] = a_rid
@validator('link_to_id', always=True)
def link_to_id_lookup(cls, v, values, **kwargs):
log.setLevel(logging.WARNING)
log.debug(locals())
# 2. Map hosted_file_id
if f_rid := values.get('hosted_file_id_random'):
if not isinstance(values.get('hosted_file_id'), int):
values['hosted_file_id'] = f_rid
if values['link_to_id_random'] and values['link_to_type']:
return redis_lookup_id_random(record_id_random=values['link_to_id_random'], table_name=values['link_to_type'])
return None
# 3. Map link_to_id
if l_rid := values.get('link_to_id_random'):
if not isinstance(values.get('link_to_id'), int):
values['link_to_id'] = l_rid
return values
# Fields that are part of the model (for reading) but should not be saved to the DB table
fields_to_exclude_from_db: ClassVar[list] = [
'link_to'
]
class Config:
underscore_attrs_are_private = True

View File

@@ -0,0 +1,74 @@
# Aether V3: Hosted File System Migration Plan
## 1. Overview
The goal of this project is to migrate the existing `hosted_file` and `hosted_file_link` logic into the **CRUD V3 Architecture**. This involves splitting the system into a **Standard Record Layer** (metadata) and a **Specialized Action Layer** (binary handling).
## 2. Core Requirements
- **Relational Integrity:** Fully utilize `hosted_file_link` for all object associations.
- **Deduplication:** Automatic filesystem and DB hash-checks before creating new records.
- **Cleanup:** Intelligent "Orphan" removal logic via `rm_orphan` flag.
- **Flexible Auth:** Support uploads/downloads with JWT, without JWT (Guest), and via URL-key fallback (bypass API Key requirement).
- **Binary Support:** High-performance streaming, byte-range seeking, and multi-file POST handling.
- **Developer DX:** Integrated `delay_ms` simulation and extension whitelisting.
---
## 3. Implementation Phases (Bite-Sized Chunks)
### Phase 1: V3 Metadata Baseline
*Status: Ready to start*
- Whitelist `hosted_file` and `hosted_file_link` in `obj_type_kv_li`.
- Verify standard V3 Search works for files (filtering by account, hash, etc.).
- Enable `PATCH /v3/crud/hosted_file/{id}` for metadata updates (title, description).
- Implement "Fake Delete" using standard `DELETE ...?method=hide`.
### Phase 2: V3 Action Router Scaffolding
- Create `app/routers/api_v3_actions_hosted_file.py`.
- Implement `delay_ms` middleware/logic for action routes.
- Implement specialized Extension Validator.
### Phase 3: Enhanced Binary Actions
- **Download Action:** Port the streamer logic to `/v3/action/hosted_file/{id}/download`.
- Add URL-param fallback for API Key/Auth bypass.
- **Upload Action:** Implement `/v3/action/hosted_file/upload`.
- Support both single and `List[UploadFile]`.
- Implement the Hash-Lookup-Before-Write logic.
### Phase 4: Relational Cleanup & Linking
- **Relational Delete Logic:**
- Implement `DELETE /v3/action/hosted_file/{id}`.
- Support `method` parameter: `hide`, `disable`, `delete` (hard).
- **Orphan Check:** Logic to count remaining links; if `rm_orphan=True` and count is 0, physically remove file and parent record.
- **Fake Delete (Test Mode):**
- Specialized mode for testing frontend workflows without data loss.
- Logic:
1. Verify `hosted_file` record existence.
2. Verify physical file existence on server.
3. Verify `hosted_file_link` existence.
4. Return 200 OK success response *without* executing the actual SQL DELETE or `os.unlink`.
---
## 4. Technical Architecture
### Standard CRUD Routes (Metadata)
| Method | Endpoint | Description |
| :--- | :--- | :--- |
| `POST` | `/v3/crud/hosted_file/search` | Find files by hash, name, or account. |
| `PATCH` | `/v3/crud/hosted_file/{id}` | Update title, description, or notes. |
| `DELETE` | `/v3/crud/hosted_file/{id}` | Soft-delete (Hide) the file record. |
### Specialized Action Routes (Binary)
| Method | Endpoint | Description |
| :--- | :--- | :--- |
| `POST` | `/v3/action/hosted_file/upload` | Upload 1+ files; handles deduplication. |
| `GET` | `/v3/action/hosted_file/{id}/download` | Stream binary data; supports range seeking. |
| `DELETE` | `/v3/action/hosted_file/{id}` | Removes link; optionally deletes orphan file. |
---
## 5. Testing & Verification Strategy
For every chunk, we will create/update:
1. **Logic Test:** Unit test for the internal method (e.g., `lookup_file_hash`).
2. **E2E Test:** Live network test against the dev API to verify real record creation and file persistence.
3. **Security Test:** Verification of the "Bypass" modes (Site Key / URL Key).