Documentation Standardisation & Unit Test Stabilization

- Overhauled README.md to serve as a unified system index and WIP tracker.
- Standardized documentation filenames (ARCH__, GUIDE__, PLAN__) for better discoverability.
- Archived completed project plans and scopes.
- Fixed regressions in unit tests (errors, models, email) caused by V3 architectural updates.
- Ensured unit tests remain non-destructive and environment-independent via mocking.
This commit is contained in:
Scott Idem
2026-01-28 12:15:01 -05:00
parent 3eaf176b05
commit 860cf80a4e
17 changed files with 118 additions and 28 deletions

View File

@@ -0,0 +1,255 @@
# Common Parameters Refactoring Proposal for Aether API V3
## 1. Problem Statement
Currently, the `common_route_params` dependency in `app/lib_general.py` bundles many disparate concerns:
- Account authentication/authorization context (`x_account_id`).
- Pagination (`limit`, `offset`).
- Status filtering (`enabled`, `hidden`).
- Pydantic serialization options (`by_alias`, `exclude_unset`).
- FastAPI internal (`response` object).
- Request delay (`delay_ms`, `X-Delay-ms`).
While functional, this monolithic dependency leads to:
- **Lack of Granularity:** Every route that uses `common_route_params` inherits all its parameters, even if it only needs a subset.
- **Reduced Clarity:** The purpose of each parameter is obscured within a single, large object.
- **Limited Flexibility:** Difficult to customize or override specific common parameters without affecting others.
- **Suboptimal OpenAPI Docs:** FastAPI generates a single, large set of parameters for `commons`, which can be less precise in the Swagger UI.
## 2. Proposed Solution: Granular Dependencies
The proposal is to break down `common_route_params` into smaller, more focused FastAPI dependency injection functions and Pydantic models. This leverages FastAPI's powerful dependency injection system to achieve clarity, flexibility, and better OpenAPI documentation.
All new models and dependency functions will reside in `app/lib_general.py`.
### A. `AccountContext` (Pydantic Model & Dependency Function)
* **Purpose:** Encapsulates the resolved account context, including the integer `account_id` and the random `account_id_random` string. It also handles the authentication/authorization logic for `x_account_id`.
* **Dependency Function:** `get_account_context()`
**Proposed `app/lib_general.py` additions:**
```python
# --- Pydantic Model for Account Context ---
class AccountContext(BaseModel):
account_id: int
account_id_random: str
# Add any other account-specific context needed here
# --- Dependency Function for Account Context ---
def get_account_context(
x_account_id: Optional[str] = Header(None, min_length=11, max_length=22),
x_no_account_id: Optional[str] = Header(None, min_length=11, max_length=22),
x_no_account_id_token: Optional[str] = Query(None, min_length=11, max_length=22), # Changed from Header to Query for easier testing
) -> AccountContext:
"""
Resolves the account context from headers/query parameters.
Raises HTTPException 403 if no valid account is found.
"""
log.setLevel(logging.DEBUG) # Adjust as needed
log.debug(locals())
resolved_account_id = None
resolved_account_id_random = None
if x_account_id:
resolved_account_id_random = x_account_id
if resolved_account_id := redis_lookup_id_random(table_name='account', record_id_random=x_account_id):
log.info(f'Found account from x-account-id: {resolved_account_id}')
else:
log.warning(f'Invalid x-account-id provided: {x_account_id}')
raise HTTPException(status_code=403, detail='Invalid X-Account-ID.')
elif x_no_account_id and len(x_no_account_id) >= 11:
# User specified no account ID but might have a special bypass token/logic
# USER INPUT REQUIRED: Clarify logic for x_no_account_id vs x_no_account_id_token
# For now, treat as no specific account and potentially allow access based on other factors
log.info(f'X-No-Account-ID header found: {x_no_account_id}. Proceeding without specific account context.')
resolved_account_id = 0 # Or some other indicator for "no specific account"
resolved_account_id_random = '--- NO ACCOUNT ---'
elif x_no_account_id_token and len(x_no_account_id_token) >= 11:
# USER INPUT REQUIRED: How should x_no_account_id_token be validated?
# Current logic in common_route_params looks up account based on this token.
resolved_account_id_random = x_no_account_id_token
if resolved_account_id := redis_lookup_id_random(table_name='account', record_id_random=x_no_account_id_token):
log.info(f'Found account from x_no_account_id_token: {resolved_account_id}')
else:
log.warning(f'Invalid x_no_account_id_token provided: {x_no_account_id_token}')
raise HTTPException(status_code=403, detail='Invalid X-No-Account-ID-Token.')
else:
log.warning('No valid account context provided via X-Account-ID or X-No-Account-ID-Token.')
raise HTTPException(status_code=403, detail='Account context required. Please provide X-Account-ID or X-No-Account-ID-Token.')
return AccountContext(account_id=resolved_account_id, account_id_random=resolved_account_id_random)
```
---
### B. `PaginationParams` (Pydantic Model & Dependency Function)
* **Purpose:** Encapsulates pagination parameters (`limit`, `offset`).
* **Dependency Function:** `get_pagination_params()`
**Proposed `app/lib_general.py` additions:**
```python
# --- Pydantic Model for Pagination ---
class PaginationParams(BaseModel):
limit: int = 100 # Default limit
offset: int = 0
# --- Dependency Function for Pagination ---
def get_pagination_params(
limit: int = Query(100, ge=0, description="Maximum number of items to return"),
offset: int = Query(0, ge=0, description="Number of items to skip (for pagination)"),
) -> PaginationParams:
return PaginationParams(limit=limit, offset=offset)
```
---
### C. `StatusFilterParams` (Pydantic Model & Dependency Function)
* **Purpose:** Encapsulates status filtering parameters (`enabled`, `hidden`).
* **Dependency Function:** `get_status_filter_params()`
**Proposed `app/lib_general.py` additions:**
```python
# --- Pydantic Model for Status Filtering ---
class StatusFilterParams(BaseModel):
enabled: str = 'enabled' # 'enabled', 'disabled', 'all'
hidden: str = 'not_hidden' # 'hidden', 'not_hidden', 'all'
# --- Dependency Function for Status Filtering ---
def get_status_filter_params(
enabled: str = Query('enabled', description="Filter by object enabled status ('enabled', 'disabled', 'all')"),
hidden: str = Query('not_hidden', description="Filter by object hidden status ('hidden', 'not_hidden', 'all')"),
) -> StatusFilterParams:
# USER INPUT REQUIRED: Should we validate 'enabled' and 'hidden' values here
# (e.g., raise HTTP 400 if not 'enabled', 'disabled', 'all')?
return StatusFilterParams(enabled=enabled, hidden=hidden)
```
---
### D. `SerializationParams` (Pydantic Model & Dependency Function)
* **Purpose:** Encapsulates Pydantic serialization options (`by_alias`, `exclude_unset`).
* **Dependency Function:** `get_serialization_params()`
**Proposed `app/lib_general.py` additions:**
```python
# --- Pydantic Model for Serialization Options ---
class SerializationParams(BaseModel):
by_alias: bool = True
exclude_unset: bool = False
# --- Dependency Function for Serialization Options ---
def get_serialization_params(
by_alias: bool = Query(True, description="Whether to use field aliases for serialization"),
exclude_unset: bool = Query(False, description="Whether to exclude unset fields from the response"),
) -> SerializationParams:
return SerializationParams(by_alias=by_alias, exclude_unset=exclude_unset)
```
---
### E. `DelayParams` (Pydantic Model & Dependency Function)
* **Purpose:** Encapsulates the calculated `sleep_time` based on header/query delay parameters.
* **Dependency Function:** `get_delay_params()`
**Proposed `app/lib_general.py` additions:**
```python
# --- Pydantic Model for Delay ---
class DelayParams(BaseModel):
sleep_time_ms: int = 0 # Raw delay value in ms
sleep_time_s: float = 0.0 # Converted to seconds for time.sleep()
# --- Dependency Function for Delay ---
def get_delay_params(
x_delay_ms: Optional[int] = Header(0, alias='X-Delay-ms', description="Delay response for X milliseconds (header)"),
delay_ms: Optional[int] = Query(0, description="Delay response for X milliseconds (query parameter)"),
) -> DelayParams:
calculated_delay_ms = max(x_delay_ms or 0, delay_ms or 0)
return DelayParams(sleep_time_ms=calculated_delay_ms, sleep_time_s=calculated_delay_ms / 1000.0)
```
---
### F. Refactoring `common_route_params` and `common_route_params_no_account_id`
The existing `common_route_params` and `common_route_params_no_account_id` functions, along with their associated Pydantic models, would be deprecated or removed, as their functionality is now covered by these new granular dependencies.
## 3. How this affects route functions in `api_crud_v3.py`
Route functions will now explicitly declare only the dependencies they need. This significantly cleans up the function signatures and makes their requirements explicit.
**Example `get_obj_li` function signature update:**
```python
# Old:
# async def get_obj_li(
# obj_type_l1: str,
# for_obj_type: Optional[str] = None,
# for_obj_id: Optional[str] = None,
# hidden: str = 'not_hidden',
# order_by_li: Optional[str] = None,
# jp: Optional[Union[str, None]] = None,
# x_delay_ms: Optional[int] = Header(0, alias='X-Delay-ms'),
# delay_ms: Optional[int] = Query(0),
# commons: Common_Route_Params = Depends(common_route_params),
# ):
# New (using new dependencies):
from fastapi import Depends, Header, Query, HTTPException
from pydantic import BaseModel
from typing import Optional, Union # ... other typing imports (existing)
import time # (existing)
# ... (assuming new AccountContext, PaginationParams, etc. and their get_X_params functions are defined in lib_general)
@router.get('/{obj_type_l1}/', response_model=Resp_Body_Base)
async def get_obj_li(
obj_type_l1: str,
account_context: AccountContext = Depends(get_account_context),
pagination: PaginationParams = Depends(get_pagination_params),
status_filter: StatusFilterParams = Depends(get_status_filter_params),
serialization: SerializationParams = Depends(get_serialization_params),
delay_params: DelayParams = Depends(get_delay_params),
for_obj_type: Optional[str] = Query(None), # Still needed for parent filtering
for_obj_id: Optional[str] = Query(None), # Still needed for parent filtering
order_by_li: Optional[str] = None,
jp: Optional[Union[str, None]] = None,
# The 'response: Response' object will be handled by FastAPI directly, or can be passed through a dependency if really needed.
):
# Apply delay
if delay_params.sleep_time_s > 0:
time.sleep(delay_params.sleep_time_s)
# Use parameters:
# account_id = account_context.account_id
# account_id_random = account_context.account_id_random
# limit = pagination.limit
# offset = pagination.offset
# enabled_status = status_filter.enabled
# hidden_status = status_filter.hidden
# by_alias_option = serialization.by_alias
# exclude_unset_option = serialization.exclude_unset
# ... rest of your logic, now using these granular parameters ...
```
## 4. User Input Required
Please review this proposal. Specifically, I need your input on:
1. **Approval of the Overall Structure:** Do you agree with breaking down `common_route_params` into these more granular dependencies?
2. **`AccountContext` Logic:**
* How should `x_no_account_id` be handled precisely? Is it just a flag to bypass `x_account_id` validation, or does it imply a specific "anonymous" account context?
* How should `x_no_account_id_token` be validated? The current code looks up an account based on it, but is there a specific token validation mechanism beyond just lookup? Should it have a default value for unauthenticated use?
3. **`StatusFilterParams` Validation:** Should the `enabled` and `hidden` query parameters (e.g., `'enabled'`, `'disabled'`, `'all'`, `'hidden'`, `'not_hidden'`, `'all'`) be strictly validated within `get_status_filter_params` (e.g., raising `HTTPException 400` if an invalid string is provided)?
Your feedback on these points is crucial before proceeding with implementation.

View File

@@ -0,0 +1,74 @@
# Aether V3: Hosted File System Migration Plan
## 1. Overview
The goal of this project is to migrate the existing `hosted_file` and `hosted_file_link` logic into the **CRUD V3 Architecture**. This involves splitting the system into a **Standard Record Layer** (metadata) and a **Specialized Action Layer** (binary handling).
## 2. Core Requirements
- **Relational Integrity:** Fully utilize `hosted_file_link` for all object associations.
- **Deduplication:** Automatic filesystem and DB hash-checks before creating new records.
- **Cleanup:** Intelligent "Orphan" removal logic via `rm_orphan` flag.
- **Flexible Auth:** Support uploads/downloads with JWT, without JWT (Guest), and via URL-key fallback (bypass API Key requirement).
- **Binary Support:** High-performance streaming, byte-range seeking, and multi-file POST handling.
- **Developer DX:** Integrated `delay_ms` simulation and extension whitelisting.
---
## 3. Implementation Phases (Bite-Sized Chunks)
### Phase 1: V3 Metadata Baseline
*Status: Ready to start*
- Whitelist `hosted_file` and `hosted_file_link` in `obj_type_kv_li`.
- Verify standard V3 Search works for files (filtering by account, hash, etc.).
- Enable `PATCH /v3/crud/hosted_file/{id}` for metadata updates (title, description).
- Implement "Fake Delete" using standard `DELETE ...?method=hide`.
### Phase 2: V3 Action Router Scaffolding
- Create `app/routers/api_v3_actions_hosted_file.py`.
- Implement `delay_ms` middleware/logic for action routes.
- Implement specialized Extension Validator.
### Phase 3: Enhanced Binary Actions
- **Download Action:** Port the streamer logic to `/v3/action/hosted_file/{id}/download`.
- Add URL-param fallback for API Key/Auth bypass.
- **Upload Action:** Implement `/v3/action/hosted_file/upload`.
- Support both single and `List[UploadFile]`.
- Implement the Hash-Lookup-Before-Write logic.
### Phase 4: Relational Cleanup & Linking
- **Relational Delete Logic:**
- Implement `DELETE /v3/action/hosted_file/{id}`.
- Support `method` parameter: `hide`, `disable`, `delete` (hard).
- **Orphan Check:** Logic to count remaining links; if `rm_orphan=True` and count is 0, physically remove file and parent record.
- **Fake Delete (Test Mode):**
- Specialized mode for testing frontend workflows without data loss.
- Logic:
1. Verify `hosted_file` record existence.
2. Verify physical file existence on server.
3. Verify `hosted_file_link` existence.
4. Return 200 OK success response *without* executing the actual SQL DELETE or `os.unlink`.
---
## 4. Technical Architecture
### Standard CRUD Routes (Metadata)
| Method | Endpoint | Description |
| :--- | :--- | :--- |
| `POST` | `/v3/crud/hosted_file/search` | Find files by hash, name, or account. |
| `PATCH` | `/v3/crud/hosted_file/{id}` | Update title, description, or notes. |
| `DELETE` | `/v3/crud/hosted_file/{id}` | Soft-delete (Hide) the file record. |
### Specialized Action Routes (Binary)
| Method | Endpoint | Description |
| :--- | :--- | :--- |
| `POST` | `/v3/action/hosted_file/upload` | Upload 1+ files; handles deduplication. |
| `GET` | `/v3/action/hosted_file/{id}/download` | Stream binary data; supports range seeking. |
| `DELETE` | `/v3/action/hosted_file/{id}` | Removes link; optionally deletes orphan file. |
---
## 5. Testing & Verification Strategy
For every chunk, we will create/update:
1. **Logic Test:** Unit test for the internal method (e.g., `lookup_file_hash`).
2. **E2E Test:** Live network test against the dev API to verify real record creation and file persistence.
3. **Security Test:** Verification of the "Bypass" modes (Site Key / URL Key).

View File

@@ -0,0 +1,48 @@
# Aether V3 Schema Discovery Plan
## Objective
Provide an endpoint that returns the structure of an Aether object type, combining database metadata with Pydantic model definitions. This helps the frontend adapt to schema changes and understand field types/aliases without hardcoding.
## Proposed Endpoint
`GET /v3/crud/{obj_type}/schema`
### Query Parameters
- `view`: (Optional) `default`, `enriched`, `detail`. Determines which database table/view and Pydantic model to inspect.
- `variant`: (Optional) `in`, `out`, `base`. Selects the specific Pydantic model variant (e.g., `mdl_in` for write operations).
## Implementation Details
### 1. Database Metadata
- Execute `DESCRIBE {table_name}` or `SHOW COLUMNS FROM {table_name}`.
- Extract: `Field`, `Type`, `Null`, `Key`, `Default`, `Extra`.
### 2. Pydantic Metadata
- Inspect the model's `__fields__`.
- Map Pydantic field names to database aliases.
- Extract: `type`, `required`, `default`, `description` (from Field extra), `validators`.
### 3. Response Structure
```json
{
"object_type": "account",
"database": {
"table_name": "account",
"columns": [...]
},
"model": {
"model_name": "Account_Base",
"fields": {
"account_id_random": {
"alias": "id_random",
"type": "string",
"required": false
},
...
}
}
}
```
## Security
- Requires standard V3 authentication.
- Only returns schema for objects defined in `obj_type_kv_li`.

View File

@@ -0,0 +1,81 @@
# Scope: Automated Field Management Tool (MCP)
**Task ID:** 153357623
**Type:** MCP Tool / Dev Utility
**Target System:** Aether API V3 (FastAPI + MariaDB)
## 1. Objective
To create a safe, automated "Vertical Slice" tool that handles the repetitive and error-prone process of adding or modifying fields in the Aether backend.
**Current Manual Process (Pain Points):**
1. `ALTER TABLE` in MariaDB (risk of locking or syntax errors).
2. Update/Recreate `v_[object]` SQL View (easy to forget, breaks Triple ID).
3. Update Pydantic Models in `app/models/` (tedious, risk of schema drift).
4. Update `searchable_fields` in `app/object_definitions/` (often forgotten).
## 2. Proposed Solution: `ae_field_manage`
A Python script in the MCP toolset (`agents_sync/scripts/ae_field_manage.py`) that orchestrates these steps atomically.
### Usage Example
```bash
# CLI Usage
python3 ae_field_manage.py add person nickname --type string --length 50 --nullable
# MCP Tool Usage
{
"tool": "ae_field_manage",
"action": "add",
"object": "person",
"field": "nickname",
"params": { "type": "string", "length": 50 }
}
```
## 3. Implementation Steps
### Phase 1: Analysis & Safety (Dry Run)
* [ ] **Input Validation:** Ensure field name is snake_case and valid.
* [ ] **Schema Check:** Check if field already exists in DB or Pydantic.
* [ ] **Impact Analysis:** Identify which files will be modified (e.g., `app/models/person_models.py`).
### Phase 2: Database Operations
* [ ] **Backup:** (Optional) Trigger a quick table dump?
* [ ] **DDL Execution:** Run `ALTER TABLE {table} ADD COLUMN {field} {type} ...`.
* [ ] **View Recreation:**
* Fetch current view definition OR construct standard V3 view.
* Ensure Triple ID pattern is preserved (`id`, `{type}_id`, `{type}_id_random`).
* Execute `CREATE OR REPLACE VIEW`.
### Phase 3: Codebase Modification (AST/Regex)
* [ ] **Pydantic Model Update:**
* Locate `class {Object}_Base(BaseModel):`.
* Inject field definition: `nickname: Optional[str] = Field(None, max_length=50)`.
* Use AST (Abstract Syntax Tree) parsing for safety, or careful Regex if AST is too complex for simple insertions.
* [ ] **Search Registry Update:**
* If flag `--searchable` is set, add to `searchable_fields` list in `app/object_definitions/core.py`.
### Phase 4: Verification
* [ ] **Integrity Check:** Run `ae_check_integrity` to verify View health.
* [ ] **Schema Sync:** Trigger `ae_schema_sync` to update Frontend TS interfaces (if applicable).
## 4. Technical Constraints
* **Database:** MariaDB 10.11.
* **Language:** Python 3.10+.
* **ORM:** SQLAlchemy (for type mapping), but raw SQL for DDL.
* **File Paths:** Must respect `~/OSIT_dev/` structure.
## 5. Risks & Mitigation
* **Risk:** Corrupting `person_models.py`.
* *Mitigation:* Create a `.bak` copy before writing.
* **Risk:** Locking the DB table.
* *Mitigation:* Verify no active heavy transactions (low priority for dev env).
* **Risk:** View definition divergence.
* *Mitigation:* The tool should ideally *read* the existing view logic or use a standard template.
## 6. Definition of Done
* Can add a string/int/bool field to `person` via one command.
* Database column exists.
* View `v_person` includes the column.
* `Person_Base` model includes the field.
* `ae_obj_info person` reflects the change.