Security: Implement modern JWT authentication for V3 CRUD and Search; update documentation and to-do list.

This commit is contained in:
Scott Idem
2026-01-02 20:26:44 -05:00
parent 53d252b23d
commit f865b1cfb7
4 changed files with 164 additions and 115 deletions

View File

@@ -18,78 +18,44 @@ I am an interactive CLI agent assisting with software engineering tasks for One
- **System Name:** Aether (AE). - **System Name:** Aether (AE).
- **Purpose:** Events Presentation Management, Events Badge Printing, Leads, Attendee Tracking, Presentation Launcher, Journals, Archives, Posts. - **Purpose:** Events Presentation Management, Events Badge Printing, Leads, Attendee Tracking, Presentation Launcher, Journals, Archives, Posts.
- **Started:** Mid-2018. - **Started:** Mid-2018.
- **Frontend History:** Python Flask -> Svelte (current: SvelteKit). This explains legacy API calls. - **Frontend History:** Python Flask -> Svelte (current: SvelteKit).
- **Current API Version (FastAPI):** Roughly v2.5. - **Current API Version (FastAPI):** v4.9.0.
- **Target API Version:** v3. - **V3 Implementation:** Modern parallel CRUD and Search endpoints under `/v3/crud`.
### API Versioning & Strategy ### API Versioning & Strategy
- `/crud` (v1): Legacy, still used by some older frontend parts. Defined in `app/routers/api_crud.py`. **Remains untouched.** - `/crud` (v1): Legacy, still used by older frontend parts.
- `/v2/crud` (v2.5): Modern, preferred, and mostly functional endpoint. Defined in `app/routers/api_crud_v2.py`. **Remains untouched.** - `/v2/crud` (v2.5): Modern, preferred, and mostly functional endpoint.
- `/v3/crud`: The goal of this project phase. A new, parallel implementation with a refined structure. **Will run alongside v1 and v2.** - `/v3/crud`: The goal of this project phase. A new, parallel implementation with a refined structure and advanced search. **Runs alongside v1 and v2.**
### V3 Architectural Goals ### Technical Learnings
- **Startup Errors & Logging:** The "worker failed to boot" error is often an import-time error or a logging configuration failure.
- **Root Cause:** If `logging.config.dictConfig` fails (e.g., due to missing `/logs` directories in Docker), the entire application crashes.
- **Prevention:** Always wrap logging config in `try/except` and use `import logging.config` explicitly.
- **Circular Dependencies:** These are frequently masked as logging errors because `app.log` is imported very early in most files. Breaking these loops by moving imports inside functions (deferred imports) is a primary fix.
- **V3 API Dependencies:** Standardized `Response` injection should use plain type hints (e.g., `response: Response`) to avoid router initialization failures.
- **Pydantic Compatibility:** The current environment uses Pydantic v1.10. Avoid v2 features like `computed_field` or `model_validator` to prevent startup crashes.
- **Nested URL Structure:** Enforce hierarchical relationships (e.g., `/v3/crud/{parent_type}/{parent_id}/{child_type}/...`). ### V3 Architectural Progress (Jan 2026)
- **Dedicated Router:** All v3 functionality will reside in `app/routers/api_crud_v3.py`.
- **Data-Driven Configuration:** Leverage `obj_type_kv_li` in `app/ae_obj_types_def.py` for object definitions (tables, models).
## Session Learnings & Progress (December 3, 2025) - **Modular Object Definitions:** Monolithic `ae_obj_types_def.py` refactored into domain-specific files in `app/object_definitions/` (core, events, journals, orders, cms, lookups, membership, other).
- **Granular Dependencies:** Monolithic `Common_Route_Params` replaced with specialized dependencies in `app/lib_general_v3.py` (AccountContext, Pagination, StatusFilter, Serialization, Delay).
- **Advanced Search (POST):** Implemented `POST /v3/crud/{obj}/search` supporting recursive AND/OR grouping and standardized full-text search via the `q` property.
- **Security Hardening:** Implemented a 5-level recursion depth limit and a field allowlist (`searchable_fields`) for the Search API.
- **Non-blocking Concurrency:** Standardized on `asyncio.sleep()` for delay simulation to prevent Gunicorn worker hangs.
### Strategy Shift & V3 Development ## Session Learnings & Progress (Jan 2-3, 2026)
* **Initial Plan Shift:** The strategy shifted from *migrating/replacing* existing v1/v2 routes to *building a new, parallel v3 implementation* from scratch. All existing v1/v2 routes will remain untouched. - **Logging Robustness:** All core modules and routers now use module-level loggers (`logging.getLogger(__name__)`). `app/log.py` includes robust `dictConfig` initialization with error handling.
* **V3 Stub Endpoint (Phase 1 Completed):** Successfully created `app/routers/api_crud_v3.py` and mounted it at `/v3/crud` in `app/main.py` with a working `/health` endpoint. - **Backward Compatibility:** Hybrid object definitions ensure that `/v2/crud` continues to work by including both modern (`tbl`, `mdl`) and legacy (`table_name`, `base_name`) keys.
- **FastAPI Best Practices:** Standardized `Response` injection via `response: Response` type hints instead of `Depends(Response)`.
- **Documentation:** Created `V3_FRONTEND_API_GUIDE.md` for Svelte/TypeScript integration and `V3_CRUD_ARCHITECTURE_AND_LEARNINGS.md` for backend maintenance.
### V3 CRUD Proof-of-Concept (Journal & Journal Entry - Phase 2 & 3 Completed) ## Current To-Do List
Implemented the full CRUD functionality for `journal` (top-level) and `journal_entry` (nested child object), demonstrating the v3 nested URL structure and its underlying logic: 1. **Docker Environment Insight Improvements (Priority: High)**: Implement methods/endpoints to give the agent more insight into the actual Docker runtime environment (environment variables, container status, log accessibility) to better diagnose recurring startup and configuration issues.
* **Top-Level Journal CRUD:** 2. **Security - Field Allowlists (Priority: High)**: Finish populating `searchable_fields` for all remaining object definitions.
* `GET /v3/crud/journal/{journal_id}` 3. **Security - Authentication (Priority: High)**: Continue refining and enforcing JWT-based authentication across all V3 endpoints.
* `GET /v3/crud/journal/` (list, with filtering via `for_obj_type`, `for_obj_id`, and `jp`) 4. **Specialized Endpoints (Priority: Medium)**: Identify and plan the modernization of custom logic (e.g., importing, websockets) to match V3 patterns.
* `POST /v3/crud/journal/` 5. **Directory Cleanup (Priority: Low)**: Long-term plan to archive old projects and standardize directory naming in `OSIT_dev`.
* `PATCH /v3/crud/journal/{journal_id}` 6. **Unused Route Cleanup**: Successfully commented out `cont_edu_cert` routers in `main.py`.
* `DELETE /v3/crud/journal/{journal_id}`
* **Nested Journal Entry CRUD:**
* `GET /v3/crud/journal/{journal_id}/journal_entry/` (list)
* `POST /v3/crud/journal/{journal_id}/journal_entry/`
* `GET /v3/crud/journal/{journal_id}/journal_entry/{entry_id}`
* `PATCH /v3/crud/journal/{journal_id}/journal_entry/{entry_id}`
* `DELETE /v3/crud/journal/{journal_id}/journal_entry/{entry_id}`
### V3 CRUD Refinements
* **Soft-Delete Functionality:** `DELETE` endpoints now support a `method` query parameter (`delete`, `disable`, `hide`) for soft-deleting (setting `enable=False` or `hide=True`) or hard-deleting records, mirroring v2 behavior.
* **Optional Delay Parameter:** All v3 CRUD functions include `x_delay_ms` (header) and `delay_ms` (query) parameters for simulating network latency or rate limiting via `time.sleep()`.
### Current Task: Common Parameters Refactoring (Reverted)
* **Goal:** Refactor the current monolithic `commons: Common_Route_Params` dependency into smaller, more granular FastAPI dependencies.
* **Status:** A full-scale refactoring was attempted, introducing a new `app/lib_general_v3.py` file and modifying `api_crud_v3.py`, `main.py`, and `models/response_models.py`. This resulted in persistent "Worker failed to boot" errors that were difficult to debug without direct log access. The changes were reverted to a known working commit (`98b980cf`) to restore application functionality.
### Operational Learnings
* **File Location:** The `GEMINI.md` file must always be located in the project root directory.
* **Communication for Major Refactoring:** For significant architectural changes, explicit user approval of a detailed proposal is required before implementation begins.
* **Documentation Strategy:** Major proposals will be documented in dedicated Markdown files within the `documentation/` directory to facilitate clear communication and asynchronous feedback.
* **Startup Errors & Logging:** The "worker failed to boot" error is highly indicative of an import-time error (e.g., circular dependency, missing module, or misconfiguration). A common cause is the logging configuration in `app/log.py` failing due to an uninitialized setting or file path issue.
* **Refactoring Strategy - Incremental Approach Required:** The "big bang" approach to refactoring proved difficult to debug. A more incremental strategy is required for the next attempt:
1. **Isolate Logging:** First, refactor all modules (e.g., `main.py`, `models/response_models.py`) to instantiate their own module-level logger (`logger = logging.getLogger(__name__)`) instead of importing a global `log` instance from `app.log`. This will break potential circular dependencies related to logging.
2. **Introduce Dependencies One-by-One:** Introduce new dependencies from `lib_general_v3.py` one at a time.
3. **Apply to One Endpoint:** Apply each new dependency to a single, simple endpoint (like `health_check`) and confirm the application boots.
4. **Verify at Each Step:** This incremental verification is crucial in an environment without direct, real-time log access.
* **Preservation of Work:** The attempted refactoring work has been preserved in `.snapshot` files for future reference.
---
## Learnings from previous session (December 2, 2025):
* **Docker Environment Challenges:** Debugging issues in a Dockerized FastAPI environment when running locally (outside the container's execution context) is significantly more challenging due to environment mismatches and symlinked executables. Direct `uvicorn` execution for debugging is not viable in this setup. This necessitates an approach that can either:
* Execute Python code snippets directly (e.g., for import validation).
* Rely on external tools (like `curl` or `requests` from another script) to interact with the Dockerized API for runtime testing.
* Assume that the Docker container provides the authoritative runtime environment, and local checks are primarily for static analysis (syntax, imports).
* **Need for Incremental Verification:** Given the complexity of the project and the debugging constraints, future changes must be exceptionally small, incremental, and verified through a robust testing strategy that can be executed, ideally, within the Docker environment or through isolated Python scripts.
* **Pydantic `BaseModel` Import:** Simple Pydantic `BaseModel` imports can be forgotten, leading to `NameError`. This highlights the need for automated linting or a minimal test harness that can quickly validate new model definitions in isolation.
* **Legacy Code vs. New Code:** When encountering errors, it's crucial to distinguish whether the error is in the new code being introduced or in existing legacy code that might have subtle interactions. The `422 Unprocessable Entity` errors were occurring in the legacy `/crud/` endpoints. This indicates that while our new factory code *itself* didn't cause those specific runtime errors, interactions or an underlying pre-existing issue became apparent when the application was loading.
* **`parent_table_name` for Nested CRUD:** The implementation detail of passing `parent_table_name` (or similar context) to child CRUD operations is essential for correctly linking nested resources in the database layer. The router factory's child object creation needs to pass this context explicitly.
* **API Endpoint Naming Convention:** The user prefers singular nouns for API endpoints (e.g., `/journal`, `/journal_entry`) over plural forms (e.g., `/journals`, `/journal_entries`). This convention will be followed for all new router creations.

View File

@@ -7,6 +7,7 @@ that are relevant to the v3 API, while removing unused or outdated functionaliti
# Standard library imports # Standard library imports
import time import time
import logging import logging
import jwt
from typing import ( from typing import (
Any, Any,
Dict, Dict,
@@ -34,64 +35,120 @@ from pydantic import (
# Internal imports (from this project) # Internal imports (from this project)
from app.config import settings from app.config import settings
from app.db_sql import redis_lookup_id_random
from app.log import get_logger
logger = get_logger(__name__) logger = logging.getLogger(__name__)
# --- Pydantic Model for Account Context --- def decode_jwt(
class AccountContext(BaseModel): secret_key: str,
account_id: Optional[int] token: str,
account_id_random: Optional[str] ) -> dict:
# --- Dependency Function for Account Context ---
def get_account_context(
x_account_id: Optional[str] = Header(None, min_length=11, max_length=22),
x_no_account_id: Optional[str] = Header(None, min_length=3, max_length=100), # Assuming 'bypass' or similar string
x_no_account_id_token: Optional[str] = Query(None, min_length=11, max_length=22),
) -> AccountContext:
""" """
Resolves the account context from headers/query parameters with defined precedence. Decodes and validates a JWT token.
Precedence: x_account_id (header) > x_no_account_id_token (query) > x_no_account_id (header flag) Ported from lib_general.py to break circular dependencies.
Raises HTTPException 403 if no valid account is found and no bypass is indicated.
""" """
logger.setLevel(logging.DEBUG) # Adjust as needed algorithm = 'HS256'
logger.debug(locals()) try:
decoded_token = jwt.decode(token, secret_key, algorithms=[algorithm])
if decoded_token['eat'] >= time.time():
return decoded_token
else:
return False
except Exception:
return None
resolved_account_id = None
resolved_account_id_random = None
# --- Pydantic Model for Authentication Context ---
class AuthContext(BaseModel):
account_id: Optional[int] = None
account_id_random: Optional[str] = None
user_id: Optional[int] = None
person_id: Optional[int] = None
auth_method: str = 'none' # 'jwt_header', 'jwt_query', 'legacy_header', 'bypass'
# Alias for backward compatibility with initial V3 implementation
AccountContext = AuthContext
# --- Dependency Function for V3 Authentication ---
def get_v3_auth_context(
request: Request,
authorization: Optional[str] = Header(None, description="Bearer <jwt_token>"),
jwt_query: Optional[str] = Query(None, alias="jwt", description="JWT token for URL-based auth (e.g., file downloads)"),
x_account_id: Optional[str] = Header(None, min_length=11, max_length=22, description="Legacy X-Account-ID header"),
x_no_account_id: Optional[str] = Header(None, min_length=3, max_length=100, description="Bypass account context header"),
) -> AuthContext:
"""
Standardized V3 Authentication Dependency.
Supports JWT in Authorization header (Bearer) OR 'jwt' query parameter.
Falls back to legacy headers for backward compatibility.
"""
# Defer import to break circular dependency
from app.db_sql import redis_lookup_id_random
# 1. Check for JWT (Header preferred, then Query for downloads)
token = None
method = 'none'
if authorization and authorization.startswith("Bearer "):
token = authorization.split(" ")[1]
method = 'jwt_header'
elif jwt_query:
token = jwt_query
method = 'jwt_query'
if token:
payload = decode_jwt(settings.JWT_KEY, token)
if payload:
logger.info(f"JWT Validated ({method}). User: {payload.get('user_id')}, Account: {payload.get('account_id')}")
return AuthContext(
account_id=payload.get('account_id'),
account_id_random=payload.get('public_key'), # existing sign_jwt uses public_key for id_random
user_id=payload.get('user_id'),
person_id=payload.get('person_id'),
auth_method=method
)
else:
logger.warning(f"Invalid or expired JWT provided via {method}")
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid or expired authentication token.")
# 2. Legacy / Testing Fallback: x_account_id
if x_account_id: if x_account_id:
# Primary check: x_account_id header
resolved_account_id_random = x_account_id
if looked_up_id := redis_lookup_id_random(table_name='account', record_id_random=x_account_id): if looked_up_id := redis_lookup_id_random(table_name='account', record_id_random=x_account_id):
resolved_account_id = looked_up_id logger.info(f"Authenticated via legacy header: {looked_up_id}")
logger.info(f'Found account from x_account_id header: {resolved_account_id}') return AuthContext(
account_id=looked_up_id,
account_id_random=x_account_id,
auth_method='legacy_header'
)
else: else:
logger.warning(f'Invalid x_account_id header provided: {x_account_id}') raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Invalid X-Account-ID header.")
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail='Invalid X-Account-ID header.')
elif x_no_account_id_token:
# Secondary check: x_no_account_id_token query parameter
resolved_account_id_random = x_no_account_id_token
if looked_up_id := redis_lookup_id_random(table_name='account', record_id_random=x_no_account_id_token):
resolved_account_id = looked_up_id
logger.info(f'Found account from x_no_account_id_token query: {resolved_account_id}')
else:
logger.warning(f'Invalid x_no_account_id_token query provided: {x_no_account_id_token}')
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail='Invalid X-No-Account-ID-Token query parameter.')
elif x_no_account_id:
# Tertiary check: x_no_account_id header for bypass
# For now, just presence indicates bypass. Can add a specific value check later if needed.
logger.info(f'X-No-Account-ID header found: {x_no_account_id}. Proceeding without specific account context.')
resolved_account_id = None # Explicitly None for "no specific account"
resolved_account_id_random = '--- NO ACCOUNT ---'
else:
logger.warning('No valid account context provided via X-Account-ID, X-No-Account-ID-Token, or X-No-Account-ID.')
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail='Account context required. Please provide X-Account-ID, X-No-Account-ID-Token, or X-No-Account-ID.')
return AccountContext(account_id=resolved_account_id, account_id_random=resolved_account_id_random) # 3. Bypass Fallback
if x_no_account_id:
logger.info("Authentication bypassed via X-No-Account-ID")
return AuthContext(
account_id_random='--- NO ACCOUNT ---',
auth_method='bypass'
)
# 4. No Auth Found
logger.warning("No authentication provided for V3 endpoint.")
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="Authentication required. Provide Authorization header or 'jwt' query parameter."
)
# --- Legacy wrapper to avoid breaking current V3 code ---
def get_account_context(
auth: AuthContext = Depends(get_v3_auth_context)
) -> AuthContext:
"""
Alias for the new auth dependency to maintain compatibility
with existing V3 routes.
"""
return auth
# --- Pydantic Model for Pagination --- # --- Pydantic Model for Pagination ---

View File

@@ -3,8 +3,6 @@ import datetime, pytz
from typing import Any, Dict, List, Optional, Set, Union from typing import Any, Dict, List, Optional, Set, Union
from pydantic import BaseModel, EmailStr, Field, Json, PrivateAttr, ValidationError, validator from pydantic import BaseModel, EmailStr, Field, Json, PrivateAttr, ValidationError, validator
from app.db_sql import redis_lookup_id_random
import logging import logging
log = logging.getLogger(__name__) log = logging.getLogger(__name__)

View File

@@ -75,7 +75,35 @@ Use the `q` property in your search body for a general keyword search across ind
--- ---
## 3. Best Practices for V3 ## 4. Authentication in V3
V3 supports multiple authentication methods. The backend resolves these automatically.
### A. Standard Requests (Header)
For most API calls, use the standard Bearer token in the `Authorization` header.
```ts
// Example: Setting the JWT in headers
headers: {
"Authorization": `Bearer ${user_jwt_token}`
}
```
### B. Secure File Downloads (URL Parameter)
**Crucial for `hosted_file` and `event_file`**: To allow browsers to download files without complex header modifications, you can pass the JWT directly in the URL.
```ts
// Example: Creating a secure download link
// GET /v3/crud/hosted_file/{id}/?jwt={token}
const downloadUrl = `${BASE_URL}/hosted_file/${fileId}/?jwt=${jwtToken}`;
```
### C. Legacy Fallback (X-Account-ID)
For development and backward compatibility, the `X-Account-ID` header is still supported but should be phased out in favor of JWT.
---
## 5. Best Practices for V3
1. **Use `view` for Rich Data**: Instead of manually joining data in separate calls, use `?view=enriched` or `?view=detail` if defined in the backend. 1. **Use `view` for Rich Data**: Instead of manually joining data in separate calls, use `?view=enriched` or `?view=detail` if defined in the backend.
2. **Hybrid Search**: Use query parameters for simple toggles (enabled/hidden) and the POST body for complex logic. 2. **Hybrid Search**: Use query parameters for simple toggles (enabled/hidden) and the POST body for complex logic.