Files
OSIT-AE-API-FastAPI/documentation/V3_CRUD_ARCHITECTURE_AND_LEARNINGS.md
Scott Idem 6d13b952c4 Implement V3 API security hardening and multi-tenant data isolation
- Enhanced AuthContext with role-aware fields (administrator, manager, super).
- Implemented deferred database lookups for user roles in get_v3_auth_context.
- Added global account isolation in api_crud_v3.py using check_account_access and apply_forced_account_filter.
- Hardened all V3 CRUD endpoints (GET, POST, PATCH, DELETE) and nested routes with ownership verification.
- Enforced forced account filtering at the SQL level for Listing and Searching.
- Updated documentation with details on the new security and data isolation architecture.
2026-01-07 13:34:38 -05:00

6.1 KiB

Aether API V3 CRUD: Architecture and Learnings

This document summarizes the development of the V3 CRUD API, the architectural choices made, and the lessons learned during the process.

1. V3 CRUD Architecture

The V3 CRUD API (/v3/crud/) is designed to run in parallel with legacy V1 and V2 endpoints. It introduces a hierarchical, nested URL structure and leverages modern FastAPI features for better maintainability and performance.

Key Features:

  • Nested URL Structure: Enforces parent-child relationships (e.g., /v3/crud/site/{site_id}/site_domain/).
  • Granular Dependencies: Instead of a monolithic common parameters object, V3 uses specialized, reusable dependencies from app/lib_general_v3.py:
    • AccountContext: Resolves account ID with clear precedence (Header > Query Token > Bypass Header).
    • PaginationParams: Standardizes limit and offset.
    • StatusFilterParams: Handles enabled and hidden status filtering.
    • SerializationParams: Controls Pydantic serialization options (by_alias, exclude_unset).
    • DelayParams: Facilitates optional latency simulation for testing.
  • Non-blocking Delay: Uses await asyncio.sleep() to simulate network latency without blocking the Gunicorn worker's event loop.
  • Data-Driven Configuration: Uses the modern format in app/ae_obj_types_def.py to map objects to tables and models.
  • Advanced Search (POST): Supports complex, nested filtering via POST /v3/crud/{obj_type}/search.
    • Recursive AND/OR logic.
    • Full operator support: eq, ne, gt, gte, lt, lte, in, is_null, is_not_null, like, contains, startswith, endswith.
    • Safe parameterization using unique generated names (e.g., :sp_1) to prevent collisions.

2. Backward Compatibility Strategy

To ensure that the introduction of V3 doesn't break legacy V1 and V2 endpoints:

  • Parallel Routes: All V3 logic is isolated in app/routers/api_crud_v3.py.
  • Hybrid Configuration: The obj_type_kv_li dictionary in app/ae_obj_types_def.py has been updated to include both modern keys (e.g., tbl, mdl) and legacy keys (e.g., table_name, base_name). This allows V2 endpoints to continue functioning normally while V3 endpoints use the refined structure.
  • Stable Core Imports: Core modules like app/lib_general.py and app/models/response_models.py have been refactored to maintain their existing exports (log, logging, mk_resp) while internally adopting more robust module-level logging.

3. Learnings and Best Practices

Logging and Startup Stability

  • Isolate Loggers: Modules should instantiate their own module-level loggers (logging.getLogger(__name__)) rather than importing a global instance. This breaks circular dependencies and improves traceability.
  • Robust Configuration: Logging configuration (dictConfig) should be wrapped in try...except to prevent application crashes due to environment issues (e.g., missing log directories in Docker).
  • Explicit Imports: Always use import logging.config before calling logging.config.dictConfig to ensure the module is fully initialized.

FastAPI and Pydantic

  • Dependency Injection: Use response: Response as a type hint for standard injection. Avoid Depends(Response) as it is not a valid dependency provider and can cause router initialization failures.
  • Python Parameter Order: In function signatures, non-default arguments (like response: Response) must precede arguments with default values or Depends().
  • Async Concurrency: Use asyncio.sleep() instead of time.sleep() in async endpoints. Blocking the event loop in a high-concurrency environment leads to worker timeouts and 502 Bad Gateway errors.
  • Pydantic Compatibility: Ensure all new models and utility functions remain compatible with Pydantic v1.10 until a full project-wide migration to v2 is planned. Avoid V2-only features like computed_field or model_validator.

4. Current Migration Status

The following objects have been migrated to the modern V3 configuration and are supported by the V3 API:

  • journal, journal_entry
  • site, site_domain
  • account, account_cfg
  • address, contact
  • order, order_line
  • organization, page
  • data_store, activity_log
  • archive, archive_content
  • hosted_file, post, post_comment
  • person, user
  • lu_country, lu_country_subdivision, lu_time_zone

5. Security and Data Isolation

Implemented in Jan 2026, the V3 architecture enforces strict data isolation and role-aware authorization.

Role-Aware Authorization (AuthContext)

  • Deferred User Lookup: When a JWT is provided, the API performs a deferred database lookup via load_user_obj to populate administrator, manager, and super flags.
  • Role Scoping: These flags are used to bypass account isolation for support and system maintenance tasks.

Multi-Tenant Isolation

  • Forced Account Filtering: For all non-super users, the API automatically injects an account_id filter into every list and search query. This is enforced at the SQL level via apply_forced_account_filter.
  • Post-Retrieval Verification: Single object retrievals (GET), updates (PATCH), and deletions (DELETE) include a secondary ownership check (check_account_access). If a user attempts to access an ID belonging to another account, they receive a 403 Forbidden response.
  • Hierarchical Verification: Child/Nested endpoints verify the ownership of the parent object before allowing operations on children, preventing cross-tenant "sideways" traversal.
  • Creation Guard: When creating records (POST), the user's account_id is automatically forced onto the new record, preventing a user from creating data for another account.

Bypass and Utility Access

  • X-No-Account-ID: A specialized header used by development utilities and administrative scripts to bypass standard account isolation. This header grants super permissions and should only be used in trusted internal environments.
  • JWT Query Parameter: Supported for direct file downloads and sharing links where custom headers cannot be provided.