Docs: Update Unified Agent Architecture and Platform Roadmap.
This commit is contained in:
66
GEMINI.md
66
GEMINI.md
@@ -2,15 +2,9 @@
|
|||||||
|
|
||||||
## My Role and Operating Principles
|
## My Role and Operating Principles
|
||||||
|
|
||||||
I am an interactive CLI agent assisting with software engineering tasks for One Sky IT, LLC, primarily on the Aether API project. My core mandates include:
|
I am the **primary orchestrator and main helper** for the development of the **Unified Aether AI Agent (UE-AE-01)**. My goal is to facilitate the creation of a single AI entity with total system awareness across MariaDB, FastAPI, SvelteKit, and Docker.
|
||||||
- Adhering to project conventions and existing code style.
|
|
||||||
- **Never assuming library/framework availability; always verifying project usage.**
|
---
|
||||||
- Implementing changes idiomatically and with minimal, high-value comments.
|
|
||||||
- Being proactive, including adding tests for new features/fixes.
|
|
||||||
- **Confirming ambiguity or actions beyond clear scope with the user.**
|
|
||||||
- Prioritizing user control and project conventions.
|
|
||||||
- **Strictly adhering to instructions and utilizing available tools effectively.**
|
|
||||||
- **Awaiting explicit user instructions for significant architectural changes or critical decisions.**
|
|
||||||
|
|
||||||
## Project Context - Aether API (FastAPI)
|
## Project Context - Aether API (FastAPI)
|
||||||
|
|
||||||
@@ -31,59 +25,41 @@ I am an interactive CLI agent assisting with software engineering tasks for One
|
|||||||
### Technical Learnings
|
### Technical Learnings
|
||||||
- **Startup Errors & Logging:** The "worker failed to boot" error is often an import-time error or a logging configuration failure.
|
- **Startup Errors & Logging:** The "worker failed to boot" error is often an import-time error or a logging configuration failure.
|
||||||
- **Root Cause:** If `logging.config.dictConfig` fails (e.g., due to missing `/logs` directories in Docker), the entire application crashes.
|
- **Root Cause:** If `logging.config.dictConfig` fails (e.g., due to missing `/logs` directories in Docker), the entire application crashes.
|
||||||
- **Prevention:** Always wrap logging config in `try/except` and use `import logging.config` explicitly.
|
- **Circular Dependencies during Refactoring:** Even deferred imports can trigger boot failures during FastAPI's introspection phase if the module structure is fragile. "Isolation Mode" (local definitions in routers) is a confirmed temporary fix.
|
||||||
- **Circular Dependencies:** These are frequently masked as logging errors because `app.log` is imported very early in most files. Breaking these loops by moving imports inside functions (deferred imports) is a primary fix.
|
|
||||||
- **Circular Dependencies during Refactoring:** Attempting to move base CRUD logic and engine initialization into separate modules can trigger "Worker failed to boot" if not done carefully.
|
|
||||||
- **Issue:** Moving `db` and `engine` to a separate file like `db_connection.py` often creates circular loops with `db_sql.py` or `log.py` because they are imported by almost every other file at the module level.
|
|
||||||
- **Resolution:** A "Facade Pattern" was used for `db_sql.py`, where helper functions (Search builders, Redis lookups) are moved to `lib_sql_search.py` and `lib_redis_helpers.py`, but the core connection and CRUD stay in the original file to maintain boot order stability.
|
|
||||||
- **V3 API Dependencies:** Standardized `Response` injection should use plain type hints (e.g., `response: Response`) to avoid router initialization failures.
|
- **V3 API Dependencies:** Standardized `Response` injection should use plain type hints (e.g., `response: Response`) to avoid router initialization failures.
|
||||||
- **Pydantic Compatibility:** The current environment uses Pydantic v1.10. Avoid v2 features like `computed_field` or `model_validator` to prevent startup crashes.
|
|
||||||
|
|
||||||
### V3 Architectural Progress (Jan 2026)
|
### V3 Architectural Progress (Jan 2026)
|
||||||
|
|
||||||
- **Modular Object Definitions:** Monolithic `ae_obj_types_def.py` refactored into domain-specific files in `app/object_definitions/`.
|
- **Modular Object Definitions:** Monolithic `ae_obj_types_def.py` refactored into domain-specific files in `app/object_definitions/`.
|
||||||
- **Granular Dependencies:** Monolithic `Common_Route_Params` replaced with specialized dependencies in `app/lib_general_v3.py` (AccountContext, Pagination, StatusFilter, Serialization, Delay).
|
|
||||||
- **Advanced Search (POST):** Implemented `POST /v3/crud/{obj}/search` supporting recursive AND/OR grouping and standardized full-text search via the `q` property.
|
- **Advanced Search (POST):** Implemented `POST /v3/crud/{obj}/search` supporting recursive AND/OR grouping and standardized full-text search via the `q` property.
|
||||||
- **Security Hardening:** Implemented a 5-level recursion depth limit and a field allowlist (`searchable_fields`) for the Search API.
|
- **Security Hardening:** Implemented a 5-level recursion depth limit and a field allowlist (`searchable_fields`) for the Search API.
|
||||||
- **Non-blocking Concurrency:** Standardized on `asyncio.sleep()` for delay simulation to prevent Gunicorn worker hangs.
|
|
||||||
|
|
||||||
## Session Learnings & Progress (Jan 2-7, 2026)
|
## Session Learnings & Progress (Jan 2-7, 2026)
|
||||||
|
|
||||||
### V3 API Security Hardening (Jan 7, 2026) - MILESTONE
|
### V3 API Security Hardening (Jan 7, 2026) - MILESTONE
|
||||||
- **Mandatory JWT Authentication**: Successfully implemented strict multi-tenant isolation across all V3 CRUD and Search endpoints.
|
- **Mandatory JWT Authentication**: Successfully implemented strict multi-tenant isolation across all V3 CRUD and Search endpoints.
|
||||||
- All requests (except context resolution) now require a valid JWT `Authorization: Bearer <token>` or `?jwt=<token>`.
|
|
||||||
- **Account Isolation**: results are automatically filtered by `account_id` from the JWT.
|
- **Account Isolation**: results are automatically filtered by `account_id` from the JWT.
|
||||||
- **Documentation**: Updated `V3_FRONTEND_API_GUIDE.md` with explicit instructions and security requirements for the frontend agent.
|
- **Bootstrap Paradox Exception**: `site_domain` search is explicitly allowed for unauthenticated guests to unblock site context resolution.
|
||||||
|
|
||||||
### Agent Bridge & Docker Integration
|
### Unified Agent Architecture
|
||||||
- **Agent Bridge Implementation**: Developed `app/routers/agent_bridge.py` for environment diagnostics.
|
- **Refined Specification**: Incorporated feedback from the Frontend Svelte agent. The Unified Agent will handle **Automated Schema Synchronization**, **Log Stream Aggregation**, and **Automated Lifecycle Management**.
|
||||||
- **MCP Docker Explorer**: Attempted to run `mcp_docker_explorer.py`, but failed with `ModuleNotFoundError: No module named 'mcp'`.
|
|
||||||
- **Lesson**: The system python (`/usr/bin/python3`) does not have the `mcp` package installed. We must use the specific virtual environment `env_mcp` (e.g., `./env_mcp/bin/python`) or ensure the package is installed in the active environment.
|
|
||||||
|
|
||||||
### V3 CRUD Infrastructure & Search
|
### Infrastructure & Progress
|
||||||
- **Modular Object Definitions**: Refactored `ae_obj_types_def.py` into modular domain files in `app/object_definitions/`.
|
- [x] **Modularize `lib_general.py`**: Successfully extracted Email, Export, JWT, and Hash functions into specialized modules (`lib_email.py`, `lib_export.py`, `lib_jwt.py`, `lib_hash.py`).
|
||||||
- **Advanced Search Fixes**:
|
|
||||||
- Resolved account listing and search issues by implementing `get_supported_filters` in `api_crud_v3.py`.
|
|
||||||
- Improved standardized full-text search (`q` parameter) with fallback logic for missing columns.
|
|
||||||
- **Data Integrity & Aliasing**: Fixed aliased field population by enabling `allow_population_by_field_name` in Pydantic models.
|
|
||||||
|
|
||||||
### Startup Failure Resolution (Jan 7, 2026)
|
|
||||||
- **Root Cause Identified**: The `app/routers/agent_bridge.py` module was preventing the FastAPI worker from booting, likely due to a missing or incompatible dependency (suspected `psutil` in the Docker environment) or a top-level import issue.
|
|
||||||
- **Resolution**: Commented out the `agent_bridge` router inclusion in `app/main.py`.
|
|
||||||
- **Status**: The API server has successfully started.
|
|
||||||
- **Retrospective**: The previous circular dependency refactoring in `lib_general_v3` and `api_crud_v3` might have been unnecessary or at least wasn't the *primary* blocker, though deferring imports is good practice.
|
|
||||||
|
|
||||||
## Current To-Do List
|
## Current To-Do List
|
||||||
|
|
||||||
1. **Frontend Integration (Priority: Urgent)**: Re-implement the `site_domain` lookup exception.
|
### 1. High Priority & Urgent
|
||||||
- *Constraint*: Must allow searching `site_domain` without an `account_id` or JWT.
|
- [ ] **Initialize `aether_platform` Project** (Priority: High): Create the root directory at `/home/scott/OSIT_dev/aether_platform/` and establish the initial meta-structure.
|
||||||
- *Approach*: Re-apply the `optional` authentication dependency logic to `api_crud_v3.py` and `lib_general_v3.py`, now that the server is stable.
|
- [ ] **Unified Agent Architecture Document** (Priority: High): Refine and synchronize the final spec (Draft Done).
|
||||||
2. **Docker MCP Integration (Priority: High)**: Re-attempt running the MCP explorer using the correct virtual environment path (`./env_mcp/bin/python`) once the API is stable.
|
- [ ] **Permanent Dependency Fix** (Priority: Urgent): Migrate `AccountContext` and Auth logic to a dedicated module.
|
||||||
3. **Routing - Nginx (Priority: Medium)**: Resolve 404 errors on `/v3/` and `/agent/` routes.
|
|
||||||
4. **Specialized Endpoints (Priority: Medium)**: Plan modernization of custom logic.
|
### 2. Infrastructure & Environment
|
||||||
5. **Agent Bridge Repair (Priority: Low)**: Investigate why `agent_bridge.py` crashes the server (check `psutil` availability).
|
- [ ] **Docker MCP Integration**: Re-attempt diagnostics using the correct python path (`./env_mcp/bin/python`).
|
||||||
|
- [ ] **Agent Bridge Repair**: Resolve the `psutil` or syntax issues in `app/routers/agent_bridge.py`.
|
||||||
|
- [ ] **Nginx Configuration**: Resolve 404 errors on Port 8888 routes.
|
||||||
|
|
||||||
### Workflow & Collaboration
|
### Workflow & Collaboration
|
||||||
- **`GEMINI.md` Strategy:** The user is creating `GEMINI.md` files in key project directories. Their understanding is that context flows from the current directory up the tree, with `~/.gemini/GEMINI.md` serving as a global catch-all for general memories.
|
- **`GEMINI.md` Strategy:** Context flows up the tree.
|
||||||
- **Agents Sync (rsync):** Shared documentation, notes, and architectural updates are pushed to the `agents_sync` directory using `rsync`. This allows real-time coordination between different specialized agents (e.g., FastAPI backend and Svelte frontend agents).
|
- **Agents Sync (rsync):** Shared documentation and notifications pushed to `~/agents_sync/`.
|
||||||
- **Home Server:** The user self-hosts a Proxmox server for services like Nextcloud.
|
- **Home Server:** Remote proxy at `https://dev-api.oneskyit.com`.- [x] **Establish Symbolic Links**: Linked API, App, and Env into aether_platform.
|
||||||
|
|||||||
101
documentation/UNIFIED_AGENT_ARCH.md
Normal file
101
documentation/UNIFIED_AGENT_ARCH.md
Normal file
@@ -0,0 +1,101 @@
|
|||||||
|
# Specification: Unified Aether AI Agent (UE-AE-01)
|
||||||
|
|
||||||
|
## 1. Vision & Purpose
|
||||||
|
The **Unified Aether AI Agent** is a single, cohesive AI entity designed to eliminate the friction of multi-agent coordination. It possesses "Total System Awareness," allowing it to understand how a change in the database schema on a remote server impacts the FastAPI backend, the Nginx proxy, and the SvelteKit frontend simultaneously.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. System Architecture & Operational Domains
|
||||||
|
|
||||||
|
### A. Data Layer (MariaDB)
|
||||||
|
* **Location:** Separate Virtual Server (Remote VM).
|
||||||
|
* **Role:** Master data storage.
|
||||||
|
* **Agent Access Requirements:**
|
||||||
|
* Remote SQL execution capabilities.
|
||||||
|
* SSH access for database maintenance and schema inspection.
|
||||||
|
* Knowledge of cross-server connection strings and security groups.
|
||||||
|
|
||||||
|
### B. Caching & Messaging Layer (Redis)
|
||||||
|
* **Location:** Docker Container (Main Workstation).
|
||||||
|
* **Role:** Session management, ID resolution (Random ID mapping), and real-time messaging.
|
||||||
|
* **Agent Access Requirements:**
|
||||||
|
* Ability to execute `redis-cli` commands via Docker.
|
||||||
|
* Direct inspection of key-value pairs for troubleshooting.
|
||||||
|
|
||||||
|
### C. API Backend Layer (FastAPI / Python)
|
||||||
|
* **Location:** Docker Container (Main Workstation).
|
||||||
|
* **Role:** Business logic, CRUD V3 implementation, JWT authentication, and multi-tenant isolation.
|
||||||
|
* **Agent Access Requirements:**
|
||||||
|
* Full filesystem access to `osit-api-fastapi/`.
|
||||||
|
* Ability to manage Python environments and dependencies.
|
||||||
|
* Docker container management (logs, restarts, shell execution).
|
||||||
|
|
||||||
|
### D. Frontend Layer (SvelteKit / TypeScript)
|
||||||
|
* **Location:** Local Filesystem (Main Workstation).
|
||||||
|
* **Role:** User Interface, API consumption, and client-side state management.
|
||||||
|
* **Agent Access Requirements:**
|
||||||
|
* Full filesystem access to SvelteKit project directories.
|
||||||
|
* Ability to execute build tools (npm, vite) and linting (eslint, prettier).
|
||||||
|
* Browser automation for E2E testing (Playwright).
|
||||||
|
|
||||||
|
### E. Routing Layer (Nginx)
|
||||||
|
* **Location:** Host System or Docker.
|
||||||
|
* **Role:** SSL termination and reverse proxying for the API and Frontend.
|
||||||
|
* **Agent Access Requirements:**
|
||||||
|
* Ability to modify and reload Nginx configuration files.
|
||||||
|
* Diagnostic access to Nginx access/error logs.
|
||||||
|
|
||||||
|
### F. Storage Layer (Syncthing / Hosted Files)
|
||||||
|
* **Location:** `/home/scott/OSIT/hosted_files/` (Main Workstation) and synchronized Remote Servers.
|
||||||
|
* **Role:** Extremely important persistent storage for files served via the API (e.g., `hosted_file`, `event_file`).
|
||||||
|
* **Synchronization:** Managed via **Syncthing** (similar to the `agents_sync` directory), ensuring real-time mirroring across the Aether ecosystem.
|
||||||
|
* **Agent Access Requirements:**
|
||||||
|
* Full filesystem access to the local hosted files directory.
|
||||||
|
* Ability to verify synchronization status and resolve conflicts.
|
||||||
|
* Understanding of the relationship between file metadata in MariaDB and physical assets in this directory.
|
||||||
|
|
||||||
|
### G. Workstation Development Environment
|
||||||
|
* **Base Path:** `/home/scott/OSIT_dev/`
|
||||||
|
* **Project Repositories:**
|
||||||
|
* `aether_container_env/`: Docker Compose and environment configuration.
|
||||||
|
* `aether_api_fastapi/`: The Python/FastAPI backend source.
|
||||||
|
* `ae_app_svelte_tailwind_skeleton/`: The SvelteKit/TypeScript frontend source.
|
||||||
|
* **Network & Proxy Path:**
|
||||||
|
* Docker containers on the workstation are proxied via **Nginx on a separate Home Server** (Proxmox VM hosting Home Assistant, Jellyfin, Jitsi, etc.).
|
||||||
|
* **External Access URL:** `https://dev-api.oneskyit.com`
|
||||||
|
* **Agent Access Requirements:**
|
||||||
|
* Total awareness of the inter-connected paths between these three main directories.
|
||||||
|
* Knowledge of the home server's proxy logic to debug external connectivity vs. internal container health.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Communication & Context Strategy
|
||||||
|
|
||||||
|
### A. Integrated Global Memory
|
||||||
|
The Unified Agent will move away from separate `GEMINI.md` files in favor of a **Global System Context**. This context tracks:
|
||||||
|
1. **Service Map:** Mapping of ports, paths (e.g., `/v3/crud/`), and container IDs.
|
||||||
|
2. **Dependency Graph:** Visualizing how modules across different repositories interact.
|
||||||
|
3. **Boot Order Logic:** Understanding the fragile initialization requirements of the stack.
|
||||||
|
|
||||||
|
### B. Agent Sync Orchestration
|
||||||
|
The agent acts as the primary orchestrator for the `~/agents_sync/` directory:
|
||||||
|
- **Log Aggregation:** Pulling logs from MariaDB, FastAPI, and Nginx into a central diagnostic stream.
|
||||||
|
- **Inbound Messaging:** Processing user instructions from the `inbox/` and updating "System Health" status files.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Key Capabilities
|
||||||
|
|
||||||
|
1. **Cross-Stack Debugging:** Tracing a "500 Internal Server Error" from a Svelte fetch call, through the Nginx proxy, into the FastAPI logic, and finally identifying the missing column in the remote MariaDB table.
|
||||||
|
2. **Automated Schema Synchronization:** Reading Pydantic models and MariaDB table schemas to automatically generate and update TypeScript interfaces and `.editable_fields.ts` definitions in the Svelte project (`src/lib/ae_core/`).
|
||||||
|
3. **Log Stream Aggregation:** Simultaneous monitoring of Svelte console output, Nginx access/error logs, and FastAPI container logs to provide instant root-cause identification for cross-stack failures.
|
||||||
|
4. **Automated Lifecycle Management:** Orchestrating the "Change-Restart-Verify" loop. The agent should automatically trigger targeted Docker container restarts whenever backend code is modified to ensure the frontend is always interacting with the latest logic.
|
||||||
|
5. **Environment-Aware Refactoring:** Safely breaking up monolithic files (like `lib_general.py`) while knowing exactly which services are impacted and verifying them across the full stack.
|
||||||
|
6. **Automated Full-Stack Verification:** Writing a backend migration and a frontend UI component in a single turn, then verifying the integration with an automated test suite.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Security & Safety
|
||||||
|
- **Credential Isolation:** Secrets and API keys remain in `.env` files; the agent only manages the logic to use them.
|
||||||
|
- **Incremental Deployment:** Changes are applied service-by-service with health checks at every stage.
|
||||||
|
- **Sandboxing Awareness:** The agent operates with the knowledge that it is running directly on the user's workstation and remote infrastructure.
|
||||||
Reference in New Issue
Block a user