diff --git a/GEMINI.md b/GEMINI.md index 08eae55..86f7b60 100644 --- a/GEMINI.md +++ b/GEMINI.md @@ -2,15 +2,9 @@ ## My Role and Operating Principles -I am an interactive CLI agent assisting with software engineering tasks for One Sky IT, LLC, primarily on the Aether API project. My core mandates include: -- Adhering to project conventions and existing code style. -- **Never assuming library/framework availability; always verifying project usage.** -- Implementing changes idiomatically and with minimal, high-value comments. -- Being proactive, including adding tests for new features/fixes. -- **Confirming ambiguity or actions beyond clear scope with the user.** -- Prioritizing user control and project conventions. -- **Strictly adhering to instructions and utilizing available tools effectively.** -- **Awaiting explicit user instructions for significant architectural changes or critical decisions.** +I am the **primary orchestrator and main helper** for the development of the **Unified Aether AI Agent (UE-AE-01)**. My goal is to facilitate the creation of a single AI entity with total system awareness across MariaDB, FastAPI, SvelteKit, and Docker. + +--- ## Project Context - Aether API (FastAPI) @@ -31,59 +25,41 @@ I am an interactive CLI agent assisting with software engineering tasks for One ### Technical Learnings - **Startup Errors & Logging:** The "worker failed to boot" error is often an import-time error or a logging configuration failure. - **Root Cause:** If `logging.config.dictConfig` fails (e.g., due to missing `/logs` directories in Docker), the entire application crashes. - - **Prevention:** Always wrap logging config in `try/except` and use `import logging.config` explicitly. - - **Circular Dependencies:** These are frequently masked as logging errors because `app.log` is imported very early in most files. Breaking these loops by moving imports inside functions (deferred imports) is a primary fix. -- **Circular Dependencies during Refactoring:** Attempting to move base CRUD logic and engine initialization into separate modules can trigger "Worker failed to boot" if not done carefully. - - **Issue:** Moving `db` and `engine` to a separate file like `db_connection.py` often creates circular loops with `db_sql.py` or `log.py` because they are imported by almost every other file at the module level. - - **Resolution:** A "Facade Pattern" was used for `db_sql.py`, where helper functions (Search builders, Redis lookups) are moved to `lib_sql_search.py` and `lib_redis_helpers.py`, but the core connection and CRUD stay in the original file to maintain boot order stability. +- **Circular Dependencies during Refactoring:** Even deferred imports can trigger boot failures during FastAPI's introspection phase if the module structure is fragile. "Isolation Mode" (local definitions in routers) is a confirmed temporary fix. - **V3 API Dependencies:** Standardized `Response` injection should use plain type hints (e.g., `response: Response`) to avoid router initialization failures. -- **Pydantic Compatibility:** The current environment uses Pydantic v1.10. Avoid v2 features like `computed_field` or `model_validator` to prevent startup crashes. ### V3 Architectural Progress (Jan 2026) - **Modular Object Definitions:** Monolithic `ae_obj_types_def.py` refactored into domain-specific files in `app/object_definitions/`. -- **Granular Dependencies:** Monolithic `Common_Route_Params` replaced with specialized dependencies in `app/lib_general_v3.py` (AccountContext, Pagination, StatusFilter, Serialization, Delay). - **Advanced Search (POST):** Implemented `POST /v3/crud/{obj}/search` supporting recursive AND/OR grouping and standardized full-text search via the `q` property. - **Security Hardening:** Implemented a 5-level recursion depth limit and a field allowlist (`searchable_fields`) for the Search API. -- **Non-blocking Concurrency:** Standardized on `asyncio.sleep()` for delay simulation to prevent Gunicorn worker hangs. ## Session Learnings & Progress (Jan 2-7, 2026) ### V3 API Security Hardening (Jan 7, 2026) - MILESTONE - **Mandatory JWT Authentication**: Successfully implemented strict multi-tenant isolation across all V3 CRUD and Search endpoints. - - All requests (except context resolution) now require a valid JWT `Authorization: Bearer ` or `?jwt=`. - **Account Isolation**: results are automatically filtered by `account_id` from the JWT. - - **Documentation**: Updated `V3_FRONTEND_API_GUIDE.md` with explicit instructions and security requirements for the frontend agent. + - **Bootstrap Paradox Exception**: `site_domain` search is explicitly allowed for unauthenticated guests to unblock site context resolution. -### Agent Bridge & Docker Integration -- **Agent Bridge Implementation**: Developed `app/routers/agent_bridge.py` for environment diagnostics. -- **MCP Docker Explorer**: Attempted to run `mcp_docker_explorer.py`, but failed with `ModuleNotFoundError: No module named 'mcp'`. - - **Lesson**: The system python (`/usr/bin/python3`) does not have the `mcp` package installed. We must use the specific virtual environment `env_mcp` (e.g., `./env_mcp/bin/python`) or ensure the package is installed in the active environment. +### Unified Agent Architecture +- **Refined Specification**: Incorporated feedback from the Frontend Svelte agent. The Unified Agent will handle **Automated Schema Synchronization**, **Log Stream Aggregation**, and **Automated Lifecycle Management**. -### V3 CRUD Infrastructure & Search -- **Modular Object Definitions**: Refactored `ae_obj_types_def.py` into modular domain files in `app/object_definitions/`. -- **Advanced Search Fixes**: - - Resolved account listing and search issues by implementing `get_supported_filters` in `api_crud_v3.py`. - - Improved standardized full-text search (`q` parameter) with fallback logic for missing columns. -- **Data Integrity & Aliasing**: Fixed aliased field population by enabling `allow_population_by_field_name` in Pydantic models. - -### Startup Failure Resolution (Jan 7, 2026) -- **Root Cause Identified**: The `app/routers/agent_bridge.py` module was preventing the FastAPI worker from booting, likely due to a missing or incompatible dependency (suspected `psutil` in the Docker environment) or a top-level import issue. -- **Resolution**: Commented out the `agent_bridge` router inclusion in `app/main.py`. -- **Status**: The API server has successfully started. -- **Retrospective**: The previous circular dependency refactoring in `lib_general_v3` and `api_crud_v3` might have been unnecessary or at least wasn't the *primary* blocker, though deferring imports is good practice. +### Infrastructure & Progress +- [x] **Modularize `lib_general.py`**: Successfully extracted Email, Export, JWT, and Hash functions into specialized modules (`lib_email.py`, `lib_export.py`, `lib_jwt.py`, `lib_hash.py`). ## Current To-Do List -1. **Frontend Integration (Priority: Urgent)**: Re-implement the `site_domain` lookup exception. - - *Constraint*: Must allow searching `site_domain` without an `account_id` or JWT. - - *Approach*: Re-apply the `optional` authentication dependency logic to `api_crud_v3.py` and `lib_general_v3.py`, now that the server is stable. -2. **Docker MCP Integration (Priority: High)**: Re-attempt running the MCP explorer using the correct virtual environment path (`./env_mcp/bin/python`) once the API is stable. -3. **Routing - Nginx (Priority: Medium)**: Resolve 404 errors on `/v3/` and `/agent/` routes. -4. **Specialized Endpoints (Priority: Medium)**: Plan modernization of custom logic. -5. **Agent Bridge Repair (Priority: Low)**: Investigate why `agent_bridge.py` crashes the server (check `psutil` availability). +### 1. High Priority & Urgent +- [ ] **Initialize `aether_platform` Project** (Priority: High): Create the root directory at `/home/scott/OSIT_dev/aether_platform/` and establish the initial meta-structure. +- [ ] **Unified Agent Architecture Document** (Priority: High): Refine and synchronize the final spec (Draft Done). +- [ ] **Permanent Dependency Fix** (Priority: Urgent): Migrate `AccountContext` and Auth logic to a dedicated module. + +### 2. Infrastructure & Environment +- [ ] **Docker MCP Integration**: Re-attempt diagnostics using the correct python path (`./env_mcp/bin/python`). +- [ ] **Agent Bridge Repair**: Resolve the `psutil` or syntax issues in `app/routers/agent_bridge.py`. +- [ ] **Nginx Configuration**: Resolve 404 errors on Port 8888 routes. ### Workflow & Collaboration -- **`GEMINI.md` Strategy:** The user is creating `GEMINI.md` files in key project directories. Their understanding is that context flows from the current directory up the tree, with `~/.gemini/GEMINI.md` serving as a global catch-all for general memories. -- **Agents Sync (rsync):** Shared documentation, notes, and architectural updates are pushed to the `agents_sync` directory using `rsync`. This allows real-time coordination between different specialized agents (e.g., FastAPI backend and Svelte frontend agents). -- **Home Server:** The user self-hosts a Proxmox server for services like Nextcloud. \ No newline at end of file +- **`GEMINI.md` Strategy:** Context flows up the tree. +- **Agents Sync (rsync):** Shared documentation and notifications pushed to `~/agents_sync/`. +- **Home Server:** Remote proxy at `https://dev-api.oneskyit.com`.- [x] **Establish Symbolic Links**: Linked API, App, and Env into aether_platform. diff --git a/documentation/UNIFIED_AGENT_ARCH.md b/documentation/UNIFIED_AGENT_ARCH.md new file mode 100644 index 0000000..dde3bd1 --- /dev/null +++ b/documentation/UNIFIED_AGENT_ARCH.md @@ -0,0 +1,101 @@ +# Specification: Unified Aether AI Agent (UE-AE-01) + +## 1. Vision & Purpose +The **Unified Aether AI Agent** is a single, cohesive AI entity designed to eliminate the friction of multi-agent coordination. It possesses "Total System Awareness," allowing it to understand how a change in the database schema on a remote server impacts the FastAPI backend, the Nginx proxy, and the SvelteKit frontend simultaneously. + +--- + +## 2. System Architecture & Operational Domains + +### A. Data Layer (MariaDB) +* **Location:** Separate Virtual Server (Remote VM). +* **Role:** Master data storage. +* **Agent Access Requirements:** + * Remote SQL execution capabilities. + * SSH access for database maintenance and schema inspection. + * Knowledge of cross-server connection strings and security groups. + +### B. Caching & Messaging Layer (Redis) +* **Location:** Docker Container (Main Workstation). +* **Role:** Session management, ID resolution (Random ID mapping), and real-time messaging. +* **Agent Access Requirements:** + * Ability to execute `redis-cli` commands via Docker. + * Direct inspection of key-value pairs for troubleshooting. + +### C. API Backend Layer (FastAPI / Python) +* **Location:** Docker Container (Main Workstation). +* **Role:** Business logic, CRUD V3 implementation, JWT authentication, and multi-tenant isolation. +* **Agent Access Requirements:** + * Full filesystem access to `osit-api-fastapi/`. + * Ability to manage Python environments and dependencies. + * Docker container management (logs, restarts, shell execution). + +### D. Frontend Layer (SvelteKit / TypeScript) +* **Location:** Local Filesystem (Main Workstation). +* **Role:** User Interface, API consumption, and client-side state management. +* **Agent Access Requirements:** + * Full filesystem access to SvelteKit project directories. + * Ability to execute build tools (npm, vite) and linting (eslint, prettier). + * Browser automation for E2E testing (Playwright). + +### E. Routing Layer (Nginx) +* **Location:** Host System or Docker. +* **Role:** SSL termination and reverse proxying for the API and Frontend. +* **Agent Access Requirements:** + * Ability to modify and reload Nginx configuration files. + * Diagnostic access to Nginx access/error logs. + +### F. Storage Layer (Syncthing / Hosted Files) +* **Location:** `/home/scott/OSIT/hosted_files/` (Main Workstation) and synchronized Remote Servers. +* **Role:** Extremely important persistent storage for files served via the API (e.g., `hosted_file`, `event_file`). +* **Synchronization:** Managed via **Syncthing** (similar to the `agents_sync` directory), ensuring real-time mirroring across the Aether ecosystem. +* **Agent Access Requirements:** + * Full filesystem access to the local hosted files directory. + * Ability to verify synchronization status and resolve conflicts. + * Understanding of the relationship between file metadata in MariaDB and physical assets in this directory. + +### G. Workstation Development Environment +* **Base Path:** `/home/scott/OSIT_dev/` +* **Project Repositories:** + * `aether_container_env/`: Docker Compose and environment configuration. + * `aether_api_fastapi/`: The Python/FastAPI backend source. + * `ae_app_svelte_tailwind_skeleton/`: The SvelteKit/TypeScript frontend source. +* **Network & Proxy Path:** + * Docker containers on the workstation are proxied via **Nginx on a separate Home Server** (Proxmox VM hosting Home Assistant, Jellyfin, Jitsi, etc.). + * **External Access URL:** `https://dev-api.oneskyit.com` +* **Agent Access Requirements:** + * Total awareness of the inter-connected paths between these three main directories. + * Knowledge of the home server's proxy logic to debug external connectivity vs. internal container health. + +--- + +## 3. Communication & Context Strategy + +### A. Integrated Global Memory +The Unified Agent will move away from separate `GEMINI.md` files in favor of a **Global System Context**. This context tracks: +1. **Service Map:** Mapping of ports, paths (e.g., `/v3/crud/`), and container IDs. +2. **Dependency Graph:** Visualizing how modules across different repositories interact. +3. **Boot Order Logic:** Understanding the fragile initialization requirements of the stack. + +### B. Agent Sync Orchestration +The agent acts as the primary orchestrator for the `~/agents_sync/` directory: +- **Log Aggregation:** Pulling logs from MariaDB, FastAPI, and Nginx into a central diagnostic stream. +- **Inbound Messaging:** Processing user instructions from the `inbox/` and updating "System Health" status files. + +--- + +## 4. Key Capabilities + +1. **Cross-Stack Debugging:** Tracing a "500 Internal Server Error" from a Svelte fetch call, through the Nginx proxy, into the FastAPI logic, and finally identifying the missing column in the remote MariaDB table. +2. **Automated Schema Synchronization:** Reading Pydantic models and MariaDB table schemas to automatically generate and update TypeScript interfaces and `.editable_fields.ts` definitions in the Svelte project (`src/lib/ae_core/`). +3. **Log Stream Aggregation:** Simultaneous monitoring of Svelte console output, Nginx access/error logs, and FastAPI container logs to provide instant root-cause identification for cross-stack failures. +4. **Automated Lifecycle Management:** Orchestrating the "Change-Restart-Verify" loop. The agent should automatically trigger targeted Docker container restarts whenever backend code is modified to ensure the frontend is always interacting with the latest logic. +5. **Environment-Aware Refactoring:** Safely breaking up monolithic files (like `lib_general.py`) while knowing exactly which services are impacted and verifying them across the full stack. +6. **Automated Full-Stack Verification:** Writing a backend migration and a frontend UI component in a single turn, then verifying the integration with an automated test suite. + +--- + +## 5. Security & Safety +- **Credential Isolation:** Secrets and API keys remain in `.env` files; the agent only manages the logic to use them. +- **Incremental Deployment:** Changes are applied service-by-service with health checks at every stage. +- **Sandboxing Awareness:** The agent operates with the knowledge that it is running directly on the user's workstation and remote infrastructure.