Files
Cortex-Inara/documentation/ARCH__AE_INTEGRATION.md
Scott Idem 98546abe21 docs: update ARCH__AE_INTEGRATION with verified API behavior
- query_string required for and/or filters to apply; use "%" as wildcard
- Total count is in meta.data_list_count, not top-level
- id_random is None in responses; Vision ID convention uses {obj_type}_id
- tags comes back as string on read, not list — normalize before joining
- Replace stale "Planned: Search Improvements" with current signature + notes
- Clarify date_to boundary (lte midnight, use next day to include full day)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 21:17:19 -04:00

227 lines
8.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Aether Platform Integration — Cortex Tool Layer
> Last updated: 2026-04-30
> Status: Journal toolset complete — broader AE integration planned
This doc covers how Cortex/Inara integrates with the Aether Platform API, what's
implemented, what the data model looks like, and what's planned next.
---
## Overview
Cortex connects to the Aether Platform V3 API to give the orchestrator read/write
access to the user's knowledge base (Journals) and task data. Auth uses the same
`x-aether-api-key` + `x-account-id` headers as every other Aether client.
Config lives in `.env`:
```
AE_API_URL=https://dev-api.oneskyit.com
AE_API_KEY=...
AE_ACCOUNT_ID=...
AE_API_TIMEOUT=15
```
Tool implementation: `cortex/tools/ae_knowledge.py`
Tool registrations: `cortex/tools/__init__.py`
---
## V3 Search Engine
### Endpoint
```
POST /v3/crud/{obj_type}/search
```
For nested objects (journal_entry scoped to a journal):
```
POST /v3/crud/journal_entry/search
?for_obj_type=journal&for_obj_id={journal_id}
```
### Search body
```json
{
"query_string": "fulltext search term",
"and": [
{ "field": "tags", "op": "icontains", "value": "networking" },
{ "field": "created_on", "op": "gte", "value": "2026-01-01" }
],
"or": [...],
"page_size": 20,
"page": 1,
"order_by": "-updated_on"
}
```
**`query_string` vs `and` filters on `default_qry_str`:**
- `query_string` → triggers `MATCH(default_qry_str) AGAINST(... IN BOOLEAN MODE)` — uses the
FULLTEXT index. Faster and supports boolean operators (`+word`, `-word`, `"phrase"`).
- `and` with `icontains` on `default_qry_str` → plain `LIKE '%term%'`. Slower, no index.
**Important:** `query_string` must be present for `and`/`or` filters to apply. When using
filters without a keyword query, pass `query_string: "%"` as a wildcard to activate the
filter path without restricting by keyword.
### Supported operators
| Operator | SQL | Notes |
|---|---|---|
| `eq` | `=` | exact match |
| `ne` | `!=` | not equal |
| `gt` / `gte` | `>` / `>=` | numeric, dates |
| `lt` / `lte` | `<` / `<=` | numeric, dates |
| `contains` / `icontains` | `LIKE '%v%'` | substring; both case-insensitive on MariaDB |
| `startswith` / `istartswith` | `LIKE 'v%'` | |
| `endswith` / `iendswith` | `LIKE '%v'` | |
| `like` | `LIKE` | raw LIKE pattern |
| `in` | `IN (...)` | value is a list |
| `is_null` / `is_not_null` | `IS NULL` / `IS NOT NULL` | no value needed |
### Sorting
`order_by` accepts any indexed field name. Prefix with `-` for descending:
- `-updated_on` (default for listing)
- `-created_on`
- `name`
- `-priority`
### Pagination
`page_size` (default 10, max ~100) + `page` (1-based).
Total count is in `response["meta"]["data_list_count"]` — not a top-level key.
---
## journal_entry Schema
Full table schema from `ae_describe journal_entry --detailed`:
| Field | Type | Indexed | Notes |
|---|---|---|---|
| `id_random` | varchar(22) | UNI | DB public ID field — but API responses return this as `journal_entry_id` (the Vision ID convention: `{obj_type}_id`). `id_random` key is `None` in responses. |
| `journal_id` | int | MUL | FK — use `for_obj_id` param in search |
| `name` | varchar(250) | MUL | Entry title |
| `short_name` | varchar(25) | | |
| `summary` | text | | Short summary (12 sentences) |
| `content` | text | | Full markdown content |
| `content_html` | text | | HTML version |
| `content_json` | longtext | | Structured content (editor format) |
| `content_encrypted` | longtext | | Optional encrypted content |
| `tags` | varchar(255) | MUL | Comma-separated string — filter with `icontains` |
| `type` / `type_code` | varchar | | Classification: type |
| `topic` / `topic_code` | varchar | | Classification: topic |
| `activity` / `activity_code` | varchar | | Classification: activity |
| `category_code` | varchar(25) | | Classification: category |
| `code` | varchar(20) | | Short entry code |
| `start_datetime` | datetime | MUL | Optional event start |
| `end_datetime` | datetime | | Optional event end |
| `seconds` / `hours` | int/decimal | | Duration |
| `priority` | tinyint | MUL | 1=low → 5=high |
| `status` | int | MUL | Status code (domain-specific) |
| `private` / `public` / `personal` / `professional` | tinyint | MUL | Visibility flags |
| `billable` | tinyint | | Billing flag |
| `enable` | tinyint NOT NULL | MUL | Soft-delete flag (default 1) |
| `hide` | tinyint | MUL | UI hide flag |
| `archive` | tinyint | MUL | Archived flag |
| `default_qry_str` | text | FULLTEXT | Auto-generated search target (name + content) |
| `data_json` | longtext | | Arbitrary structured data |
| `notes` | text | | Internal notes |
| `created_on` | timestamp NOT NULL | MUL | Auto-set on create |
| `updated_on` | timestamp | MUL | Auto-updated on change |
### journal Schema (top-level)
| Field | Type | Notes |
|---|---|---|
| `id_random` | varchar(22) | DB field — returned in API as `journal_id` |
| `name` | varchar(250) | Journal name |
| `summary` / `description` | text | |
| `type_code` | varchar(25) | Journal type |
| `enable` | tinyint | |
| `created_on` / `updated_on` | timestamp | |
---
## Current Tool Inventory
| Tool | Status | Notes |
|---|---|---|
| `ae_journal_list` | ✅ | Lists journals with id + name |
| `ae_journal_search` | ✅ | Fulltext + tag/date/type/status/priority filters; paginated |
| `ae_journal_entry_read` | ✅ | Full content by entry_id; configurable truncation |
| `ae_journal_entries_list` | ✅ | Browse a journal newest-first; paginated |
| `ae_journal_entry_create` | ✅ | Create with title, content, tags, summary |
| `ae_journal_entry_update` | ✅ | Patch any fields (title, content, tags, summary, enable) |
| `ae_journal_entry_disable` | ✅ | Soft-delete (enable=false) |
| `ae_journal_entry_append` | ✅ | Timestamped append to bottom |
| `ae_journal_entry_prepend` | ✅ | Timestamped prepend to top |
| `ae_task_list` | ✅ | agents_sync Kanban (admin only) |
---
## ae_journal_search — Current Signature
All filters are optional and combine with AND. At least one should be provided.
```python
ae_journal_search(
query: str = "", # fulltext via query_string (MATCH/AGAINST)
journal_id: str = "", # scope to a specific journal
tags: str = "", # icontains on tags field
type_code: str = "", # eq on type_code
topic_code: str = "", # eq on topic_code
date_from: str = "", # created_on gte (YYYY-MM-DD)
date_to: str = "", # created_on lte (YYYY-MM-DD, exclusive of time — use next day to include full day)
sort_by: str = "updated", # updated | created | name | priority
sort_order: str = "desc",
status: int | None = None,
priority: int | None = None,
max_results: int = 10,
page: int = 1,
)
```
**date_to boundary note:** `date_to='2026-01-17'` means `<= 2026-01-17 00:00:00`, which
excludes entries created later that day. Use `date_to='2026-01-18'` to include all of Jan 17.
---
## Planned: Broader AE Platform Integration
### Phase 1 — Journal Toolset (current)
Complete read/write/search for Journals and Journal Entries.
### Phase 2 — Tasks & Projects
- `ae_task_create` / `ae_task_update` / `ae_task_complete` on Aether tasks (not just agents_sync Kanban)
- Read project/task hierarchy
### Phase 3 — Knowledge Import Pipeline
- Script to walk markdown dirs, chunk by H2, create Journal entries
- Dedup via search-before-create pattern
- Tag and classify entries automatically via orchestrator
### Phase 4 — People & Contacts
- Read contact records (person, organization)
- Link journal entries to contacts
### Phase 5 — Calendar / Events
- `start_datetime` / `end_datetime` already on journal_entry
- Could expose time-scoped journal queries as a calendar view
---
## Notes on `tags` field
`tags` is stored as a raw comma-separated varchar(255), not a JSON array.
The API accepts a Python list on write (the `tags` PATCH key takes a list and the backend joins it).
On read, it comes back as a **string** (e.g. `"shelterluv, api"`), not a list — normalize before
displaying: `[t.strip() for t in tags_str.split(",") if t.strip()]`.
For filtering: use `icontains` on `tags` inside the `"and"` list, e.g.:
`{"field": "tags", "op": "icontains", "value": "networking"}`.
A tag search for "net" matches "networking" AND "subnet" — acceptable for now.
True per-tag filtering would require a tags junction table.
## Notes on `default_qry_str`
Auto-populated by the backend from `name` + content fields. Do not write to it directly.
FULLTEXT index supports boolean mode: `+required -excluded "exact phrase"`.
The `query_string` key in the search body triggers this path automatically.