agent-docs
Documentation engine and Astro site for AGENTS
agent-docs is the documentation engine and KADI agent that powers the AGENTS project’s documentation site and developer tooling. It exists to (1) crawl configured repositories, (2) collect and normalize markdown into a docs/ site (Astro), (3) provide indexing and semantic/graph search (via graph-index / ArcadeDB), and (4) expose operational tools over the Kadi broker for status, sync, pipeline orchestration, README lint/generation, and search. It runs as a KADI agent (entrypoint dist/agent.js) and integrates with secret vaults and remote brokers.
Architecture
Section titled “Architecture”- Core runtime: a KADI client (agent) registers a set of tools (registerAllTools) that can be invoked locally or remotely via the broker (remote broker defined in agent.json and config.toml).
- Sync stage: registerSyncTool crawls configured repos and copies/aggregates markdown into docs/ for Astro site building.
- Pipeline stage: registerPipelineTool orchestrates an asynchronous background task (startTask) that:
- collects pages from repos,
- chunks and embeds content,
- calls ability-graph’s graph-index to create embeddings/graph edges,
- writes index entries to ArcadeDB (default collection agents-docs).
- Search stage: registerSearchTools exposes agents-docs-search, agents-docs-page and agents-docs-reindex. These delegate to ability-docs-memory if present (docsMemoryAbility.invoke), or fall back to client.invokeRemote to a memory service.
- README management: registerReadmeLintTool and registerReadmeGenerateTool validate and synthesize README.md files from templates and agent.json metadata.
- Observability / tasks: registerStatusTool and registerTaskStatusTool report state, repo health, and task progress.
- Secrets and environment: uses vault-delivered secrets (model-manager, arcadedb) and environment variables (e.g., ARCADE_HOST/PORT) for production deployment.
- How it fits: agent-docs provides discoverability and indexing services consumed by UIs, other agents, and human operators via KADI broker RPC tools.
Data flow (high-level):
- Operator triggers agents-docs-sync or run pipeline.
- Sync collects markdown → docs/ (site build) and yields page list.
- Pipeline chunks pages → asks ability-graph to index embeddings/graph → writes edges/records into ArcadeDB.
- agents-docs-search queries memory/index (ability-docs-memory or remote docs-search) and returns ranked results.
Tools / API
Section titled “Tools / API”Below is a catalog of the tools this agent registers (see src/tools/index.ts and individual tool registrations).
| Tool name | Description | Key input params |
|---|---|---|
| agents-docs-status | Show documentation system status: configured repos, sync state, build health. | none |
| agents-docs-config | Read and describe the current agent-docs configuration. | none |
| agents-docs-sync | Crawl configured repos and collect documentation files into docs/. | repos?: string[], dryRun?: boolean |
| agents-docs-pipeline | Full pipeline: sync → chunk+embed → graph-index → create edges. Runs as background task (returns taskId). | repos?: string[], skipIndex?: boolean, collection?: string, database?: string |
| agents-docs-search | Hybrid search (semantic+keyword+graph+structural). Delegates to docsMemoryAbility or remote. | query: string, collection?: string, limit?: number |
| agents-docs-page | Fetch a single documentation page by slug. Delegates to docsMemoryAbility or remote. | slug: string, collection?: string |
| agents-docs-reindex | Trigger a full reindex of documentation into ArcadeDB (use pipeline for full flow). | collection?: string |
| agents-docs-readme-lint | Validate README.md files in each repo against templates. | repos?: string[] |
| agents-docs-readme-generate | Generate or update README.md files using templates + LLM enhancement. | (tool-specific inputs; templates used from /templates) |
| agents-docs-task-status | Query background task status (taskId) | taskId: string |
Notes on delegation:
- agents-docs-search and agents-docs-page will call docsMemoryAbility.invoke(‘docs-search’/‘docs-page’, …) when a docs memory ability is provided; otherwise they attempt client.invokeRemote to a remote memory service.
If you need to inspect code that registers tools, see src/tools/index.ts which calls each registerXTool function.
Configuration
Section titled “Configuration”Configuration values come from config.toml at repo root and from environment/secrets delivered per deployment.
Key config.toml fields (exact keys):
- [agent]
- ID = “agent-docs”
- VERSION = “0.1.3”
- [logging]
- LEVEL = “info”
- [broker.remote]
- URL = “wss://broker.dadavidtseng.com/kadi”
- NETWORKS = [“global”]
- [secrets]
- VAULTS = [“model-manager”, “arcadedb”]
- KEYS = [“MODEL_MANAGER_API_KEY”, “MODEL_MANAGER_BASE_URL”, “ARCADE_USERNAME”, “ARCADE_PASSWORD”]
- [arcadedb]
- HOST = “arcadedb.dadavidtseng.com”
- PORT = 443
- USERNAME = “root”
- DATABASE = “agents_logs”
agent.json highlights (deployment & runtime):
- “abilities”: { “secret-ability”: "", “ability-log”: "" } — this agent declares it uses secret and logging abilities.
- “brokers”: { “remote”: “wss://broker.dadavidtseng.com/kadi” } — remote broker to connect.
- “networks”: [“global”] — network scopes.
- Deploy block (akash-mainnet) expects secret vaults:
- model-manager: required MODEL_MANAGER_API_KEY, MODEL_MANAGER_BASE_URL
- arcadedb: required ARCADE_USERNAME, ARCADE_PASSWORD
- secrets.delivery: “broker” — secrets are expected to be delivered via broker (kadi secret receive …)
Environment variables in deployment/services:
- NODE_ENV=production
- ARCADE_HOST and ARCADE_PORT (example values set in agent.json deploy/service)
Secrets vault usage:
- During build/deploy the agent expects KADI secret delivery for “model-manager” and “arcadedb” vaults. The runtime command in deployment runs kadi secret receive —vault model-manager —vault arcadedb before starting.
Code Examples
Section titled “Code Examples”Below are representative code excerpts copied from the source showing registration patterns and delegations.
registerConfigTool (reads and summarizes config):
import { z } from '@kadi.build/core';import type { DocsConfig } from '../config/types.js';
export function registerConfigTool(client: any, config: DocsConfig): void { client.registerTool( { name: 'agents-docs-config', description: 'Read and describe the current agent-docs configuration. Shows site settings, repo list, and agent config.', input: z.object({}), }, async () => { return { success: true, config: { site: config.site, repoCount: Object.keys(config.repos).length, repos: Object.fromEntries( Object.entries(config.repos).map(([name, repo]) => [ name, { type: repo.type, description: repo.description, crawl: repo.crawl }, ]), ), output: config.output, agent: { name: config.agent.name, version: config.agent.version, brokers: Object.keys(config.agent.brokers), networks: config.agent.networks, }, }, }; }, );}Tool registry (registers all tools on startup):
import type { DocsConfig } from '../config/types.js';import { registerStatusTool } from './status.js';import { registerConfigTool } from './config-tool.js';import { registerSyncTool } from './sync.js';import { registerPipelineTool } from './pipeline.js';import { registerSearchTools } from './search-tools.js';import { registerReadmeLintTool } from './readme-lint.js';import { registerReadmeGenerateTool } from './readme-generate.js';import { registerTaskStatusTool } from './task-status.js';
export function registerAllTools( client: any, config: DocsConfig, secrets?: any, apiKey?: string, docsMemoryAbility?: any,): void { registerStatusTool(client, config); registerConfigTool(client, config); registerSyncTool(client, config); registerPipelineTool(client, config); registerSearchTools(client, config, apiKey, docsMemoryAbility); registerReadmeLintTool(client, config); registerReadmeGenerateTool(client, config); registerTaskStatusTool(client);
console.error('[tools] registered all agents-docs tools');}Pipeline (starts a background task and collects pages):
import * as fs from 'fs';import * as path from 'path';import { fileURLToPath } from 'url';import { z } from '@kadi.build/core';import type { DocsConfig } from '../config/types.js';import { startTask } from '../utils/tasks.js';
const __dirname = path.dirname(fileURLToPath(import.meta.url));const PROJECT_ROOT = path.resolve(__dirname, '../..');
const DEFAULT_COLLECTION = 'agents-docs';const DEFAULT_DATABASE = 'kadi';const MAX_TOKENS_PER_CHUNK = 500;const INDEX_CONCURRENCY = 5;
export function registerPipelineTool( client: any, config: DocsConfig,): void { client.registerTool( { name: 'agents-docs-pipeline', description: 'Full documentation pipeline: sync repos → collect markdown → chunk+embed via graph-index → create edges. ' + 'Runs as a background task and returns a taskId for polling.', input: z.object({ repos: z.array(z.string()).optional() .describe('Specific repos to process (default: all)'), skipIndex: z.boolean().optional() .describe('Skip graph indexing (default: false)'), collection: z.string().optional() .describe('Target collection name (default: agents-docs)'), database: z.string().optional() .describe('Target database (default: kadi)'), }), }, async (input: { repos?: string[]; skipIndex?: boolean; collection?: string; database?: string }) => { const taskId = startTask(async () => { const startTime = Date.now(); const collection = input.collection ?? DEFAULT_COLLECTION; const database = input.database ?? DEFAULT_DATABASE;
// Step 1: Collect pages from repos console.error('[pipeline] Step 1: Collecting pages…'); const reposToSync = input.repos ? Object.entries(config.repos).filter(([name]) => input.repos!.includes(name)) : Object.entries(config.repos);
const pages: Array<{ title: string; slug: string; pageUrl: string; source: string; content: string; }> = [];
for (const [name, repo] of reposToSync) { const repoPath = path.resolve(PROJECT_ROOT, repo.path); if (!fs.existsSync(repoPath)) continue;
const crawlPatterns = (repo as any).indexCrawl ?? repo.crawl; const mdFiles = collectMarkdownFiles(repoPath, crawlPatterns); for (const file of mdFiles) { const content = fs.readFileSync(file, 'utf-8'); const relativePath = path.relative(repoPath, file); const slug = `${name}/${relativePath.replace(/\.md$/, '').replace(/\\/g, '/')}`; const title = extractTitle(content) ?? `${name}/${relativePath}`;Search tool (delegates to docs memory ability or remote):
import { z } from '@kadi.build/core';import type { DocsConfig } from '../config/types.js';
export function registerSearchTools( client: any, config: DocsConfig, apiKey?: string, docsMemoryAbility?: any,): void { // agents-docs-search — 4-signal hybrid search client.registerTool( { name: 'agents-docs-search', description: 'Search AGENTS documentation using 4-signal hybrid recall (semantic + keyword + graph + structural). ' + 'Returns ranked results with content snippets and source attribution.', input: z.object({ query: z.string().describe('Search query'), collection: z.string().optional().describe('Collection to search (default: agents-docs)'), limit: z.number().optional().describe('Max results (default: 10)'), }), }, async (input: { query: string; collection?: string; limit?: number }) => { const collection = input.collection ?? 'agents-docs'; const limit = input.limit ?? 10;
if (docsMemoryAbility) { return docsMemoryAbility.invoke('docs-search', { query: input.query, collection, limit }); }
try { return await client.invokeRemote('docs-search', { query: input.query, collection, limit }); } catch (err: any) { return { success: false, error: `Search unavailable: ${err?.message ?? err}` }; } }, );
// agents-docs-page — Fetch a single page by slug client.registerTool( { name: 'agents-docs-page', description: 'Fetch a single documentation page by slug. Returns full content and metadata.', input: z.object({ slug: z.string().describe('Document slug (e.g., "agent-worker/README")'), collection: z.string().optional().describe('Collection (default: agents-docs)'), }), }, async (input: { slug: string; collection?: string }) => { const collection = input.collection ?? 'agents-docs';
if (docsMemoryAbility) { return docsMemoryAbility.invoke('docs-page', { slug: input.slug, collection }); }
try { return await client.invokeRemote('docs-page', { slug: input.slug, collection }); } catch (err: any) { return { success: false, error: `Page fetch unavailable: ${err?.message ?? err}` }; } }, );}Sync tool header (shows crawl/copy behavior):
import * as fs from 'fs';import * as path from 'path';import { fileURLToPath } from 'url';import { z } from '@kadi.build/core';import type { DocsConfig } from '../config/types.js';
const __dirname = path.dirname(fileURLToPath(import.meta.url));const PROJECT_ROOT = path.resolve(__dirname, '../..');const DOCS_DIR = path.join(PROJECT_ROOT, 'docs');
export function registerSyncTool(client: any, config: DocsConfig): void { client.registerTool( { name: 'agents-docs-sync', description: 'Crawl all configured repos and collect documentation files into the docs/ directory. ' + 'Reads markdown files according to each repo\'s crawl patterns.', input: z.object({ repos: z.array(z.string()).optional() .describe('Specific repos to sync (default: all)'), dryRun: z.boolean().optional() .describe('List files that would be synced without copying (default: false)'), }), }, async (input: { repos?: string[]; dryRun?: boolean }) => { const startTime =README generation helper (parsing README sections used by the generator):
interface ReadmeSection { heading: string; content: string; line: number;}
function parseReadme(content: string): ReadmeSection[] { const lines = content.split('\n'); const sections: ReadmeSection[] = []; let currentHeading = ''; let currentContent: string[] = []; let currentLine = 0;
for (let i = 0; i < lines.length; i++) { const match = lines[i].match(/^##\s+(.+)/); if (match) { if (currentHeading) { sections.push({ heading: currentHeading, content: currentContent.join('\n').trim(), line: currentLine }); } currentHeading = match[1].trim(); currentContent = []; currentLine = i; } else if (currentHeading) { currentContent.push(lines[i]); } } if (currentHeading) { sections.push({ heading: currentHeading, content: currentContent.join('\n').trim(), line: currentLine }); } return sections;}Dependencies
Section titled “Dependencies”Abilities declared in agent.json:
- secret-ability (required) — used to receive vault secrets at runtime (kadi secret receive).
- ability-log (required) — structured logging ability.
Runtime / library dependencies (from source and build scripts):
- @kadi.build/core — Zod wrappers and client typings used by tools.
- node: standard Node APIs (fs, path, url).
- Astro — site generation (used in build and dev scripts).
- tsx / TypeScript — dev/run tooling.
- kadi, kadi-secret — build scripts call kadi install and kadi-secret.
- ability-graph — pipeline references graph-index (embedding/chunking) — pipeline calls graph-index ability directly.
- ability-docs-memory — search tools may delegate to a docs-memory ability (native or broker).
- ArcadeDB — external database for storing index/graph (configured in [arcadedb]).
What depends on agent-docs:
- Any UI or agent that needs documentation search or page fetch (e.g., web frontend, chatbots, other agents) can call agents-docs-search / agents-docs-page via broker.
- Tools that present site status or auto-generate READMEs rely on agents-docs’ repo crawling and metadata.
Developer notes / tips
Section titled “Developer notes / tips”- Tool registration pattern: all tools are registered with KadiClient.registerTool; follow the z.object input schema for validation.
- To run locally (development): scripts include dev (Astro), dev:agent (agent run), start (broker mode). See package.json scripts in agent.json.
- Secrets in local development: the deploy block expects secrets via broker; for local testing provide ARCADE_* and MODEL_MANAGER_* env vars or mock the secret-ability.
- Pipeline concurrency and chunking constants (MAX_TOKENS_PER_CHUNK, INDEX_CONCURRENCY) are defined in src/tools/pipeline.ts — tune them when changing chunk/embedding behavior.
- Template directory for README generation is templates/ at repo root (TEMPLATE_DIR constant in readme-generate.ts).
- When adding new tools, register them in src/tools/index.ts so registerAllTools includes them.
If you need help locating a specific tool implementation, search src/tools for register