Skip to content

agent-docs

Documentation engine and Astro site for AGENTS

agent-docs is the documentation engine and KADI agent that powers the AGENTS project’s documentation site and developer tooling. It exists to (1) crawl configured repositories, (2) collect and normalize markdown into a docs/ site (Astro), (3) provide indexing and semantic/graph search (via graph-index / ArcadeDB), and (4) expose operational tools over the Kadi broker for status, sync, pipeline orchestration, README lint/generation, and search. It runs as a KADI agent (entrypoint dist/agent.js) and integrates with secret vaults and remote brokers.

  • Core runtime: a KADI client (agent) registers a set of tools (registerAllTools) that can be invoked locally or remotely via the broker (remote broker defined in agent.json and config.toml).
  • Sync stage: registerSyncTool crawls configured repos and copies/aggregates markdown into docs/ for Astro site building.
  • Pipeline stage: registerPipelineTool orchestrates an asynchronous background task (startTask) that:
    • collects pages from repos,
    • chunks and embeds content,
    • calls ability-graph’s graph-index to create embeddings/graph edges,
    • writes index entries to ArcadeDB (default collection agents-docs).
  • Search stage: registerSearchTools exposes agents-docs-search, agents-docs-page and agents-docs-reindex. These delegate to ability-docs-memory if present (docsMemoryAbility.invoke), or fall back to client.invokeRemote to a memory service.
  • README management: registerReadmeLintTool and registerReadmeGenerateTool validate and synthesize README.md files from templates and agent.json metadata.
  • Observability / tasks: registerStatusTool and registerTaskStatusTool report state, repo health, and task progress.
  • Secrets and environment: uses vault-delivered secrets (model-manager, arcadedb) and environment variables (e.g., ARCADE_HOST/PORT) for production deployment.
  • How it fits: agent-docs provides discoverability and indexing services consumed by UIs, other agents, and human operators via KADI broker RPC tools.

Data flow (high-level):

  1. Operator triggers agents-docs-sync or run pipeline.
  2. Sync collects markdown → docs/ (site build) and yields page list.
  3. Pipeline chunks pages → asks ability-graph to index embeddings/graph → writes edges/records into ArcadeDB.
  4. agents-docs-search queries memory/index (ability-docs-memory or remote docs-search) and returns ranked results.

Below is a catalog of the tools this agent registers (see src/tools/index.ts and individual tool registrations).

Tool nameDescriptionKey input params
agents-docs-statusShow documentation system status: configured repos, sync state, build health.none
agents-docs-configRead and describe the current agent-docs configuration.none
agents-docs-syncCrawl configured repos and collect documentation files into docs/.repos?: string[], dryRun?: boolean
agents-docs-pipelineFull pipeline: sync → chunk+embed → graph-index → create edges. Runs as background task (returns taskId).repos?: string[], skipIndex?: boolean, collection?: string, database?: string
agents-docs-searchHybrid search (semantic+keyword+graph+structural). Delegates to docsMemoryAbility or remote.query: string, collection?: string, limit?: number
agents-docs-pageFetch a single documentation page by slug. Delegates to docsMemoryAbility or remote.slug: string, collection?: string
agents-docs-reindexTrigger a full reindex of documentation into ArcadeDB (use pipeline for full flow).collection?: string
agents-docs-readme-lintValidate README.md files in each repo against templates.repos?: string[]
agents-docs-readme-generateGenerate or update README.md files using templates + LLM enhancement.(tool-specific inputs; templates used from /templates)
agents-docs-task-statusQuery background task status (taskId)taskId: string

Notes on delegation:

  • agents-docs-search and agents-docs-page will call docsMemoryAbility.invoke(‘docs-search’/‘docs-page’, …) when a docs memory ability is provided; otherwise they attempt client.invokeRemote to a remote memory service.

If you need to inspect code that registers tools, see src/tools/index.ts which calls each registerXTool function.

Configuration values come from config.toml at repo root and from environment/secrets delivered per deployment.

Key config.toml fields (exact keys):

  • [agent]
    • ID = “agent-docs”
    • VERSION = “0.1.3”
  • [logging]
    • LEVEL = “info”
  • [broker.remote]
    • URL = “wss://broker.dadavidtseng.com/kadi”
    • NETWORKS = [“global”]
  • [secrets]
    • VAULTS = [“model-manager”, “arcadedb”]
    • KEYS = [“MODEL_MANAGER_API_KEY”, “MODEL_MANAGER_BASE_URL”, “ARCADE_USERNAME”, “ARCADE_PASSWORD”]
  • [arcadedb]
    • HOST = “arcadedb.dadavidtseng.com”
    • PORT = 443
    • USERNAME = “root”
    • DATABASE = “agents_logs”

agent.json highlights (deployment & runtime):

  • “abilities”: { “secret-ability”: "", “ability-log”: "" } — this agent declares it uses secret and logging abilities.
  • “brokers”: { “remote”: “wss://broker.dadavidtseng.com/kadi” } — remote broker to connect.
  • “networks”: [“global”] — network scopes.
  • Deploy block (akash-mainnet) expects secret vaults:
    • model-manager: required MODEL_MANAGER_API_KEY, MODEL_MANAGER_BASE_URL
    • arcadedb: required ARCADE_USERNAME, ARCADE_PASSWORD
    • secrets.delivery: “broker” — secrets are expected to be delivered via broker (kadi secret receive …)

Environment variables in deployment/services:

  • NODE_ENV=production
  • ARCADE_HOST and ARCADE_PORT (example values set in agent.json deploy/service)

Secrets vault usage:

  • During build/deploy the agent expects KADI secret delivery for “model-manager” and “arcadedb” vaults. The runtime command in deployment runs kadi secret receive —vault model-manager —vault arcadedb before starting.

Below are representative code excerpts copied from the source showing registration patterns and delegations.

registerConfigTool (reads and summarizes config):

import { z } from '@kadi.build/core';
import type { DocsConfig } from '../config/types.js';
export function registerConfigTool(client: any, config: DocsConfig): void {
client.registerTool(
{
name: 'agents-docs-config',
description: 'Read and describe the current agent-docs configuration. Shows site settings, repo list, and agent config.',
input: z.object({}),
},
async () => {
return {
success: true,
config: {
site: config.site,
repoCount: Object.keys(config.repos).length,
repos: Object.fromEntries(
Object.entries(config.repos).map(([name, repo]) => [
name,
{ type: repo.type, description: repo.description, crawl: repo.crawl },
]),
),
output: config.output,
agent: {
name: config.agent.name,
version: config.agent.version,
brokers: Object.keys(config.agent.brokers),
networks: config.agent.networks,
},
},
};
},
);
}

Tool registry (registers all tools on startup):

import type { DocsConfig } from '../config/types.js';
import { registerStatusTool } from './status.js';
import { registerConfigTool } from './config-tool.js';
import { registerSyncTool } from './sync.js';
import { registerPipelineTool } from './pipeline.js';
import { registerSearchTools } from './search-tools.js';
import { registerReadmeLintTool } from './readme-lint.js';
import { registerReadmeGenerateTool } from './readme-generate.js';
import { registerTaskStatusTool } from './task-status.js';
export function registerAllTools(
client: any,
config: DocsConfig,
secrets?: any,
apiKey?: string,
docsMemoryAbility?: any,
): void {
registerStatusTool(client, config);
registerConfigTool(client, config);
registerSyncTool(client, config);
registerPipelineTool(client, config);
registerSearchTools(client, config, apiKey, docsMemoryAbility);
registerReadmeLintTool(client, config);
registerReadmeGenerateTool(client, config);
registerTaskStatusTool(client);
console.error('[tools] registered all agents-docs tools');
}

Pipeline (starts a background task and collects pages):

import * as fs from 'fs';
import * as path from 'path';
import { fileURLToPath } from 'url';
import { z } from '@kadi.build/core';
import type { DocsConfig } from '../config/types.js';
import { startTask } from '../utils/tasks.js';
const __dirname = path.dirname(fileURLToPath(import.meta.url));
const PROJECT_ROOT = path.resolve(__dirname, '../..');
const DEFAULT_COLLECTION = 'agents-docs';
const DEFAULT_DATABASE = 'kadi';
const MAX_TOKENS_PER_CHUNK = 500;
const INDEX_CONCURRENCY = 5;
export function registerPipelineTool(
client: any,
config: DocsConfig,
): void {
client.registerTool(
{
name: 'agents-docs-pipeline',
description:
'Full documentation pipeline: sync repos → collect markdown → chunk+embed via graph-index → create edges. ' +
'Runs as a background task and returns a taskId for polling.',
input: z.object({
repos: z.array(z.string()).optional()
.describe('Specific repos to process (default: all)'),
skipIndex: z.boolean().optional()
.describe('Skip graph indexing (default: false)'),
collection: z.string().optional()
.describe('Target collection name (default: agents-docs)'),
database: z.string().optional()
.describe('Target database (default: kadi)'),
}),
},
async (input: { repos?: string[]; skipIndex?: boolean; collection?: string; database?: string }) => {
const taskId = startTask(async () => {
const startTime = Date.now();
const collection = input.collection ?? DEFAULT_COLLECTION;
const database = input.database ?? DEFAULT_DATABASE;
// Step 1: Collect pages from repos
console.error('[pipeline] Step 1: Collecting pages…');
const reposToSync = input.repos
? Object.entries(config.repos).filter(([name]) => input.repos!.includes(name))
: Object.entries(config.repos);
const pages: Array<{
title: string;
slug: string;
pageUrl: string;
source: string;
content: string;
}> = [];
for (const [name, repo] of reposToSync) {
const repoPath = path.resolve(PROJECT_ROOT, repo.path);
if (!fs.existsSync(repoPath)) continue;
const crawlPatterns = (repo as any).indexCrawl ?? repo.crawl;
const mdFiles = collectMarkdownFiles(repoPath, crawlPatterns);
for (const file of mdFiles) {
const content = fs.readFileSync(file, 'utf-8');
const relativePath = path.relative(repoPath, file);
const slug = `${name}/${relativePath.replace(/\.md$/, '').replace(/\\/g, '/')}`;
const title = extractTitle(content) ?? `${name}/${relativePath}`;

Search tool (delegates to docs memory ability or remote):

import { z } from '@kadi.build/core';
import type { DocsConfig } from '../config/types.js';
export function registerSearchTools(
client: any,
config: DocsConfig,
apiKey?: string,
docsMemoryAbility?: any,
): void {
// agents-docs-search — 4-signal hybrid search
client.registerTool(
{
name: 'agents-docs-search',
description:
'Search AGENTS documentation using 4-signal hybrid recall (semantic + keyword + graph + structural). ' +
'Returns ranked results with content snippets and source attribution.',
input: z.object({
query: z.string().describe('Search query'),
collection: z.string().optional().describe('Collection to search (default: agents-docs)'),
limit: z.number().optional().describe('Max results (default: 10)'),
}),
},
async (input: { query: string; collection?: string; limit?: number }) => {
const collection = input.collection ?? 'agents-docs';
const limit = input.limit ?? 10;
if (docsMemoryAbility) {
return docsMemoryAbility.invoke('docs-search', { query: input.query, collection, limit });
}
try {
return await client.invokeRemote('docs-search', { query: input.query, collection, limit });
} catch (err: any) {
return { success: false, error: `Search unavailable: ${err?.message ?? err}` };
}
},
);
// agents-docs-page — Fetch a single page by slug
client.registerTool(
{
name: 'agents-docs-page',
description: 'Fetch a single documentation page by slug. Returns full content and metadata.',
input: z.object({
slug: z.string().describe('Document slug (e.g., "agent-worker/README")'),
collection: z.string().optional().describe('Collection (default: agents-docs)'),
}),
},
async (input: { slug: string; collection?: string }) => {
const collection = input.collection ?? 'agents-docs';
if (docsMemoryAbility) {
return docsMemoryAbility.invoke('docs-page', { slug: input.slug, collection });
}
try {
return await client.invokeRemote('docs-page', { slug: input.slug, collection });
} catch (err: any) {
return { success: false, error: `Page fetch unavailable: ${err?.message ?? err}` };
}
},
);
}

Sync tool header (shows crawl/copy behavior):

import * as fs from 'fs';
import * as path from 'path';
import { fileURLToPath } from 'url';
import { z } from '@kadi.build/core';
import type { DocsConfig } from '../config/types.js';
const __dirname = path.dirname(fileURLToPath(import.meta.url));
const PROJECT_ROOT = path.resolve(__dirname, '../..');
const DOCS_DIR = path.join(PROJECT_ROOT, 'docs');
export function registerSyncTool(client: any, config: DocsConfig): void {
client.registerTool(
{
name: 'agents-docs-sync',
description:
'Crawl all configured repos and collect documentation files into the docs/ directory. ' +
'Reads markdown files according to each repo\'s crawl patterns.',
input: z.object({
repos: z.array(z.string()).optional()
.describe('Specific repos to sync (default: all)'),
dryRun: z.boolean().optional()
.describe('List files that would be synced without copying (default: false)'),
}),
},
async (input: { repos?: string[]; dryRun?: boolean }) => {
const startTime =

README generation helper (parsing README sections used by the generator):

interface ReadmeSection {
heading: string;
content: string;
line: number;
}
function parseReadme(content: string): ReadmeSection[] {
const lines = content.split('\n');
const sections: ReadmeSection[] = [];
let currentHeading = '';
let currentContent: string[] = [];
let currentLine = 0;
for (let i = 0; i < lines.length; i++) {
const match = lines[i].match(/^##\s+(.+)/);
if (match) {
if (currentHeading) {
sections.push({ heading: currentHeading, content: currentContent.join('\n').trim(), line: currentLine });
}
currentHeading = match[1].trim();
currentContent = [];
currentLine = i;
} else if (currentHeading) {
currentContent.push(lines[i]);
}
}
if (currentHeading) {
sections.push({ heading: currentHeading, content: currentContent.join('\n').trim(), line: currentLine });
}
return sections;
}

Abilities declared in agent.json:

  • secret-ability (required) — used to receive vault secrets at runtime (kadi secret receive).
  • ability-log (required) — structured logging ability.

Runtime / library dependencies (from source and build scripts):

  • @kadi.build/core — Zod wrappers and client typings used by tools.
  • node: standard Node APIs (fs, path, url).
  • Astro — site generation (used in build and dev scripts).
  • tsx / TypeScript — dev/run tooling.
  • kadi, kadi-secret — build scripts call kadi install and kadi-secret.
  • ability-graph — pipeline references graph-index (embedding/chunking) — pipeline calls graph-index ability directly.
  • ability-docs-memory — search tools may delegate to a docs-memory ability (native or broker).
  • ArcadeDB — external database for storing index/graph (configured in [arcadedb]).

What depends on agent-docs:

  • Any UI or agent that needs documentation search or page fetch (e.g., web frontend, chatbots, other agents) can call agents-docs-search / agents-docs-page via broker.
  • Tools that present site status or auto-generate READMEs rely on agents-docs’ repo crawling and metadata.
  • Tool registration pattern: all tools are registered with KadiClient.registerTool; follow the z.object input schema for validation.
  • To run locally (development): scripts include dev (Astro), dev:agent (agent run), start (broker mode). See package.json scripts in agent.json.
  • Secrets in local development: the deploy block expects secrets via broker; for local testing provide ARCADE_* and MODEL_MANAGER_* env vars or mock the secret-ability.
  • Pipeline concurrency and chunking constants (MAX_TOKENS_PER_CHUNK, INDEX_CONCURRENCY) are defined in src/tools/pipeline.ts — tune them when changing chunk/embedding behavior.
  • Template directory for README generation is templates/ at repo root (TEMPLATE_DIR constant in readme-generate.ts).
  • When adding new tools, register them in src/tools/index.ts so registerAllTools includes them.

If you need help locating a specific tool implementation, search src/tools for registerTool and match the tool name in its client.registerTool call.