Architecture

Two-tier architecture separating AI-assisted research from interactive presentation.

1. Data Collection (Knowledge Base)

Research Methodology

AI-Powered Research Pipeline

The research process combines multiple AI platforms with desktop tool integration to ensure comprehensive, accurate tool evaluations:

1Deep Research via Frontier Models

Leverage reasoning capabilities of major frontier models (Claude, Gemini, OpenAI)
Synthesize insights across multiple model perspectives to reduce bias
Identify emerging trends and patterns across the agentic tools landscape
Provide competitive context for evaluation criteria

2Database Management via MCP (Model Context Protocol)

Desktop Tool Integration

Model Context Protocol (MCP) enables Claude Desktop and ChatGPT Desktop to directly access and modify the tool knowledge base through conversational interfaces. This breakthrough integration removes the traditional barrier between AI research and data entry.

Direct database access through AI conversations in desktop applications
Natural language commands create/update tool records without manual forms
Claude Desktop + ChatGPT Desktop: Flexible research workflows across platforms
Example: "Add a new tool called Windsurf with Autonomy: 14, Context: 16..." updates the knowledge base directly
Conversational iteration allows rapid refinement of tool profiles
Zero-friction workflow: Research insights flow directly into structured data

MCP Advantage

Traditional workflow: Research → Copy notes → Open browser → Navigate to knowledge base → Find file → Edit YAML + markdown → Commit

MCP workflow: Research → "Add to database: [tool info]" → Done

3Validation & Roadmap

Current State: Automated Quality Gates, Human Calibration

Fail-closed validation gates on every publication: citation grounding, schema integrity, and score-distribution checks
Comparative scoring within each tool cohort, with an anti-clustering gate that flags rubric drift before data ships
Weekly human calibration audit measuring the automated judgments against reviewer verdicts

North Star: Always-On Evaluation

We are moving toward a triggered, always-on evaluation dashboard. As the AI landscape advances, our goal is to automate the monitoring of tool updates, GitHub activity, and community sentiment to trigger re-evaluation events automatically.

Research Workflow Example

1. Frontier Model Research: "Analyze agentic IDE assistants released in Q4 2024"
   → Synthesize insights from Claude, Gemini, and OpenAI
   → Identify key differentiators and market positioning

2. Claude Code (with Radar MCP): "Create entries for Windsurf, Cursor, and Copilot"
   → Conversationally populates knowledge base with structured data
   → MCP + evaluate skill handle all data operations automatically

3. AI-Assisted Validation: "Review the Windsurf entry against latest release notes"
   → Human expert verifies facts using AI research tools
   → Changes committed to git with full history

4. Scoring: Apply evaluation framework based on research findings
   → Dimension scores informed by capability analysis

Evaluation Framework

Tools are scored across the six ACES v2 dimensions (Autonomy, Integration, Context, Compliance, Viability, Interface) on a 1–20 rubric, then summed to a 0–100 Rating. Each tool also carries a Signal Level (Validated, Assessed, Tracked, Detected) and an Evidence Grade (A–D).

See full methodology

2. Presentation Layer (Next.js Application)

Tech Stack

FrameworkNext.js 15.5.12 + React 19.0.0 + TypeScript 5.9.3

DeploymentVercel (serverless, global CDN)

VisualizationNivo 0.99.0 (D3-based radar charts)

StylingTailwind CSS 3.4.19

Data FetchingSWR 2.4.0 (client-side caching)

ValidationZod 4.3.6 (runtime type safety)

Key Features

Data Pipeline

• Build-time generation: Scripts read tool markdown files at build time
• Production: Serves pre-generated static snapshots for fast, reliable responses
• Development: Same snapshot-based data as production
• Version-controlled: Full git history on every score change
• Validation: Zod schema ensures type safety across all data sources

Interactive Radar (`/radar`)

• Compare up to 5 tools (or all tools in category filter)
• Tool logos via favicon API with smart fallbacks
• Smart collision detection with auto-stacking
• PNG export functionality via overlay button
• Dynamic dimension filtering (minimum 3 required)
• URL-based presets for shareable comparisons
• Historical snapshots for time-based analysis

All Tools View (`/tools`)

• Comprehensive listing grouped by category
• Smart score display (shows both scores when they differ)
• Signal Level badges (Validated, Assessed, Tracked, Detected)
• Consistent card layout with tool details

User Interface

• Unified drawer (tools, filters, dimensions)
• Category-grouped selection with bulk actions
• Real-time filtering (category, status, recency)
• Dynamic dimension visibility controls
• Signal Level badges with color mapping (4 Signal Levels: Validated, Assessed, Tracked, Detected; pipeline statuses: Backlog, Deferred, Not Enterprise Viable, Submitted)
• "About Scores" documentation explaining scoring methodology

AI Chat Assistant (`/labs/chat`)

• Natural language queries about tools and recommendations
• Context-aware responses using radar data
• Conversational interface for tool discovery
• Markdown-formatted responses with tool links

Tool Submission (`/submit`)

• Community-driven tool suggestions
• Form captures tool name, URL, and quick take
• Submissions flow to GitHub Issues backlog for review
• Transparent backlog visibility for submitted tools

Insights Dashboard (`/insights`)

• Strategic analysis of tool landscape
• Category breakdowns and trend analysis
• Task-based tool recommendations
• Adoption guidance for teams

Tool Details (`/tools/detail/[id]`)

• Individual tool deep-dives with full scoring breakdown
• Dimension-by-dimension analysis visualization
• Quick links to product, docs, and company sites
• Related tools suggestions

Data Flow

AI Research Layer
├── Frontier Models (Claude/Gemini/OpenAI) → Deep research & reasoning
├── Claude Code (/evaluate skill) → Automated tool evaluations
└── Human Expert Loop → Strategic validation & scoring
        ↓
knowledge/tools/*.md (Source of Truth)
        ↓
        └─→ Build-time snapshot generation
                   (scripts/generate-static-data.js)
                   ↓
            ├── Static JSON (src/data/tools-snapshot.json)
            │
            └── Historical Snapshots (src/data/snapshots/*.json)
                   ↓
            Next.js API (/api/tools, /api/history)
        ↓
SWR Cache (Client)
        ↓
React UI (Radar + Tools + Chat + Insights)
        ↓
        └─→ Slack Notifications (release updates, new tools)

Community Input
├── Tool Submissions (/submit) → GitHub Issues Backlog → Review Queue
└── AI Chat (/labs/chat) → Context-aware tool recommendations

Key Benefits

Research

Velocity: AI-assisted research + MCP desktop integration dramatically accelerates data collection
Accuracy: Multi-platform validation ensures quality
Scalability: New tools added quickly through AI workflow

Technology

Type Safety: TypeScript + Zod prevent runtime errors
Performance: Build-time snapshots eliminate API latency in production, SWR minimizes client requests
Reliability: Static snapshots ensure consistent data even without external service dependencies
Maintainability: No database/servers to manage
Cost: Pay-per-use serverless model, reduced API calls in production

Operations

Zero-downtime deploys: Vercel auto-deploy on git push with fresh data from knowledge base on every build
Team collaboration: All edits tracked in git history
Audit ready: Complete edit history maintained in git
Fast iteration: Changes flow to production on next deploy (automatic snapshot refresh)
Developer experience: Live API in development, cached snapshots in production

Tech Stack Summary

Layer	Technology	Version	Purpose
Research	WebSearch, Tavily, Frontier Models	-	AI-assisted data collection
Data	Markdown + YAML Files	4.0.3 + 4.1.1	Source of truth with version control
Build	Node.js scripts	-	Static snapshot generation (prebuild)
Storage	Static JSON	-	Production data cache (tools-snapshot.json)
API	Next.js Server Routes	15.5.12	Transform and validate
Frontend	React + TypeScript	19.0.0 + 5.9.3	Interactive UI
Visualization	Nivo	0.99.0	D3-based radar charts
Styling	Tailwind CSS	3.4.19	Utility-first CSS
Deploy	Vercel	-	Serverless hosting + CDN
Validation	Zod	4.3.6	Runtime type safety
Cache	SWR	2.4.0	Client-side data management
Export	html-to-image	1.11.13	PNG chart export
Testing	Playwright	1.58.2	E2E browser testing

Architecture Philosophy

Combine AI-assisted research with desktop tool integration (MCP) for rapid, accurate data collection, paired with modern web infrastructure for robust, scalable delivery.

View the Radar