DarkOps

AI governance platform with custom RAG and domain-aware agent specialization

What is DarkOps?

Custom RAG system for enterprise AI operations

DarkOps is an AI governance platform with a custom retrieval-augmented generation (RAG) system. Built before RAG was mainstream, it solves the "context bleeding" problem - when AI gets confused mixing PostgreSQL context into Redis tasks, or security patterns into performance optimization.

The platform uses a 10-layer context assembly architecture where each layer serves a specific purpose: base rules, domain knowledge, agent personality, invasiveness permissions, caution levels, specialized expertise, function patterns, field notes, and conversation history.

Built Using DIOS2

DarkOps was developed using the DIOS2 multi-agent system. Major implementations were orchestrated by specialized agents (Executives → Technical Lead → Intel Agent → Field Agents) with comprehensive tracking and verification at each stage.

Mission Analyst Meta-Cognitive Loop

AI that analyzes its own performance and suggests improvements

After each conversation, a specialized mission analyst agent reviews the interaction, maps user feedback to specific context layer gaps, and generates exact recommendations for improvement. This is self-improving AI without RLHF.

  • Layer-Specific Diagnosis: "Layer 3: Create Script lacks error handling at line 45" (not generic "improve prompts")
  • Tool Execution Analysis: Detects failed commands, maps to context instruction errors
  • Fast Feedback Loop: 30 seconds after conversation (vs weeks for A/B testing)
  • Cost-Effective: $0.02 per analysis (vs $10,000+ for RLHF fine-tuning)

Self-Improving Intelligence

The meta-cognitive loop enables the system to identify its own blind spots and suggest fixes. Mission analyst recommendations become context updates, improving all future conversations. This is continuous improvement at scale - 1,000 conversations/day × $0.02 = $20/day vs hiring prompt engineers at $100/hr.

Agent Tool Execution

AI agents with calibrated execution permissions

DarkOps agents don't just return text responses - they execute operations on systems with calibrated permissions. Like Claude Code, but with enterprise safety controls designed for production environments.

The invasiveness framework creates a permission system for AI tool use. Each agent's level determines which operations it can execute:

  • SENTINEL: Read-only observation (query metrics, analyze logs, inspect configurations)
  • ANALYST: Suggestions without modification (recommend optimizations, identify issues)
  • TECHNICIAN: Minor system changes (restart services, clear caches, adjust settings)
  • OPERATIVE: Moderate changes (create database indexes, modify configurations, deploy updates)
  • INFILTRATOR: Significant changes (schema migrations, architecture changes, critical deployments)

Combined with caution levels (Quartermaster → Diplomat → Ranger → Maverick → Ronin), this creates a two-dimensional framework for precise risk management. A TECHNICIAN with QUARTERMASTER caution makes careful, validated changes. An OPERATIVE with RONIN caution moves fast but can execute significant operations.

AI Safety Through Architecture

The specialization matrix solves AI safety for production environments. Instead of giving all agents full access or no access, DarkOps calibrates permissions based on task requirements. Database optimization needs an OPERATIVE. Log analysis only needs a SENTINEL. The architecture enforces these boundaries.

Enterprise AI Governance

Role-based AI capabilities for production environments

DarkOps implements enterprise AI governance through conditional tool provisioning - the first platform to enforce role-based AI capabilities. Junior staff get safe AI (Technician level - can read but not modify), senior architects get powerful AI (Operative level - can modify system intelligence).

AI Governance in Action

Scenario 1: Nulla (Technician) attempts context update → Gets read-only access → Shows governance restriction
Scenario 2: Ewen (Operative) updates context successfully → Changes propagate via 4-tier cache invalidation → Next conversation uses improved intelligence → Shows meta-cognitive capabilities with immediate effect

  • Conditional Tool Provisioning: Tools dynamically provided based on invasiveness level
  • Server-Side Authorization: Technicians can't escalate privileges even if they try
  • Audit Logging: Full change tracking for compliance (SOC2, HIPAA ready)
  • Immediate Effect: Context updates propagate via 4-tier cache invalidation

10-Layer Context Architecture

Progressive context assembly for specialized AI responses

LAYER 1-2
Foundation
Base system rules and 5 specialized categories: DataOps (data engineering), CodeOps (software ops), DocOps (documentation), LogOps (logging/monitoring), and IntelOps (meta-cognitive - agents that improve other agents' intelligence)
LAYER 3-4
Task & Agent
Action-specific frameworks and agent personality/communication style
LAYER 5-6
Behavior Control
Invasiveness level (what can be modified) and caution level (decision-making approach)
LAYER 7-8
Specialized Knowledge
Deep domain expertise and operational function patterns
LAYER 9
Field Notes
Accumulated wisdom retrieved via hybrid search (semantic + keyword), domain filtering with relevance thresholds, deduplication, and effectiveness-based re-ranking
LAYER 10
Conversation
Historical context and continuity from previous interactions

Key Technical Innovations

Production engineering patterns and AI governance architecture

  • Dual-File Validation Workflow: Solves critical LLM truncation problem through programmatic validation. Despite 35,229 characters of explicit instructions, LLMs truncate large content 30-40% of the time with '[Rest remains same]' placeholders. Export creates locked reference + editable working copy, LLM performs native file operations, then validation compares both to detect truncation (>10% loss), corruption patterns, or structure damage. Zero truncation failures in production (100+ updates tested). Engineering solution to fundamental LLM limitation.
  • Domain-Aware RAG with Retrieval Best Practices: Solves context bleeding through vector embeddings and domain-scoped retrieval. Domains chunked with overlap for context preservation. Field notes retrieved via hybrid search (semantic + keyword), filtered by domain scope with relevance thresholding (PostgreSQL tasks only get PostgreSQL context above quality cutoffs), deduplicated to remove redundant entries, then re-ranked by effectiveness scores. Multi-stage retrieval pipeline prevents AI confusion from irrelevant information.
  • Agent Specialization Matrix: Two-dimensional framework (Invasiveness × Caution) creates precise capability matching. 17 base templates dynamically compose to generate 120 unique agent configurations, achieving 60-70% token savings vs monolithic prompts. Agents share base layers, only specialize deltas - surgical updates fix one template, 15 agents benefit instantly. Agents range from read-only Sentinels to system-modifying Infiltrators, with decision-making from cautious Quartermasters to fast-acting Ronins.
  • Self-Improving Agents: Effectiveness scoring tracks which field notes produce successful outcomes. Adaptive learning thresholds adjust based on agent's learning velocity, enabling natural specialization emergence.
  • Production Engineering Patterns: Circuit breaker pattern prevents cascade failures, 4-tier caching architecture (Service Layer → Repository Layer → Database Static Cache → Redis Distributed Cache) achieving 90%+ cache hit rate with 36-48ms average context assembly, transaction-aware cache invalidation, comprehensive audit logging.
  • Multi-Layer Security: Template validation with regex blacklists, sandboxed execution, HMAC signatures, HTML sanitization, input size limits, pattern detection for injection attacks.
  • Repository Pattern Architecture: Composition over inheritance with specialized repositories coordinating multiple base repositories. Clean separation with dictionary returns (not ORM objects) preventing session detachment.

System Architecture

4-layer stack with production-grade patterns

Data Access Layer

Repository pattern with composition over inheritance. MintedBaseRepository provides CRUD operations, specialized repositories (AgentRepository, CategoryRepository, ConversationRepository) coordinate multiple base repositories. Session management with circuit breakers, LRU cache eviction, transaction-aware invalidation.

Services Layer

Context Service orchestrates 10-layer assembly with intelligent token management. Template Renderer processes Jinja2 templates, Template Validator enforces security (no script injection, no eval), Token Optimizer reduces token usage 35-60% while preserving meaning. Async operations throughout for parallel layer fetching.

API Layer

Flask Blueprint architecture for modular organization. RESTful patterns with consistent response formats, multi-layer security (authentication → authorization → validation), comprehensive audit logging, graceful degradation with fail-safe defaults.

Intelligence Layer

Multi-note system with domain scope awareness. Performance metrics tracking (daily success rates, competency scores), context effectiveness events (learning from each conversation), optimization recommendations (self-improvement suggestions), failure-driven learning with adaptive thresholds.

Potential as MCP Server

DarkOps's architecture maps naturally to Model Context Protocol (MCP) servers. The 10-layer context assembly, domain-aware filtering, and effectiveness scoring could serve as an MCP context management server for AI coding assistants.

Technical Stack

Production-ready Python/Flask architecture

Python 3.9+ Flask SQLAlchemy 2.0 PostgreSQL Redis Jinja2 Async/Await

Patterns & Practices

Repository Pattern Circuit Breaker 4-Tier Caching Blueprint Architecture Transaction Management LRU Eviction HMAC Signatures

Documentation & Lectures

Comprehensive architectural documentation

Created detailed lecture series explaining the architecture from first principles. Each lecture uses metaphors (libraries, orchestras, theaters) to make complex systems understandable, then dives into technical implementation details.

  • Lecture I - Data Access Layer: Repository pattern, session management, cache architecture, transaction handling
  • Lecture II - Services Layer: Context assembly, template rendering, token optimization, security validation
  • Lecture III - API Layer: Blueprint architecture, request flow, error handling, proxy pattern
  • Lecture V - Intelligence Layer: Multi-note system, domain awareness, adaptive learning, self-improvement
  • Winston Series: Platform vision, network effects, competitive moats, future of AI governance

The lectures demonstrate technical communication ability - explaining sophisticated distributed systems architecture to varied audiences (developers, business leaders, architects).

Skills Demonstrated

What employers will see in DarkOps

  • Meta-Cognitive AI Architecture: Self-improving system with mission analyst loop ($0.02 per analysis vs $10k+ RLHF), layer-specific diagnostics, and fast feedback cycles
  • Enterprise AI Governance: Conditional tool provisioning for role-based AI capabilities, server-side authorization preventing privilege escalation, SOC2/HIPAA-ready audit logging
  • Custom RAG Implementation: Built retrieval-augmented generation before it was mainstream, implementing chunking with overlap, hybrid search (semantic + keyword), relevance thresholding, deduplication, domain-scoped filtering, and effectiveness-based re-ranking for multi-stage retrieval pipeline
  • Production Engineering: 4-tier caching achieving 90%+ hit rate with 36-48ms context assembly, circuit breakers, graceful degradation, comprehensive error handling
  • Security Engineering: Multi-layer validation, injection prevention, sandboxed execution, audit logging
  • Distributed Systems: 4-layer architecture with clean separation, async operations, state management
  • Platform Thinking: Flask blueprint pattern for embeddable integration, repository pattern for data access
  • Technical Writing: 27,000+ lines of comprehensive documentation with metaphors and examples
  • Systems Architecture: Designed for multi-tenancy, cross-organizational intelligence, platform evolution