EyeNet—AI & Automation
AI & Automation Systems. Full infrastructure: LLM pipelines, ETL/ELT, containerized microservices, and production apps.
Summary
The Challenge
Hooks and events arrived with inconsistent schemas — missing fields, wrong types, variable structures. Impossible to store for KPIs without a robust normalization layer.
The Solution
A hybrid AI infrastructure combining proprietary APIs with a self-hosted GPU cluster. Sub-agent architecture for memory, context, and specialized delegation.
The Impact
65% reduction in manual work, 300+ docs processed weekly, 10K+ daily requests, and 92% extraction accuracy across two production systems.
“Using proprietary APIs for everything was not viable at scale. The decision to build our own GPU cluster changed the economics of the entire operation.”
The Infrastructure Bet
01 // The Real Challenge
What forced the architecture.
Three hard problems that shaped every technical decision.
Unnormalized Data
External hooks arrived with no fixed schema. Fields missing, types inconsistent, structures variable. We had to build a normalization layer before any AI could process the data.
DATA QUALITYControlled AI Costs
Using OpenAI/Gemini for everything scaled linearly in cost. We built a GPU cluster with open-source models and reserved proprietary APIs only where ROI justified it.
INFRASTRUCTUREAgents with Memory
Assistants had to handle multiple intents, remember context, and delegate to specialized sub-agents without losing coherence in long conversations.
AI SYSTEMS02 // The Systems
From sources to deliverables.
Two production systems. Different architecture, same principle: clean data → intelligent processing → actionable output.
03 // Key Decisions
Trade-offs that shaped the systems.
Own Cluster vs External APIs
For high-volume and repetitive use cases, proprietary API costs scale linearly. We mounted a GPU cluster with open-source models (LLaMA, Mistral, DeepSeek) for base workloads, reserving OpenAI/Gemini for cases where quality justified cost.
Trade-off
Higher operational complexity in exchange for full control over latency, cost, and data privacy.
Schema-First from Day One
Hooks arrived with variable structures. The temptation was to process and move on — but that would have made KPIs impossible. We designed the MongoDB schema thinking about future aggregations before writing the first pipeline.
Trade-off
More initial design time, zero technical debt in later analytics.
Sub-Agents vs Monolith
A single agent with all tools collapsed in long contexts and made routing errors. We separated into specialized sub-agents by domain (calendar, documents, data, general response) with a central orchestrator that delegates.
Trade-off
More nodes in the graph, explicit routing logic, but predictable and debuggable behavior.
Telegram as Alert System
Instead of building a monitoring dashboard from scratch, we integrated Telegram notifications directly into workflows. Every critical pipeline reports success/failure to the operator in real time.
Trade-off
Not Grafana, but instant, zero overhead, and the team already uses Telegram.
04 // Full Stack
Technologies and tools.
Ingestion
Processing
Storage
Delivery
Monitoring
05 // What Went Wrong
Documented limitations.
Problems we had to recognize and fix in production.
Dirty Data in Production
The first production webhooks arrived with fields in unexpected formats that the normalization schema didn't anticipate. Required rapid iterations on the Code layer before stabilizing.
DATA QUALITYCluster vs Production Gap
Models that worked well on the local cluster failed in production due to quantization differences and available memory. We learned to separate model evaluation environments from production inference.
INFRASTRUCTUREContext Window in Long Conversations
In long conversations, the main agent lost relevant context. Simple memory wasn't enough — we had to implement sliding window memory with context summarization.
AI SYSTEMS