Filtered

#infrastructure

2 posts

Apr 15, 2026 Self-Hosted LLMs in Production: What It Actually Takes to Cut API Costs A real case study on running 80B parameter models locally: hardware, costs, tradeoffs, and the numbers from a production self-hosted AI stack serving 4 concurrent users. #llm #ollama #local-ai #gpu #self-hosted #consulting [#infrastructure] Apr 15, 2026 Borrowing From Hermes Agent: A Self-Improvement Stack for a Multi-Agent Claude Code Fleet How we ported six patterns from NousResearch's Hermes Agent — FTS5 recall, dialectic user model, auto-skill drafts, trajectory logging — onto our four-worker Claude Code system in one session. #ai #claude-code #agents [#infrastructure] #consulting #self-improvement