Hermes Agent: The AI That Gets Smarter on Every Project

The Knowledge That Walks Out the Door

Every construction company has a person — maybe it's the senior estimator, maybe it's the 30-year superintendent — who just knows things. They know that concrete on waterfront sites always costs 15% more because of access logistics. They know which subcontractors start slow but finish strong. They know which architects take a week to answer questions and which take a month.

This knowledge took decades to accumulate. It lives entirely in that person's head. And when they retire, change companies, or get hit by a bus — it's gone. Poof. The next person starts from scratch.

This is one of the construction industry's biggest problems, and most AI tools don't solve it. ChatGPT, Claude, Codex — they're all stateless. That's a fancy way of saying they have no long-term memory. Every time you start a new conversation, the AI has completely forgotten everything from the last one. It's like hiring a consultant who gets amnesia every evening.

Hermes Agent (opens in a new tab) is fundamentally different. It's an open-source AI agent built by Nous Research (opens in a new tab) that has something no other major AI agent has: a built-in learning loop. It doesn't just answer your questions — it learns from every interaction, creates reusable procedures (called "skills"), improves those procedures over time, and remembers everything across sessions.

It literally gets smarter the more you use it.

What Is Hermes? (The Simple Version)

Imagine hiring a junior estimator who:

Has a perfect memory and never forgets a detail from any project
After completing a task, writes down the process they used so they can do it faster next time
Reviews and improves their own notes after every project
Shares their notes with colleagues so everyone benefits
Learns your personal communication preferences (you like bullet points, not paragraphs)

That's Hermes. It's not just an AI that answers questions — it's an AI that learns and improves.

What's an "AI Agent" vs. a "Chatbot"?

A chatbot (like ChatGPT) is like talking to a very smart friend on the phone. You ask questions, they answer, the call ends, they forget the conversation. A chatbot with memory is like texting — the history is preserved, but they don't learn from it.

An AI agent is like a remote employee. They don't just answer questions — they can research things, use tools (like looking up data in software), create files, send emails, and take action on your behalf. They work semi-independently.

Hermes is an AI agent that also learns on the job — like a new employee who gets better every week.

The Learning Loop: What Makes Hermes Special

Most AI agents work like this: you give them a task → they complete it → they forget everything. Hermes adds two critical extra steps: remember and improve.

Here's how Hermes compares to regular AI agents:

What You'd Want an Employee to Do	Regular AI (Claude, GPT)	Hermes
Complete a task you give them	✅ Yes	✅ Yes
Remember what they learned last week	❌ No — amnesia after each session	✅ Yes — persistent memory
Apply past lessons to new tasks	❌ No — starts from scratch	✅ Yes — recalls relevant knowledge
Write down their processes for future use	❌ No	✅ Yes — auto-creates "skills"
Get better at their job over time	❌ No — same quality forever	✅ Yes — skills get refined
Learn your personal preferences	❌ No — treats you the same as everyone	✅ Yes — models your style
Share knowledge with other team members	❌ No	✅ Yes — skills are portable

How Memory Works (Three Layers)

Hermes has three types of memory, each serving a different purpose. Think of it like how humans remember things:

Layer 1 — Working Memory: What you're talking about right now. Just like how you can hold a phone conversation in your head. When the conversation ends, the important parts get saved to long-term memory.

Layer 2 — Long-Term Memory: A permanent, searchable database of everything the agent has learned. Uses a technology called FTS5 (Full-Text Search) to quickly find relevant past knowledge. Also uses Honcho — a separate memory system — to build detailed models of the people it works with (more on this later).

Layer 3 — Procedural Memory (Skills): These are the "how to do things" instructions the agent writes for itself. When you learn to ride a bike, you don't have to think about pedaling anymore — it's automatic. Similarly, when Hermes learns a process (like estimating waterfront concrete), it creates a skill that it can invoke automatically in the future.

Quick Start: Installing Hermes

What You Need

A Mac, Linux, or Windows (with WSL2) computer
An API key from an AI provider (Anthropic, OpenAI, or others). Hermes itself is 100% free — you only pay for the AI model it uses, which typically costs pennies per task.

# Download and install Hermes (one command)
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

Then set up your AI provider:

hermes model set anthropic/claude-sonnet-4
export ANTHROPIC_API_KEY=sk-ant-your-key-here

Claude is the most capable general-purpose model. Best for complex reasoning tasks like estimating and scheduling.

Start a conversation:

# Interactive mode — type back and forth like a chat
hermes
 
# Or give it a one-off task
hermes -q "Analyze the attached bid tabulation and tell me which trades came in above our historical averages"

The Big Example: How Hermes Learns to Estimate

Let's walk through a complete, realistic example of how Hermes learns over the course of multiple construction projects. This is the core value proposition — watch how the agent goes from "generic AI" to "your company's expert estimator."

Project 1: Main St. Commercial Build (The First Estimate)

You ask Hermes to estimate concrete for a foundation:

You: Estimate concrete for the Main St. foundation.
     Plans are in ~/projects/main-st/plans/

Hermes analyzes the plans, calculates quantities, pulls regional pricing data, and delivers an estimate: $735,000.

At this point, Hermes is working with generic industry data — it's like a first-day employee using textbook numbers. The estimate is reasonable but not calibrated to your company's reality.

Three months later, you finalize the actual costs and tell Hermes:

You: Actual concrete costs came in at $847K vs your estimate of $735K.
     The difference was mostly access logistics — it's a waterfront site,
     the staging area was tiny, and we needed crane-assisted bucket pours
     instead of pump trucks.

Here's where Hermes diverges from every other AI. It doesn't just say "oh, noted." It reasons about why it was wrong and creates a permanent skill:

"Waterfront sites with constrained access add approximately 15% to concrete costs due to logistics. This specifically affects foundation work requiring crane-assisted pours. I should create a skill so I remember this for future waterfront projects."

It automatically creates this skill:

# Waterfront Concrete Estimating
 
## When to Apply
- Project is waterfront or has constrained site access
- Foundation or below-grade concrete work
- Staging area < 5,000 SF or requires crane-assisted pours
 
## Adjustments
- Base concrete estimate: use standard RSMeans + regional factor
- Access logistics premium: +12-18% (use 15% as baseline)
- Crane-assisted pour premium: +8-12% on placement costs
- Weather contingency for waterfront: +3-5%
 
## Calibration Data
- Main St. Commercial Build: estimated $735K, actual $847K (+15.2%)
  - Primary driver: access logistics, crane time
 
## Confidence
LOW — based on single project. Refine after 2+ waterfront projects.

Notice the Confidence Rating

This is crucial. Hermes doesn't treat one data point as gospel. After one project, confidence is LOW. It will use this skill as a guideline, not a rule. After three waterfront projects, confidence upgrades to MEDIUM and the numbers get tighter. After ten projects, it's HIGH with narrow, reliable ranges. This is exactly how experienced human estimators build intuition — Hermes just does it systematically.

Project 2: Harbor View Mixed-Use (Six Months Later)

You start a new waterfront project. You ask Hermes to estimate the foundation concrete.

You don't need to remind Hermes about the waterfront lessons. It automatically checks its skills library, finds the waterfront-concrete-estimating skill, and applies it:

Hermes: I've applied my waterfront access adjustments to this estimate:

        Base estimate:              $1,120,000
        Access logistics (+15%):      $168,000
        Crane placement (+10%):        $89,000
        Weather contingency (+4%):     $55,000
        ─────────────────────────────────────
        Adjusted estimate:          $1,432,000

        Note: My waterfront adjustment is based on one prior project
        (Main St., where actuals ran 15.2% over base). Confidence: LOW.
        I'll refine these numbers once we get actuals from this project.

Actuals come in at $1,389K (3% below estimate). Hermes updates its skill:

# Updated calibration data
projects:
  - name: "Main St. Commercial Build"
    estimated: 735000
    actual: 847000
    variance: "+15.2%"
    drivers: ["access logistics", "crane time"]
 
  - name: "Harbor View Mixed-Use"
    estimated: 1432000
    actual: 1389000
    variance: "-3.0%"
    drivers: ["slightly overestimated crane time"]
 
# Tightened ranges based on two data points
confidence: MEDIUM
adjusted_range:
  access_logistics: "12-16%"    # was 12-18%
  crane_placement: "8-10%"      # was 8-12%

Project 5, 10, 20...

By project 20, Hermes has calibration data from waterfront, urban, suburban, and rural sites. Its estimating skills are tuned to your specific company's cost patterns — not generic RSMeans data. It knows YOUR subcontractors' pricing tendencies, YOUR regions' material costs, YOUR team's productivity rates.

This is institutional knowledge that doesn't walk out the door when someone quits. It lives in Hermes's skills — searchable, portable, and continuously improving.

Skills: How They're Created, Used, and Shared

The Skill Lifecycle

Created: Hermes notices it did something that could be useful in the future. It writes down the process as a skill.

Active: The skill gets used on a real task. If it works, great — it stays active.

Refined: Every time the skill is used, Hermes compares the result to what actually happened and adjusts the skill's parameters. This is like a human estimator saying "okay, that number was a little high, I'll adjust next time."

Shared: Skills can be exported and shared with other agents, other projects, or other team members:

# Export a skill (like sharing your estimating cheat sheet with a colleague)
hermes skills export waterfront-concrete-estimating
 
# Import a skill from someone else
hermes skills import ~/shared-skills/waterfront-concrete-estimating/
 
# Browse community skills (like an app store for AI knowledge)
hermes skills search "construction estimating"

How Hermes Learns About People (Honcho Integration)

Hermes doesn't just learn about projects and costs — it learns about people. It uses Honcho (a memory system we've covered in a separate article) to build detailed models of every person it interacts with.

Over months of interaction, Hermes builds profiles like this:

Peer: John (Site Superintendent)
├── Communication style: Brief, action-oriented. Hates long explanations.
├── Schedule: Sends updates early morning (5:30-6:30 AM). Don't message after 6 PM.
├── Primary concerns: Safety first, then schedule, then cost.
├── Decision pattern: Decides fast, revisits if new info emerges.
├── Reliability: His daily log entries are consistently accurate.
└── Pet peeve: Being asked questions he already answered.

Peer: Sarah (Architect - Henderson Design)
├── Communication style: Detailed, formal. Prefers email over chat.
├── RFI response time: Average 8 business days (industry avg is 5).
├── Pro tip: Responds faster when RFI includes a sketch or markup.
├── Pricing accuracy: Her change order pricing is usually accurate.
└── Sensitivity: Resistant to design changes after the Construction Documents phase.

Why this matters in practice:

When Hermes is handling an RFI follow-up, it already knows:

For John: Give him a 2-sentence status update, not a paragraph
For Sarah: Include a markup in the follow-up email (she responds 40% faster when you do)
Timing: If it's been 7 days without a response from Sarah, it's time to follow up. But don't follow up at day 3 — that's premature for her pattern and will just annoy her.

This is the kind of interpersonal intelligence that takes humans years to develop. Hermes builds it systematically.

Connecting Hermes to Paperclip (Multi-Agent Orchestration)

If you're running multiple agents (not just Hermes), you can plug Hermes into Paperclip — the organizational layer that coordinates your whole AI team:

# Install the adapter (the bridge between Hermes and Paperclip)
npm install hermes-paperclip-adapter

Configure it when creating an agent in Paperclip:

{
  "name": "EstimatingAgent",
  "adapter": "hermes_local",
  "adapterConfig": {
    "model": "anthropic/claude-sonnet-4",
    "provider": "anthropic",
    "timeoutSec": 300,
    "persistSession": true,
    "toolsets": ["terminal", "file-ops", "web"],
    "checkpoints": true
  }
}

The key setting is persistSession: true. This means every time Paperclip wakes up the Hermes agent (on a scheduled "heartbeat"), the agent picks up exactly where it left off — with all its memories, skills, and context intact. It doesn't start fresh each time.

Where to Run Hermes

Hermes runs anywhere — from your personal laptop to a cloud server to "serverless" infrastructure (where you only pay when it's actively doing something):

Where	Best For	Monthly Cost
Your laptop	Testing, solo use	$0 (just pay for AI model usage)
Docker container	Team deployment on an office server	$0 (just pay for AI model usage)
Daytona (cloud)	Always-on agent that hibernates when idle	~$5-20/mo + AI model usage
Modal (serverless)	Burst workloads (like bid season)	Pay per second of usage

# Run locally — simplest option
hermes
 
# Run in Docker — good for team deployments
docker run -e ANTHROPIC_API_KEY=... nousresearch/hermes-agent
 
# Run on Daytona — nearly free when idle, always available
hermes config set runtime daytona
hermes

What's "Serverless" Mean?

Traditional servers run 24/7, even when nobody's using them — like leaving your car running in the driveway all day. Serverless infrastructure only starts when there's work to do and shuts down when it's idle — like a taxi that only charges you when you're riding. For AI agents that might only work for a few minutes per day, this can reduce costs by 90%.

What Hermes Knows After 6-12 Months

After running on your projects for 6-12 months, Hermes builds a knowledge base that looks like this:

This is the institutional knowledge that construction companies have never been able to systematically capture. It doesn't live in someone's head. It lives in Hermes's skills and memory — searchable, shareable, and continuously improving.

The comparison is stark:

Without Hermes: Your senior estimator retires. The new estimator spends 2-3 years rebuilding that intuition through trial and error (and expensive mistakes).
With Hermes: Your senior estimator retires. The new estimator works alongside Hermes, which contains 3 years of calibrated estimating skills, vendor knowledge, and regional pricing data. Day one, they're producing estimates at a senior level.

Conclusion

Most AI tools are like calculators — they're useful when you pick them up, but they don't learn anything. Hermes is like a new hire who keeps a meticulous notebook, reviews it every evening, and comes back every morning a little smarter than the day before.

For construction — an industry where institutional knowledge is the single most valuable (and most fragile) asset a company has — that's not an incremental improvement. It's a fundamental shift.

Static tools assist. Hermes grows.

OpenClaw for the Field The Agent Stack: Paperclip + Honcho + OpenClaw + Hermes