Troubleshooting
This page helps you diagnose and resolve common AgentHub issues.
Quick Diagnostic Commands
# Check if AgentHub is running
curl -i http://localhost:8080/
# Check authentication
curl -H "Authorization: Bearer $TOKEN" http://localhost:8080/api/me
# Verify internal gRPC (if enabled)
agenthub actor inbox --actor-id test --limit 1
# Check disk space for event databases
df -h ~/.agenthub
# View recent logs
journalctl -u agenthub -n 100 --no-pager
Login Issues
Cannot Access Login Page
Symptoms: Browser cannot connect to http://localhost:8080
Diagnostic Steps:
-
Check if AgentHub is running:
pgrep -a agenthub -
Verify the server is listening:
netstat -tlnp | grep 8080# orss -tlnp | grep 8080 -
Check config for correct listen address:
[server]listen = "127.0.0.1:8080" # or "0.0.0.0:8080" for remote access
Solutions:
- Start AgentHub:
agenthub - Check firewall rules if accessing remotely
- Verify port is not in use by another process
Invalid Credentials
Symptoms: "Invalid username or password" error
Solutions:
- Verify caps lock is off
- Check if root user exists: check
~/.agenthub/agenthub.dbwith sqlite3 - If root password lost, you may need to reset the database (data loss)
Join Token Issues
Symptoms: "Invalid join token" or "Join token expired"
Solutions:
- Generate new join challenge via admin API
- Complete join within token expiration window
- Verify PIN is entered correctly
Agent Startup Issues
Agent Cannot Start
Symptoms: Clicking Start shows error or no response
Common Causes:
| Cause | Diagnostic | Solution |
|---|---|---|
| Invalid workdir | Check path exists | Create directory or choose different path |
| Path outside safe_paths | Check safe_paths config | Add path to config or use allowed path |
| Missing ACP binary | Check configured codex_acp.binary (or startup logs showing config codex_acp_binary) and verify that exact path exists and is executable | Install the bundled ACP adapter or set codex_acp.binary to a valid executable |
| Permission denied | Check directory permissions | chmod 755 /path/to/workdir |
"Workdir path must be within allowed safe paths"
Solutions:
-
Add path to
safe_pathsin config:safe_paths = ["/home/you/projects","/new/path/here",] -
Use
create_worktreemode (uses configureddefault_root) -
Restart AgentHub after config changes
ACP Binary Starts But Fails Immediately
Symptoms: Agent starts and exits quickly, or the server logs show ACP handshake / startup errors.
Diagnostic Steps:
- Check which binary AgentHub is launching by inspecting the configured
codex_acp.binaryvalue first. - Verify that exact adapter binary is the expected one:
/path/to/your-configured-agenthub-codex-acp --version
- Compare the binary path and reported adapter version to the deployed AgentHub build, or rebuild/replace the adapter from the same repository revision if you are unsure they match.
Solutions:
- Rebuild from the current repository state if the binary is stale
- Avoid mixing an older fork-pinned ACP binary into a newer AgentHub deployment
- If you intentionally use a custom ACP binary, set
codex_acp.binaryto that exact path and verify protocol compatibility first
"Agent is already running"
Solutions:
- Wait for current run to complete
- Stop the agent first, then restart
- Check if zombie process exists:
ps aux | grep agenthub-codex-acp
Session and Output Issues
No Output or Stale Output
Symptoms: Agent shows "running" but no new output appears
Diagnostic Steps:
-
Check connection badge in UI header
connected: SSE connection activeconnecting/reconnecting: Connection issues
-
Check browser console for SSE errors:
// In browser consolenew EventSource('/api/agents/{agent_id}/events') -
Verify event database is writable:
ls -la ~/.agenthub/agent-events/{agent_id}.db -
Check server logs for ACP event sink errors
Recovery Actions:
- Keep the current session for evidence
- Create a fresh session with same prompt
- Compare behavior to isolate differences
- Check Connection Status and Recovery
History Replay Is Slow
Symptoms: Opening a completed session takes long time to load
Solutions:
- Large sessions (>10k events) naturally take time
- Reduce
event_retention_daysin config - Delete old agent event databases:
rm ~/.agenthub/agent-events/{old_agent_id}.db
Events Missing From History
Symptoms: Recent events don't appear in session view
Causes:
- Events persisted asynchronously (small delay normal)
- Database locked during cleanup (if
vacuum_on_cleanup = true) - Event database corruption
Solutions:
- Wait a few seconds and refresh
- Check server logs for SQLite errors
- Restart AgentHub if database appears stuck
Team Issues
Team Runtime Won't Start
Symptoms: "Start Team" fails or hangs
Diagnostic Steps:
- Check Team spec is valid JSON
- Verify all
member_idreferences are valid - Ensure
leader_member_idreferences existing member - Check
entrypointis defined
Common Errors:
| Error | Cause | Solution |
|---|---|---|
spec.members must be an array | Invalid spec format | Ensure members is JSON array |
spec.leader_member_id must reference a defined member | Leader not in members list | Add leader to members or fix reference |
step already exists for run | Duplicate step key | Use unique step keys |
Permission Review Not Routing
Symptoms: Tool permission requests not reaching reviewers
Solutions:
- Check Team has leader defined
- Verify
requester_roleis set correctly - Ensure
agenthub actor ...commands work:agenthub actor team-members --actor-id <leader_actor_id>
Team Messages Not Delivered
Symptoms: Messages sent but not received by members
Diagnostic Steps:
-
Check actor inbox:
agenthub actor inbox --actor-id <member_id> --run-id <run_id> -
Verify mailbox routing is correct
-
Check internal gRPC connectivity
Internal gRPC and Actor Issues
"internal grpc client not available"
Solutions:
-
Enable internal gRPC in config:
[internal_grpc]enabled = truelisten = "127.0.0.1:50051" -
Restart AgentHub
-
Verify
shared_secretis explicitly configured
Actor CLI Commands Fail
Symptoms: agenthub actor inbox returns error
Diagnostic Steps:
- Verify internal gRPC is enabled
- Check
shared_secretmatches between config and CLI context - Ensure authority process is running
- Verify
--actor-idis correct
Solutions:
# Test basic connectivity
agenthub actor team-members --actor-id <id>
# Check inbox with explicit run scope
agenthub actor inbox --actor-id <id> --run-id <run_id> --limit 10
Agent Node Issues
Cannot Register Remote Node
Symptoms: "failed to connect to remote node" error
Diagnostic Steps:
- Verify remote AgentHub is running with internal gRPC enabled
- Check network connectivity:
telnet remote-host 50051
- Verify TLS certificates are valid
- Check firewall rules
Remote Agent Won't Start
Symptoms: Agent assigned to remote node but doesn't start
Solutions:
- Verify node is reachable from main control plane
- Check node's
Default worktree rootis configured or provide explicitWorkdir - Review node logs on remote machine
- Ensure same
shared_secretacross cluster
Performance Issues
High CPU Usage
Diagnostic Steps:
- Check active agent count
- Review event database sizes:
du -sh ~/.agenthub/agent-events/*.db
- Monitor cleanup operations
Solutions:
- Reduce
event_retention_days - Lower
delete_batch_sizefor gentler cleanup - Delete old/unused agents
High Disk Usage
Diagnostic Steps:
# Check AgentHub data directory
du -sh ~/.agenthub/*
# Find largest event databases
ls -lhS ~/.agenthub/agent-events/*.db | head -10
Solutions:
- Enable
vacuum_on_cleanup = true - Manually delete old agent databases
- Reduce retention period
Slow Query Performance
Causes:
- Large event databases without cleanup
- Missing indexes (should be automatic)
- Concurrent cleanup operations
Solutions:
- Regular cleanup via
event_retention_days - Schedule maintenance window for VACUUM
- Monitor with:
sqlite3 agent.db "PRAGMA integrity_check;"
Notification Issues
Push Notifications Not Received
Diagnostic Steps:
- Check browser notification permission
- Verify VAPID keys exist:
cat ~/.agenthub/vapid.json
- Check subscription status in UI
- Verify
subjectis configured
Solutions:
- Re-subscribe in browser
- Rotate VAPID keys if corrupted
- Check HTTPS requirement for production
Database Issues
SQLite Lock/Timeout Errors
Symptoms: "database is locked" errors in logs
Solutions:
- Reduce concurrent operations
- Check for long-running queries
- Ensure proper connection pooling
Database Corruption
Symptoms: SQLite integrity check failures
Recovery:
# Backup first
cp ~/.agenthub/agenthub.db ~/.agenthub/agenthub.db.backup
# Check integrity
sqlite3 ~/.agenthub/agenthub.db "PRAGMA integrity_check;"
# For event databases
sqlite3 ~/.agenthub/agent-events/{agent_id}.db "PRAGMA integrity_check;"
Getting Help
When reporting issues, include:
- AgentHub version:
agenthub --version - Config (sanitized):
cat ~/.agenthub/config.toml - Logs at debug level:
# Add to config[logging]level = "debug"
- System info:
uname -adf -hfree -h
- Reproduction steps
Recovery Strategy
For serious issues:
- Keep evidence: Don't delete failed sessions
- Create fresh environment: New workdir/worktree
- Isolate variables: Test with minimal config
- Incremental changes: Add complexity gradually
- Monitor: Watch logs during recovery
See also: