Token overhead analysis: 73% of each API call is fixed overhead (~13.9K tokens) — data + suggestions
Using a custom monitoring dashboard, the reporter analyzed 6 request dumps from a Hermes v0.6.0 deployment and found that 73% of every API call (~13,935 tokens) is fixed overhead from tool definitions (46.1%) and the system prompt (27.2%), independent of conversation content.