Response truncated due to output length limit
When a model hits its max output tokens limit, the response gets truncated (finish_reason='length') and Hermes rolls back to the last complete assistant turn, an issue reported as common in long conversations with lengthy outputs.