Response truncated due to output length limit

#2706 Fixed

Response truncated due to output length limit

When a model hits its max output tokens limit, the response gets truncated (finish_reason='length') and Hermes rolls back to the last complete assistant turn, an issue reported as common in long conversations with lengthy outputs.