Skip to main content

​​🧠​ Managing conversation memory: an often overlooked lever for frugality

Written by Nicolas Movio
Updated this week

The impact of memory

With each new message in a conversation, the AI assistant reloads the entire history to maintain context.

Concretely:

  • 1st message: 100k tokens used

  • 2nd message: 100k (history) + 100k (new) = 200k tokens

  • 3rd message: 200k (history) + 100k (new) = 300k tokens

👉 The longer the conversation, the more each new interaction costs in resources—even if the message itself is short.

The issue: if this context is no longer useful for your new request, you’re still paying (indirectly) for that memory without getting any value from it. At the same time, you are slowing down interactions (lower performance), consuming more tokens (higher cost), and increasing environmental impact (energy, but also water used to cool infrastructure).

The goal: stay frugal—see the article AI: better, not more — adopting a frugal approach

🧩 Quick reminder: memory + tokens

An assistant based on an LLM doesn’t “remember” like a human. For each message, it reprocesses part (or all) of the conversation history, within the limits of its context window. This window is measured in tokens (units of text).

👉 For a simple explanation and key orders of magnitude, see: LLM limitations and the concept of tokens


When to keep the same conversation

It makes sense to continue the same conversation when:

  • you are iterating on the same document or topic

  • you are progressively refining a response

  • you are asking follow-up questions tied to the previous context

  • you are requesting adjustments in format or content

Example:

  1. “Summarize this report in 5 key points”

  2. “Add one recommendation per point”

  3. “Put everything into a table”

Here, each message builds on the previous one: the memory is useful and justifies its cost.


When to start a new conversation

It is better to start a new conversation as soon as:

  • you switch to a completely different topic

  • you move to a new document unrelated to the previous one

  • you have completed a task and are starting another

  • the conversation becomes long and the initial context is no longer relevant

Example:

Conversation 1: marketing document analysis → Completed ✓

New conversation: writing a client email

👉 No connection between the two topics → no reason to keep the history from the first conversation.


The reflex to adopt

Ask yourself this simple question before continuing a conversation:

“Does the AI need the previous context to answer correctly?”

  • Yes → continue in the same conversation

  • No → start a new one

This simple habit can reduce token usage by 2 to 3× over a day of usage (and therefore energy consumption), while maintaining the same quality of responses.


In summary

Situation

Recommended action

Iterating on the same topic

Keep the conversation

Refining a response

Keep the conversation

Changing topic

New conversation

Long conversation with outdated context

New conversation

Different document with no link

New conversation

👉 Actively managing your conversations means avoiding paying for memory that no longer creates value.

It’s a simple, immediate, and highly effective lever to adopt a more frugal approach in everyday use.

Did this answer your question?