Skip to main content

​🦾​ Limitations of LLMs and the concept of tokens

Written by Ludivine Schmitt
Updated this week

AI assistants, like those used in Outmind, are now extremely powerful tools. They make it possible to search, analyze, and leverage large amounts of information in just a few seconds.

However, to get the most out of them, it is essential to understand their limitations. Not to distrust them, but to use them more intelligently and effectively.


🧠 An LLM is not magic

An LLM (Large Language Model) does not “understand” things like a human. It does not reason based on real-world experience or business context.

👉 It predicts likely answers based on the information it has access to.

In practical terms, this means it can:

  • make mistakes or approximations

  • misinterpret a request if it is unclear

  • generate plausible but incorrect information (this is called a “hallucination”)

  • fail to take all available information into account

👉 And that’s normal… a bit like a human.

You can think of an LLM as a very fast and efficient intern who does not yet have deep knowledge of your company or the perspective of a senior profile.

With clear instructions, it can produce excellent work. Without a clear framework, it can make the same kinds of mistakes as a junior.

👉 It remains a support tool, not a source of absolute truth.


🔍 Working effectively with AI

The goal is not to “trust or not trust” AI, but to understand how to work with it.

In practice, this comes down to a few simple habits.

Reviewing important answers is essential, especially when dealing with sensitive elements such as dates, numbers, or names. It is also useful to explicitly ask the AI which sources it relies on, so you can quickly verify where the information comes from.

How you phrase your request also plays a key role. A clear, structured, and step-by-step instruction will almost always produce better results than a vague or overly broad request.

Finally, working step by step helps you stay in control of the assistant’s reasoning, just as you would with a colleague.

👉 The right mindset: use AI as a support tool, not as final validation.


🔢 What is a token?

AI models work with tokens, which are units of text.

A token can be a word, part of a word, or even a punctuation mark.

As you can see in this example, each colored segment corresponds to a token (word, subword, or punctuation mark):

📐 Orders of magnitude

  • 1 token ≈ 0.75 word

  • 1 dense page ≈ 400 to 500 words

  • 1 million tokens ≈ 750,000 words, or ≈ 1,500 to 2,000 pages

👉 Note: these equivalences vary significantly depending on the type of content.


📏 Why tokens matter

AI models have a token limit. In other words, they can only process a certain amount of text at once.

This limit includes:

  • your question

  • the documents being analyzed

  • the generated answer

The more content you add, the more tokens you consume. Once the limit is reached, the model has to make trade-offs.

→ Not all content is equal

The model does not read “pages,” but raw tokens. Depending on the format, cost and efficiency vary greatly:

  • Narrative text (emails, articles) → efficient (few tokens per idea)

  • Tables / poorly extracted PDFs → costly (many tokens for little information)

  • Code → very costly (every symbol counts)

  • JSON / HTML / logs → token explosion

👉 1,000 pages of a novel ≠ 1,000 pages of code or tables.


⚠️ Practical consequences

When too much information is sent at once, two main scenarios can occur:

  1. The LLM makes trade-offs : The model selects only part of the information to stay within its limit. As a result, some documents are ignored and the answer may be partial or less accurate.

  2. The request fails : If the limit is exceeded, the analysis may not complete (incomplete response, error, or interruption).

👉 This limitation is technical and independent of Outmind.


💡 Token best practices

In practice, the best results come from a more focused approach.

Instead of analyzing everything at once, it is better to work with batches of documents, break down complex requests, and focus on truly relevant information.

This not only helps bypass technical limits but also significantly improves the quality of the answers.

👉 In short: less volume, more precision.

Structure = cost + performance

How you structure your request has a direct impact:

  • unnecessary repetition → wasted tokens

  • overly long instructions → higher cost

  • poorly organized context → lower understanding

On the other hand, a clear and structured prompt uses fewer tokens and produces better results.

Reading more ≠ understanding better

Even if a model can process a large number of tokens, the longer the context, the more its “attention” gets diluted.

👉 Structuring and prioritizing information is often more important than raw quantity.


🧠 Key takeaways

AI assistants are extremely powerful tools, but they remain imperfect.

They can make mistakes, miss information, or be limited by technical constraints such as tokens. Most importantly, they do not replace human judgment.

👉 The right approach is to:

  • clearly frame your requests

  • verify the results

  • work step by step

The best results come from combining the power of AI with your own expertise.

Did this answer your question?