Skip to main content

🥸 Anonymize a Document with the Assistant

M
Written by Maxime Renault
Updated over 3 months ago

🎯 Goal

Allow a user to anonymize a document containing sensitive information, while preserving its structure and formatting (layout, tables, images, etc.).

The assistant can be launched on a file to detect and mask confidential data.


🧠 How It Works

Outmind provides a dedicated assistant specialized in document anonymization.
It follows a 3-step process to ensure a rigorous and user-controlled approach.

The assistant works locally on the selected file with no extrapolation — it only analyzes what is actually present in the document.


✅ Benefits

  • Securely share documents externally

  • Full control over what gets anonymized

  • Saves time: no need to reread the document line by line

  • Compliant with GDPR and internal confidentiality policies

  • Visual fidelity: formatting and layout remain intact


📌 Key Takeaway

This use case is especially valuable in pre-sales, support, legal, HR, or any scenario where you want to reuse a document without exposing sensitive information.

Here, the assistant acts as a document hygiene tool, automating a process that is both tedious and critical.


🔍 Real-World Example

A user wants to anonymize a client report before sending it to a partner, removing:

  • Names of involved individuals

  • Email or postal addresses

  • Project identifiers or internal references


🧪 Example Prompt

Context: I want to anonymize a document so it can be reused without exposing any sensitive information.


The file may contain confidential data that should be anonymized, while keeping its original formatting and layout (styles, tables, images, etc.).

Step 1 → Extract Sensitive Elements

Go through the entire document, including:

  • Main text

  • Headers and footers

  • Comments, notes, and metadata

  • Tables, charts, and image-embedded text (if OCR is available)

Identify and list all of the following sensitive elements:

  • Company names (e.g., “ACME”, “EDF”, “Deloitte”)

  • First and last names (e.g., “Grace”, “Mehrabe”, “Grace Mehrabe”)

  • Phone numbers (e.g., “+33 6 07 08 09 10” or similar formats)

  • Postal addresses (e.g., “4B Rue Saint Sauveur, 75002 PARIS”)

  • Email addresses (e.g., “grace@outmind.fr”)

  • Internal or client-specific URLs (e.g., https://intranet.edf.fr/projet-xyz)

  • Project or RFP identifiers (e.g., “AO-2024-EDF-02”)

  • Bank details, SIRET numbers, IBANs, etc.

  • Initials or acronyms tied to personal names (e.g., “G.M.” as short for “Grace Mehrabe”)

  • File metadata (author, company, title, etc.)

Present the full list of these elements with context where possible (e.g., snippet or page number).


Do not make any replacements at this stage.

Step 2 → Manual Validation

Once the list is shown, wait for me to validate the elements to anonymize.
I may choose to remove some, add new ones, or request specific replacement rules such as:

  • “All first names → FirstNameX

  • “Emails → [anonymized email]

Only after confirmation, proceed with the anonymization.

Did this answer your question?