← Back to blog
LegalFebruary 26, 202610 min read

Privilege-Aware Document Search: Role-Based PII Control for Legal RAG

Build RAG systems for law firms where partners, associates, paralegals, and client portals each see appropriate levels of case data. Supports ethical walls, attorney-client privilege, and ABA compliance.

Law firms are rapidly adopting RAG systems for case research, contract analysis, due diligence, and client communication drafting. The efficiency gains are compelling — associates can search across thousands of case files in seconds, partners can pull up settlement histories instantly, and paralegals can draft document summaries without manually reviewing every page.

But legal documents contain some of the most sensitive PII in any industry. Client identities, financial details, medical records in personal injury cases, settlement amounts, Social Security numbers for estate planning, and privileged attorney-client communications all flow through a firm's document management system. ABA Model Rule 1.6 imposes a duty of confidentiality that extends to every piece of technology a firm uses — including AI systems. When a RAG pipeline sends retrieved case files to an LLM, every piece of client data in those files is exposed to the model provider.

The solution is role-based PII control: different roles within a law firm see different levels of detail from the same underlying case data. A partner working a case sees everything. An associate sees case details but not opposing party contact information. A paralegal sees document summaries with personal identifiers tokenized. And a client portal shows only the client's own matter status. This article shows how to build that system using Blindfold's policy engine and entity-level tokenization.

The Confidentiality Challenge in Legal AI

Legal confidentiality is not just a best practice — it is an ethical obligation enforced by bar associations, courts, and malpractice insurers. When a law firm deploys a RAG system, several overlapping concerns create a uniquely difficult privacy problem:

  • Multi-client data. A firm handles matters for dozens or hundreds of clients simultaneously. A single RAG query might retrieve documents from multiple unrelated matters, exposing one client's data to someone working on another client's case.
  • Conflicts of interest. A firm may represent TechCorp in a patent case while simultaneously representing an employee suing TechCorp for discrimination. Information barriers (ethical walls) must prevent attorneys on one matter from accessing data on the other. A RAG system that retrieves across all case files obliterates these walls.
  • Paralegal access limits. Paralegals need access to case files for document preparation, filing, and scheduling — but they should not see privileged attorney-client communications or sensitive financial details that are irrelevant to their work.
  • Client portals. Clients increasingly expect self-service access to their case status, upcoming dates, and assigned attorneys. But a client portal must never expose opposing party details, internal strategy notes, or data from other matters.
  • Bar association scrutiny. State bar associations and the ABA are actively issuing opinions on AI use in legal practice. Firms that fail to implement adequate confidentiality protections risk disciplinary action, malpractice claims, and loss of client trust.

ABA Model Rule 1.6(c): “A lawyer shall make reasonable efforts to prevent the inadvertent or unauthorized disclosure of, or unauthorized access to, information relating to the representation of a client.” Sending unredacted case files to an LLM provider arguably violates this rule.

Role-Based Access with Blindfold Policies

The core idea is simple: each role in the firm gets a different tokenization configuration. When a user queries the RAG system, retrieved documents are tokenized according to that user's role before being sent to the LLM. The same case file produces different outputs depending on who is asking.

RoleSeesRedactedEthical Basis
PartnerFull access — all client data, privileged communicationsNothingCase responsibility, fiduciary duty
AssociateClient names, case details, legal argumentsOpposing party contact info, settlement amountsWorking case but limited need-to-know
ParalegalCase details, filing dates, document summariesClient contact info, financial details, SSNDocument preparation, no client contact
Client PortalOwn case status, key dates, assigned attorneysAll other party PII, internal notes, strategySelf-service access, limited to own matter

This tiered model ensures each role sees exactly the information their ethical duties require — no more, no less. Partners carry full fiduciary responsibility and need unrestricted access. Associates work directly on cases but do not need opposing party contact details or exact settlement figures during routine research. Paralegals handle procedural tasks and should not see sensitive financial data or privileged communications. Client portals are the most restricted, showing only the requesting client's own matter information.

Implementation

Let's build a complete legal RAG system with role-based tokenization. We start with realistic case file data, define per-role entity configurations, and wire it all together with ChromaDB and OpenAI.

Step 1: Define the Case Data

Here are four documents that represent a typical law firm's case management system. Notice the variety of PII: names, emails, phone numbers, Social Security numbers, financial figures, addresses, and privileged strategy communications.

python
# Legal case files with realistic PII
CASE_FILES = [
    "Case File #CF-2024-001: Martinez v. TechCorp — Employment discrimination "
    "claim filed by Sofia Martinez (sofia.m@email.com, +1-305-555-0234, "
    "SSN 567-89-0123) against TechCorp Inc. Plaintiff alleges wrongful "
    "termination based on national origin. Seeking $500,000 in damages. "
    "Lead attorney: David Park. Settlement conference scheduled for "
    "2024-06-15.",

    "Case File #CF-2024-002: In re Estate of William Chen — Probate matter. "
    "Decedent William Chen (DOB 1945-02-18, SSN 123-45-6789). Estate value "
    "approximately $3.2M including property at 142 Oak Street, Palo Alto. "
    "Beneficiaries: Sarah Chen (daughter, sarah.chen@email.com) and Michael "
    "Chen (son). Executor: Sarah Chen. Tax ID: EIN 94-1234567.",

    "Case File #CF-2024-003: TechCorp v. InnovateCo — Patent infringement "
    "case. TechCorp claims InnovateCo's product violates US Patent "
    "#10,234,567. Damages estimate: $12M. InnovateCo CEO James O'Brien "
    "(james@innovateco.com, +1-415-555-0198). TechCorp GC: Lisa Park "
    "(lisa.park@techcorp.com). Expert witness: Dr. Robert Kim.",

    "Privileged Communication #PC-2024-001: Attorney David Park to client "
    "Sofia Martinez — strategy memo regarding settlement negotiations. We "
    "should counter at $350,000 given the deposition testimony weaknesses. "
    "Judge Thompson has historically favored mediation.",
]

Step 2: Configure Role-Based Entity Lists

Each role maps to a list of PII entity types that should be tokenized. Partners see everything (empty list means no tokenization). Associates get contact details and financial identifiers redacted. Paralegals get a broader set of entities tokenized. The client portal uses policy="strict" for maximum de-identification.

python
# Role-to-entity mapping: which PII types to tokenize per role
ROLE_ENTITIES = {
    "partner": [],  # full access — no tokenization
    "associate": [
        "email address",
        "phone number",
        "social security number",
        "address",
    ],
    "paralegal": [
        "email address",
        "phone number",
        "social security number",
        "address",
        "credit card number",
        "iban",
        "date of birth",
    ],
    "client_portal": None,  # uses policy="strict" for full de-identification
}

Step 3: Build the Legal RAG Class

The LegalRAG class handles document ingestion, role-aware querying, and privileged document filtering. Documents are ingested with metadata tags indicating privilege status and associated matter numbers, enabling both document-level and PII-level access control.

python
import os
import chromadb
from blindfold import Blindfold
from openai import OpenAI

class LegalRAG:
    def __init__(self):
        self.blindfold = Blindfold(
            api_key=os.environ["BLINDFOLD_API_KEY"],
        )
        self.openai = OpenAI()
        self.collection = chromadb.Client().create_collection(
            "legal_cases"
        )

    def ingest(self, documents, metadata_list):
        """Ingest case files with metadata for filtering."""
        for i, (doc, meta) in enumerate(
            zip(documents, metadata_list)
        ):
            self.collection.add(
                documents=[doc],
                ids=[f"doc-{i}"],
                metadatas=[meta],
            )

    def query(self, question, role, matter_id=None):
        """Query with role-based tokenization and privilege filtering."""

        # Build metadata filter based on role
        where_filter = {}
        if role != "partner":
            # Non-partners cannot see privileged communications
            where_filter["privileged"] = False
        if role == "client_portal" and matter_id:
            # Client portal: only their own matter
            where_filter["matter_id"] = matter_id

        # Retrieve relevant documents
        results = self.collection.query(
            query_texts=[question],
            n_results=3,
            where=where_filter if where_filter else None,
        )
        context = "\n\n".join(results["documents"][0])

        # Apply role-based tokenization
        entities = ROLE_ENTITIES[role]
        if entities is None:
            # Client portal: strict de-identification
            tokenized = self.blindfold.tokenize(
                context, policy="strict"
            )
        elif len(entities) == 0:
            # Partner: no tokenization needed
            tokenized = None
        else:
            # Associate/Paralegal: selective tokenization
            tokenized = self.blindfold.tokenize(
                context, entities=entities
            )

        safe_context = tokenized.text if tokenized else context

        # Send to LLM
        messages = [
            {
                "role": "system",
                "content": (
                    "You are a legal research assistant. "
                    "Answer based only on the provided context. "
                    "Do not fabricate information."
                ),
            },
            {
                "role": "user",
                "content": (
                    f"Context:\n{safe_context}\n\n"
                    f"Question: {question}"
                ),
            },
        ]

        completion = self.openai.chat.completions.create(
            model="gpt-4o", messages=messages
        )
        response = completion.choices[0].message.content

        # Detokenize for the end user
        if tokenized:
            restored = self.blindfold.detokenize(
                response, tokenized.mapping
            )
            return restored.text
        return response

Step 4: Ingest with Document Metadata

Each document gets metadata that drives both the privilege filter and the ethical wall enforcement. The privileged flag marks attorney-client communications. The matter_id ties each document to a specific case.

python
# Document metadata for filtering
METADATA = [
    {"matter_id": "CF-2024-001", "privileged": False, "client": "Martinez"},
    {"matter_id": "CF-2024-002", "privileged": False, "client": "Chen"},
    {"matter_id": "CF-2024-003", "privileged": False, "client": "TechCorp"},
    {"matter_id": "CF-2024-001", "privileged": True, "client": "Martinez"},
]

# Initialize and ingest
rag = LegalRAG()
rag.ingest(CASE_FILES, METADATA)

# Query as different roles
question = "What is the settlement status in the Martinez case?"

partner_answer = rag.query(question, role="partner")
associate_answer = rag.query(question, role="associate")
paralegal_answer = rag.query(question, role="paralegal")
client_answer = rag.query(
    question, role="client_portal", matter_id="CF-2024-001"
)

What Each Role Sees

Given the same question — “What is the settlement status in the Martinez case?” — each role receives a fundamentally different response because the LLM itself only sees the data that role is permitted to access.

Partner View

The partner sees everything, including the privileged strategy memo. The LLM receives all four documents with no tokenization:

Sofia Martinez filed an employment discrimination claim against TechCorp seeking $500,000 in damages. A settlement conference is scheduled for 2024-06-15. Per the privileged strategy memo from David Park, the recommendation is to counter at $350,000, considering weaknesses in deposition testimony. Judge Thompson has historically favored mediation.

Associate View

The associate sees case facts but not contact information or the privileged memo (filtered by metadata). Email, phone, SSN, and address are tokenized:

Sofia Martinez filed an employment discrimination claim against TechCorp seeking $500,000 in damages. The claim alleges wrongful termination based on national origin. Lead attorney is David Park. Settlement conference is scheduled for 2024-06-15.

Paralegal View

The paralegal sees case structure and dates but all personal identifiers and financial details are tokenized. The privileged memo is filtered out by metadata:

<Person_1> filed a claim against <Organization_1>. The case involves an employment discrimination claim alleging wrongful termination. Settlement conference is scheduled for 2024-06-15.

Client Portal View

The client portal is restricted to the client's own matter (via matter_id filter) and uses strict de-identification. The response is tailored to the client:

Your case Martinez v. TechCorp has a settlement conference scheduled for 2024-06-15. Contact your attorney David Park for details.

Key insight: The partner and client portal see the same underlying case data, but the LLM receives completely different inputs. The partner's LLM prompt contains full PII and privileged memos. The client portal's LLM prompt contains only the client's own matter data with strict de-identification applied. The privacy boundary exists before the LLM, not after.

Ethical Walls and Conflict Screening

Consider the conflict in our sample data: the firm represents Sofia Martinez in Case #001 (Martinez v. TechCorp) and also represents TechCorp in Case #003 (TechCorp v. InnovateCo). An attorney working on the Martinez case must not see TechCorp's internal strategy, and an attorney working on TechCorp's patent case must not see the employment discrimination matter where TechCorp is the defendant.

This requires two layers of protection:

  1. Document-level filtering (metadata). Each attorney has an assigned set of matter IDs. When they query the RAG system, only documents tagged with their assigned matters are retrieved. This is the ethical wall — a hard boundary enforced at the database level.
  2. PII-level tokenization (Blindfold). Even if a document is accidentally retrieved across the ethical wall (due to a metadata tagging error, for example), tokenization ensures that opposing party contact information, financial details, and other sensitive data are replaced with anonymous tokens before reaching the LLM.
python
def query_with_ethical_wall(self, question, role, attorney_matters):
    """Query with ethical wall enforcement."""

    # Layer 1: Document-level filter — only assigned matters
    where_filter = {
        "matter_id": {"$in": attorney_matters}
    }
    if role != "partner":
        # Also filter out privileged docs for non-partners
        where_filter = {
            "$and": [
                {"matter_id": {"$in": attorney_matters}},
                {"privileged": False},
            ]
        }

    results = self.collection.query(
        query_texts=[question],
        n_results=5,
        where=where_filter,
    )
    context = "\n\n".join(results["documents"][0])

    # Layer 2: PII-level tokenization — defense in depth
    entities = ROLE_ENTITIES[role]
    if entities and len(entities) > 0:
        tokenized = self.blindfold.tokenize(
            context, entities=entities
        )
        context = tokenized.text

    # ... send to LLM as before

This double-barrier approach means an attorney on the Martinez case querying about TechCorp will get no results from Case #003 (filtered by metadata). And even if a future indexing error places a TechCorp document in the wrong matter bucket, the tokenization layer will replace names, emails, and financial figures with tokens like <Person_1>, <Email_Address_1>, and <Money_Amount_1> before the LLM ever sees the content.

Warning: Metadata filtering alone is not sufficient for ethical wall compliance. A single mislabeled document can breach the wall. Tokenization provides a critical safety net by ensuring that even if the wrong document is retrieved, the sensitive PII within it is replaced with anonymous tokens before reaching the LLM.

Attorney-Client Privilege Protection

Privileged communications — strategy memos, legal advice, work product — require the highest level of protection. If a privileged document is sent to an LLM, the firm risks waiving privilege entirely. Courts have held that sharing privileged information with third parties, even inadvertently, can constitute a waiver.

The architecture uses three defenses against privilege waiver:

  1. Metadata filtering. Documents tagged {"privileged": true} are excluded from retrieval for all non-partner roles. Associates, paralegals, and client portals never see these documents in their search results.
  2. Tokenization as a safety net. Even if a privileged document is accidentally tagged as non-privileged and retrieved for an associate, the tokenization layer replaces client names, strategy details, and financial figures with anonymous tokens. The LLM sees “<Person_1> to <Person_2> — strategy memo regarding settlement negotiations. We should counter at <Money_Amount_1>” — which reveals no meaningful privileged information.
  3. Audit logging. Every query is logged with the role, retrieved document IDs, and tokenization details. If a privilege review is needed, the firm can trace exactly what data was exposed to which role and whether tokenization was applied.
python
def query_with_privilege_guard(self, question, role, matter_id=None):
    """Query with privilege protection and audit logging."""

    # Retrieve documents (privilege-filtered for non-partners)
    results = self._retrieve(question, role, matter_id)

    # Check for any privileged docs that slipped through
    retrieved_meta = results["metadatas"][0]
    for meta in retrieved_meta:
        if meta.get("privileged") and role != "partner":
            # Log the incident and skip this document
            log_privilege_breach(
                role=role,
                document_id=meta["matter_id"],
                action="blocked",
            )
            continue

    # Tokenize based on role (defense in depth)
    context = "\n\n".join(results["documents"][0])
    entities = ROLE_ENTITIES[role]

    if entities is None:
        tokenized = self.blindfold.tokenize(
            context, policy="strict"
        )
    elif len(entities) > 0:
        tokenized = self.blindfold.tokenize(
            context, entities=entities
        )
    else:
        tokenized = None

    # Audit trail
    log_query(
        role=role,
        question=question,
        docs_retrieved=results["ids"][0],
        tokenized=tokenized is not None,
        entities_redacted=entities if entities else [],
    )

    # ... continue with LLM call

Defense in depth: The combination of metadata filtering and PII tokenization means that a single point of failure — whether a mislabeled document, a database permission error, or a retrieval bug — does not result in a privilege waiver. Both layers must fail simultaneously for privileged content to reach the LLM in identifiable form.

Bar Association Compliance

ABA Formal Opinion 512 and a growing number of state bar opinions address the use of AI in legal practice. While these opinions generally permit AI use, they impose specific obligations that directly affect how a legal RAG system should be built.

Informed Client Consent

Clients must be informed that AI tools are used in their representation. This is a disclosure requirement, not a technical one, but your RAG system should make it easy to generate a list of matters where AI was used. The audit logging in the code above provides the foundation for this disclosure.

Supervision of AI Outputs

Attorneys must review AI-generated content before relying on it. This is particularly important for RAG systems that synthesize information across multiple case files. The role-based architecture supports this by ensuring that partners and senior associates see the full context (including document sources) while junior staff see appropriately filtered views.

Confidentiality Protections

This is where Blindfold's role-based tokenization directly satisfies the bar's requirements. ABA opinions require “reasonable efforts” to protect client confidentiality when using AI tools. Sending unredacted client data to an LLM provider is difficult to justify as “reasonable.” Tokenization ensures that the LLM provider never receives identifiable client information — satisfying the confidentiality requirement across every role level.

ABA RequirementHow This Architecture Addresses It
Informed consent (Rule 1.4)Audit logs track which matters used AI, enabling client disclosure
Supervision (Rule 5.1, 5.3)Partners see full context for review; role hierarchy enforces oversight
Confidentiality (Rule 1.6)PII tokenized before reaching LLM; provider never sees client data
Competence (Rule 1.1)Understanding and implementing AI safeguards demonstrates technical competence
Conflicts (Rule 1.7, 1.9)Ethical walls enforced via metadata filtering and tokenization

State Bar Variations

Several state bars have issued their own guidance on AI. California, Florida, New York, and Texas have all weighed in with opinions that largely align with the ABA framework but add state-specific wrinkles. For example, California requires specific client consent before using AI for “substantive legal work,” while Florida emphasizes the duty to understand the AI tool's limitations. The role-based architecture described here satisfies the common thread across all these opinions: protecting client data from unauthorized disclosure through the AI pipeline.

Putting It All Together

Here is the complete flow for a legal RAG query, showing every layer of protection:

  1. User authenticates and their role is determined (partner, associate, paralegal, or client portal user).
  2. Ethical wall check. The system verifies which matter IDs the user is authorized to access. For client portal users, this is their own matter only.
  3. Document retrieval. ChromaDB retrieves relevant documents filtered by authorized matter IDs. Privileged documents are excluded for non-partner roles.
  4. PII tokenization. Retrieved documents are passed through Blindfold with entity lists specific to the user's role. Partners get no tokenization. Associates get contact details tokenized. Paralegals get broader tokenization. Client portal users get strict de-identification.
  5. LLM call. The tokenized context and question are sent to the LLM. The model provider sees only anonymous tokens, never real PII.
  6. Detokenization. The LLM's response is detokenized using the mapping from step 4, restoring real names and details for the authorized user.
  7. Audit log. The query, role, retrieved documents, and tokenization status are recorded for compliance review.
python
# Complete query flow with all protections
def handle_legal_query(user, question):
    # 1. Determine role and authorized matters
    role = user.role  # "partner", "associate", "paralegal", "client_portal"
    authorized_matters = user.assigned_matters

    # 2. Retrieve with ethical wall and privilege filter
    where_filter = {"matter_id": {"$in": authorized_matters}}
    if role != "partner":
        where_filter = {
            "$and": [
                {"matter_id": {"$in": authorized_matters}},
                {"privileged": False},
            ]
        }

    results = collection.query(
        query_texts=[question], n_results=5, where=where_filter
    )
    context = "\n\n".join(results["documents"][0])

    # 3. Role-based tokenization
    entities = ROLE_ENTITIES[role]
    mapping = {}

    if entities is None:
        tokenized = blindfold.tokenize(context, policy="strict")
        context = tokenized.text
        mapping = tokenized.mapping
    elif len(entities) > 0:
        tokenized = blindfold.tokenize(context, entities=entities)
        context = tokenized.text
        mapping = tokenized.mapping

    # 4. LLM call with tokenized context
    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Legal research assistant."},
            {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"},
        ],
    ).choices[0].message.content

    # 5. Detokenize for the end user
    if mapping:
        response = blindfold.detokenize(response, mapping).text

    # 6. Audit log
    log_query(
        role=role,
        question=question,
        docs=results["ids"][0],
        tokenized=bool(mapping),
    )

    return response

Try It Yourself

Build a role-based legal RAG system using the complete cookbook examples. The RBAC examples include role definitions, document metadata, ethical wall filtering, and privilege protection out of the box:

Start protecting sensitive data

Free plan includes 500K characters/month. No credit card required.