LangChain Integration with flexorch-audit AuditedLoader

AuditedLoader is a LangChain-compatible document loader that audits each file before it enters your chain. It masks PII in-place and skips documents that fall below your minimum quality threshold — so your vector store only ever sees clean, safe content.

Install

Install both packages

pip install flexorch-audit langchain-community

Load documents with AuditedLoader

from flexorch_audit.integrations.langchain import AuditedLoader

loader = AuditedLoader(
    file_paths=["contracts/agreement.pdf", "invoices/inv_001.pdf"],
    min_grade="B",                        # Skip documents graded C or D
    mask_pii=True,                        # Replace PII before loading
    locales=["universal", "tr", "de"],    # Restrict detection to these jurisdictions
)

docs = loader.load()

for doc in docs:
    print(doc.metadata["quality_grade"])       # "A"
    print(doc.metadata["pii_findings_count"])  # 2
    print(doc.page_content[:200])              # PII already masked

Documents that don’t meet min_grade are excluded from the returned list. Check loader.skipped after calling load() to see which files were dropped and why.

Parameters

Parameter	Type	Default	Description
`file_paths`	`list[str]`	required	Paths to the documents you want to load
`min_grade`	`str`	`"D"`	Minimum quality grade to include (`"A"`, `"B"`, `"C"`, or `"D"`)
`mask_pii`	`bool`	`True`	Replace PII spans with `[MASKED_...]` placeholders before loading
`locales`	`list[str]`	all	Restrict PII detection to specific jurisdictions (e.g. `"tr"`, `"de"`, `"us"`)

In a RAG pipeline

Use AuditedLoader as a drop-in replacement for any LangChain document loader. The documents it returns are already masked and quality-filtered, so you can pass them straight to your embeddings and vector store.

from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from flexorch_audit.integrations.langchain import AuditedLoader

loader = AuditedLoader(
    file_paths=["docs/"],
    min_grade="B",
    mask_pii=True,
)
docs = loader.load()

embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(docs, embeddings)

retriever = vectorstore.as_retriever()

Documents with a quality grade below min_grade are silently skipped. After calling load(), inspect loader.skipped for a list of excluded files along with their grades.

As a pre-processing step

If you’re working with user-supplied text rather than files, use redact_for_llm() inline with a RunnableLambda to strip PII before it reaches your chain.

from langchain_core.runnables import RunnableLambda
from flexorch_audit import redact_for_llm

# redact_for_llm returns (clean_text, summary) — take only the text
safe_chain = RunnableLambda(lambda text: redact_for_llm(text)[0]) | your_chain

redact_for_llm() always applies the redact strategy, replacing each PII span with a [MASKED_TYPE] label. See Masking if you need a different strategy.

​Install

​Load documents with AuditedLoader

​Parameters

​In a RAG pipeline

​As a pre-processing step

Install

Load documents with AuditedLoader

Parameters

In a RAG pipeline

As a pre-processing step