Skip to main content
After you detect PII with detect_pii(), you pass the findings to mask() to replace each sensitive span with a safe placeholder. The original text is never modified in place — mask() always returns a new string.

Basic usage

from flexorch_audit import detect_pii, mask

text = "Email us at hello@example.com or call +49 30 1234567."

findings = detect_pii(text)
masked = mask(text, findings)

print(masked)
# Email us at [MASKED_EMAIL] or call [MASKED_PHONE_DE].

Masking strategies

mask() supports four strategies. Pass your chosen strategy as the third argument.
StrategyExample outputBest for
redact (default)[MASKED_EMAIL]Production datasets and compliance logs
replaceuser@example.comSynthetic plausible data for testing
token<EMAIL_1>Structure-preserving NLP pipelines
hasha3f2b19c...Deterministic, reversible anonymization
# Token strategy — keeps grammatical structure intact
masked = mask(text, findings, strategy="token")
# Email us at <EMAIL_1> or call <PHONE_DE_1>.

# Hash strategy — same input always produces the same hash
masked = mask(text, findings, strategy="hash")

# Replace strategy — substitutes plausible synthetic values
masked = mask(text, findings, strategy="replace")
Use token when you need to preserve sentence structure for downstream NLP. Use hash when you need to consistently anonymize the same value across multiple documents.

One-liner: redact_for_llm()

When you want to detect and mask in a single call — optimized for preparing text as LLM input — use redact_for_llm(). It runs detect_pii() and mask() internally and returns the cleaned text alongside a compact summary.
from flexorch_audit import redact_for_llm

clean_text, summary = redact_for_llm(text)

print(clean_text)
# Email us at [MASKED_EMAIL] or call [MASKED_PHONE_DE].

print(summary)
# {"count": 2, "types": ["email", "phone_de"]}
redact_for_llm() always uses the redact strategy. If you need a different strategy, call detect_pii() and mask() separately.