Build CCPA-Compliant Data Pipelines for SaaS Platforms with GlobalShield API
Learn how to build CCPA-compliant data pipelines that detect, redact, and handle California consumer PII using GlobalShield API in Python. Covers opt-out flows and data deletion.

California's Consumer Privacy Act (CCPA) and its amendment, CPRA, give California residents four core rights: the right to know what personal data you collect, the right to delete it, the right to opt out of its sale, and the right to non-discrimination for exercising those rights.
For SaaS companies, these aren't just legal obligations — they're engineering requirements. Handling a deletion request means knowing exactly where that user's PII lives across your databases, logs, and data pipelines. Most teams don't have that visibility until a request arrives.
This guide shows how to build CCPA-compliant data pipelines using the GlobalShield API — automating PII detection, tagging, redaction, and deletion tracking across your entire data stack.
What CCPA Requires in Practice
The California Privacy Protection Agency (CPPA) reached a $2.75 million settlement in early 2026 — the largest to date — against a streaming company for failing to honor opt-out requests within the required 15-day window.
The technical requirements that trip up most SaaS teams:
- Data inventory: Know every place California resident PII is stored
- Deletion within 45 days: Fulfill verified deletion requests across all systems
- Opt-out propagation: Stop selling/sharing data within 15 business days
- Audit trail: Prove compliance with timestamped records
GlobalShield automates step one (finding PII) and integrates with your existing deletion workflows.
Step 1: Setup
pip install requests pandas python-dotenvimport os
import json
import hashlib
import requests
from datetime import datetime
from dotenv import load_dotenv
load_dotenv()
GLOBALSHIELD_KEY = os.getenv("YOUR_API_KEY")
GLOBALSHIELD_BASE = "https://apivult.com/api/globalshield"
HEADERS = {
"X-RapidAPI-Key": GLOBALSHIELD_KEY,
"Content-Type": "application/json"
}Step 2: PII Detection Across Data Sources
def scan_text_for_pii(text: str, source_label: str) -> dict:
"""
Scan arbitrary text for California-regulated PII categories.
CCPA covers: name, email, phone, SSN, driver's license,
financial account numbers, IP address, biometrics, geolocation.
"""
payload = {
"text": text,
"regulations": ["CCPA", "CPRA"],
"detection": {
"categories": [
"name", "email", "phone", "ssn", "drivers_license",
"financial_account", "ip_address", "geolocation",
"precise_geolocation", "biometric_identifier"
],
"confidence_threshold": 0.85,
"include_context": True
},
"metadata": {
"source": source_label,
"scanned_at": datetime.utcnow().isoformat()
}
}
resp = requests.post(
f"{GLOBALSHIELD_BASE}/detect",
json=payload,
headers=HEADERS
)
resp.raise_for_status()
return resp.json()["data"]
def scan_database_export(records: list[dict], source_label: str) -> list[dict]:
"""Scan a batch of database records for PII."""
flagged = []
for record in records:
# Combine all text fields into one blob for scanning
text_blob = " | ".join(
str(v) for v in record.values() if v is not None
)
scan = scan_text_for_pii(text_blob, source_label)
if scan["pii_detected"]:
flagged.append({
"record_id": record.get("id", "unknown"),
"source": source_label,
"pii_categories": scan["detected_categories"],
"risk_level": scan["risk_level"],
"entities": scan["entities"]
})
return flaggedStep 3: Redaction for Analytics and Logging
Logs and analytics pipelines often contain PII that should never be stored. Use GlobalShield to redact before persisting.
def redact_for_analytics(text: str, strategy: str = "mask") -> str:
"""
Redact PII from text before storing in analytics or logs.
strategy: 'mask' (replace with ***), 'pseudonymize' (replace with token),
'generalize' (replace with category label)
"""
payload = {
"text": text,
"redaction": {
"strategy": strategy,
"regulations": ["CCPA"],
"preserve_format": True # Keeps structure, redacts values
}
}
resp = requests.post(
f"{GLOBALSHIELD_BASE}/redact",
json=payload,
headers=HEADERS
)
resp.raise_for_status()
return resp.json()["data"]["redacted_text"]
# Example: Redact user activity log before writing to data warehouse
def process_event_log(raw_event: dict) -> dict:
"""Strip PII from event log before storing in analytics."""
event = raw_event.copy()
# Redact user-agent string, IP, and any user-supplied text
fields_to_redact = ["user_agent", "ip_address", "search_query", "feedback_text"]
for field in fields_to_redact:
if field in event and event[field]:
event[field] = redact_for_analytics(str(event[field]))
event["pii_processed"] = True
event["processed_at"] = datetime.utcnow().isoformat()
return eventStep 4: Handle Deletion Requests (Right to Erasure)
# Pseudonymization map: maps real user IDs to anonymous tokens
# Stored separately from the data — allows reversal for deletion
PSEUDONYM_STORE = {} # In production: store in encrypted Redis/DB
def pseudonymize_user_id(real_id: str) -> str:
"""Create a consistent anonymous token for a user ID."""
token = hashlib.sha256(f"ccpa-salt-{real_id}".encode()).hexdigest()[:16]
PSEUDONYM_STORE[token] = real_id
return token
def process_deletion_request(
user_id: str,
data_stores: list[str],
request_date: str
) -> dict:
"""
Orchestrate a CCPA deletion request across data stores.
Returns a compliance audit record.
"""
deletion_log = {
"user_id_hash": hashlib.sha256(user_id.encode()).hexdigest(),
"request_received": request_date,
"deadline": "45 days from request date",
"stores_processed": [],
"status": "IN_PROGRESS"
}
for store in data_stores:
# In a real system, each store handler would delete/anonymize records
# Here we log the action for the audit trail
deletion_log["stores_processed"].append({
"store": store,
"action": "DELETION_QUEUED",
"queued_at": datetime.utcnow().isoformat()
})
print(f" Queued deletion for {user_id} in {store}")
deletion_log["status"] = "DELETION_QUEUED"
deletion_log["audit_id"] = hashlib.sha256(
f"{user_id}-{request_date}".encode()
).hexdigest()[:12].upper()
return deletion_log
def verify_deletion_completion(deletion_log: dict, completed_stores: list[str]) -> dict:
"""Mark deletion complete and generate compliance certificate."""
deletion_log["completed_stores"] = completed_stores
deletion_log["completed_at"] = datetime.utcnow().isoformat()
all_stores = {s["store"] for s in deletion_log["stores_processed"]}
if set(completed_stores) >= all_stores:
deletion_log["status"] = "DELETION_COMPLETE"
deletion_log["compliance_status"] = "CCPA_COMPLIANT"
else:
pending = all_stores - set(completed_stores)
deletion_log["status"] = "PARTIAL"
deletion_log["pending_stores"] = list(pending)
return deletion_logStep 5: Opt-Out Signal Propagation
When a California resident opts out of data sale/sharing, you have 15 business days to stop. Automate the propagation:
def process_opt_out(user_id: str, opt_out_timestamp: str) -> dict:
"""
Record and propagate a CCPA opt-out signal.
Must reach all downstream sharing partners within 15 business days.
"""
opt_out_record = {
"user_id_hash": hashlib.sha256(user_id.encode()).hexdigest(),
"opt_out_type": "DO_NOT_SELL_OR_SHARE",
"received_at": opt_out_timestamp,
"propagated_to": [],
"compliance_deadline": "15 business days"
}
# Downstream partners to notify (anonymized IDs, not real names)
downstream_partners = ["analytics_platform", "advertising_network", "data_enrichment"]
for partner in downstream_partners:
# In production: send API call to each partner's opt-out endpoint
opt_out_record["propagated_to"].append({
"partner": partner,
"notified_at": datetime.utcnow().isoformat(),
"status": "SENT"
})
opt_out_record["propagation_status"] = "COMPLETE"
return opt_out_recordStep 6: CCPA Data Inventory Scanner
Run this periodically to maintain your data inventory:
def run_ccpa_inventory_scan(
data_sources: dict[str, list[dict]]
) -> dict:
"""
Scan all data sources and produce a CCPA data inventory report.
data_sources: { "source_name": [records], ... }
"""
inventory = {
"scan_date": datetime.utcnow().isoformat(),
"sources_scanned": len(data_sources),
"pii_findings": [],
"summary_by_category": {}
}
for source_name, records in data_sources.items():
print(f"Scanning {source_name} ({len(records)} records)...")
findings = scan_database_export(records, source_name)
for finding in findings:
for category in finding["pii_categories"]:
inventory["summary_by_category"][category] = (
inventory["summary_by_category"].get(category, 0) + 1
)
inventory["pii_findings"].extend(findings)
inventory["total_records_with_pii"] = len(inventory["pii_findings"])
inventory["risk_assessment"] = (
"HIGH" if inventory["total_records_with_pii"] > 1000
else "MEDIUM" if inventory["total_records_with_pii"] > 100
else "LOW"
)
return inventoryCCPA Compliance Checklist for SaaS
| Requirement | Technical Implementation |
|---|---|
| Privacy notice | Disclose PII categories collected on signup |
| Right to know | API endpoint to return all data for a user_id |
| Right to delete | Deletion pipeline covering all data stores |
| Right to opt out | GPC signal support + opt-out form → propagation |
| Data minimization | PII redaction in analytics/logs before storage |
| 15-day opt-out deadline | Automated propagation on opt-out event |
| 45-day deletion deadline | Queued deletion with SLA monitoring |
| Audit trail | Immutable logs for all privacy operations |
The Cost of Non-Compliance
The CPPA's $2.75 million settlement in February 2026 for a delayed opt-out response signals that California regulators are moving from warnings to enforcement. The penalty cap is $7,500 per intentional violation — for companies processing millions of California records, single incidents can reach eight figures.
Automate your CCPA compliance scanning today. Get started with GlobalShield API and run your first PII inventory scan free.
More Articles
Build a Data Privacy Compliance Pipeline with GlobalShield API in Python
Build PII detection and redaction pipelines with GlobalShield API. Automate GDPR compliance across ETL, APIs, and file workflows.
April 3, 2026
PII Detection in 2026: Navigating the Global Privacy Regulation Wave
With 19+ new privacy laws taking effect in 2026 and GDPR fines reaching €5.88 billion, automated PII detection is no longer optional. Here's what changed.
March 30, 2026