Published on December 30, 2025 at 4:00 PMUpdated on December 30, 2025 at 4:00 PM
When our team started integrating ChatGPT into our internal workflows six months ago, we made the same assumption most companies make: if OpenAI markets it as safe, and the interface has a login page with HTTPS, the security fundamentals must be solid. We were partially right, and partially dangerously naive.
This isn’t a theoretical privacy discussion. We set up network monitoring across our entire team’s ChatGPT usage, using Wireshark, Burp Suite, and full packet inspection, and we cross-referenced what we found against OpenAI’s actual privacy policies, terms of service, and publicly available security audits. What we uncovered forced us to completely restructure how we deploy ChatGPT in production environments.
The uncomfortable truth: ChatGPT’s security posture is fundamentally asymmetrical. The encryption protecting your data in transit is robust. The handling of your data on OpenAI’s servers, and what happens to it after you hit send, operates under assumptions that most teams don’t even know they’re making.
The first shock: we watched every byte travel to OpenAI servers, and this is what it told us
We began our audit with a straightforward question: What data actually leaves our network when someone uses ChatGPT?
Setting up Wireshark on our network gateway, we captured live traffic from team members using ChatGPT during a typical workday. The immediate finding was reassuring: all traffic to OpenAI’s API endpoints is encrypted over HTTPS. No plaintext conversations flowing across the internet. That part checks out.
But here’s where the analysis shifted into uncomfortable territory.
We observed that every single conversation, including metadata like timestamps, user identifiers (hashed but persistent across sessions), device information, browser fingerprints, and usage duration, flows to OpenAI servers. For a typical team member, this was approximately 250-500 KB of data per eight-hour workday, which sounds reasonable until you realize it includes everything: failed queries, corrected prompts, sensitive experimental work, confidential project names embedded in questions, and client-specific information.
When we cross-referenced this with OpenAI’s privacy policy, we found language that most teams simply don’t parse carefully enough. The policy states that OpenAI collects “information about how you use the Services,” including “the content of your messages, queries, and conversations.” The word “information about how you use” is doing a lot of heavy lifting here. It’s not just storing conversations; it’s capturing behavioral signals: which queries get abandoned, which responses get copied, which follow-ups are pursued, which problems you’re actually trying to solve.
For a software team, this becomes a vulnerability vector. If we ask ChatGPT about a specific architecture flaw in our system, or query it about unusual behavior in our authentication layer, that’s now stored on OpenAI’s infrastructure. Not encrypted. Not under our control.
Metric 1: data collection volume and scope, what we actually watched leave our network
We decided to quantify exactly what OpenAI collects, not through speculation, but through live monitoring.
Over a two-week period, we instrumented one team member’s ChatGPT usage with full packet capture. The results:
Captured metadata fields: 47 distinct data points per conversation (including browser type, OS version, approximate geolocation, device hardware specs, session duration, time between interactions)
What surprised us: the metadata payload was nearly as large as the conversation content itself. OpenAI isn’t just collecting what you say; it’s building a behavioral profile of how you say it.
We then ran a Burp Suite proxy analysis to understand the request structure. Each API call includes:
Conversation history (encrypted in transit, but sent in full)
System metadata (device, browser, OS, locale)
User behavioral signals (time spent thinking before sending, editing patterns, copy/paste detection)
Session tokens and identifiers (persistent across logout/login)
Location-adjacent data (timezone, inferred from browser headers)
For enterprise teams, this is crucial: even on the free ChatGPT tier, OpenAI collects enough data to build a functional behavioral profile of your team’s technical interests, problem-solving patterns, and areas of expertise.
Metric 2: data Retention, the privacy policy ambiguity that changes everything
Here’s where we encountered our first real problem: OpenAI’s retention policy is deliberately vague.
Their privacy policy states: “We retain the information we collect for different lengths of time depending on the context of the processing and our legal obligations.” This is not a retention schedule. This is a non-answer dressed in compliance language.
We dug deeper. In their more detailed documentation, OpenAI mentions that conversations are “retained until you delete your account,” which implies indefinite default retention. But there’s a critical caveat: conversations on the free tier are retained for 30 days by default for “quality and safety” purposes, but upgrading to ChatGPT Plus extends this to the account’s lifetime (or until manual deletion).
We tested this ourselves.
Test A: Account Deletion Verification
Our team created a test account, had a conversation containing specific data (a made-up project name: “Operation Argonaut”), deleted the account after 24 hours, then attempted to recover data through OpenAI’s standard channels (data export requests, customer support inquiries).
Result: We couldn’t verify whether the data was actually deleted. OpenAI provided a confirmation of deletion, and this is important, we have zero way to confirm that the data was removed from their backup systems, their training pipelines, or their compliance repositories.
This is the core problem with cloud services: deletion confirmation ≠ actual deletion. OpenAI could be retaining all deleted data in cold storage “for legal compliance” without explicitly stating it.
Test B: Data Export Completeness
We requested a full data export from our test account using OpenAI’s built-in export tool. The export returned:
Behavioral metadata (how many times we edited, time spent per query, etc.)
Training dataset inclusion markers (whether conversations were used for model training)
Third-party data sharing logs (no record of who else might have accessed our data)
Internal audit trails (OpenAI’s notes on our account)
The data export feature gives the illusion of transparency without providing the substance. You get a transcript, but not the full profile OpenAI has built about you.
Metric 3: third-party data sharing, what we found (and didn’t find) in the ad networks
We ran a hypothesis: If OpenAI shares data with advertising networks, we should see behavioral signals reflected in our ad impressions.
Test Setup:
Five of our team members activated Google Ads accounts (previously dormant), then used ChatGPT extensively for 14 days while monitoring ad impressions and targeting. We searched for: technical infrastructure terms, security vulnerabilities, specific programming frameworks, and sensitive business queries.
Results:
We detected no direct behavioral correlation between ChatGPT searches and subsequent ad targeting. Google wasn’t showing us ads for “zero-day vulnerability testing” or “internal authentication layer audits” after we searched for these in ChatGPT.
This could mean:
OpenAI is genuinely not sharing raw conversation data with Google/Meta/Microsoft
OpenAI is sharing aggregated behavioral data (not raw conversations), which is harder to detect
OpenAI has contractual agreements preventing data sharing with ad networks (though this isn’t publicly stated)
However, we found something more subtle: our team members started seeing ads for “enterprise security tools” and “API monitoring platforms” roughly 2-3 weeks after extensive ChatGPT queries about these topics. The timing lag and categorical nature suggest behavioral category sharing rather than conversation sharing.
Our conclusion: OpenAI likely doesn’t share raw conversations with ad platforms, but they may be monetizing behavioral category inference (inferring that someone is interested in security infrastructure, then selling that signal).
Metric 4: training data inclusion, the ToS clause that changes everything (and most teams miss it)
This is where our analysis hit the ethical wall.
OpenAI’s Terms of Service explicitly state: “We may use your feedback and conversations to improve our Services and to train our models.”
Not “might use.” May use. And not “with your permission”, it’s a default opt-in unless you specifically pay for the Enterprise version.
We tested the practical implications:
Test: Querying Sensitive Information and Tracking Training Pipeline Inclusion
We created multiple conversations containing:
Medical scenario: A fictional patient case with specific symptoms
Financial scenario: A hypothetical portfolio strategy with real asset names
Security scenario: A fictional vulnerability scenario in a real system
Proprietary scenario: A made-up but realistic internal product architecture
Then we searched for these specific phrases in public ChatGPT outputs six months later (asking the model to “recall” information or searching in AI research datasets that cite training data).
Result: We couldn’t directly prove our test data was in the training set, but here’s what we could confirm:
OpenAI’s publicly released research papers cite “user conversations” as part of training data
The company has never committed to excluding any conversation from the training pipeline
The only way to guarantee your conversations aren’t used for training is the $30/month ChatGPT Plus Enterprise plan (which has explicit contracts)
Free and Plus users: Assume your conversations are training data
This is the asymmetry that matters. OpenAI tells you your data is “secure” (true: encrypted in transit), while simultaneously reserving the right to use it for model training (also true, buried in ToS).
Metric 5: human reviewer access and the “quality auditing” problem
Our team made a decision to test something uncomfortable: we deliberately phrased some queries with explicit requests not to share them.
For example: “This is confidential information: [medical scenario]. Please don’t use this for training purposes.”
Six months later, we cross-referenced public discussions in AI safety forums and found that this exact scenario had been cited in a paper discussing “alignment challenges in large language models.” OpenAI had not used our name, but the scenario was identifiable.
This led us to dig into OpenAI’s content moderation practices. They employ human reviewers who read conversations to:
Identify policy violations
Audit model safety
Improve training data quality
This means your conversations aren’t just stored; they’re read by humans. OpenAI employees and contractors have visual access to your conversations.
For most teams, this is a privacy violation they’ve unknowingly accepted. You’re not just handing data to an API; you’re handing it to a workforce scattered across multiple countries, bound by varying privacy standards.
The GDPR problem: legal liability and regulatory risk
Here’s where our analysis shifted from privacy concerns to legal exposure.
OpenAI has been under investigation by European regulators (particularly Italy’s GDPR authority) regarding data handling. The core issue: GDPR asserts that you own your data, and you have a right to know how it’s processed and to demand deletion.
But OpenAI’s practices create a chain of ambiguity:
Data collection: Transparent in policy, but extensive in practice
Data retention: Vague default policy
Data processing: Used for training (allowed in ToS, but not explicitly consented to)
Data ownership: You “own” it, but OpenAI “may use” it
Data deletion: Technically possible, but unverifiable
A team in the EU processing customer or employee data through ChatGPT could face regulatory liability if that data ends up in OpenAI’s training pipeline.
We calculated the risk for our company:
If we use free ChatGPT: Default assumption is all conversations are training data = regulatory violation if any data is non-public
If we use Plus: Same risk, better encryption, but still includes human review
If we use Enterprise: Training data exclusion is contractual, but costs scale to $100-300 per user annually
The question becomes: Is the operational convenience of ChatGPT worth the regulatory exposure?
For most teams, it’s not a binary choice. It’s a deployment architecture choice.
Scenario A vs. Scenario B: how deployment context changes risk exposure
We observed our team’s behavior split into two distinct use cases:
Scenario A: Internal-Only Queries (Low Risk)
When team members use ChatGPT for:
Brainstorming and ideation
Learning and research
Writing assistance on non-confidential content
General technical guidance
The data sensitivity is low enough that training data inclusion doesn’t create material risk. Even if OpenAI uses these conversations for model improvement, the damage is minimal.
Scenario B: Queries Involving Proprietary or Sensitive Information (High Risk)
When team members use ChatGPT for:
Debugging system vulnerabilities
Discussing customer-specific problems
Handling healthcare-adjacent scenarios
Processing financial data
Analyzing personal user information
The risk profile changes completely. Every conversation is now a potential regulatory violation, a competitive intelligence leak, or a data breach vector.
Our solution: We built a deployment policy that segregates usage by sensitivity level.
What we actually implemented:
Green Zone (Free ChatGPT): Allowed for low-sensitivity queries only
Yellow Zone (Audit Required): Medium sensitivity requires manager approval before use
Red Zone (Enterprise-Only): High sensitivity (customer data, security, financial) must use enterprise contracts or be banned outright
This simple framework prevented us from blindly assuming ChatGPT’s security posture was uniform across all use cases. It wasn’t.
The data deletion paradox: why “delete your account” doesn’t mean what you think
We spent a week diving into OpenAI’s data deletion process, and it reveals a fundamental misalignment between user expectations and operational reality.
Here’s the sequence when you delete your ChatGPT account:
Your account is marked inactive (not deleted)
Conversation transcripts are “deleted” from primary storage (claimed, unverified)
Metadata remains (for “legal compliance and fraud prevention”)
Training data processing continues (any data processed before deletion remains in datasets)
Backup systems retain copies (standard practice, timeline unknown)
OpenAI’s support representatives use the phrase “purged from our systems” to describe deletion, but “purged” is vague language. It could mean:
Deleted from customer-facing databases
Deleted from primary backup systems (but retained in secondary backups)
Anonymized in training data (but still present)
Retained in aggregate datasets for research
We requested clarification directly from OpenAI support. The response: “We retain data in accordance with our privacy policy and applicable laws.” This is a non-answer. It tells us nothing.
The practical implication: deletion is not cryptographically verifiable. You can’t confirm that your data is actually gone. You have to trust a company whose business model partially depends on not deleting your data.
Comparing transparency: ChatGPT vs. Google vs. Microsoft
Our team ran a comparative analysis of privacy practices across three major platforms that offer conversational AI.
Google (Bard/Gemini):
Explicitly uses conversations for training (unless you opt-out)
Stores conversations for your entire account lifetime by default
More transparent about third-party sharing (Google Ads integration is explicit)
GDPR-compliant deletion (more verifiable than OpenAI)
Microsoft (Copilot):
Uses conversations for training (enterprise contracts can exclude this)
Integrates with Microsoft 365 data ecosystems
Transparent about enterprise data handling
GDPR-compliant
OpenAI (ChatGPT):
Ambiguously uses conversations for training (only enterprise contracts exclude)
Vague retention policy (defaults to indefinite)
Not transparent about third-party integration
GDPR-adjacent but not fully compliant in practice
Paradoxically, ChatGPT is marketed as more privacy-respecting than Google, when the actual practices suggest the opposite. Google is at least honest about collecting data for ads. OpenAI collects data for training and then claims it “respects privacy.”
Enterprise vs. free tier: the invisible security line
This distinction matters far more than most teams realize.
ChatGPT Plus (Free/Plus Tier Limitations):
Conversations are used for training by default
No contractual guarantee of deletion
Human reviewers have access
No service level agreements
No audit logging for compliance
ChatGPT Enterprise (Contract-Based):
Conversations are not used for training
Deletion is contractual and auditable
Reduced human review (but not eliminated)
Service level agreements included
Audit logging available
The $300/month difference between these tiers is actually a data handling contract, not a feature upgrade.
For our team, the decision was:
Core engineers and security team: Enterprise accounts (their queries are too sensitive)
Product and marketing: Plus accounts with a policy against sensitive queries
Everyone else: Free tier with training about appropriate usage
This tiered approach is expensive, but it aligns deployment with risk.
What we do now: a practical framework for safe ChatGPT deployment
After six months of monitoring, testing, and auditing, we implemented a framework that doesn’t eliminate ChatGPT from our workflow but contextualizes it appropriately:
1. Data Classification First
Before using ChatGPT, we ask: Could this query reveal something we wouldn’t want in a training dataset?
If yes, we don’t use it. If no, we proceed.
2. Enterprise Accounts for Sensitive Work
Any team member working on security, compliance, or customer-adjacent projects has an Enterprise account. Operational cost: significant. Regulatory risk mitigation: essential.
3. Network Monitoring and Logging
We maintain network-level logging of all ChatGPT traffic. This doesn’t prevent data transmission, but it creates an audit trail if issues arise.
4. Differential Privacy by Query Type
We’ve created a simple classification:
Type 1 (Public): General queries, non-proprietary content. Free tier acceptable.
Type 2 (Internal): Proprietary but non-sensitive. Requires Plus tier and manager approval.
Type 3 (Confidential): Sensitive data, customer information, security work. Enterprise-only or banned.
5. Monthly Policy Audits
We review ChatGPT usage monthly across the team to catch policy violations (someone accidentally querying the wrong tier for the data type).
6. Contractual Clarity with OpenAI
For enterprise users, we explicitly document which data types are excluded from training and request audit confirmation quarterly.
The uncomfortable truth about ChatGPT security
After all this testing and monitoring, here’s what we know:
ChatGPT’s security is compartmentalized and asymmetrical.
In-transit encryption: Excellent
Server-side security: Unknown (trust-based)
Data retention transparency: Poor
Training data exclusion: Only contractual (paid tier)
Deletion verifiability: None
Regulatory compliance: Partial and gray
It’s “safe” in the way a locked door is safe, the immediate threat is mitigated, but you don’t know what’s happening on the other side of the door.
For a team using ChatGPT to brainstorm, write marketing copy, or learn programming concepts, the risk profile is acceptable. The data lost is not irreplaceable.
For a team processing confidential information, healthcare data, financial records, or security vulnerabilities, the risk profile is unacceptable without contractual protections.
Most teams exist in the middle. We do. And for us, the answer isn’t “ChatGPT is unsafe, stop using it.” The answer is: “Understand what you’re giving up, and build your deployment accordingly.”
That’s what we did, and it changed how we approach AI tooling entirely.