The secret data collector: Who is really behind the verifications of LinkedIn, OpenAI & Co

Xpert Pre-Release

Online contact (Konrad Wolfenstein)

Available in 27 languages 📢

Prefer Xpert.Digital on Googleⓘ

Published on: May 2, 2026 / Updated on: May 2, 2026 – Author: Konrad Wolfenstein

The secret data collector: Who is really behind the verifications of LinkedIn, OpenAI & Co. – Image: Xpert.Digital

Facial scans for the internet: How an invisible US startup is hoarding our biometric data

Source code leak reveals: What really happens to your ID data on Reddit, Roblox, and LinkedIn

Anyone who navigates the internet today increasingly has to prove their identity. Whether it's for the coveted LinkedIn "checkmark," access to OpenAI's powerful AI models, age verification on Reddit, or chatting on Roblox – reaching for a passport and taking the obligatory video selfie are becoming the new normal. But while users believe they are entrusting their sensitive biometric data directly to the respective platforms, an invisible power operates in the background: the US startup "Persona." With over 300 million verified identities, the company, backed by influential investors, has become the covert infrastructure of digital life. But the seemingly convenient system has a major flaw: a sensitive source code leak, questionable data retention periods, and close ties to US authorities raise serious data privacy concerns. What really happens to our faces and IDs? And why was Discord the only major platform to pull the plug after massive user protests? An investigation into the hidden server structures of a data giant.

The invisible backbone of the internet: How LinkedIn, Reddit, OpenAI, Roblox and other platforms use personas — and what that means for your data

One company, 148,000 customers, 300 million data records — and hardly anyone knows about it

When a LinkedIn user verifies their profile, they believe they are speaking to LinkedIn. When a Reddit user confirms their age, they trust Reddit. When a Roblox player holds their face up to the camera, they do so believing Roblox is the person they are interacting with. In reality, it's the same company in all these cases: Persona Identities, Inc., a San Francisco-based startup founded in 2018 that has become the de facto identity infrastructure for large swathes of the internet. By 2024, Persona had completed over 300 million identity verifications, doubling both its revenue and its customer base. The platforms that use Persona's services read like a cross-section of modern digital life—and yet, the name Persona remains completely unknown to most users.

This is no coincidence. Persona's business model deliberately relies on invisibility: While the platforms cultivate user relationships and build trust, Persona handles the actual verification in the background. With a valuation of $2 billion after its Series D funding round in April 2025 and investors such as Founders Fund, Ribbit Capital, and Index Ventures, the company is among the most significant privately held identity technology companies worldwide. What appears to users as a convenient verification process is, in reality, a concentrated data collection infrastructure that goes far beyond what would be necessary for a simple identity check.

Related to this:

Identity verification | Your face and your data don't belong to you – Anthropic (Claude), LinkedIn and the new economy of biometric control

LinkedIn: 100 million verified profiles — and what's behind them

LinkedIn was one of the first major platforms to introduce Persona as a verification partner. In December 2025, the platform surpassed 100 million verified profiles worldwide—a milestone demonstrating how deeply Persona is already embedded in the identity infrastructure of the professional web. LinkedIn makes clear promises: Verified members receive, on average, 60 percent more profile views and 50 percent more engagement. Verified company pages report 10.9 times more views and 7.7 times more followers. These are powerful incentives—and they work.

The actual verification process takes place via the LinkedIn app. Users are prompted to connect their device to an NFC-enabled passport, scan the chip, and then take a live selfie. What LinkedIn receives is limited: the name as it appears on the passport, the passport type, the issuing country, a hashed identifier, and confirmation that the verification was successful. However, what Persona collects and processes beyond this process is considerably more extensive—and is only addressed in LinkedIn's own documentation with a brief reference to Persona's own privacy policy. The subprocessors authorized for LinkedIn verification via Persona include AWS, Confluent, DBT, Elasticsearch, Google Cloud Platform, MongoDB, Sigma Computing, and Snowflake. Data storage on European servers with guaranteed EU legal compliance is not provided.

The issue of data deletion is particularly noteworthy. Persona confirms in its LinkedIn documentation that data is deleted after verification—but does not specify a concrete timeframe. In the context of other partnerships, for example with Discord, a storage period of up to seven days was acknowledged, which publicly contradicted the platform's statements. And the leaked source code of Persona's government infrastructure reveals that biometric facial lists can be stored for up to three years—more precisely, 1,095 days. What applies to commercial customers and what applies to government customers is deliberately kept vague in Persona's public communications.

Reddit: Age verification as an entry point to biometric control

Reddit introduced Persona as part of the implementation of the UK's Online Safety Act (OSA), which mandated age verification for online platforms. The UK market served as a test case: users attempting to access age-restricted content were redirected to Persona for age verification. In Reddit's dedicated FAQ page, Persona explains that it acts as a data processor under Reddit's instructions, collects only date of birth and image data, and deletes all other information within three days. This initially sounds reasonable.

The situation is actually more complex. Reddit users who verified their identities report a process that goes far beyond simple age estimation: NFC chip scans of passports, live selfies, and detailed behavioral biometrics were collected—including typing speed, hesitation, and whether information is copied. Privacy activists discovered that the Persona source code identified by Celeste ran on a FedRAMP-authorized government server and included checks against FinCEN watchlists, politically exposed persons (PEP) lists, and global sanctions lists. All of this runs parallel to the commercial verification processes for Reddit, LinkedIn, and other platforms—on the same underlying infrastructure. Following significant user protests in the UK, Reddit temporarily suspended forced Persona verification for new users, although already verified users remained in the system.

OpenAI: The oldest and most opaque connection

The relationship between OpenAI and Persona is particularly revealing—and chronologically the earliest of all major platform partnerships. Security researchers discovered via certificate transparency logs that a dedicated watchlist system under the domain openai-watchlistdb.withpersona.com had been active since November 2023—around 18 months before OpenAI publicly announced a requirement for identity verification to access GPT-5 in the summer of 2025. OpenAI had quietly added a sentence to its privacy policy in November 2024 referring to identity and age verification by third-party providers—without explicitly mentioning Persona.

In September 2024, Persona itself published a page explaining that OpenAI uses Persona to verify millions of users monthly, with over 99 percent of users being automatically verified in the background within seconds. This means that not only users actively undergoing verification are affected—background screening is continuous. To access the OpenAI API and models like GPT-5, developers must submit their ID and take three selfies from different angles—left, right, and front—to create a three-dimensional facial profile. Forrester Research assessed this decision as a response to regulatory pressure, geopolitical risks of model misuse, and the need to distinguish between legitimate corporate customers and state-sponsored actors. The consequence: Anyone wanting to use the most advanced AI tools must entrust their face to a US-based verification infrastructure.

Roblox: 151 million daily users — and a mandatory face scan

Roblox is, in terms of sheer user numbers, the most remarkable Persona client. In January 2026, Roblox introduced mandatory age verification for all users who want to use the platform's chat function. With 151 million daily active users, this represents a enforced biometric process on an unprecedented scale—especially since a significant portion of the user base consists of children and teenagers. The mechanism is simple: Anyone wanting to chat must either record a video selfie, which Persona algorithmically analyzes to estimate their age, or alternatively submit an official ID.

Roblox and Persona emphasize that biometric data and images are deleted immediately after age estimation. However, there are no independent verifications of this claim. The real problem lies elsewhere: Persona's age estimation technology is not, from a technical standpoint, age verification—it estimates age based on facial features without verifying it with a document. This means that Roblox users are subjected to a biometric scan that offers neither the accuracy of a true ID check nor the security of document-based verification. At the same time, the biometric data is transferred to a US infrastructure—which raises significant GDPR concerns for EU users, especially minors. Biometric data of minors is subject to particularly stringent requirements under Article 9 in conjunction with Article 8 of the GDPR, which Roblox has not publicly demonstrated in its current implementation.

Our EU and German expertise in business development, sales and marketing

Our EU and German expertise in business development, sales and marketing - Image: Xpert.Digital

Industry focus areas: B2B, digitalization (from AI to XR), mechanical engineering, logistics, renewable energies and industry

More information here:

Expert Business Hub

A thematic hub offering insights and expertise:

Knowledge platform covering global and regional economies, innovation and industry-specific trends
A collection of analyses, insights, and background information from our key areas of focus
A place for expertise and information on current developments in business and technology
A hub for companies seeking information on markets, digitalization, and industry innovations

Persona leak reveals surveillance infrastructure — What EU users need to know now

Discord: The most public retreat and what it reveals

Discord is the only major company to have publicly ended its partnership with Persona, providing an explicit explanation—a rare moment of transparency in the sector. In January 2026, Discord rolled out Persona for age verification purposes in the UK as part of an undisclosed pilot program. When the partnership became public, it sparked a massive user backlash. Discord's CTO, Stanislav Vishnevskiy, admitted the company had failed in its communication. The real escalation came when security researchers not only uncovered the connection to Peter Thiel's Founders Fund but also found an archived support page that mentioned Persona retaining data for seven days—directly contradicting previous statements from the company about near-instant deletion.

Discord subsequently formulated a new, clear set of requirements for all future verification partners: Biometric data must be processed entirely on the user's device and must not leave the device. Persona explicitly did not meet this standard—and was therefore excluded from the contract. This requirement is groundbreaking from both a technical and data protection perspective: It defines on-device processing as the minimum standard for biometric age verification. The fact that none of Persona's other major clients have publicly demanded this standard demonstrates how far the industry still is from this norm. Adding to the difficulties, Discord was also facing a separate data breach concurrently with the Persona case: At another age verification provider, the official identification documents of approximately 70,000 users were compromised—a situation that vividly illustrated the structural risk of outsourcing biometric checks to third-party providers.

VRChat: Identity verification in virtual reality

VRChat, the social virtual reality platform with a dedicated international user base, also implemented Persona as its age verification partner. Due to increasing regulatory requirements for online platforms—particularly in the UK and the EU—VRChat was compelled to implement an age verification system. The system chosen: Persona. The community reaction was fierce. User forums flooded with detailed analyses of the Persona infrastructure, references to the Thiel Founders Fund connection, and a collective call to reconsider the implementation.

VRChat emphasizes in its official communications that it does not receive images of identity documents or facial scans—only a hash value that confirms verification. Persona also claims to receive no information about the user's identity within VRChat. This aligns with Persona's own data protection architecture, but from a data privacy perspective, it is only a partial solution: Persona itself possesses all raw biometric data, regardless of what is shared with the platform. European users of the VRChat community rightly point out that eID technology—which enables age verification without transmitting biometric data—exists as a secure alternative, but is explicitly not supported by Persona. From a European data protection perspective, this rejection is difficult to justify.

Upwork: When identity verification becomes a job requirement

At Upwork, the world's largest freelance platform, persona verification has a particularly direct economic impact. Freelancers are required to verify their identity—either proactively by paying 35 Connects (the platform's internal currency) or by force when applying for loans, US-specific jobs, or after account suspension. Those who fail to complete verification within seven days risk account suspension.

This is remarkable in several respects. First, Upwork directly links biometric verification to economic participation: Anyone who wants to work and get paid must be verified—and this is done via a third-party US infrastructure. For freelancers in the EU, this means transferring their biometric data to a US legal system, for which no realistic alternative exists. Second, Upwork allows routine re-verifications: Users report being asked to verify their identity multiple times, even when there is no apparent reason. Third, Upwork's own communication regarding the scope of data processing is vague: The platform describes verification as an identity check without specifying the exact data flows to Persona or its data storage policies.

The hidden government infrastructure: What the source code leak reveals

The truly disturbing aspect of Persona's infrastructure isn't what the company publicly communicates—but rather what a random configuration error revealed. In February 2026, security researchers discovered that 53 megabytes of source code from Persona's government platform were publicly accessible without any attack or unauthorized access. The Vite build system had left source code maps publicly available—a design flaw that exposed the entire internal architecture.

What the researchers found exceeded expectations: 2,456 source files documenting 269 different verification checks. The platform, running on FedRAMP-authorized government infrastructure under the name ONYX, contains complete modules for the direct submission of Suspicious Activity Reports (SARs) to FinCEN, the U.S. Treasury Department's financial investigation network, and its Canadian counterpart, FINTRAC. It features facial biometric databases that are matched against watchlists, PEP (Politically Exposed Persons) facial recognition modules, and 13 different types of tracking lists, including faces, browser fingerprints, and geolocations. The code explicitly shows that biometric face lists can be stored for up to 1,095 days—that is, three years. Simultaneously, a new subdomain named onyx.withpersona-gov.com appears in the certificate transparency logs, which is temporally linked to the Fivecast ONYX tool—an AI-powered surveillance tool that ICE commissioned for $4.2 million and that generates risk assessments of social media and the dark web. Whether this name coincidence is a coincidence or a structural connection has not been conclusively proven. What is proven, however, is that Persona's government infrastructure runs on the same technical foundation as the commercial infrastructure that powers LinkedIn, Reddit, and OpenAI.

FedRAMP and Persona's Dual Role

One aspect that receives little attention in public discussion is Persona's FedRAMP certification. FedRAMP—the Federal Risk and Authorization Management Program—is the U.S. system for certifying the security of cloud services for federal agencies. In October 2025, Persona achieved FedRAMP Low Impact Authorized certification and is on track for FedRAMP Moderate Ready certification. This means Persona is now an accredited infrastructure provider for U.S. federal agencies. The commercial platform that verifies LinkedIn users and the government platform that provides identity checks to federal agencies are thus under the same umbrella—technically separated through deployments, but legally and organizationally under the same corporate structure.

For European users, this is crucial because it takes the CLOUD Act issue to a new level. The CLOUD Act obligates US companies to grant data access to US authorities upon request—regardless of server location. A company that simultaneously processes commercial biometric data from millions of users worldwide and is an accredited infrastructure provider for US federal agencies represents a combination where the boundaries between commercial service and government infrastructure are structurally blurred. This is not a suspicion—it is a description of the business model.

What users should know and do

Understanding the persona infrastructure changes how we evaluate a seemingly simple decision: Should I verify my LinkedIn profile? The 60 percent increase in profile views sounds tempting. But the real question is: What's in it for me? The answer is multifaceted.

Anyone who verifies their identity via Persona on LinkedIn, Reddit, OpenAI, or Roblox is handing over biometric data to a US company that is a FedRAMP-certified government infrastructure provider, run by Peter Thiel's Founders Fund, which shares the same founding chairman as Palantir, and whose leaked source code reveals a surveillance infrastructure that goes far beyond simple identity verification. EU residents have the right to access, erasure, and objection under the GDPR. Data deletion requests to Persona can be submitted directly via DSAR (Data Subject Access Request)—however, several users report that Persona responds to such requests with automated and vague replies.

For platforms themselves, the lesson lies in Discord's decision: on-device processing as the standard, not the exception. Age verification doesn't have to mean that raw biometric data passes through a US server infrastructure. European eID systems, national identity wallets, and decentralized verification architectures exist as technical alternatives. The fact that they aren't used is a political and economic decision—not a technical necessity.

The structural question: Who is building the Internet of identities?

This inventory doesn't provide answers, but rather raises a question that is becoming increasingly urgent: Who should control the internet's identity infrastructure? By 2026, the de facto answer is a private company in San Francisco, funded by Peter Thiel, with a dual role as a commercial KYC provider and accredited US government infrastructure. LinkedIn has verified 100 million profiles within this system. Reddit verifies age cohorts under British and soon EU-wide pressure. Roblox subjects 151 million daily users to mandatory facial scanning. OpenAI requires biometric verification for access to the world's most powerful AI models.

This concentration is not inevitable. It is the result of market decisions made under weak regulatory control and with a lack of public transparency. The EU AI Act, the GDPR, and the EU Digital Package provide the legal tools to regulate this concentration and promote European alternatives. What is lacking is the political will to enforce them—and public awareness that the question of who manages their public image is not a technical detail, but a core issue of democratic self-determination in the digital age.