Compa benchmarks compensation against actual offer letters in real time – a dataset that gets more accurate the more companies participate.
ENTRY ANGLES
Build data-exchange platforms using give-to-get model for operational data (compensation benchmarks, marketing metrics, CRM contacts, procurement pricing) · Build give-to-get platform for AI training data in fragmented-but-valuable sectors · Solve data verification/anti-gaming mechanisms for give-to-get platforms
VERTICALS
CAPABILITIES
Data verification and anti-gaming mechanisms, Data aggregation and collective access infrastructure, Industry-specific dataset validation
Every company thinks its compensation is competitive. Most are wrong – not because they're stingy, but because salary survey data is stale by the time it's published, and self-reported by companies with every incentive to shade their numbers. Compa built something different: a real-time benchmark drawn from actual offer letters.
Salary benchmarking isn't new. Research firms and employers have long conducted compensation surveys. The problem with those surveys: they're periodic. By the time a company uses the data, it's already stale. And because the data is self-reported by survey participants, it can be unreliable – companies have obvious incentives to shade their numbers.
Compa's data avoids both problems. It's collected in real time from a source with no incentive to mislead: actual offer letters.
The mechanism is straightforward. Companies connect their Applicant Tracking Systems (ATS) to the Compa platform. From that point:
- The offers they extend to candidates flow into Compa's database. - In return, they receive anonymized, aggregated data on offers made by other participating companies.
Personal information about candidates is stripped out entirely. What remains is market signal: what companies are actually offering, right now, for specific roles.
Why offers rather than accepted salaries? Because an offer is what Compa calls the "marketing interface" – the first competitive move a company makes for any candidate. Win that first moment or lose the candidate before the conversation really begins.
One challenge any compensation database faces: job titles are inconsistent across companies. The same role might be called "Senior Engineer" at one company and "Staff Software Developer" at another. Conversely, the same title can mean very different things at different organizations. Regional variation adds another layer of noise.
Compa handles this by letting each client company define mapping rules – translating titles and geographies from the shared database into their own internal taxonomy. The data becomes useful in the client's context, not just in the abstract.
Another variable: company type matters for compensation benchmarking. Large established companies, well-funded high-growth startups, and smaller businesses all operate on different compensation scales. Compa lets clients create peer-group clusters – groups of companies they consider relevant compensation benchmarks – and view average offers per role within each cluster separately.
Because offer data flows in and distributes in real time, clients can react immediately to market shifts as they happen.
Compa currently focuses exclusively on technology companies. Its database contains data on more than 110,000 offers from companies including Stripe, Nvidia, DoorDash, Squarespace, Dropbox, Instacart, Autodesk, Vimeo, and MongoDB.
Subscription pricing starts at $50,000 per year.
The startup launched in 2021 with $3.9M to build its core offer-management platform. The compensation database product launched in spring 2023. Since then, the client count has grown 800% and revenue has increased 10x.
Compa has now raised a new $10M round.
Compa's foundational model is "give-to-get": to receive data on others' offers, you have to share your own. The $50K-per-year floor price likely scales with how many offers a company makes – meaning companies that contribute little data pay more, since they're extracting more than they're contributing.
This exact model was [covered previously](/review/daj-chtoby-vzjat) in the fall of 2021, in a review of Varos – a platform that enables e-commerce businesses and SaaS companies to exchange marketing and financial benchmarks on the same give-to-get basis: click-through rates, conversion rates, repeat purchase rates, retention, average order value, and revenue. At the time of that review, Varos had just come out of Y Combinator with $125K. By 2022, it had raised $4.3M more.
Two other startups apply a closely related version of the model. Crossbeam ([covered here](/review/vmeste-prodadim)) and Reveal both built platforms that let B2B companies automatically exchange CRM data with each other. The goal: identify warm introductions – situations where one company already has a customer relationship that another company is trying to establish. Reveal has raised $54.3M; Crossbeam, $116.9M.
David Sacks of the "PayPal Mafia" has argued that the give-to-get model could become a defining mechanic for AI startups – specifically for data acquisition. His argument: companies that need proprietary training data could use a give-to-get exchange to accumulate it, contributing their own data in exchange for access to others'. That approach could particularly suit AI startups in healthcare, finance, scientific research, manufacturing, creative applications, and legal document analysis.
For context: the earliest known give-to-get startup was Jigsaw, founded in 2004. They built a shared contact database where participants earned credits for adding and updating contact records, then spent those credits to access the database. Salesforce acquired Jigsaw in 2010 for $142M.
The inherent risks of give-to-get – data security, data quality, gaming the system with fabricated contributions – are real. But Airbnb, Uber, and BlaBlaCar all looked dangerously risky before they were obviously right. Risk and return tend to travel together. Which suggests that some application of give-to-get is still waiting to become the next name on that list.
The opportunity: build data-exchange platforms on the give-to-get model. Two directions, neither of which is fully occupied.
The more established path follows Compa, Varos, Crossbeam, and Reveal – traditional operational data exchange. Compensation benchmarks, marketing metrics, CRM contacts, procurement pricing by category, logistics costs. The common thread: valuable datasets that individual companies can't generate alone but can access collectively.
The less explored path is give-to-get for AI training data. Many industries have companies sitting on proprietary datasets that AI models need – but no single company has enough to train on. Healthcare, finance, manufacturing, legal – these are sectors where data accumulation is fragmented but valuable. The key design challenge is verification: you need mechanisms to catch participants submitting garbage data just to harvest good data from others. That's a real engineering problem, but a solvable one.
The give-to-get model has been validated enough times that the question isn't whether it works – it's which dataset category represents the sharpest current opening.