Neon pays users to record phone conversations, then sells that data to AI companies training voice models – a bet on proprietary data as the durable moat.
ENTRY ANGLES
Build data collection businesses where the collection mechanism IS the product · Create products that continuously collect high-quality, unique data as a byproduct of normal usage · Develop data collection systems for AI training with willing user participation
VERTICALS
CAPABILITIES
Data collection and pipeline infrastructure, Product design that incentivizes user participation in data generation, Understanding of what training data is valuable to AI model developers
NEON FOUNDER
“Data is the fuel, and AI is the engine that runs on it.”
Neon is an app that pays users for recording their phone conversations.
The business model: Neon sells that data to AI companies who need it to train voice models and conversational AI agents.
To avoid privacy issues, Neon only records a call if both participants are Neon users. As a result, anyone who wants to earn through the app needs to convince the people they regularly talk to on the phone to sign up as well.
All recordings are automatically stripped of names, addresses, phone numbers, and other personally identifiable information.
When users sign up, they set their own asking price – anywhere from $0.15 to $1.00 per minute. Neon analyzes call quality first, and if it meets the bar, sends the user confirmation that it's ready to start paying. Lower prices naturally improve the odds of getting accepted.
Beyond per-minute earnings, Neon also runs ongoing prize draws: $100 daily, $1,000 weekly, $10,000 monthly, and $100,000 annually.
To stay eligible, users need to have natural, varied conversations with different people, refer new users, and avoid gaming the system with fake calls – including AI-generated ones.
Neon raised its first $1.5 million last May to build the app. It launched in September and climbed to the top of the App Store charts – before a vulnerability was discovered that allowed call recordings and personal data to be extracted, leading to its removal.
In February, Neon released a rebuilt version with improved data protections. It has since raised $25 million in equity funding plus an undisclosed amount in debt financing.
Y Combinator's recent demo day produced eight standout companies that captured the most investor attention. One of them was Luel – which, like Neon, is in the business of collecting and selling data for AI training.
Luel also sources data from users and pays them for it – then anonymizes, cleans, and standardizes it for resale. The data types span audio recordings, video content, and conversations of various kinds. The existing audio catalog includes multi-language phone conversations, monologues, doctor-patient dialogues, call center recordings, and other specialized interaction types recorded in various conditions.
Video data is less extensive so far: examples include recordings of craftspeople doing manual work and footage capturing ordinary everyday activities.
Why are investors suddenly paying attention to this category? Because AI quality is determined primarily not by algorithms – but by the volume and quality of the data used to train them. As one AI specialist put it: "Data is the fuel, and AI is the engine that runs on it."
Without StackOverflow, there would be no AI that can write code. And voice AI agents will only learn to converse as naturally as humans if they're trained on enormous volumes of real human conversation.
AI developers need that training data from somewhere – which has created an entirely new category of companies that do nothing but collect data at scale and sell it to model developers.
These data collection companies break into two types. The first collects broadly useful "everyday" data – standard phone conversations, for example. The second collects expert data: domain-specific knowledge needed to build specialized AI models and agents.
Rubric ([covered previously](/review/razrabatyvat-ii-platformy-uzhe-ne-tak-vygodno)) illustrates the expert-data variant – another recent Y Combinator graduate building infrastructure to systematically extract knowledge from human experts across various fields.
Rubric makes an interesting claim: the window for capturing authentic pre-AI expertise is at most a decade – while the generation that built their knowledge and intuition before AI existed is still around.
"Data is the new oil" has been a talking point for years. But it only became a concrete business reality with the rise of AI.
For AI product developers, the most important question has shifted: it's no longer "what does your product do well" – it's "what makes your data good" More data, better data, more unique data – those are the real competitive levers, even when using the same base models as everyone else. The practical question is what you're actively doing to collect better data than your competitors, and whether that compounds into a moat.
The more radical move is to skip the product layer entirely and build a data collection business. That's the key distinction between Neon and Luel. Luel collects data through individual efforts – each run requires deliberate, discrete work. Neon built a product that collects data continuously, in a repeatable and scalable way, with users as willing participants. The collection is the product. What data useful for AI training could you start collecting at scale – and who would buy it?