- Chain of Thought
- Posts
- šµ Vana: All You Need to Know
šµ Vana: All You Need to Know
Vana is the robin hood of data, giving ownership back to the people
GM! Happy spot ETH ETF day to everyone who celebrates this. No matter what direction the market moves on this, this is a historic moment.
A year ago, I wouldnāt have imagined ETH ETFs happening this soon.
But now, letās move on to our main agenda.
Every week we present to you early, interesting crypto AI projects that caught our eye during our independent research.
š° Research Highlight ā Vana
This week we look at Vana, which is on a mission to revolutionise data ownership for training AI models.
Vana wants to:
Liberate data from walled gardens
Shift ownership of AI back to the user
Itās the Robin Hood of data, giving ownership back to the people.
Source: Vana
User Data is Immensely Valuable.
User data is used to personalize products, provide targeted marketing, and keep an edge over competitors.
With the rise of AI and model training, the value of user data has multiplied manyfold.
Unfortunately, big tech companies monopolize and keep this private data for themselves. Companies like Reddit and Twitter have closed off access to their developer APIs to stop others from training on their data.
Thankfully, data privacy laws retain usersā right to their data. Vana leverages this.
If enough users willingly export their data and make it publicly available, could this create the largest, most comprehensive data treasury in the world?
Source: Vana
Users collectively hold ~100 times the data used to train GPT-4. Imagine the capabilities of the models that are trained on this data.
This vast reservoir of high-quality dataālike messaging from Instagram and Redditācould significantly enhance AI model performance.
Vanaās Secret Sauce ā the Data DAO
Why would users willingly contribute their data?
The answer lies in our favourite word in crypto: incentives!
Big tech generates billions in revenue from harnassing user data. Imagine if users could own a share of the profits their data helped to create. Vana solves this through a concept called Data DAOs.
Source: Vana
The Data DAO allows users to pool and govern their data, rewarding them with a token representing ownership of the particular dataset.
The DAO decides what to do with the data, such as renting it out for training purposes or selling copies of the data.
Some of the Data DAOs on Vanaās testnet include:
Finquarium (Financial)
Flirtual (Dating)
Volara (Twitter)
Reddit DAO, the largest with over 140K users
What Role does Crypto Play?
The Vana network is an EVM-compatible L1 that optimizes for data transactions.
Source: Vana
Users first upload their data to the relevant Digital Liquidity Pool (DLP), almost like a subnet.
Upon upload, the userās data is encrypted, and this transaction is recorded on Vana. The encrypted data then needs to be verified by the validators to ensure their quality and integrity.
This is done through Proof-of-Contribution, a valuation metric specific to each dataset. For example, the Reddit DAO uses karma as a contribution metric. Once the data is validated, another transaction is recorded on Vana, and the data is added to the DLP.
People looking to train models can then purchase access to the DLP APIs or the data can be sold to data buyers at the discretion of the DAO. Users can see their data contributions through their own EVM wallet.
The VANA token will be used to pay transaction fees and govern the network:
70% of block rewards to the top 16 DLPs based on metrics like transactions facilitated and verified data
30% of block rewards to Propagators (Validators on the root network)
š Research-Level Alpha
Vana launched its Satori testnet on June 11. You can earn rewards by participating in the testnet in several ways. Itās early days, so active participants are not many at this time.
Create a DLP ā the pool of data to which users can contribute data. This is competitive, and only 16 DLPs slots are available
Run a ValidatorāValidate the quality of data contributions. It takes about 2 hours to set up, and you can run validators for multiple DLPs. You can register here.
Submit Test Dataācontribute data to a DLP. I connected my Twitter account to this, and it took less than two minutes.
The Team
Founder Anna Kazlauskas was previously a core engineer at Celo blockchain before founding Vana. She graduated from MIT with degrees in both computer science and economics.
Arthur Abal (COO) and Colin Stevenson (Head of Data) were both previously at Appen, a company specializing in high-quality, human-annotated data for machine learning and AI.
Matthias Knauth (Head of Product) was Head of Product at both Credmark and First Coin GmbH.
The team has raised $20M in funding from Paradigm and other notable investors.
At the Imagination in Action summit, Anna Kazlauskas highlighted:
āYeah, so in summary, I think foundation models, they really tend towards monopolies. They require these huge upfront investments in the form of research, data, and compute.
And it's very tempting, I think, for the open source AI community, or more broadly, anyone who's not big tech, to just sort of settle and say like, okay, we're going to do the best we can with the last generation of models that big tech companies open source to us and give us access to.
But we really don't need to settle for being a few generations behind, right? You can actually have a collective of users create their own best model because we have the data and the compute to make it possible.ā
Our Thoughts
Data DAOs are a promising concept. Using cryptoeconomic incentives to bootstrap a valuable network is arguable one of the most compelling use cases for crypto. Hereās a great article on the opportunities and challenges in Data DAOs by Variant fund.
The primary challenge for Vana lies in scaling user contributions. With 140k users in the Reddit DAO, Vana has made a significant start, but itās still far from achieving critical mass for the data pool to be useful.
Another data product, Grass, leverages a passive process by piggybacking off a userās idle bandwidth to scrape the internet. Vana, however, faces a bigger obstacle as it is an active process: users must first recognize the value of their data and then take action to contribute.
The actual value of individual user data is uncertain. For instance, if you are an infrequent Reddit poster and you contribute your data to the network, how much will you earn? If the earnings are minimal (e.g., a few dollars), it wonāt be sufficient to incentivize widespread data contributions.
A critical aspect to watch is how Vana creates demand drivers for the VANA token. The team must generate enough attention and liquidity for the token to spark a flywheel of network growth to reach the scale it requires.
The success of Vana hinges on its ability to cultivate an ecosystem where data contributions and token incentives synergize effectively, driving sustained engagement and expansion.
Thatās it! If you have specific feedback or anything interesting youād like to share, please just reply to this email. We read everything.
Cheers,
Teng Yan & Joshua
Did you like this week's edition? |
This newsletter is intended solely for educational purposes and does not constitute financial advice. It is not an endorsement to buy or sell assets or make financial decisions. Always conduct your own research and exercise caution when making investment choices.
Reply