[Short general description]: Ocean is a protocol and network that incentivizes data providers to provide a vast supply of high-quality data, for use in training artificial intelligence (AI) models.
Ocean incentivizes not only high-quality priced data but also high quality public or commons data. In turn, this helps to power data marketplaces. Ultimately, Ocean intends to develop a protocol and network - a tokenised ecosystem - that incentivises making this data available.
[Main problems tackled]:
1) Strong incentives to submit, refer, and make available quality data.
2) Curation market for reputation of data.
3) Only keeper nodes provably making high-quality data assets available will be able to reap rewards
4) Block rewards for a given data are distributed based on amount of stake in that data asset, and its popularity
1) marketplaces - a place where providers and consumers can interact
2) keepers - which collectively maintain the network
3) user registry - whitelist of good actors
4) data curation - list of available data
5) data pricing - either free or requested price by provider for data access.
6 ) verification - via a challenge response mechanism that makes sure the keeper actually made a data set available.
Token design: challenge-response protocols where (1) in case of the data having to leave the provider’s premise, a history of provenance for all the consumer’s received data is published to Ocean’s immutable record and where (2) in case of an on-premise computation the data consumer is provably guaranteed a correct model execution on the purchased data.
Ocean core software uses several building blocks, either as libraries or connecting to them via networks. This includes IPDB4 network running BigchainDB5 software for storing metadata and COALAIP6 rights. Data blobs themselves may be stored on-premise, on the centralized cloud, or on the decentralized cloud. On-premise storage may pair with on-premise processing; in which case only the result of the processing is made available to the data consumer.
Ocean incentivizes not only high-quality priced data but also high-quality public or commons data, making a substrate for decentralized data marketplaces.