Data Watermarking: The missing link to on-/off-chain implementation of distributed data marketplaces
IMDEA Networks is the beneficiary of this project
  • Financed by: Next Generation Internet (NGI) ONTOCHAIN first open call
  • Duration: March 2021 to May 2021
  • Contact: Nikolaos LAOUTARIS, Principal Investigator for IMDEA Networks

Data Marketplaces (DMs), in which data sellers make datasets available for purchase by data buyers are emerging fast in the big data market for monetising personal, or aggregate, often anonymized, datasets. Monolithic DMs operating under a single authority, need to place full trust on a single company/organisation. They may also end up producing additional monopolies/oligopolies on the Internet. Therefore, several attempts are ongoing for developing distributed marketplaces, often on top of Distributed Ledger Technologies (DLTs).

Fully on-chain approaches are having scalability problems when faced with large datasets, such as those traded over DMs. To facilitate trustworthy off-chain handling of datasets in distributed DMs, DW-marking will develop a new breed of digital watermarking techniques for protecting ownership, and establishing accountability, in the off-chain handling of datasets.

Existing digital watermarking techniques for media, such as video, images, and software are not well suited for DMs, since they were developed for large binary files of particular encoding that can be easily manipulated without affecting the contained information (e.g., changing slightly the color tone of a few pixels). Such operations cannot be applied on datasets that carry structured and loosely structured information, in the form of strings, integers, and floating point numbers. Any change of such information can render it useless (e.g., changing the character of a string), or inaccurate (integers, floating point numbers).

Therefore, we will develop a new breed of watermarking techniques suitable to the nature of datasets traded in contemporary DMs. In addition to developing core watermarking techniques, we will also develop protocols for using them to power dataset provenance primitives. We will also develop interfaces for connecting off-chain watermarking techniques with on-chain primitives for the same purpose.

The ONTOCHAIN project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 957338.