We propose OmniLytics, a blockchain-based secure data trading marketplace for
machine learning applications. Utilizing OmniLytics, many distributed data
owners can contribute their private data to collectively train an ML model
requested by some model owners, and receive compensation for data contribution.
OmniLytics enables such model training while simultaneously providing 1) model
security against curious data owners; 2) data security against the curious
model and data owners; 3) resilience to malicious data owners who provide
faulty results to poison model training; and 4) resilience to malicious model
owners who intend to evade payment. OmniLytics is implemented as a blockchain
smart contract to guarantee the atomicity of payment. In OmniLytics, a model
owner splits its model into the private and public parts and publishes the
public part on the contract. Through the execution of the contract, the
participating data owners securely aggregate their locally trained models to
update the model owner’s public model and receive reimbursement through the
contract. We implement a working prototype of OmniLytics on Ethereum blockchain
and perform extensive experiments to measure its gas cost, execution time, and
model quality under various parameter combinations. For training a CNN on the
MNIST dataset, the MO is able to boost its model accuracy from 62% to 83%
within 500ms in blockchain processing time.This demonstrates the effectiveness
of OmniLytics for practical deployment.

By admin