New Machine-Learning Model Improves Security, Privacy for Data
High tech is keeping the world’s economies running right now. From the need for COBOL writers to get the antiquated check-writing process done for the IRS and Treasury to new machine-learning based platforms and AI food distribution advances. Now a new system of AI oversight is in development that could improve security in machine-learning systems that rely on sensitive private data.
In an academic paper published this week on arxiv.org, a team of researchers from Princeton, Microsoft, the nonprofit Algorand Foundation and Technion developed Falcon, a framework for secure computation of AI models on distributed systems. It’s the first secure C++ framework to support high-capacity AI models and batch normalization, a technique for improving both the speed and stability of models. Falcon automatically aborts operations when it detects malicious attackers and can outperform existing solutions up to a factor of 200.
This project has created great interest from AI developers, especially those in need of strong security in operating health care systems and research models. A venturebeat.com story expanded on the problem:
“The Royal Free London NHS Foundation Trust, a division of the U.K.’s National Health Service based in London, provided Alphabet’s DeepMind with data on 1.6 million patients without their consent. Google — whose health data-sharing partnership with Ascension became the subject of scrutiny in November — abandoned plans to publish scans of chest X-rays over concerns that they contained personally identifiable information. This past summer, Microsoft quietly removed a data set (MS Celeb) with more than 10 million images of people after it was revealed that some weren’t aware they had been included.”
Techniques called federated learning and homomorphic encryption are being used to help mitigate privacy issues, but machine learning models still have computational lags due to limiting the massive data needed to maximize output.
Falcon operates on the division of two types of users in a distributed AI usage scenario: data holders, who own the training data sets, and query users, who query the system post-learning. Falcon leverages new protocols for the computation of nonlinear functions, like rectified linear units (ReLU), a type of activation function. Falcon also uses semi-honest protocols, where parties have to follow prespecified rules exactly and can’t change their inputs or outputs, as well as “malicious protocols,” allowing corrupted parties to deviate from rules by switching inputs and outputs or ignoring the rules.
“[T]he sensitive nature of [certain] data demands deep learning frameworks that allow training on data aggregated from multiple entities while ensuring strong privacy and confidentiality guarantees,” conclude the researchers. “A synergistic combination of secure computing primitives with deep learning algorithms would enable sensitive applications to benefit from the high prediction accuracies of neural networks.”
read more at venturebeat.com