Reducing the requirement of data with Smart Machines

Nowadays machine learning (ML) systems learn with the help of an example, ingesting large data which has been labeled by human analysts individually to get the expected output. With the progression of these systems, deep neural network (DNN) has also emerged as state of the art in a model of machine learning systems. DNN has a robust capability to perform different powering tasks like machine translation and recognition of speech with a high degree of accuracy. To train DNN a considerable amount of labeled data possibly 1010 examples of training are required. The entire process of collecting and marking such huge data involves a lot of time and high cost as well.

With the ML model challenge is not limited to collecting and labeling large data. Also, the system is prone to break down with an even minimal change in the operating environment. If changes in recognition or speaker identification system occur if they are required to be retrained with new data set then modifying or making the model adapt to these changes can take almost same energy and time as it will take while developing one from scratch.

To reduce this cost and time related to adapting and training of an ML model, now DARPA is coming up with a new program which is known as Learning with Less Labels (LwLL). With the help of LwLL, now DARPA will research some new learning algorithms which won’t demand considerable data to learn or update.

Wade Shen who is a DARPA program manager said that “Under LwLL we are looking forward to reducing the requirement of data for building a model from scratch and also reduce the amount of data which is required by a model to adapt to changes.” It can be said that earlier what requires millions of images for training a system will demand just one copy in future or require about 100 labeled examples for adapting a system instead of millions of labeled examples as required at present.

For achieving this goal, researchers of LwLL are going to explore two technical areas, in particular. The first one is focusing on building a learning algorithm which is capable of learning and adapting efficiently. Researchers will work to develop some algorithms which can help in reducing the number of labeled examples required and for this they will be establishing a program metric, so that system performance is not compromised. 

The second technical area where researchers will work is to characterize problems of ML both regarding their difficulties in the decision making and the complexity of data which is used while making a decision.