Pocket Guide

Access and Collect Labeled Data for Deep Learning

Why Is So Much Training Data Necessary?

Deep learning networks try to classify abstract patterns without having experience or existing knowledge to draw from. Deep learning needs more training data than traditional methods to offset humans’ domain knowledge. Your network will be only as good as the labeled data you provide. Several methods exist for acquiring labeled data.

Knowledge vs Size graph

Collect Your Own Data

You can build a database from scratch by collecting data from sensors. This is a good option in some cases, such as with autonomous vehicles, because billions of vehicles are on the road. Collecting your own data seems straightforward at first, but you need to consider collecting data across the entire solution space and labeling that data.

Data collection diagram

Example: Synthesizing Waveform Data

RF modulation schemes and the impairments that produce noise on them are well known, so they are perfect candidates for synthesized training data. The real test is how well a network trained on synthesized data can label actual RF data.