Thinking machines have been around since 1948, when the Manchester Small-Scale Experimental Machine, nicknamed the Baby, was the first to execute a program stored in its memory.

Nearly seventy years later, computers now speak freely to us, take our commands, field our questions, deliver mail in the office, greet customers in stores, cook pancakes, produce creative works, win strategy-based games, provide companionship in hospital wards, assist in the operating room  and on the factory floor, fight wars, drive cars, and are increasingly a fixture in our homes and workplaces.
A computer’s brain is empty, however, until it’s fed the information and algorithms it needs in order to process what it learns and deliver results.

An algorithm is a step-by-step set of instructions, as in a recipe. One way to teach a computer is to feed into it huge quantities of information and rules about the world so that it can call upon an encyclopedic store of know-how, coupled with algorithms that instruct it what to do with all that information when tasked with a given challenge. This is the “big data” approach and it’s had its successes.
The concept is to use a large number of training examples from which the computer can infer the rules for recognizing a box when it sees one – whether the box is expertly or sloppily drawn, or even just inferred from a few dashed-off lines.  Increasing the number of training examples increases the computer’s accuracy.

The big idea here is that the computer works with a map of artificial neural networks (ANNs) that are inspired by our biological neural networks. Just as biological networks are interconnected, the artificial ones likewise exchange information between the layers.
The first layer learns primitive features, like how to discern the edge of the image, such as the straight edge of the box, or the tiniest unit of speech, as in the sound of the individual letter “b.” It does this by finding combinations of digitized pixels (for image recognition) or sound waves (for speech recognition) that occur more often than they would by chance.  

Once that layer accurately detects these features, they’re fed to the next layer, which trains itself through a large input of examples to recognize more complex features, like the corner of the box, or a combination of speech sounds, such as “b-o-x.”  

The process is repeated in successive layers until the system can reliably recognize the object or sound it is attempting to identify. Further, the computer can train itself on known data and apply what it knows to new data.
Denise Shekerjian is a writer and lawyer with a keen interest in creativity and artificial intelligence. You can find her at soulofaword.com.