Machines that learn to do, and do to learn: What is artificial intelligence?
Artificial intelligence is much talked about, but what exactly is it? Georgios Petropoulos explores the origins, methods and potential of machine learning.
Artificial intelligence (AI) refers to intelligence exhibited by machines. It lies at the intersection of big data, machine learning and computer programming. Computer programming contributes the necessary design and operational framework. It can make machines capable of carrying out a complex series of computations automatically. These computations can be linked to specific actions in the case robots (which are in principle programmable through computers). Machine learning enables computer programs to acquire knowledge and skills, and even improve their own performance. Big Data provides the raw material for machine learning, and offers examples on computer programs can “practice” in order to learn, exercise, and ultimately perform their assigned tasks more efficiently.
The idea of intelligent machines arose in the early 20th century. From the beginning, the idea of “human-like” intelligence was key. Following Vannevar Bush’s seminal work from 1945, where he proposed “a system which amplifies people’s own knowledge and understanding”, Alan Turing asked the question: “Can a machine think?” In his famous 1950 imitation game, Turning proposed a test of a machine’s ability to exhibit intelligent behaviour equivalent to that of a human. A human evaluator judges a text exchange conversation between a human and a machine that is designed to generate human-like responses. The evaluator would be aware that one of the two partners in conversation is a machine, and all participants would be separated from one another. If the evaluator cannot reliably tell the machine from the human, the machine is said to have passed the test.
The specific term “artificial intelligence” was first used by John McCarthy in the summer of 1956, when he held the first academic conference on the subject in Dartmouth. However, the traditional approach to AI was not really about independent machine learning. Instead the aim was to specify rules of logical reasoning and real world conditions which machines could be programmed to follow and react to. This approach was time-consuming for programmers and its effectiveness relied heavily on the clarity of rules and definitions.
For example, applying this rule-and-content approach to machine language translation would require the programmer to proactively equip the machine with all grammatical rules, vocabulary and idioms of the source and target languages. Only then could one feed the machine a sentence to be translated. As words cannot be reduced only to their dictionary definition and there are many exceptions to grammar rules, this approach would be inefficient and ultimately offer poor results, at least if we compare the outcome with a professional, human translator.
Modern AI has deviated from this approach by adopting the notion of machine learning. This shift follows in principle Turing’s recommendation to teach a machine to perform specific tasks as if it were a child. By building a machine with sufficient computational resources, offering training examples from real world data and by designing specific algorithms and tools that define a learning process, rather than specific data manipulations, machines can improve their own performance through learning by doing, inferring patterns, and hypothesis checking.
Thus it is no longer necessary to programme in advance long and complicated rules for a machine’s specific operations. Instead programmers can equip them with flexible mechanisms that facilitate machines’ adaptation to their task environment. At the core of this learning process are artificial neural networks, inspired by the networks of neurons in the human brain. The article by The Economist provides a nice illustration of how a simple artificial neuron network works: It is organized in layers. Data is introduced to the network through an input layer. Then come the hidden layers in which information is processed and finally an output layer where results are released. Each neuron within the network is connected to many others, as both inputs and outputs, but the connections are not equal. They are weighted such that a neuron’s different outward connections fire at different levels of input activation. A network with many hidden layers can combine, sort or divide signals by applying different weights to them and passing the result to the next layer. The number of hidden layers is indicative of the ability of the network to detect increasingly subtle features of the input data. The training of the network takes place through the adjustment of neurons’ connection weights, so that the network gives the desired response when presented with particular inputs.
The goal of the neural network is to solve problems in the same way that a hypothesised human brain would, albeit without any “conscious” codified awareness of the rules and patterns that have been inferred from the data. Modern neural network projects typically work with a few thousand to a few million neural units and millions of connections, which are still several orders of magnitude less complex than the human brain and closer to the computing power of a worm (see the Intel AI Documentation for further details). While networks with more hidden layers are expected to be more powerful, training deep networks can be rather challenging, owing to the difference in speed at which every hidden layer learns.
By categorising the ways this artificial neuron structure can interact with the source data and stimuli, we can identify three different types of machine learning:
- Supervised learning: the neural network is provided with examples of inputs and corresponding desired outputs. It then “learns” how to accurately map inputs to outputs by adjusting the weights and activation thresholds of its neural connections. This is the most widely used technique. A typical use would be training email servers to choose which emails should automatically go to the spam folder. Another task that can be learnt in this way is finding the most appropriate results for a query typed in a search engine.
- Unsupervised learning: the neural network is provided with example inputs and then it is left to recognise features, patterns and structure in these inputs without any specific guidance. This type of learning can be used to cluster the input data into classes on the basis of their statistical properties It is particularly useful for finding things that you do not know the form of, such as as-yet-unrecognised patterns in a large dataset.
- Reinforcement learning: the neural network interacts with an environment in which it must perform a specific task, and receives feedback on its performance in the form of a reward or a punishment. This type of learning corresponds, for example, to the training of a network to play computer games and achieve high scores.
Since artificial neural networks are based on a posited structure and function of the human brain, a natural question to ask is whether machines can outperform human beings. Indeed, there are several examples of games and competitions in which machines can now beat humans. By now, machines have topped the best humans at most games traditionally held up as measures of human intellect, including chess (recall for example the 1997 game between IBM’s Deep Blue and the champion Garry Kasparov), Scrabble, Othello, and Jeopardy!. Even in more complex games, machines seem to be quickly improving their performance through their learning process. In March 2016, the AlphaGo computer program from the AI startup DeapMind (which was bought by Google in 2014) beat Lee Sedol at a five-game match of Go – the oldest board game, invented in China more than 2,500 years ago. This was the first time a computer Go program has beaten a 9-dan professional without handicaps.
Probably the most striking performance of machine learning took place in the ImageNet Large Scale Visual Recognition Challenge, which evaluates algorithms for object detection and image classification at large scale. For any given word, ImageNet contains several hundred images. In the annual ImageNet contest several research groups compete in getting their computers to recognise and label images automatically. Humans on average label an image correctly 95% of the time. The respective number for the winning AI system in 2010 was 72%, but over the next couple of years the error rate fell sharply. In 2015, machines managed to achieve an accuracy of 96%, reducing the error rate below human average level for first time.
It is important to understand that many of these machines are programmed to perform specific tasks, narrowing the scope of their operation. So humans are still superior in performing general tasks and using experience acquired in one task to deliver another task. Take, for example, the ImageNet challenge. As one of the challenge’s organisers, Olga Russakovsky, pointed out in 2015, “the programs only have to identify images as belonging to one of a thousand categories; humans can recognise a larger number of categories, and also (unlike the programs) can judge the context of an image”.
Multitask learning and general-task-AI are still lagging behind human cognitive ability and performance. Indeed, they are the next big challenges for AI research teams. For example, a self-driving car which drives a specific route in a controlled environment is quite a different task from a car out in the road amidst varied and unpredictable traffic and weather conditions.
Nevertheless, the rapid improvement in the performance of machines through learning is something that is easily observable since 2012, when deep learning neuron networks started to be constructed and operated. Technological advances have increased the rate with which machines improve their function, further accelerating the progress of AI.
So what does this mean? AI could bring substantial social benefits, which will improve many aspects of our lives. For example, smart machines can make healthcare more effective, by providing more accurate and timely diagnoses and treatments. The increased ability of machine scanners to analyse images like X-rays and CT scans can reduce the error margin in a diagnosis. It can also lead to great time efficiencies. In the fight against breast cancer, Forbes illustrates how efficient AI can be. So far, women depend on monthly home exams and annual mammograms to detect breast cancer. Cyrcadia Health, a cancer therapy startup, has developed a sensor-filled patch that can be inserted comfortably under a bra for daily wear. Connecting through the woman’s smartphone or PC, the patch uses machine-learning algorithms to track the woman’s breast tissue temperatures and analyse this data at the Cyrcadia lab. If it detects a change in pattern, the technology will quickly alert the woman — and her healthcare provider — to schedule a follow-up with her doctor.
The first generation of AI machines has already arrived as computer algorithms in online translation, search, digital marketplaces and collaborative economy markets. Algorithms are learning how to perform their tasks more efficiently, providing a better and higher quality experience for the online users. Such efficiency gains through smart technology lead to high social benefits as reported by numerous studies (for instance, see Petropoulos, 2017 for the benefits from the collaborative economy, which are made possible by AI and machine learning).
The final destination of AI research is still uncertain. But machines will continue to become ever smarter, performing the tasks assigned to them ever more efficiently. Depending on their design and construction, they can have many applications. However, they will also interact with humans in sometimes challenging ways. Policymakers and researcher alike need to be prepared for the AI revolution.
Republishing and referencing
Bruegel considers itself a public good and takes no institutional standpoint. Anyone is free to republish and/or quote this post without prior consent. Please provide a full reference, clearly stating Bruegel and the relevant author as the source, and include a prominent hyperlink to the original post.