The sorcery of work in machine learning systems nowadays owes itself to a pedigree of pioneers in artificial intelligence, mathematics and computer science that laid the flagstones for others to walk. Whilst I am always interested in reading about how the giant research labs of Google, Netflix, Baidu and so on are pushing the envelope with what can be achieved, I thought I would reflect on a few of the forefathers of machine learning systems as a reminder of how we got here.
1. Alan Turing, The Father of Theoretical Computer Science and AI (1938 onwards)
The German Enigma Machine with it’s complex combination of mechanical and electrical subsystems was used to encipher and decipher messages from the 1920s and was most notably used by the German Navy in Word War 2. Turing is best known for his work at Bletchley Park in England, which was the home of the United Kingdom’s Government Code & Cypher School, in creating Banburismus, a crpytanalytic process used to crack German naval messages which helped determine mutually exclusive unknown settings of the machine set by the Germans.
Turing’s legacy also includes the design of the Automatic Computing Engine for which he presented a paper in 1946 detailing the architecture of a stored programme computer which, although it was never fully built, inspired the design for future generations of computer.
Turing also proposed a test known as the Turing test, regarded as the bed of modern artificial intelligence, to assess whether a computer can be considered intelligent or not. The test definition states that if a panel of human beings communicating with an unknown entity believe that the entity is human, then the computer is said to have passed the test.
2. Walter Pitts, Computational Neural Networks (1943)
The astonishing story of Walter Pitts is of a boy that was brought up in a family which encouraged him to leave school early. After all, the sooner you leave school, the sooner you can start earning. So Pitt’s father maintained. In the disordered life of Walter Pitts, the homeless stints and the trials of not being afforded a structured education, he lived in libraries, gatecrashed lectures at the University of Chicago and absorbed Bertrand Russell’s Pricipia Mathematica which came in handy for when he met Neurophysiologist, Warren McCulloch.
Famously, on reading the work, he wrote to Russell highlighting errors in the first volume, prompting Russell to invite him to Cambridge in England to study.
Walter Pitts is remembered, along with Warren McCulloch for writing the 1943 paper ‘A Logical Calculus of Ideas Immanent in Nervous Activity’ which architected the notion that the brain was a universal computing device. Pitts application of mathematics to neural networks was the seed for looking at neural networks in the context of artificial intelligence.
3. Arthur Samuel, The World’s first Self Learning Program (1956)
Machine learning lectures are not complete without featuring an image somewhere of Samuel sitting at an IBM machine flicking switches with a checkers board in front of him. In 1956 Samuel showed the capability of the computer on television by demonstrating his checkers learning program, widely regarded to be the world’s first self-learning program.
Samuel used alpha-beta pruning in order to decrease the number of nodes in a search tree from the present state of the game due to the limited amount of computer memory. As the program learnt as it progressed, the depth of the search trees increased.
What Samuel disproved is the notion that computers had to be explicitly programmed in order to perform a task. Samuel’s work on the checkers program continued until the 1970s by which time it was able to beat players of a good amateur level.
4. Herbert A Simon, The General Problem Solver (1959)
Simon was a Social Scientist with an enormous spread of deep interests from economics to cognitive psychology. He approached decision making from a rational intelligence gathering perspective taking alternatives into consideration which led to him creating the General Problem Solver (GPS) program in 1959 which was able to separate problem solving strategy from information about particular problems.
Simon was also co-Author of an assembly language called Information Processing Language (IPL) which was used for manipulating lists and which Simon used for the GPS. Simon’s efforts always ran in parallel to a relentless crusade to understand human problem solving behaviour which lead to the development of the Elementary Perceiver and Memoriser Theory (EPAM) which looked at the assembly of expertise through chunks of information.
In 1975 he won the Turing Award along with Allan Newell for his contributions to the field of Artificial Intelligence.
5. Frank Rosenblatt, Neural Networks in Practice (1960)
The similarities between neural networks for computation and the human brain is a comparison that some modern day Data Scientists steer away from but Frank Rosenblatt’s Mark 1 Perceptron in 1960 was the first computer to use neural networks which mimicked the human brain.
Rosenblatt’s work was a manifestation of the work that Pitt and McCulloch had previously done and made computational learning through neural networks a reality. Despite the hype of this groundbreaking technology, further research within the field laid largely dormant until the 1980s.
6. James William Cooley, Fast Fourier Transform for Signal Representation (1965)
The recent improvement in performance in convolutional neural networks by Facebook AI Research was down to their custom implementation of the Fast Fourier Transform algorithm, fbFFT. James Cooley was an American Mathematician that developed the Fast Fourier Transform as a way of switching representations for mapping signals into something that is easier to work throughout the nodes of a convolutional neural network.
Fast Fourier Transform is able to perform convolutions, that is, represent one of two inputs as a single function which is far more efficient for neural networks and machine learning systems which allows poweful compute power to perform at it’s quickest.
7. Joseph Weizenbaum – Birth of Natural Language Processing Applications (1966)
Weizenbaum is credited with publishing a program called ELIZA which is regarded as the first program to be able to demonstrate Natural Language Processing. In particular the DOCTOR script appeared to be able to simulate a conversation between the user and slightly aloof psychotherapist, although when cornered the program’s strategy was to ask an open question which was able to side step any vagueness. The illusion raised ethical issues and questions, especially in Weizenbaum’s mind when he was asked to leave the room so that she could use DOCTOR.
In 1976, Weizenbaum’s book ‘Computer Power and Human Reason: From Judgement To Calculation’ was published which looked at the ethics of artificial intelligence and asserted that whilst it is possible to create artificial intelligence, it would be stripped of other factors which humans use in decision making processes such as compassion.
I would also like to write a synopsis of modern day pioneers of machine learning and deep learning although with the money that is directed at research labs, the pool of people working in the field nowadays is huge. What are your thoughts? Geoffrey Hinton? Andrew Ng? Or maybe specific tools or organisations. Netflix, Google?