I spent the last 10 years working on big data, analytics, and artificial intelligence projects. Artificial intelligence has become a matter of curiosity by the masses, especially in the last five years. In this series of articles, I will try to answer the questions about artificial intelligence. I aim to address ordinary people who wonder about artificial intelligence and those interested in the subject professionally. I'm going to start with definitions of artificial intelligence, machine learning, deep learning, artificial neural networks because it is not possible to have a holistic view of the subject without grasping these terms together with the relations between them.
Artificial intelligence refers to the systems that create the impression that they are intelligent. Today, the systems called artificial intelligence are essentially computer programs.
Artificial intelligence systems can have a body, or they can only exist as software. Examples of artificial intelligence with a body are humanoid robots and autonomic cars. Siri, Google Now, Cortana, and navigation applications are examples of software AI operating on mobile devices.
In traditional programming, input data is processed through a program, and output data is generated to design a system. In machine learning, the program is created using input and output data.
Machine learning can be defined as the task of writing programs automatically by taking advantage of input and output data through a learning algorithm in this context. If the problem is too complicated to be expressed and you have a lot of input/output data in hand, machine learning methods are used. Translation from one language to another is a good example of this situation. It isn't easy to write a translation program without using machine learning algorithms because of its complex nature. If we have sufficient volumes of input and output data for translation(text to be translated and translated text), translation programs can be created using machine learning methods.
Deep learning, a method of machine learning, has become popular since the early 2010s. Artificial neural networks are implemented as multilayers in deep learning. In this method, unlike traditional machine learning methods, when the data size grows too large, the model continues to learn proportional to the data's size.
Deep learning is successfully applied to data types such as free text, audio, images that traditional machine learning methods can not process. Deep artificial neural networks can learn the hierarchy of the data's contents, thanks to the layers they contain.
I want to give an example of the data hierarchy; a free text contains a hierarchy of letters, words, sentences, and paragraphs. Deep artificial neural networks do not have to learn it repeatedly when they learn a word, thanks to its hierarchical structure. Similarly, although it is not so easy to describe mathematically, groups of data encountered repeatedly within images also form an internal hierarchy. A deep artificial learning model that learns what a cat looks like can perceive the image in a holistic manner rather than treating it pixel by pixel.
Image Source: envano.com
The definition of artificial intelligence is quite vague. Until recently, we said, “computers can't recognize objects, so when we see a cat, we know a cat like that.” Over the past 7-8 years, computers have begun to recognize hundreds of different objects with over 95% accuracy over photographs. Artificial intelligence algorithms can recognize and tag people automatically from their pictures, as we witness on Facebook.
Traditionally, "expert systems" have been used in artificial intelligence, and in recent times machine learning methods have gained popularity. "Expert systems" are created by experts to manually reach a certain target with programs written by experts.
It is not easy to grasp the terms used in artificial intelligence without understanding their relationship with each other. Roughly, artificial intelligence includes machine learning, and machine learning includes deep learning. Artificial intelligence systems can contain code written by programmers and program codes created by one or more machine learning models.
How can a computer program learn from the data? To visualize the situation, let's create an algorithm that learns through a sample dataset. Our goal is to create an algorithm that predicts the length of high school students.
There are 4 kinds of students at our table. Girls playing basketball, girls not playing basketball, boys playing basketball, and boys not playing basketball. According to the table above, let's calculate the average height of the students for each group.
|Gender||Playing Basketball?||Avr. Height (cm)|
We created a height prediction algorithm based on the average height of the students. Now we have an algorithm that can roughly predict a high school student's height when we know whether he or she plays basketball or not. By applying this algorithm to students in other classes, we can estimate their height. I want to make a few comments on the subject.
- An algorithm does not need to know the meaning of the data processing to make a prediction.
- The sample we used was too small to create a prediction model; if a 190 cm tall girl didn't play basketball in the sample, it would disrupt the whole model. I used a small sample to make it easy to visualize.
- I included only the data that could be effective in estimating the length of students. If we had much more input data, the machine learning algorithm would try to determine which variables were effective for predicting and automatically using them. The number of these variables can be hundreds or even thousands.
In this example, we created a model that predicts a numeric variable using categorical variables by simply taking averages. Of course, machine learning algorithms do not work with such simple logic. My goal was to make the logic clear. Sound and image data are also modeled by examining the relationships between the data.
Thanks for reading.
Cover Image Source: pixabay.com