Effective learning: the near future of AI
There is no doubt that the The ultimate future of AI is to reach and surpass human intelligence. But that is an exaggerated feat to achieve. Even the most optimistic of us bet that AI at the level human (AGI or ASI) will be as far as 10 to 15 years away, with skeptics even willing to bet that it will take ages, if it is even possible. Well, that is not the subject of the post (you should read this instead post if you want to know more about super intelligence). Here we willtalk about a more tangible and nearer future and discuss the emerging and powerful AI algorithms and techniques that we believe will shape the near future of AI.
AI has started to improve humans in a few selected and specific tasks. For example, beat doctors when diagnosing cancer of skin and defeat Go players at the mpionship world trade fair . But the same systems and models will fail in performing different tasks than they were trained for. This is why, in the long run, a generally intelligent system that efficiently performs a set of tasks without the need for reassessment is dubbed the future of AI. But, in a nearhe future of AI, long before AGI was in place, how will scientists be able to ensure that AI-powered algorithms overcome the problems they face today to get out of laboratories and become objects of everyday use?
As you look around, AI is winning one castle at a time (read our articles on how AI overtakes humans, part one and part two ). What could possibly go wrong in such a win-win game? Humans produce more and more data (which is the fodder AI consumes) over time, and our hardware capabilities improve as well. After all, data and a better calculation arethe reasons why the Deep Learning revolution started in 2012, isn't it? The truth is, faster than the growth of data and calculations, so does the growth of human expectations. Data scientists should think of solutions beyond what currently exists to solve real world problems. For example, the classification of images, as most people think, is scientifically a problem solved ( if we resist the urge to say 100% precision or GTFO.) We can categorize images (eg into images of cats or dogs) corresponding to human capabilities using AI. But Can this already be used for real use cases? Can AI provide a solution to more practical problems that humans face? In some cases yes, but in many cases we are not therecore.
We will walk you through the challenges that are the main obstacles to developing a real solution using AI. Suppose you want to categorize the images of cats and dogs. We will use this example throughout the article.
Our example algorithm: Classify images of cats and dogs
The graphic below summarizes the challenges:
Challenges involved in the real-world AI development
Let's talk about these challenges in detail:
Learn with less data:
- The training data that the most effective deep learning algorithms consume needs to be labeled according to the content / functionality that they contain. This process is called annotation.
- Algorithms cannot use data naturally found around you. Annotating a few hundred (or a few thousand data points) is easy, but our human-level image classification algorithm took a million annotated images to learn well.
- So the question is whether the annotation of a million images is possible? How else can AI evolve with less annotated data?
- Although the data sets are fixed , their use in the real world is more
- Although we have improved computer vision algorithms to detect objects corresponding to humans. But as mentioned earlier, these algorithms solve a very specific problem with respect to human intelligence which is much more generic in many ways.
- Our example AI algorithm, which classifies cats and dogs, will not be able to identify a rare dog species if they are not fed images of this species.
Fitting additional data:
- Incremental data is another major challenge. In our example, if we try to recognize cats and dogs, we could train our AI for a number of images of cats and dogs of different species whenour first deployment. But when discovering a new species, we have to train the algorithm to recognize "Kotpies" with the previous species.
- While the new species might be more similar to others than we think and can be easily trained to adapt the algorithm, there are points where this is more difficult and requires full retraining and reassessment.
- The question is whether AI can at least adapt to these small changes?
To make AI immediately usable, the idea is to solve the aforementioned challenges through a set of approaches called Effective Learning (please note that this is not an official term, I am inventing just to avoid writing -learning, transfer learning, little shooting learning, the contradictory learning and multi-tasking learning every time). We, at ParallelDots , now let's use these approaches to solve narrow problems with the 'AI, to win small battles while preparing for a more complete AI to conquer bigger wars. The We present these techniques to you one by one.
It is up to Note that most of these effective learning techniques are not something new. They just see a resurgence now. Support Vector Machines (SVM) researchers have been using these techniques for a long time. Adversarial learning, on the other hand , is something that came out of Goodfellow's recent work on GANs and neural reasoning is a new set of techniques for which datasets have become available very recently. Consider en depth how these techniques will help shape the future of AI.
As the name suggests, learning is transferred from one task to another within of the same algorithm in Transfer Learning. Algorithms trained on a task (source task) with a larger dataset can be transferred with or without modification as part of an algorithm trying to learn a different task (target task) on a dataset ( relatively) smaller.
Traditional learning vs transfer learning. Credits: Credits: IEEE Computer Society
Using parameters of an image classification algorithm as feature extractor in different tasks like object detection is a simple application of Transfer Learning. On the other hand, it can also be used to perform complex tasks. The algorithm Google has developed for better classify diabetic retinopathy than doctors a while ago, using Transfer Learning. Surprisingly, the Diabetic Retinopathy Detector was actually a real world image classifier (dog / cat image classifier). Learning transfer to classify eye scans.
those parts of neural networks transferred from source task to target task as pre-trained networks in Deep Learning literature. Fine adjustment iswhen the errors of the target task are slightly back-propagated in the pre-trained network instead of using the pre-trained network without modification. Computer vision can be viewed here . The simple concept of Transfer Learning is very important in our set of Ef Effective Learning Methodologies.
In multitasking learning, several learning tasks are solved at the same time, while exploiting the commonalities and differences across tasks. This is surprising, but sometimes learning two or more tasks together (also known as main task and auxiliary tasks) can improve task results. Please note that not all pairs, triplets or quartets of tasks can be considered ancillary. MBut when it works, it's a free precision increment.
Running three tasks with MTL. Credits: Sebastian Ruder
For example, at ParallelDots, our feeling, intention and emotion detection classifiers were trained as multitasking learning, which increased their accuracy compared to if we trained them separately. The best semantic role labeling and the best POS tagging system in NLP, we know that it is a multi-tasking Learning System, one of theilleurs systems for semantic segmentation and d 'instance in Computer Vision. Google suggested Multimodal multitasking learners (one model to rule them all) which can learn from both visi on and text data sets in the same plane.
A very important aspect of multitasking learning that is seen in real world applications is where training any task to become bulletproof we need to respect many domains where the data comes from (also called domain adaptation). An example in our cat and dog use cases will be an algorithm capable of recognizing images from different sources (eg VGA cameras and HD cameras or meven infrared cameras). In such cases, an auxiliary loss of domain classification (where the images are from) can be added to any task, and then the machine learns so that the algorithm continues to improve. in the main task (categorizing images into cat or dog pictures), but deliberately worsening to the auxiliary task (this is done by back-propagating the inverse error gradient of the domain classification task). The idea is that the algorithm learns the discriminating features for the main task, but forgets the features that differentiate the domains and it wouldn't do better. Multitasking learning and its cousins of domain adaptation are one of the most successful effective learning techniques we know of and have a big role to play in shaping the world. 'future of AI.
The contradictory apprenticeship
The apprenticeshipThis contradictory as a field has evolved from the research work of Ian Goodfellow. While the most popular applications of Adversary Learning are undoubtedly Generative Adversary Networks (GANs) which can be used to generate stunning images, there are many other ways for this set of techniques. Typically, this game theory-inspired technique has two algorithms: a generator and a discriminator, the purpose of which is to get it wrong while they are training. The generator can be used to generate new original images as we have seen, but can also generate representations of any other data to hide discriminator details. This is the reason why this concept interests us so much.
Generative conflicting networks. Credits: O 'Reilly
This is a new field and the image generation capability is probably what people like the most astronomers are focusing on. But we think it will do also evolve new use cases, as we will see later.
The domain adaptation set can be improved by using the loss of GAN. Auxiliary loss here is a GAN system instead of pure domain classification, where a discriminator tries to classify the domain from which the data came and a generator component tries to trick it into presenting random noise as data. 'after our experience, it works better than simple domain adaptation (which is also more erratic for the code).
Few Shot Learning
Little Shot Learning is a study of the techniques that enable Deep Learning algorithms (or any Machine Learning algorithm) to 'learn with fewer examples than a traditional algorithm would do. One Shot Learning is basically learning with one example category, inductive k-shot learning means learning with k examples from each category.
One Shot Learning using some examples of a class. Credits: Google DeepMind
Little Shot Learning as a domain sees an influx of papers in all the major Deep Learning conferences and there is now specific datasets on which to compare results, just like MNIST and CIFAR are intended for normal machine learning. One-shot Learning sees a number of applications in some image classification tasks such as feature detection and representation.
Several methods are used for Few Shot learning, including transfer learning, multitasking learning as well as -learning as all or part of the algorithm. Th There are other ways like having a function intelligent loss , using the dynamic architectures or using hacks optimization. Zero Shot Learning, a class of algorithms that claim to predict responses for categories the algorithm hasn't even seen, are essentially algorithms that can evolve with a new kind of data.
Meta-learning is exactly what it looks like, an algorithm that trains itself in such a way that upon seeing a dataset, it gives a new machine learning predictor for that particular dataset. The definition is very futuristic if you give it a first glance. You feel "whoa!" that's what a Data Scientist does "and it automates the" sexiest job of the 21st century ", and in some senses, -learners have started to do it (refer vou at blog article from Google and this research paper ).
Example of a Meta-Learning configuration for the classification of images in a few shots. Credits: Ravi et. .
Meta-learning has recently become a hot topic in Deep Learning, with many research papers published, most often using the technique of hyperparameter and neural network optimization, the research of good arnetwork architectures, Few-Shot image recognition and fast reinforcement learning. You can find a more complete article on use cases here .
Some people refer to this complete automation of decision making both parameters and hyperparameters as network architecture like autoML and you might find people referring to Meta Learning and AutoML as different fields. Despite all the hype around them, the truth is that Meta Learners are still algorithms and avenues for advancing Machine Learning with increasing complexity and variety of data.
Most Meta-Learning articles are smart hacks, which Wikipedia says have the following properties:
- The system must include a learning sub-system, which adapts with experience.
- Experience is gained by exploiting -knowledge extracted either from a previous learning episode on a single data set, or from different domains or problems.
- The learning bias should be chosen dynamically. em>
The subsystem is essentially a configuration that adapts when the data of a domain (or a domain completely new) are introduced. This data can indicate the increasing number of classes, complexity, changing colors and textures and objects (in images), styles, language patterns (natural language) and other similar characteristics. Check out some super cool articles here: Meta-Learning Shared Hierarchies and Meta- learning using time convolutions. You can also create Few Shot or Zero Shot algorithms using Meta-Learning architectures. Meta-learning is one of the most promising techniques that will help shape the future of AI.
Neural reasoning is the next big picture classification problem. Neural reasoning is a step above pattern recognition where algorithms go beyond the idea of simply identifying and classifying text or images. Neural Reasoning solves more generic questions in text analysis or visual analysis. For example, the image below represents a set of questions that Neural Reasoning can answer from ano picture.
Examples of questions on neural reasoning. Credits: CLEVR
This new set of techniques appears after the publication of the Facebook bAbi dataset or the recent CLEVR Dataset . The upcoming techniques for deciphering relationships and not just patterns have immense potential to solve not only neural reasoning, but several other difficult problems as well, including Few Shot learning problems.
Now that we know whichThe are the techniques, let's go back and see how they solve the basic problems we started with. The table below provides an overview of the capabilities of effective learning techniques to meet the challenges:
Capabilities of effective learning techniques
- All the above mentioned techniques help to solve the problem. training with less data one way or another. While -learning would result in architectures that would only mold with data, Transfer Learning makes knowledge from another domain useful to compensate for less Little Shot Learning is dedicated to the problem as a scientific discipline. Adverse learning can help improve data setss.
- Domain adaptation (a type of multitasking learning), adversarial learning, and (sometimes) -learning architectures help solve problems related to
- Meta-Learning and Snapshot Learning helps solve fundamental data problems.
- Neural reasoning algorithms have immense potential for solving real world problems when incorporated as -learners or few learners.
Please note that these effective learning techniques are not new Deep Learning / Machine Learning techniques, but augment existing techniques as hacks , which makes them more profitable. Therefore, you'll still see our regular tools like Convolutional Neural Networks and LSTMs in action, but with the added spices. These techniquesEffective learning programs that operate with less data and perform many tasks at once can facilitate the production and marketing of AI-based products and services. At ParallelDots, we recognize the power of effective learning and embed it as a core feature of our research philosophy.