Latest news about Bitcoin and all cryptocurrencies. Your daily crypto news habit.
Letâs face itâââforgetting things sucks. Itâs frustrating not to remember where you left your keys or to stumble over your words because you canât recall the name of that colleague you just ran into at the grocery store. However, forgetfulness is core to the human condition, and in fact, weâre lucky that weâre able to do so.
For humans, forgetting is more than just a failure to remember; itâs an active process that helps the brain take in new information and make decisions more effectively.
Now, data scientists are applying neuroscience principles to improve machine learning, convinced that human brains may hold the key to unlocking Turing complete artificial intelligence.
According to a recent paper in Neuron, our brains are meant to act as information filters. Put in a big pile of messy data, filter for the useful bits, then clear out any irrelevant details in order to tell a story or make a decision. The unused pieces are deleted in order to make space for new dataâââlike running a disk cleanup on a computer. In neurobiology terms, forgetting happens when synaptic connections between neurons weaken or are eliminated over time, and as new neurons develop, they rewire the circuits of the hippocampus, overwriting existing memories (New Atlas).
For humans, forgetting has two benefits:
- It enhances flexibility by reducing the influence of outdated information on our decision-making
- It prevents overfitting to specific past events, promoting generalizations (Neuron)
In order to adapt effectively, humans need to be able to strategically forget.
But what about computers?
Herein lies one of the big challenges for artificial intelligenceâââcomputers forget differently than humans. Deep neural networks are the most successful technique for a range of machine learning tasks, but they donât forget like we do.
Letâs take a simplified example- if you teach a child that speaks English to learn Spanish, the child will use relevant clues from learning English to apply it to Spanish âperhaps nouns, verb-tenses, sentence buildingâââand simultaneously forget the parts that arenât pertinentâ think accents, mumbling, intonation. The child can incrementally learn and build while strategically forgetting.
In contrast, if a neural network is trained to learn English, the parameters are adapted to solve for English. If then, youâd like to teach it Spanish, new adaptations for Spanish will overwrite the knowledge that the neural network previously acquired for English, effectively deleting everything and starting anew. This is called âcatastrophic forgettingâ, and âitâs one of the fundamental limitations of neural networksâ (Deep Mind).
While itâs still new territory, scientists have made strides recently to explore a few potential theories on how to overcome this limitation.
Teaching AI to Strategically Forget: Three Approaches
#1. Long Short Term Memory Networks (LSTM)
LSTMs are a type of recurrent neural network that use specific learning mechanisms to decide which pieces of information to remember, which to update, and which to pay attention toâ(Edwin Chen) at any point.
Itâs easiest to explain how LSTMs work by using a movie analogy: Imagine that a computer is trying to predict what will happen next in a movie by analyzing previous scenes. In one scene, a woman holds a knifeâââdoes the computer guess sheâs a chef or a murderer? In another, the woman and a man are eating sushi under an golden archwayâââare they in Japan or at McDonalds? Maybe itâs actually St. Louis?
Pretty difficult to predict.
LSTMs aid in this process by helping a neural network 1) forget/remember, 2) save and 3)Â focus:
- Forget/Remember: âIf a scene ends, for example, the model should forget the current scene location, the time of day, and reset any scene-specific information; however, if a character dies in the scene, it should continue remembering that heâs no longer alive. Thus, we want the model to learn a separate forgetting/remembering mechanism: when new inputs come in, it needs to know which beliefs to keep or throw away.â (Edwin Chen)
- Save: When the model sees a new image, it needs to learn whether any information about the image is worth using and saving. If the woman walks past a billboard in a certain sceneâââwill it be important to remember the billboard or is it simply noise?
- Focus: We need to remember that the woman in the movie is a mother, because we will see her children later on, but it is perhaps not important in a scene that she isnât in, so we donât need to focus on it during that scene. In the same way, not everything stored in the neural networkâs long term memory is immediately relevant, so the LSTM helps to determine which parts to focus on at any given time while keeping everything safely stored for later.
#2. Elastic Weight Consolidation (EWC)
EWC is an algorithm created in March 2017 by researchers at Googleâs DeepMind that mimics a neuroscience processes called synaptic consolidation. During synaptic consolidation, our brains assess a task, compute the importance of many neurons used to perform the task, weighing some neurons as more critical to performing the task correctly. These critical neurons are coded as important and are less likely to be overwritten in subsequent tasks. Similarly, in neural networks, multiple connections (like neurons) are used to perform a task. EWC codes some connections as critical and thus protects them from being overwritten/forgotten.
In the chart below, you can see what happened when the researchers applied EWC to a game of Atariâââthe blue line is a standard deep learning process, and the red and brown lines are aided by EWC:
blue line = standard deep learning, red & brown lines = improvements with the help of EWC#3. Bottleneck Theory
In the Fall of 2017, the AI community was humming over a talk by Naftali Tishby, a computer scientist and neuroscientist from the Hebrew University of Jerusalem and evidence for what he called The Bottleneck Theory. âThe idea is that a network rids noisy input data of extraneous details as if by squeezing the information through a bottleneck, retaining only the features most relevant to general conceptsâ (Quanta).
As Tishby explains it, neural networks go through two phases while learningâââfitting and compressing. During fitting, the network labels its training data, and during compression, a much longer process, it âsheds information about the data, keeping track of only the strongest featuresâ (Qanta)âââthose will be most relevant to helping it generalize. In this way, compressing is a way of strategically forgetting, and manipulating this bottleneck could be a tool AI researchers use to to construct new objectives and architectures of stronger neural networks in the future.
As Tishby says, âthe most important part of learning is actually forgetting.âItâs possible that our brains and distinctly human processes, like forgetting, hold the map to creating strong artificial intelligence, but scientists are collectively still figuring out how to read the directions.
Machine Un-Learning: Why Forgetting Might Be the Key to AI was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.
Disclaimer
The views and opinions expressed in this article are solely those of the authors and do not reflect the views of Bitcoin Insider. Every investment and trading move involves risk - this is especially true for cryptocurrencies given their volatility. We strongly advise our readers to conduct their own research when making a decision.