Prof. John Laird, the John L. Tishman Professor of Engineering, and alumnae Shiwali Mohan (PhD CSE 2014) have received the Blue Sky Award at the 2018 Association for the Advancement of Artificial Intelligence conference for their paper, “Learning Fast and Slow: Levels of Learning in General Autonomous Intelligent Agents.” The Blue Sky Award is given to researchers for papers that present ideas and visions that can stimulate the research community to pursue new directions, such as new problems, new application domains, or new methodologies.
General autonomous agents must overcome a number of challenges when learning. They have to continually react to their environment, focusing their computational resources on making the best decision for the current situation using all their available knowledge. They also need to learn everything they can from their experience, building up their knowledge so that they are prepared for making the best decisions in the future.
The researchers proposed that in human-like agents, learning can be split into two levels. Level 1, or L1, consists of fixed architectural learning mechanisms that are innate and automatic. Level 2, or L2, consists of deliberate learning strategies that are controlled by the agent’s knowledge.
Level 1 Learning
L1 consists of architectural learning algorithms that automatically and continually detect regularities in an agent’s experience and reasoning, and modifies the agent’s long-term memories. They are innate, fast, effortless, and outside the agent’s control. An agent can not explicitly invoke them (“I will learn this right now!”) or explicitly inhibit them (“I refuse to learn this!”); however, they can adopt strategies to influence what they learn. There are no restrictions on the types of knowledge representations that an L1 algorithm can learn. They can learn directly from an agent’s perceptual stream, but also simple feature-based statistical representations and internally created relational symbolic representations.
Level 2 Learning
L2 consists of deliberate learning strategies that create the experiences from which L1 algorithms learn. L2 strategies are voluntary, deliberately initiated by agent reasoning and knowledge, becoming a goal or task that directs behavior. These include simple strategies such as repeating a phone number to learn it or using flashcards to memorize the meaning of words in a foreign language, but also complex strategies such as those involved in scientific research. In pursuit of them, an agent can use any and all of its cognitive capabilities, such as, analogy, attention, decision making, dialog, goal-based reasoning, metareasoning, natural language reasoning, planning, spatial reasoning, temporal reasoning, and theory of mind to generate the experiences from which the L1 mechanisms learn. In contrast to L1 algorithms, which are prisoners to the agent’s ongoing experience, an L2 strategy has the ability to control the agent’s experiences.
The researchers suggest that when trying to develop general agents with broad learning capabilities, it is possible to develop a core set of primitive, automatic learning mechanisms that are shared by complex deliberate learning strategies mechanisms. The deliberate strategies leverage these primitive mechanisms, and do not have any strategy specific learning mechanism of their own. They see this as an exciting path forward for the development of general autonomous intelligent agents. Finally, they hypothesize that non-human animals have only L1 mechanisms, and are unable to use L2 strategies.