Essentials of General Intelligence: The direct path to AGI
August 22, 2002 by Peter Voss
General intelligence comprises the essential, domain-independent skills necessary for acquiring a wide range of domain-specific knowledge — the ability to learn anything. Achieving this with “artificial general intelligence” (AGI) requires a highly adaptive, general-purpose system that can autonomously acquire an extremely wide range of specific knowledge and skills and can improve its own cognitive ability through self-directed learning. This chapter in the forthcoming book, Real AI: New Approaches to Artificial General Intelligence, describes the requirements and conceptual design of a prototype AGI system.
This paper explores the concept of “artificial general intelligence” (AGI) — itsnature, importance, and how best to achieve it. Ourtheoretical model posits that general intelligence comprises a limited numberof distinct, yet highly integrated, foundational functional components.Successful implementation of this model will yield a highly adaptive,general-purpose system that can autonomously acquire an extremely wide range ofspecific knowledge and skills. Moreover, it will be able to improve its owncognitive ability through self-directed learning. We believe that, given theright design, current hardware/ software technology is adequate for engineeringpractical AGI systems. Our current implementation of a functional prototype isdescribed below.
The idea of “general intelligence” is quite controversial; Ido not substantially engage this debate here but rather take the existence ofsuch non domain-specific abilities as a given (Gottfredson 1998). It must also be noted that this essay focusesprimarily on low-level (i.e. roughly animal level) cognitive ability.Higher-level functionality, while an integral part of our model, is onlyaddressed peripherally. Finally, certain algorithmic details are omitted forreasons of proprietary ownership.
2. General Intelligence
Intelligence can be defined simply as an entity’s ability toachieve goals — with greater intelligence coping with more complex and novelsituations. Complexity ranges from the trivial — thermostats and mollusks (thatin most contexts don’t even justify the label “intelligence”) — to thefantastically complex; autonomous flight control systems and humans.
Adaptivity, the ability to deal with changing and novelrequirements, also covers a wide spectrum: from rigid, narrowly domain-specificto highly flexible, general purpose. Furthermore, flexibility can be defined interms of scope and permanence — how much, and how often itchanges. Imprinting is an example of limited scope and high permanence, whileinnovative, abstract problem solving is at the other end of the spectrum. Whileentities with high adaptivity and flexibility are clearly superior — they canpotentially learn to achieve any possible goal — there is a hefty efficiencyprice to be paid: For example, had Deep Blue also been designed to learnlanguage, direct airline traffic, and do medical diagnosis, it would not havebecome Chess champion (all other things being equal).
GeneralIntelligence comprises the essential,domain-independent skills necessary for acquiring a wide range of domain-specificknowledge (data & skills) — i.e. the ability to learn anything (inprinciple). More specifically, this learning ability needs to be autonomous,goal-directed, and highly adaptive:
Autonomous — Learning occurs both automatically, through exposure to sense data (unsupervised), and through bi-directional interaction with the environment, including exploration/ experimentation (self-supervised).
Goal-directed — Learning is directed (autonomously) towards achieving varying and novel goals and sub-goals — be they “hard-wired,” externally specified, or self-generated. Goal-directedness also implies very selective learning and data acquisition (from a massively data-rich, noisy, complex environment).
Adaptive — Learning is cumulative, integrative, contextual and adjusts to changing goals and environments. General adaptivity not only copes with gradual changes, but also seeds and facilitates the acquisition of totally novel abilities.
General cognitive ability stands in sharp contrast to inherent specializations such as speech- or face-recognition, knowledge databases/ ontologies, expert systems, or search, regression or optimization algorithms. It allows an entity to acquire a virtually unlimited range of new specialized abilities. The mark of a generally intelligent system is not having a lot of knowledge and skills, but being able to acquire and improve them — and to be able to appropriately apply them. Furthermore, knowledge must be acquired and stored in ways appropriate both to the nature of the data, and to the goals and tasks at hand.
For example, given the correct set of basic corecapabilities, an AGI system should be able to learn to recognize and categorizea wide range of novel perceptual patterns that are acquired via differentsenses, in many different environments and contexts. Additionally, it should beable to autonomously learn appropriate, goal-directed responses to such inputcontexts (given some feedback mechanism).
We take this concept to be valid not only for high-levelhuman intelligence, but for lower-level animal-like ability. The degree of”generality” (i.e., adaptability) varies along a continuum from genetically”hard-coded” responses (no adaptability), to high-level animal flexibility(significant learning ability as in, say, a dog), and finally to self-awarehuman general learning ability.
CoreRequirements for General Intelligence
General intelligence, as described above, demands a numberof irreducible features and capabilities. In order to proactively accumulateknowledge from various (and/ or changing) environments, it requires:
1. Senses to obtain features from “the world” (virtual or actual),
2. A coherent means for storing knowledge obtained this way, and
3. Adaptive output/ actuation mechanisms (both static and dynamic).
Such knowledge also needs to be automatically adjusted andupdated on an ongoing basis; new knowledge must be appropriately related toexisting data. Furthermore, perceived entities/ patterns must be stored in away that facilitates concept formation and generalization. An effective way torepresent complex feature relationships is through vector encoding (Churchland1995).
Any practical applications of AGI (and certainly anyreal-time uses) must inherently beable to process temporal data as patterns in time — not just as static patternswith a time dimension. Furthermore, AGIs must cope with data from differentsense probes (e.g., visual, auditory, and data), and deal with such attributesas: noisy, scalar, unreliable, incomplete, multi-dimensional (both space/ timedimensional, and having a large number of simultaneous features), etc. Fuzzypattern matching helps deal with pattern variability and noise.
Another essential requirement of general intelligence is tocope with an overabundance of data. Reality presents massively more featuresand detail than is (contextually) relevant, or that can be usefully processed.This is why the system needs to have some control over what input data isselected for analysis and learning — both in terms of which data, and also the degree of detail. Senses (“probes”) areneeded not only for selection and focus, but also in order to ground concepts –to give them (reality-based) meaning.
While input data needs to be severely limited by focus andselection, it is also extremely important to obtain multiple views of reality –data from different feature extractors or senses. Provided that these differentinput patterns are properly associated, they can help to provide context foreach other, aid recognition, and add meaning.
In addition to being able to sense via its multiple,adaptive input groups and probes, the AGI must also be able to act on the world– be it for exploration, experimentation, communication, or to perform usefulactions. These mechanisms need to provide both static and dynamic output(states and behavior). They too, need to be adaptive and capable of learning.
Underlying all of this functionality is pattern processing.What is more, not only are sensing and action based on generic patterns, but sois internal cognitive activity. In fact, even high-level abstract thought,language, and formal reasoning — abilities outside the scope of our currentproject — are “just” higher-order elaborations of this (Margolis 1987).
Advantages ofIntelligence being General
The advantages of general intelligence are almost tooobvious to merit listing; how many of us would dream of giving up our abilityto adapt and learn new things? In the context of artificial intelligence thisissue takes on a new significance.
There exists an inexhaustible demand for computerizedsystems that can assist humans in complex tasks that are highly repetitive,dangerous, or that require knowledge, senses or abilities that its users maynot possess (e.g., expert knowledge, “photographic” recall, overcomingdisabilities, etc.). These applications stretch across almost all domains ofhuman endeavor.
Currently, these needs are filled primarily by systemsengineered specifically for each domain and application (e.g., expert systems).Problems of cost, lead-time, reliability, and the lack of adaptability to newand unforeseen situations, severely limit market potential. Adaptive AGI technology,as described in this paper, promises to significantly reduce these limitationsand to open up these markets. It specifically implies —
That systems can learn (and be taught) a wide spectrum of data and functionality
They can adapt to changing data, environments and uses/ goals
This can be achieved without program changes — capabilities are learned, not coded.
More specifically, this technology can potentially:
Significantly reduce system “brittleness” through fuzzy pattern matching and adaptive learning — increasing robustness in the face of changing and unanticipated conditions or data.
Learn autonomously, by automatically accumulating knowledge about new environments through exploration.
Allow systems to be operator-trained to identify new objects and patterns; to respond to situations in specific ways, and to acquire new behaviors.
Eliminate programming in many applications. Systems can be employed in many different environments, and with different parameters simply through self-training.
Facilitate easy deployment in new domains. A general intelligence engine with pluggable custom input/ output probes allows rapid and inexpensive implementation of specialized applications.
From a design perspective, AGI offers the advantage that alleffort can be focused on achieving the best generalsolutions — solving them once, rather than once for each particular domain. AGIobviously also has huge economic implications: because AGI systems acquire mostof their knowledge and skills (and adapt to changing requirements) autonomously,programming lead times and costs can be dramatically reduced, or even eliminated.
The fact that no (artificial!) systems with thesecapabilities currently exist seems to imply that it is very hard (orimpossible) to achieve these objectives. However, I believe that, as with otherexamples of human discovery and invention, the solution will seem ratherobvious in retrospect. The trick is correctly choosing a few criticaldevelopment options.
3. Shortcuts to AGI
When explaining Artificial General Intelligence to theuninitiated one often hears the remark that, surely, everyone in AI is workingto achieve general intelligence. This indicates how deeply misunderstoodintelligence is. While it is true that eventuallyconventional (domain-specific) research efforts will converge with those ofAGI, without deliberate guidance this is likely to be a long, inefficientprocess. High-level intelligence mustbe adaptive, must be general — yet very little work is being done tospecifically identify what general intelligence is, what it requires, andhow to achieve it.
In addition to understanding general intelligence, AGIdesign also requires an appreciation of the differences between artificial (synthetic) and biologicalintelligence, and between designedand evolved systems.
Our particular approach to achieving AGI capitalizes onextensive analysis of these issues, and on an incremental development path thataims to minimize development effort (time and cost), technical complexity, andoverall project risks. In particular, we are focusing on engineering a seriesof functional (but low-resolution/ capacity) proof-of-concept prototypes.Performance issues specifically related to commercialization are assigned toseparate development tracks. Furthermore, our initial effort concentrates onidentifying and implementing the most general and foundational componentsfirst, leaving high-level cognition such as abstract thought, language, andformal logic for later development (more on that later). We also focus more onselective, unsupervised, dynamic, incremental, interactive learning; on noisy,complex, analog data; and on integrating entity features and concept attributesin one comprehensive network.
While our project may not be the only one proceeding on thisparticular path, it is clear that by far the majority of AI work being donetoday follows a substantially different overall approach. Our work focuses on:
General rather than domain-specific cognitive ability
Acquired knowledge and skills, versus loaded databases and coded skills
Bi-directional, real-time interaction, versus batch processing
Adaptive attention (focus & selection), versus human pre-selected data
Core support for dynamic patterns, versus static data
Unsupervised and self-supervised, versus supervised learning
Adaptive, self-organizing data structures, versus fixed neural nets or databases
Contextual, grounded concepts, versus hard-coded, symbolic concepts
Explicitly engineering functionality, versus evolving it
Conceptual design, versus reverse-engineering
General proof-of-concept, versus specific real applications development
Animal level cognition, versus abstract thought, language, and formal logic.
Let’s look at each of these choices in greater detail.
General rather thandomain-specific cognitive ability. The advantages listed in the previoussection flow from the fact that generally intelligent systems can ultimatelylearn any specialized knowledge and skills possible — humanintelligence is the proof! The reverse is obviously not true.
A complete, well-designed AGI’s ability to acquiredomain-specific capabilities is limited only by processing and storagecapacity. What is more, much of its learning will be autonomous — withoutteachers, and certainly without explicit programming. This approach implements(and capitalizes on) the essence of “Seed AI” — systems with a limited, butcarefully chosen set of basic, initial capabilities that allow them (in a”bootstrapping” process) to dramatically increase their knowledge and skillsthrough self-directed learning and adaptation. By concentrating on carefullydesigning the seed of intelligence, and then nursing it to maturity, one essentiallybootstraps intelligence. In our AGI design this self-improvement takes twodistinct forms/ phases:
1. Coding the basic skills that allow the system to acquire a large amount of specific knowledge.
2. The system reaching sufficient intelligence and conceptual understanding of its own design, to enable it to deliberately improve its own design.
Acquired knowledge and skills, versus loaded databases and coded skills. One crucial measure of general intelligence is its ability to acquire knowledge and skills, not how much it possesses. Many AI efforts concentrate on accumulating huge databases of knowledge and coding massive amounts of specific skills. If AGI is possible — and evidence presented here and elsewhere seems overwhelming — then much of this effort will be wasted. Not only will an AGI be able to acquire these additional smarts (largely) by itself, but moreover, it will also be able to keep its knowledge up-to-date, and to improve it. Not only will this save initial data collection and preparation as well as programming, it will also dramatically reduce maintenance.
An important feature of our design is that there are notraditional databases containing knowledge, nor programs encoding learnedskills: All acquired knowledge is integrated into an adaptive centralknowledge/ skills network. Patterns representing knowledge are associated in amanner that facilitates conceptualization and sensitivity to context. Naturally,such a design is potentially far less prone to brittleness, and moreresiliently fault-tolerant.
Bi-directional,real-time interaction, versus batch processing. Adaptive learning systemsmust be able to interact bi-directionally with the environment — virtual orreal. They must both sense data and act/ react on an ongoing basis. Many AIsystems do all of their learning in batch mode and have little or no ability tolearn incrementally. Such systemscannot easily adjust to changing environments or requirements — in many casesthey are unable to adapt beyond the initial training set without reprogrammingor retraining.
In addition to real-time perception and learning, intelligentsystems must also be able to act. Three distinct areas of action capability arerequired:
1. Acting on the “world” — be it to communicate, to navigate or explore, or to manipulate some external function or device in order to achieve goals.
2. Controlling or modifying the system’s internal parameters(such as learning rate or noise tolerance, etc.) in order to set or improvefunctionality.
3. Controlling the system’s sense input parameters such as focus,selection, resolution (granularity) as well as adjusting feature extractionparameters.
Adaptive attention(focus & selection), versus human pre-selected data. As mentionedearlier, reality presents far more sense data abundance, detail, and complexitythan are required for any given task — or than can be processed. Traditionally,this problem has been dealt with by carefully selecting and formatting databefore feeding it to the system. While this human assistance can improveperformance in specific applications, it is often not realized that this additionalintelligence resides in the human, not the software.
Outside guidance and training can obviously speed learning;however, AGI systems must inherentlybe designed to acquire knowledge by themselves. In particular, they need tocontrol what input data is processed — where specifically to obtain data, inhow much detail, and in what format. Absent this capability the system willeither be overwhelmed by irrelevant data or, conversely, be unable to obtaincrucial information, or get it in the required format. Naturally, such datafocus and selection mechanisms must themselves be adaptive.
Core support fordynamic patterns, versus static data. Temporal pattern processing isanother fundamental requirement of interactive intelligence. At least threeaspects of AGI rely on it: perceptionneeds to learn/ recognize dynamic entities and sequences, action usually comprises complex behavior, and cognition (internal processing) is inherently temporal. In spite ofthis obvious need for intrinsic support for dynamic patterns, many AI systemsonly process static data; temporal sequences, if supported at all, are oftenconverted (“flattened”) externally to eliminate the time dimension. Real-timetemporal pattern processing is technically quite challenging, so it is not surprisingthat most designs try to avoid it.
Unsupervised andself-supervised, versus supervised learning. Auto-adaptive systems such asAGIs require comprehensive capabilities to learn without supervision. Suchteacher-independent knowledge and skill acquisition falls into two broadcategories: unsupervised (data-driven, bottom-up), and self-supervised(goal-driven, top-down). Ideally these two modes of learning should seamlesslyintegrate with each other — and of course, also with other, supervised methods.
Here, as in other design choices, general adaptive systems are harder to design and tune than more specialized, unchanging ones. We see this particularly clearly in the overwhelming focus on back-propagation in artificial neural network (ANN) development. Relatively little research aims at better understanding and improving incremental, autonomous learning. Our own design places heavy emphasis on these aspects.
Adaptive, self-organizing data structures, versus fixedneural nets or databases. Another core requirement imposed by data/goal-driven, real-time learning is having a flexible, self-organizing datastructure. On the one hand, knowledge representation must be highly integrated,while on the other hand it must be able to adapt to changing data densities(and other properties), and to varying goals or solutions. Our AGI encodes allacquired knowledge and skills in one integrated network-like structure. Thiscentral repository features a flexible, dynamically self-organizing topology.The vast majority of other AI designs rely either on loosely-coupled dataobjects or agents, or on fixed network topologies and pre-defined ontologies,data hierarchies or database layouts. This often severely limits theirself-learning ability, adaptivity and robustness, or creates massivecommunication bottlenecks or other performance overhead.
Contextual, groundedconcepts, versus hard-coded, symbolic concepts. Concepts are probably themost important design aspect of AGI; in fact, one can say that “high-level intelligenceis conceptual intelligence.” Corecharacteristics of concepts include their ability to representultra-high-dimensional fuzzy sets that are grounded in reality, yet fluid withregard to context. In other words, they encode related sets of complex, coherent,multi-dimensional patterns that represent features of entities. Concepts obtaintheir grounding (and thus their meaning) by virtue of patterns emanating fromfeatures sensed directly from entities that exist in reality. Because conceptsare defined by value ranges within each feature dimension (sometimes incomplex relationships), some kind of fuzzy pattern matching is essential. Inaddition, the scope of concepts must be fluid; they must be sensitiveand adaptive to both environmental and goal contexts.
Autonomous concept formation is one of the key tests ofintelligence. The many AI systems based on hard-coded or human-defined conceptsfail this fundamental test. Furthermore, systems that do not derive theirconcepts via interactive perception are unable to ground their knowledge inreality, and thus lack crucial meaning. Finally, concept structures whoseactivation cannot be modulated by context and degree of fit are unable tocapture the subtlety and fluidity of intelligent generalization. In combination,these limitations will cripple any aspiring AGI.
Explicitlyengineering (and learning) functionality, versus evolving it. Design byevolution is extremely inefficient — whether in nature or in computer science.Moreover, evolutionary solutions are generally opaque; optimized only to somespecified “cost function,” not comprehensibility, modularity, ormaintainability. Furthermore, evolutionary learning also requires more data ortrials than are available in everyday problem solving.
Genetic and evolutionary programming do have their uses — they are powerful tools that can be used to solve very specific problems, such asoptimization of large sets of variables; however they generally are notappropriate for creating large systems of infrastructures. Artificiallyevolving general intelligence directlyseems particularly problematic because there is no known function measuringsuch capability along a single continuum — and absent such direction, evolutiondoesn’t know what to optimize. One approach to deal with this problem is to tryto coax intelligence out of a complex ecology of competing agents — essentiallyreplaying natural evolution.
Overall, it seems that genetic programming techniques areappropriate when one runs out of specific engineering ideas. Here is a shortsummary of advantages of explicitly engineered functionality:
Designs can directly capitalize on and encode the designer’s knowledge and insights.
Designs have comprehensible design documentation.
Designs can be more far more modular — less need for multiple functionality and high inter-dependency of sub-systems than found in evolved systems.
Systems can have a more flow-chart like, logical design — evolution has no foresight.
They can be designed with debugging aids — evolution didn’t need that.
These features combine to make systems easier to understand, debug, interface, and — importantly — for multiple teams to simultaneously work on the design.
Conceptual design,versus reverse-engineering. In addition to avoiding the shortcomings ofevolutionary techniques, there are also numerous advantages to designing and engineeringintelligent systems based on functionalrequirements rather than trying to copy evolution’s design of the brain. Asaviation has amply demonstrated, it is much easier to build planes than it isto reverse-engineer birds — much easier to achieve flight via thrust thanflapping wings.
Similarly, in creating artificial intelligence it makessense to capitalize on our human intellectualand engineering strengths — to ignore design parameters unique tobiological systems, instead of struggling to copy nature’s designs. Designs explicitly engineered to achieve desiredfunctionality are much easier to understand, debug, modify, and enhance.Furthermore, using known and existing technology allows us to best leverageexisting resources. So why limit ourselves to the single solution tointelligence created by a blind, unconscious Watchmaker with his own agenda(survival in an evolutionary environment very different from that of today)?
Intelligent machinesdesigned from scratch carry neither the evolutionary baggage, nor theadditional complexity for epigenesis, reproduction, and integrated self-repairof biological brains. Obviously this doesn’t imply that we can learn nothingfrom studying brains, just that we don’t have to limit ourselves to biologicalfeasibility in our designs. Our (currently) only working example of high-levelgeneral intelligence (the brain) provides a crucial conceptual model of cognition, and can clearly inspire numerousspecific design features.
Here are somedesirable cognitive features that can be included in an AGI design that wouldnot (and in some cases, could not) exist in a reverse-engineered brain:
More effective control of neurochemistry (“emotional states”)
Selecting the appropriate degree of logical thinking versus intuition
More effective control over focus and attention
Being able to learn instantly, on demand
Direct and rapid interfacing with databases, the Internet, and other machines — potentially having instant access to all available knowledge
Optional “photographic” memory and recall (“playback”) on all senses!
Better control over remembering and forgetting (freezing important knowledge, and being able to unlearn)
The ability to accurately backtrack and review thought and decision processes (retrace and explore logic pathways)
Patterns, nodes and links can easily be tagged (labeled) and categorized
The ability to optimize the design for the available hardware instead of being forced to conform to the brain’s requirements
The ability to utilize the best existing algorithms and software techniques — irrespective of whether they are biologically plausible
Custom designed AGI (unlike brains) can have a simple speed/ capacity upgrade path
The possibility of comprehensive integration with other AI systems (like expert systems, robotics, specialized sense pre-processors, and problem solvers)
The ability to construct AGIs that are highly optimized for specific domains
Node, link, and internal parameter data is available as “input data” (full introspection)
Design specifications are available (to the designer and to the AGI itself!)
Seed AI design: A machine can inherently be designed to more easily understand and improve its own functioning — thus bootstrapping intelligence to ever higher levels.
Generalproof-of-concept, versus specific real applications development. Applyinggiven resources to minimalist proof-of-concept designs improves the likelihoodof cutting a swift, direct path towards an ultimate goal. Having identifiedhigh-level artificial general intelligence as our goal, it makes little senseto squander resources on inessentials. In addition to focusing our efforts onthe ability to acquire knowledgeautonomously, rather than capturing or coding it, we further aim to speedprogress towards full AGI by reducing cost and complexity through —
Concentrating on proof-of-concept prototypes, not commercial performance. This includes working at low data resolution and volume, and putting aside optimization. Scalability is addressed only at a theoretical level, and not necessarily implemented.
Working with radically-reduced sense and motor capabilities. The fact that deaf, blind, and severely paralyzed people can attain high intelligence (Helen Keller, Stephen Hawking) indicates that these are not essential to developing AGI.
Coping with complexity through a willingness to experiment and implement poorly understood algorithms — i.e. using an engineering approach. Using self-tuning feedback loops to minimize free parameters.
Not being sidetracked by attempting to match the performance of domain-specific designs — focusing more on how capabilities are achieved (e.g. learned conceptualization, instead of programmed or manually specified concepts) rather than raw performance.
Developing and testing in virtual environments, not physical implementations. Most aspects of AGI can be fully evaluated without the overhead (time, money, and complexity) of robotics.
Animal levelcognition, versus abstract thought, language, and formal logic. There isample evidence that achieving high-level cognition requires only modest structural improvements from animalcapability. Discoveries in cognitive psychology point towards generalizedpattern processing being the foundational mechanism for all higher levelfunctioning. On the other hand, relatively small differences between higheranimals and humans are also witnessed by studies of genetics, the evolutionarytimetable, and developmental psychology.
The core challengeof AGI is achieving the robust, adaptive conceptual learning ability of higherprimates or young children. If human level intelligence is the goal, thenpursuing robotics, language, or formal logic (at this stage) is a costlysideshow – whether motivated by misunderstanding the problem, or by commercialor “political” considerations.
Summary.While our project leans heavily on research done in many specialized disciplines,it is one of the few efforts dedicated to integrating such interdisciplinaryknowledge with the specific goal of developing general artificial intelligence. We firmly believe that many of theissues raised above are crucial to the early achievement of truly intelligentadaptive learning systems.
4. Foundational Cognitive Capabilities
General intelligence requires a number of foundationalcognitive abilities. At a first approximation, it must be able to —
Remember and recognize patterns representing coherent features of reality
Relate such patterns by various similarities, differences, and associations
Learn and perform a variety of actions
Evaluate and encode feedback from a goal system
Autonomously adjust its system control parameters.
As mentioned earlier, this functionality must handle a verywide variety of data types and characteristics (including temporal), and mustoperate interactively, in real-time. The expanded description below is based onour particular implementation; however, the features listed would generally berequired (in some form) in anyimplementation of artificial general intelligence.
Pattern learning,matching, completion, and recall. The primary method of pattern acquisitionconsists of a proprietary adaptation of lazy learning (Aha 1997, Yip 1997). Ourimplementation stores feature patterns (static and dynamic) with adaptive fuzzytolerances that subsequently determine how similar patterns are processed. Ourrecognition algorithm matches patterns on a competitive winner-take-all basis,as a set or aggregate of similar patterns, or by forced choice. It also offersinherent support for pattern completion, and recall (where appropriate).
Data accumulation andforgetting. Because our system learns patterns incrementally, mechanism areneeded for consolidating and pruning excess data. Sensed patterns (orsub-patterns) that fall within a dynamically set noise/ error tolerance ofexisting ones are automatically consolidated by a hebbian-like mechanism thatwe call “nudging.” This algorithm also accumulates certain statisticalinformation. On the other hand, patterns that turn out not to be important (asjudged by various criteria) are deleted.
Categorization andclustering. Vector-coded feature patterns are acquired in real-time andstored in a highly adaptive network structure. This central self-organizingrepository automatically clusters data in hyper-dimensional vector-space. Ourmatching algorithm’s ability to recall patterns by any dimension providesinherent support for flexible, dynamic categorization. Additionalcategorization mechanisms facilitate grouping patterns by additionalparameters, associations, or functions.
Pattern hierarchiesand associations. Patterns of perceptual features do not stand in isolation– they are derived from coherent external reality. Encoding relationships betweenpatterns serves the crucial functions of added meaning, context, andanticipation. Our system captures low-level, perception-driven patternassociations such as: sequential or coincidental in time, nearby in space,related by feature group or sense modality. Additional relationships areencoded at higher levels of the network, including actuation layers. Thisoverall structure somewhat resembles the “dual network” described by Goertzel(1993).
Pattern priming andactivation spreading. The core function of association links is to primerelated nodes. This helps to disambiguate pattern matching, and to select contextualalternatives. In the case where activation is particularly strong andperceptual activity is low, stored patterns will be “recognized” spontaneously.Both the scope and decay rate of such activation spreading are controlledadaptively. These dynamics combine with the primary, perception-drivenactivation to form the system’s short-term memory.
Action patterns.Adaptive action circuits are used to control parameters in the following threedomains:
1)–Senses, including adjustable feature extractors, focus andselection mechanisms
Output actuators for navigation and manipulation
Meta-cognition and internal controls.
Different actions states and behaviors (action sequences)for each of these control outputs can be created at design time (using aconfiguration script) or acquired interactively. Real-time learning occurseither by means of explicit teaching, or autonomously through random exploration.Once acquired, these actions can be tied to specific perceptual stimuli orwhole contexts through various stimulus-response mechanisms. These S-R links(both activation and inhibition) are dynamically modified through ongoingreinforcement learning.
Meta-cognitivecontrol. In addition to adaptive perception and action functionality, anAGI design must also allow for extensive monitoring and control of overallsystem parameters and functions. Any complex interactive learning systemcontains numerous crucial control parameters such as noise tolerance, learningand exploration rates, priorities and goal management, and a myriad others. Notonly must the system be able to adaptively control these many interactivevectors, it must also appropriately manage its various cognitive functions(such as recognition, recall, action, etc.). Our design deals with theserequirements by means of a highly adaptive introspection/ control “probe.”
High-levelintelligence. Our AGI model posits that no additional foundational functions are necessary for higher-level cognition.Abstract thought, language, and logical thinking are all elaborations of coreabilities. This controversial point is elaborated on further on.
5. An AGI in the making
The functional prototype currently under development atAdaptive A.I. Inc. aims to embody all the abovementioned choices, requirements,and features. Our development path is as follows:
1) Development framework
2) Memory core and interface structure
3) Individual foundational cognitive components
4) Integrated low-level cognition
5) Increasing level of functionality.
The software comprises an AGI engine framework with thefollowing basic components:
A set of pluggable, programmable (virtual) sensors and actuators (called “probes”)
A central pattern store/ engine including all data and cognitive algorithms
A configurable, dynamic 2D virtual world, plus various training and diagnostic tools.
The AGI engine design is based on, and embodies insightsfrom a wide range of research in cognitive science — including computerscience, neuroscience, epistemology (Rand 1990, Kelley 1986), and psychology(Margolis 1987). Particularly strong influences include: embodied systems(Brooks 1994), vector encoded representation (Churchland 1995), adaptiveself-organizing neural nets (esp. Growing Neural Gas, Fritzke 1995),unsupervised and self-supervised learning, perceptual learning (Goldstone 1998), and fuzzy logic(Kosko 1997).
While our designincludes several novel, and proprietary algorithms, our key innovation is theparticular selection and integration of established technologies and priorinsights.
AGI Engine Architecture & Design Features
Our AGI engine (which provides this foundational cognitive ability) can logically be divided into three parts (See figure above.):
Control/ interface logic
Input/ output probes
This “situated agent architecture” reflects the importanceof having an AGI system that can dynamically and adaptively interact with theenvironment. From a theory-of-mind perspective it acknowledges both the crucialneed for concept grounding (via senses), plus the absolute need forexperiential, self-supervised learning.
The components listed below have been specifically designedwith features required for adaptive general intelligence in (ultimately) realenvironments. Among other things, they deal with a great variety and volume ofstatic and dynamic data, cope with fuzzy and uncertain data and goals, fostercoherent integrated representations of reality, and — most of all — promoteadaptivity.
Cognitive Core:This is the central repository of all static and dynamic data patterns –including all learned cognitive and behavioral states and sequences. All datais stored in a single, integrated node-link structure. The design innovates thespecific encoding of pattern “fuzziness” (in addition to other attributes). Thecore allows for several node/ link types with differing dynamics to help definethe network’s cognitive structure.
The network’s topology is dynamically self-organizing — afeature inspired by “Growing Neural Gas” design (Fritzke 1995). This allowsnetwork density to adjust to actual data feature and/ or goal requirements.Various adaptive local and global parameters further define network structureand dynamics in real time.
Control and InterfaceLogic: An overall control system coordinates the network’s execution cycle,drives various cognitive and housekeeping algorithms, and controls/ adaptssystem parameters. Via an Interface Manager, it also communicates data andcontrol information to and from the probes.
Probes: TheInterface Manager provides for dynamic addition and configuration of probes.Key design features of the probe architecture include the ability to have programmablefeature extractors, variable data resolution, and focus & selection mechanisms.Such mechanisms for data selection are imperative for general intelligence:even moderately complex environments have a richness of data that far exceedsany system’s ability to usefully process.
The system handles a very wide variety of data types andcontrol signal requirements — including those for visual, sound, and raw data(e.g., database, internet, keyboard), as well as various output actuators. Anovel “system probe” provides the system with monitoring and control of itsinternal states (a form of meta-cognition). Additional probes — either custominterfaces with other systems or additional real-world sensors/ actuators — caneasily be added to the system.
DevelopmentEnvironment/ Language/ Hardware. The complete AGI engine plus associatedsupport programs are implemented in (Object Oriented) C# under Microsoft’s .NETframework. The system is designed for optional remoting of various components,thus allowing for some distributed processing. Current tests show thatpractical (proof-of-concept) prototype performance can be achieved on a single,conventional PC (2 Ghz, 512 Meg). Even a non-performance-tuned implementationcan process several complex patterns per second on a database of well over amillion stored features.
6. From Algorithms to General Intelligence
This section covers some of our near-term research anddevelopment; it aims to illustrate our expected path toward meaningful generalintelligence. While this work barely approaches higher-level animal cognition (exceeding it in someaspects, but falling far short in others such as sensory-motor skills), we takeit to be a crucial step in proving the validity and practicality of our model.Furthermore, the actual functionality achieved should be highly competitive, ifnot unique, in applications where significant autonomous adaptivity and dataselection, lack of brittleness, dynamic pattern processing, flexible actuation,and self-supervised learning are central requirements.
General intelligence doesn’t comprise one single, brilliantknock-out invention or design feature; instead, it emerges from the synergeticintegration of a number of essential fundamental components. On the structuralside, the system must integrate sense inputs, memory, and actuators, while onthe functional side various learning, recognition, recall and actioncapabilities must operate seamlessly on a wide range of static and dynamicpatterns. In addition, these cognitive abilities must be conceptual andcontextual — they must be able to generalize knowledge, and interpret itagainst different backgrounds.
A key milestone in our project is testing the integrated functionality of the basiccognitive components within our overall AGI framework. A number ofcustom-developed, highly-configurable test utilities are used to test thecohesive functioning of the whole system. This automated training andevaluation is supplemented by manual experimentation in numerous differentenvironments and applications. Experience gained by these tests helps to refinethe complex dynamics of interacting algorithms and parameters.
One of the general difficulties with AGI development is todetermine absolute measures of success. Part of the reason is that this fieldis still nascent, and thus no agreed definitions, let alone tests or measuresof low-level general intelligence exist. As we proceed with our project we expectto develop ever more effective protocols and metrics for assessing cognitiveability. Our system’s performance evaluation is guided by this description:”General intelligence comprises the ability to acquire (and adapt) theknowledge and skills required for achieving a wide range of goals in a varietyof domains.”
In this context, “acquisition” includes all of thefollowing: automatic, via sense inputs (feature/ data driven); explicitlytaught; discovered through exploration or experimentation; internal processes(e.g., association, categorization, statistics, etc.).
“Adaptation” implies that new knowledge is integratedappropriately.
“Knowledge and skills” refer to all kinds of data andabilities (states and behaviors) that the system acquires for the short or longterm.
Our initial protocol for evaluating AGIs aims to cover awide spectrum of domains and goals by simulating sample applications in 2Dvirtual worlds. In particular, these tests should assess the degree to whichthe foundational abilities operate as an integrated, mutually supportive whole– and without programmer intervention! Here are three examples:
Sample Test Domains for Initial Performance Criteria
Adaptive SecurityMonitor. This system scans video monitors and alarm panels that oversee asecure area (say, factory, office building, etc.), and responds appropriatelyto abnormal conditions. Note, this is somewhat similar to a site monitoringapplication at MIT (Grimson 1998).
This simulation calls for a visual environment that containsa lot of detail but has only limited dynamic activity — this is its normalstate (green). Two levels of abnormality exist: (i) minor, or known disturbance(yellow); (ii) major, or unknown disturbance (red).
The system must initially learn the normal state by simpleexposure (automatically scanning the environment) at different resolutions(detail). It must also learn “yellow” conditions by being shown a number ofsamples (some at high resolution). All other states must output “red.”
Standard operation is to continuously scan the environmentat low resolution. If any abnormal condition is detected the system must learnto change to higher resolution in order to discriminate between “yellow” and”red.”
The system must adapt to changes in the environment (andtotally different environments) by simple exposure training.
Sight Assistant.The system controls a movable “eye” (by voice command) that enables theidentification (by voice output) of at least a hundred different objects in theworld. A trainer will dynamically teach the system new names, associations, andeye movement commands.
The visual probe can select among different scenes(simulating rooms) and focus on different parts of each scene. The scenesdepict objects of varying attributes: color, size, shape, various dynamics,etc. (and combinations of these), against different backgrounds.
Initial training will be to attach simple sound commands tomaneuver the “eye,” and to associate word labels with selected objects. Thesystem must then reliably execute voice commands and respond with appropriateidentification (if any). Additional functionality could be to have the systemscan the various scenes when idle, and to automatically report selectedimportant objects.
Object identification must cover a wide spectrum ofdifferent attribute combinations and tolerances. The system must easily learnnew scenes, objects, words and associations, and also adapt to changes in anyof these variables.
Maze Explorer. A(virtual) entity explores a moderately complex environment. It discovers whattypes of objects aid or hinder its objectives, while learning to navigate thisdynamic world. It can also be trained to perform certain behaviors.
The virtual world is filled with a great number of differentobjects (see previous example). In addition, some of these objects move inspace at varying speeds and dynamics, and may be solid and/ or immovable.Groups of different kinds of objects have pre-assigned attributes that indicatenegative or positive. The AGI engine controls the direction and speed of anentity in this virtual world. Its goal is to learn to navigate around immovableand negative objects to reliably reach hidden positives.
The system can also be trained to respond to operatorcommands to perform behaviors of varying degrees of complexity (for example,actions similar to “tricks” one might teach a dog). This “Maze Explorer” can easilybe set up to deal with fairly complex tasks.
Towards Increased Intelligence
Clearly, the tasks described above do not by themselvesrepresent any kind of breakthrough in artificial intelligence research. Theyhave been achieved many times before. However, what we do believe to be significant and unique is the achievement of thesevarious tasks without any task-specific programming or parameterization. It isnot what is being done, but how it is done.
Development beyond these basic proof-of-concept tests willadvance in two directions: 1) to significantly increase resolution, datavolume, and complexity in applications similar to the tests; 2) to addhigher-level functionality. In addition to work aimed at further developing andproving our general intelligence model, there are also numerous practicalenhancements that can be done. These would include implementing multi-processorand network versions, and integrating our system with databases or with otherexisting AI technology such as expert systems, voice recognition, robotics, orsense modules with specialized feature extractors.
By far the most important of these future developmentsconcern higher-level ability. Here is a partial list of action items, all ofwhich are derived from lower-level foundations:
Spread activation and retain context over extended period
Support more complex internal temporal patterns, both for enhanced recognition and anticipation, and for cognitive and action sequences
Internal activation feedback for processing without input
Deduction, achieved through selective concept activation
Advanced categorization by arbitrary dimensions
Learning of more complex behavior
Abstract and merged concept formation
Structured language acquisition
Increased awareness and control of internal states (introspection)
Learning logic and other problem-solving methodologies.
7. Other Research
Many different approaches to AI exist; some of thedifferences are straight forward while others are subtle and hinge on difficultphilosophical issues. As such the exact placement of our work relative to thatof others is difficult and, indeed, open to debate. Our view that “intelligenceis a property of an entity that engages in two way interaction with an externalenvironment,” technically puts us in the area of “agent systems” (Russel 1995).However, our emphasis on a connectionist rather than classical approach tocognitive modeling, places our work in the field of “embodied cognitivescience.” (See Pfeifer and Scheier 1999 for a comprehensive overview.)
While our approachis similar to other research in embodied cognitive science, in some respectsour goals are substantivelydifferent. A key difference is our belief that a core set of cognitiveabilities working together is sufficient to produce general intelligence. Thisis in marked contrast to others in embodied cognitive science who considerintelligence to be necessarily specific to a set of problems within a givenenvironment. In other words, they believe that autonomous agents always existin ecological niches. As such they focus their research on building verylimited systems that effectively deal with only a small number of problemswithin a specific limited environment. Almost all work in the area follows this– see Braitenberg (1984), Brooks (1994) or Arbib (1992) for just a few wellknown examples. Their stance contradicts the fact that humans possess generalintelligence; we are able to effectively deal with a wide range of problemsthat are significantly beyond anything that could be called our “ecologicalniche.”
Perhaps the closest project to ours that is strictly in thearea of embodied cognitive science is the Cog project at MIT (Brooks 1993). Theproject aims to understand the dynamics of human interaction by theconstruction of a human-like robot complete with upper torso, a head, eyes,arms and hands. While this project is significantly more ambitious than otherprojects in terms of the level and complexity of the system’s dynamics andabilities, the system is still essentially niche focused (elementary humansocial and physical interaction) when compared to our own efforts at generalintelligence.
Probably the closest work to ours in the sense that it alsoaims to achieve general rather than niche intelligence is the Novamente projectunder the direction of Ben Goertzel. (The project was formerly known as Webmind– see Goertzel 1997, 2001.) Novamente relies on a hybrid of low-level neuralnet-like dynamics for activation spreading and concept priming, coupled withhigh-level semantic constructs to represent a variety of logical, causal andspatial-temporal relations. While the semantics of the system’s internal stateare relatively easy to understand compared to a strictly connectionistapproach, the classical elements in the system’s design open the door to manyof the fundamental problems that have plagued classical AI over the last fiftyyears. For example, high-level semantics require a complex meta-logic containedin hard coded high-level reasoning and other high-level cognitive systems.These high-level systems contain significant implicit semantics that may not begrounded in environmental interaction but are rather hard coded by the designer– thus causing symbol grounding problems (Harnad 1990). The relatively fixed,high-level methods of knowledge representation and manipulation that thisapproach entails are also prone to “frame of reference” (McCarthy and Hayes 1969; Pylyshyn 1987)and “brittleness” problems. In a strictly embodied cognitive science approach,as we have taken, all knowledge is derived from agent-environment interactionthus avoiding these long-standing problems of classical AI.
–Andy Clark (1997) is anotherresearcher whose model closely resembles our own, but there are noimplementations specifically based on his theoretical work. Igor Aleksander’s(now dormant) MAGNUS project (1996) also incorporated many key AGI conceptsthat we have identified, but it was severely limited by a classical AI,finite-state machine approach. Valeriy Nenov and Michael Dyer of UCLA (1994)used “massively” parallel hardware (a CM-2 Connection Machine) to implement avirtual, interactive perceptual design close to our own, but with a more rigid,pre-programmed structure. Unfortunately, this ambitious, ground-breaking workhas since been abandoned. The project was probably severely hampered by limited(at the time) hardware.–
Moving further away from embodied cognitive science topurely classical research in general intelligence, perhaps the best knownsystem is the Cyc project being pursued by Lenat (1990). Essentially Lenat seesgeneral intelligence as being “common sense.” He hopes to achieve this goal byadding many millions of facts about the world into a huge database. After manyyears of work and millions of dollars in funding there is still a long way togo as the sheer number of facts that humans know about the world is truly staggering.We doubt that a very large database of basic facts is enough to give a computermuch general intelligence — the mechanisms for autonomous knowledge acquisitionare missing. Being a classical approach to AI this also suffers from thefundamental problems of classical AI listed above. For example the symbolgrounding problem again: if facts about cats and dogs are just added to adatabase that the computer can use even though it has never seen or interactedwith an animal, are those concepts really meaningful to the system? While hisproject also claims to pursue “general intelligence,” it is really verydifferent from our own, both in its approach and in the difficulties it faces.
Analysis of AI’s ongoing failure to overcome itslong-standing limitations reveals that it is not so much that ArtificialGeneral Intelligence has been tried and that it has failed, but rather that thefield has largely been abandoned — be it for theoretical, historic, or commercialreasons. Certainly, our particular type of approach, as detailed in previous sections,is receiving scant attention.
8. Fast-track AGI — Why so Rare?
Widespread application of AI has been hampered by a numberof core limitations that have plagued the field since the beginning, namely:
The expense and delay of custom programming individual applications
Systems’ inability to automatically learn from experience, or to be user teachable/ trainable
Reliability and performance issues caused by “brittleness” (the inability of systems to automatically adapt to changing requirements, or data outside of a predefined range)
Their limited intelligence and common sense.
The most direct path to solving these long-standing problemsis to conceptually identify the fundamental characteristics common to allhigh-level intelligence, and to engineer systems with this basic functionality,in a manner that capitalizes on human and technological strength.
General intelligence is the key to achieving robustautonomous systems that can learn and adapt to a wide range of uses. It is alsothe cornerstone of self-improving, or Seed AI — using basic abilities tobootstrap higher-level ones. This essay identified foundational components ofgeneral intelligence, as well as crucial considerations particular to theeffective development of the artificial variety. It highlighted the fact thatvery few researchers are actually following this most direct route to AGI.
If the approach outlined above is so promising, then why ishas it received so little attention? Why is hardly anyone actually working onit?
A short answer: Of all the people working in the field called “AI”:
80% don’t believe in the concept of General Intelligence (but instead, in a large collection of specific skills and knowledge)
Of those that do, 80% don’t believe that artificial, human-level intelligence is possible – either ever, or for a long, long time
Of those that do, 80% work on domain-specific AI projects for commercial or academic-political reasons (results are more immediate)
Of those left, 80% have a poor conceptual framework…
Even though the above is a caricature, in contains more thana grain of truth.
A great number of researchers reject the validity orimportance of “general intelligence.” For many, controversies in psychology(such as those stoked by The Bell Curve)make this an unpopular, if not taboo subject. Others, conditioned by decades ofdomain-specific work, simply do not see the benefits of Seed AI — solving theproblems only once.
Of those that do not in principle object to generalintelligence, many don’t believe that AGI is possible — in their life-time, orever. Some hold this position because they themselves tried and failed “intheir youth.” Others believe that AGI is not the best approach to achieving “AI,” or are at a total loss on how togo about it. Very few researchers have actually studied the problem from our(the general intelligence/ Seed AI) perspective. Some are actually trying toreverse-engineer the brain — one function at a time. There are also those whohave moral objections, or who are afraid of it.
Of course, a great many are so focused on particular, narrowaspects of intelligence that they simply don’t get around to looking at the bigpicture — they leave it to others to make it happen. It is also important tonote that there are often strong financial and institutional pressures topursue specialized AI.
All of the above combine to create a dynamic where Real AIis not “fashionable” — getting little respect, funding, and support — furtherreducing the number of people drawn into it!
These should be more than enough reasons to account for thedearth of AGI progress. But it gets worse. Researchers actually trying to buildAGI systems are further hampered by a myriad of misconceptions, poor choices,and lack of resources (funding and research). Many of the technical issues wereexplored previously (See sections 3 and 7.), but a few others are worthmentioning:
Epistemology.Models of AGI can only be as good as their underlying theory of knowledge — thenature of knowledge, and how it relates to reality. The realization thathigh-level intelligence is based on conceptualrepresentation of reality underpins design decisions such as adaptive, fuzzyvector encoding, and an interactive, embodied approach. Other consequences arethe need for sense-based focus and selection, and contextual activation. Thecentral importance of a highly-integrated pattern network — especiallyincluding dynamic ones — becomes obvious on understanding the relationshipbetween entities, attributes, concepts, actions, and thoughts. These andseveral other insights lay the foundation for solving problems related togrounding, brittleness, and common sense. Finally, there is still a lot ofunnecessary confusion about the relationship between concepts and symbols. Adynamic that continues to handicap AI is the lingering schism betweentraditionalists and connectionists. This unfortunately helps to perpetuate afalse dichotomy between explicit symbols/ schema, and incomprehensiblepatterns.
Theory of Mind.Another area of concern is sloppy formulation and poor understanding of severalkey concepts: consciousness, intelligence, volition, meaning, emotions, commonsense, and “qualia.” The fact that hundreds of AI researchers attendconferences every year where key speakers proclaim that “we don’t understandconsciousness (or qualia, or whatever), and will probably never understand it’ indicates just how pervasive this problem is.Marvin Minsky’s characterization of consciousness being a “suitcase word”is correct. Let’s just unpack it!
Errors like these are often behind research going off at atangent relative to stated long-term goals. Two examples are an undue emphasison biological feasibility, and the belief that embodied intelligence cannot bevirtual, that it has to be implemented in physical robots.
Cognitive psychology.It goes without saying that a proper understanding of the concept “intelligence”is key to engineering it. In addition to epistemology, several areas of cognitivepsychology are crucial to unraveling its meaning. Misunderstanding intelligencehas led to some costly disappointments, such as manually accumulating hugeamounts of largely useless data (knowledge without meaning), efforts to achieveintelligence by combining masses of dumb agents, or trying to obtain meaningfulconversation from an isolated network of symbols.
Project focus.The few projects that do pursue AGIbased on relatively sound models run yet another risk: they can easily losefocus. Sometimes commercial considerations hijack a project’s direction, whileothers get sidetracked by (relatively) irrelevant technical issues, such astrying to match an unrealistically high level of performance, fixating onbiological feasibility of design, or attempting to implement high-levelfunctions before their time. A clearly mapped-out developmental path tohuman-level intelligence can serve as a powerful antidote to losing sight of”the big picture.” A vision of how to get from “here” to “there” also helps tomaintain motivation in such a difficult endeavor.
Research support.AGI utilizes, or more precisely, is an integration of a large number ofexisting AI technologies. Unfortunately, many of the most crucial areas aresadly under-researched. They include:
Incremental, real-time, unsupervised/ self-supervised learning (vs. back-propagation)
Integrated support for temporal patterns
Dynamically-adaptive neural network topologies
Self-tuning of system parameters, integrating bottom-up (data driven) and top-down (goal/ meta-cognition driven) auto-adaptation
Sense probes with auto-adaptive feature extractors.
Naturally, these very limitations feed back to reducesupport for AGI research.
Cost and difficulty.Achieving high-level AGI will be hard. However, it will not be nearly asdifficult as most experts think. A key element of “Real AI” theory (and itsimplementation) is to concentrate on the essentials of intelligence. Seed AIbecomes a manageable problem — in some respects much simpler than othermainstream AI goals – by eliminating huge areas of difficult, but inessentialAI complexity. Once we get the crucial fundamental functionality working, muchof the additional “intelligence” (ability) required is taught or learned, notprogrammed. Having said this, I do believe that very substantial resources willbe required to scale up the system to human-level storage and processingcapacity. However, the far more moderate initial prototypes will serve asproof-of-concept for AGI while potentially seeding a large number of practicalnew applications.
Understanding general intelligence and identifying itsessential components are key to building next-generation AI systems — systemsthat are far less expensive, yet significantly more capable. In addition toconcentrating on general learningabilities, a fast-track approach should also seek a path of least resistance –one that capitalizes on human engineering strengths and available technology.Sometimes, this involves selecting the AI road less traveled.
We believe that the theoretical model, cognitive components,and framework described above, joined with our other strategic design decisionsprovide a solid basis for achieving practical AGI capabilities in theforeseeable future. Successful implementation will significantly address manytraditional problems of AI. Potential benefits include:
Minimizing initial environment-specific programming (through self-adaptive configuration)
Substantially reducing ongoing software changes, because a large amount of additional functionality and knowledge will be acquired autonomously via self-supervised learning
Greatly increasing the scope of applications, as users teach and train additional capabilities
Improved flexibility and robustness resulting from systems’ ability to adapt to changing data patterns, environments and goals.
AGI promises to make an important contribution toward realizing software and robotic systems that are more usable, intelligent, and human-friendly. The time seems ripe for a major initiative down this new path of human advancement that is now open to us.
To be published in the forthcoming book, Real AI: New Approaches to Artificial General Intelligence. Reproduced with permission. See Essentials of General Intelligence: The direct path to AGI for updates.
Aha, D.W.(Ed.) (1997). Lazy Learning. ArtificialIntelligence Review,11:1-5 Kluwer Academic Publishers
Aleksander, I.(1996). Impossible Minds. Imperial College Press
Arbib, M.A.(1992). Schema theory. In S. C.Shapiro (Ed.), Encyclopedia of ArtificialIntelligence, 2nd ed (pp. 1427-1443). John Wiley.
Braitenberg, V.(1984). Vehicles: Experiments insynthetic psychology. MIT Press.
Brooks, R.A.,and Stein, L. A. (1993). Building brains for bodies. Memo 1439, ArtificialIntelligence Lab, Massachusetts Institute of Technology
Brooks, R.A.(1994). Coherent behavior from many adaptive processes. In D. Cliff, P.Husbands, J.A. Meyer, and S.W. Wilson (Eds.), From animals to animats: Proceedings of the third InternationalConference on Simulation of Adaptive Behavior (421-430).MIT Press.
Churchland, P.M.(1995). The Engine of Reason, the Seat ofthe Soul: A Philosophical Journey into the Brain. MIT Press
Clark, A.(1997. Being There: Putting Brain, Bodyand World Together Again. MIT Press
Fritzke, B.(1995). A growing neural gas networklearns topologies. In Tesauro, G., Touretzky, D. S., and Leen, T. K.(Eds.), Advances in Neural InformationProcessing Systems 7 (pp. 625-632). MIT Press.
Goertzel, B.(1997). From complexity to creativity:Explorations in evolutionary, autopoietic, and cognitive dynamics. Plenum Press.
Goertzel, B.(2001). Creating internet intelligence:Wild computing, distributed digital consciousness, and the emerging globalbrain Plenum Press.
Goldstone, R.L.(1998). Perceptual Learning. AnnualReview of Psychology, 49, 585-612.
Gottfredson, L.S. (1998). The general intelligence factor. [SpecialIssue]. Scientific American, 9(4), 2, 24-29.
Grimson, W.E.L.,Stauffer, C., Lee L., Romano R. (1998). UsingAdaptive Tracking to Classify and Monitor Activities in a Site. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, pp. 22-31, 1998
Harnad, S.(1990). The symbol grounding problem. PhysicaD, 42, 335-346.
Kelley, D.(1986). The Evidence of the SensesLouisiana State University Press
Kosko, B.(1997). Fuzzy Engineering. Prentice Hall
Lenat, D.B.,Guha, R.V.(1990). Building LargeKnowledge Based Systems. Addison-Wesley.
Margolis, H.(1987). Patterns, Thinking, andCognition: A Theory of Judgment. University of Chicago
McCarthy, J.and Hayes, P.J.(1969). Somephilosophical problems from the standpoint of artificial intelligence. Machine Intelligence, 4, 463-502.
Nenov, V.I.and Dyer, M.G. (1994). Language Learningvia Perceptual/ Motor Association: A Massively Parallel Model. In: Kitano,H., Hendler, J.A. (Eds.), MassivelyParallel Artificial Intelligence (pp. 203-245) AAAI Press/The MIT Press.
Pfeifer, R.,and Scheier, C. (1999). Understandingintelligence. MIT Press.
Pylyshyn, Z.W.(Ed.)(1987).The Robot’s Dilemma: The frame problem inA.I.. Ablex.
Rand, A.(1990). Introduction to ObjectivistEpistemology. Meridian
Russell, S.J.,Norvig, P.(1995). ArtificialIntelligence: A modern approach. Prentice Hall.
Wang, P. (1995).Non-axiomatic reasoning system: Exploringthe essence of intelligence. PhD thesis, Indiana University.
Yip, K., andSussman, G.J. (1997). Sparse Representations for Fast, One-shot learning. Proc.of National Conference on Artificial Intelligence, July 1997.
 Intellectualproperty is owned by Adaptive A.I. Inc.
“Brittleness” in AI refers to a system’s inability to automatically adapt tochanging requirements, or to cope with data outside of a predefined range –thus “breaking.”
Back-propagation is one of the most powerful supervised trainingalgorithms; it is, however, notparticularly amenable to incremental learning.
 “Priming,”as used in psychology, refers to an increase in the speed or accuracy of adecision that occurs as a consequence of prior exposure or activation.
 This sectionwas co-authored with Shane Legg
 meaning thatmany different meanings are thrown together in a jumble — or at least packagedtogether in one “box,” under one label.