Gödel, Escher, Bach

Introduction: A Musico-Logical Offering

Notice that every type of "copy" preserves all the information in the original theme, in the sense that the theme is fully recoverable from any of the copies. Such an information-preserving transformation is often called an isomorphism
The strange loop phenomenon occurs whenever, by moving upwards (or downwards) through the levels of some hierarchical system, we unexpectedly find ourselves right back where we started... Sometimes I use the term "tangled hierarchy" to describe a system in which a strange loop occurs.
the tightness of the loop (how many steps before the starting point is regained)
haziness in level counting
infinity, since what else is a loop but a way of representing an endless process in a finite way.
These two levels might be the only explicitly protrayed levels. But their mere presence invites the viewer to look upon hime-self as part of yet another level; and by taking that step, the viewer cannot help getting caught up in Escher's implied chain of levels, in which,n for any one level, there is always another level above it of "greater reality", and likewise, there is always a level below, "more imaginary" than it is. This can be mind-boggling in itself. However, what happens if the chain of levels is not linear, but forms a loop? What is real, then, and what is fantasy?
A conflict between the finite and the infinite.
The epimenides paradox: "All Cretans are liars" is a one step strange loop.
Gödel is making mathematical reasoning introspective. The Incompleteness Theorem hinges upon the writing of a self-referential mathematical statement.
Statements of number theory, but also statements "about" number theory.
Provability is a weaker notion than truth, no matter what axiomatic system is involved.
No fixed system, no matter how complicated, could represent the complexity of whole numbers
the attempts to mechanize the thought process of reasoning. Our ability to reason has often been claimed to be what distinguishes us from other species; so it seems somewhat paradoxical to mechanize that which is most human. Yet even the ancient Greeks knew that reasoning is a patterned process, and is at least partially governed by stateable laws. Aristotle codified syllogisms, and Euclid codified geometry; but thereafter , many centuries had to pass before progress in the study of axiomatic reasoning would take place again.

Russell's paradox - most sets are not members of themselves, but some "self-swallowing" sets do contain themselves. The culprit is self-reference, but it is difficult to eradicate because it can be hard to figure out just where self-reference is occurring.
Russell and Whitehead tried to eradicate it, but only at the cost of introducing an artificial-seeming hierarchy. At the bottom is an object language, then a metalanguage, then a metametalanguage, etc
Babbage and his "Difference Engine" and "Analytical Engine", the AE could "eat its own tail"
The arrival of computers in the 1930s-40s converged axiomatic reasoning, mechanical computation, and the psychology of intelligence.
Turing "ineluctable holes" in any computer

Some essential abilities of intelligence:

respond to situations very flexibly
take advantage of fortuitous circumstances
make sense out of ambiguous or contradictory messages
recognize the relative importance of different elements of a situation
find similarities between situations despite differences which may separate them
find differences between situations despite similarities which may link them
synthesize new concepts by taking old concepts and putting them together in new ways
come up with ideas which are novel

The seemingly unbreachable gulf between the formal and the informal, the animate and the inanimate, the flexible and the inflexible.

The flexibility of intelligence comes from the enormous number of different rules, and levels of rules.
Without doubt, Strange Loops involving rules that change themselves, directly or indirectly, are at the core of intelligence.

Julien Oofroy de la Mettrie - L'homme machine

1. The MU Puzzle

"formal system" - you must not do anything which is outside the rules (the Requirement of Formality)
"theorem" - instead of being proven, they are produced
"axiom" - a "free" theorem. A formal system may have zero, one, several, or even infinitely many axioms
"rules of production/ rules of inference" - to "shunt" strings around
"derivation" - an explicit, line-by-line demonstration of how to produce a theorem according to the rules of the formal system. Derivation is modeled on proof, but it is an austere cousin of a proof
"decision procedure" - one requirement on formal systems is that the set of axioms must be characterized by a decision procedure - there must be a litmus test of axiomhood. This ensures that there is no problem in getting off the ground at the beginning at least. That is the difference between the set of axioms and the set of theorems: the former has a decision procedure, but the latter may not.

Inside and outside the system:

it is possible to program a machine to do a routine task in such a way that the machine cill never notice even the most obvious facts about what it is doing
it is possible for a machine to act unobservant. It is impossible for a human to act unobservant.
Most machines made so far are pretty close to being totally unobservant.
intelligence can jump out of the task it is performing, and survey what it has done; it is always looking for and often finding, patterns.
There are cases where only a rare individual will have the vision to perceive a system which governs many peoples' lives, a system which had never before even been recognized as a system; then such people often devote their lives to convincing other people that the system really is there, and that it ought to be exited from
It is very important when studying formal systems to distinguish between working within the system from making statements or observations about that system.
Every human being is capable to some extent of working inside a system and simultaneously thinking about what he is doing. Actually, in human affairs, it is often next to impossible to break things neatly up into "inside the system" and "outside the system"; life is composed of so many interlocking and interwoven and often inconsistent "systems".
Mechanical Mode (M-Mode) and Intelligent Mode (I-Mode) and Un-Mode (U-Mode)
What do we mean by a test? A guarantee that we will get our answer in a finite length of time.

2. Meaning and Form in Mathematics

Axiom schema
Well-formed string - those strings which, when interpreted symbol for symbol, yield grammatical sentences
Bottom-up - working its way up from the basics
Top-Down - working its way down to the basics
Meaningless interpretation - with no isomorphic connection between theorems of the system and reality
Meaningful interpretation - any old word can be used as an interpretation of p, but only plus has a meaningful interpretation

Do words and thoughts follow formal rules, or do they not? That problem is the problem of this book.

Any formal system which tells you how to make longer theorems from shorter ones, but never the reverse, has got to have a decision procedure for its theorems. There is not much deep interest in formal systems with lengthening rules only

Isomorphism - when two complex structures can be mapped onto each other, in such a way that to each part of one structure there is a corresponding part in the other structure, where the two parts play similar roles in their respective structures:

The perception of an isomorphism between two known structures is a significant advance in knowledge - and I claim that it is such perceptions of isomorphism which create meanings in the minds of people.
It is not always totally clear when you really have found an isomorphism. This word has all the usual vagueness of words - which is a defect but also an advantage
Symbol-word correspondence is called "interpretation"
When you confront a formal system you know nothing of, you have to try to assign interpretations to its symbols in a meaningful way - such that a higher-level correspondence emerges between true statements and theorems. The only way to proceed is by trial and error, based on educated guesses. When you hit a good choice, all of a sudden things just feel right, and work speeds up enormously. Pretty soon, everything falls into place.
Mathematicians, linguists, philosophers, and some others set up formal systems whose theorems reflect some portion of reality isomorphically. The choice of symbols is a highly motivated one, along with the choice of typographical rules of production.

Symbols of a formal system, though initially without meaning, cannot avoid taking on meaning of sorts, at least if an isomorphism is found. The difference between meaning in a formal system and in a language is very important, however:

In a language, when we have learned a meaning for a word, we then make new statements based on the meaning of the word. In a sense, the meaning becomes "active" - since it brings into being a new rule for creating sentences.
This means that the command of language is not like a finished product - the rules for making sentences increase when we learn new meanings.
On the other hand, in a formal system, the theorems are predefined by the rules of production. We can choose meanings based on an isomorphism but we cannot add new theorems to the established theorems due to the requirement of formality
In a formal system, the meaning must remain passive

An interpretation will be meaningful to the extent that it accurately reflects some isomorphism to the real world. When different aspects of the real world are isomorphic to each other (like plus and minus), one single formal system can be isomorphic to both and take on two passive meanings. This kind of double-valuedness is extremely important!

Can all of reality be turned into a formal system? In a very broad sense the answer seems to be yes, with the rules being the laws of physics and the sole axiom the configuration of the particles at the beginning of time

The digit-shunting laws for multiplication are based mostly on a few properties of addition and multiplication which are assumed to hold for all numbers. Stability of these rules no matter the context are part of what we mean by number. But are numbers so clean and crystalline and regular that their nature can be completely captured in the rules of a formal system?

Reasoning tells us that Euclid's proof of prime numbers is true
It is achieved through a generalization, but we could never check directly whether it is true. We believe it because we believe in reasoning

We use the word "all" in a few ways which are defined by the thought processes of reasoning:

That is, there are "rules" which our usage of "all" obeys. We may be unconscious of them, and tend to claim we operate on the basis of the meaning of the word, but that is only a way of saying we are guided by rules which we never make explicit.
We have used words all our lives in certain patterns, and instead of calling the patterns "rules", we attribute the courses of our thought processes to the "meanings" of words.
The operations in Euclid's brain when he invented the proof must have involved millions of neurons many of which fired hundred times in a single second. The mere utterance of a sentence involves hundreds of thousands of neurons

3. Figure and Ground

The requirement of formality is the essential things which keeps you from mixing up the I-mode and the M-mode - it keeps you from mixing up arithmetical facts with typographical theorems
Checking whether Cx is not a theorem is not an explicitly typographical operation… You have to go outside the system. This is a rule which violates the whole idea of formal systems, in that it asks you to act informally, ie outside the system.
Holes in the system are only negatively defined - they are the things that are left out of a list that is positively defined.
Figure and ground - when a figure is drawn inside a frame, its complementary shape (ground, background, or negative space) has also been defined. In most drawings, the artist is much less interested in the ground than the figure.
Can you somehow create a drawing containing words in both the figure and the ground?
A cursively drawable figure is one whose ground is merely an accidental by-product.
A recursive figure is one whose ground can be seen as a figure in its own right… the figure is « twice-cursive »
There is a natural and intuitive notion of recognizable forms. Are both the foreground and the background recognizable forms? If so then the drawing is recursive.
There exist recognizable forms whose negative space is not any recognizable form, or more technically, There exist cursively drawable figures which are not recursive.

It turns out that:

There exist formal systems whose negative space (set of non-theorems) is not the positive space (set of theorems) of any formal system
There exist recursively enumerable sets which are not recursive.
There exist formal systems for which there is no typographical decision procedure.
If the members of F were always generated in order of increasing size, then we could always characterize G. The problem is that many r.e. sets are generated by methods which throw in elements in an arbitrary order, so you never know if a number which has benn skipped over for a long time will get included if you just wait a little longer.

The principle for representing primality formally is:

that there is a test for divisibility which can be done without any backtracking. You march steadily upward, testing first for divisibility by 2, the 3, and so on.
It is this monotonicity or unidirectionality that allows primality to be captured. And it is this potential complexity of formal system to involve arbitrary amounts of backwards-forwards inference that is responsible for such limitative results as G^�ödel’s Theorem, Turings Halting Problem, and the fact that not all recursively enumerable sets are recursive.

4. Consistency, Completeness, and Geometry

The more complex the isomorphism, in general, the more « equipment » - both hardware and software - is required to extract the meaning from the symbols.
The key element is answering the question « What is consciousness? » will be the unraveling of the nature of the « isomorphism » which underlies meaning.
The story of the contracrostipunctus itself is an example of the backfirings which it discusses. So it is referring to itself indirectly, in that its own structure is isomorphic to the events it portrays.

Mapping between the contracrostipunctus and Gödel’s theorem:

Phonograph - axiomatic system for number theory
Low-fidelity phonograph - weak axiomatic theory
High-fidelity phonograph - strong axiomatic theory
Perfect phonograph - complete system for number theory
Blueprint of phonograph - aioms and rules of formal system
Record - string of the formal system
Playable sound - theorem of the axiomatic system
Unplayable sound - nontheorem of the axiomatic system
Sound - true statement of number theory
Reproducible sound - interpreted theorem of the system
Unreproducible sound - true statement which isn’t a theorem
Song title « I cannot be played on record player X » - Godel string « I cannot be derived in formal system X

Incompleteness in a system is when truth transcends theoremhood in the system.

The History of Euclidean Geometry:

Euclid, around 300BC compiled and systematized all of what was known about plane and solid geometry in his day in his « Elements », which was a bible of geometry for over 2,000 years
He was the founder of « rigor » in mathematics
Every word which we use has a meaning to us, which guides us in our use of it. The more common the word, the more associations we have with it, and the more deeply rooted is its meaning. Therefore, when someone gives a definition for a common word in the hopes that we will abide by that definition, it is a foregone conclusion that we will not do so but will instead be guided, largely unconsciously, by what our minds find in their associative stores
Eculid’s first four postulates:
- A straight line segment can be drawn joining any two points.
- Any straight line esment can be extended indefinitely in a straight line.
- Given any straight line segment, a circle can be drawn having the segment as radius and one end point as center.
- All right angles are congruent.
The fifth is not so terse and elegant:
- If two lines are drawn chich intersect a third in such a way that the sum of the inner angles on one side is less than two right angles, then the two lines inevitably must intersect each other on that side if extended far enough.
Euclid could find no proof for the fifth postulate and so had to assume it.
In 1823, non-Euclidean geometry was discovered simultaneously by Janos Bolyai and Nikolay Lobachevskiy, while Adrien-Marie Legendre believed he had proved the fifth postulate.
They didn’t deny the fifth postulate, but defined instead the parallel postulate:
- Given any straight line, and a point not on it, there exists one, and only one, straight line which passes through that point and never intersects the first line, no matter how far they are extended.
If you assert that no such line exists, then you get elliptical geometry, if you assert that two or more lines exist, you get hyperbolic geometry.
If you use everyday words like « line » and « point », these are « undefined terms, which get defined only implicitly
Consistency is not a property of a formal system per se, but depends on the interpretation which is proposed for it, so inconsistency is not an intrinsic property of any formal system

Varieties of Consistency:

Consistency of a formal system is when every theorem, when interpreted, becomes a true statement.
Inconsistency is when there is a least one false statement among the interpreted theorems
Internal consistency is when all interpreted theorems are compatible with one another.
Internal inconsistency is when two or more theorems have interpretation which are incompatible with each other.
A system plus interpretation is consistent with the external world if every theorem comes out true
Internal consistency depends on consistency with the external world, but now the

external world can be any imaginable world, instead of the one we live in.

The most lenient consistency would be logical consistency, then mathematical consistency, then physical consistency, then biological consistency.
Usually, the borderline between uninteresting and interesting is drawn between physical and mathematical consistency.
Formal systems are often built up in a kind of sequential or hierarchical manner. FS1 may be built up with rules and axioms that give certain intended passive meanings to its symbols. Then it is incorporated into FS2, with more symbols. The passive meanings from FS1 remain valid and are a skeleton for determining the passive meaning of the new symbols in FS2. And FS2 can be incorporated into FS3 and so on.

If we want to be able to communicate at al, we have to agree on some common base and it pretty well has to include logic.

The core of number theory, the counterpart to absolute geometry (first 4 postulates) is called Peano arithmetic. Number theory is a bifurcated theory, with standard and nostandard versions, but it can have an infinite number of different brands.

Completeness:

Consistency is when every theorem, unpon interpretation, comes out true (in some imaginable world).
Completeness is when all statements which are true (in some imaginable world), and which can be expressed as well-formed strings of the system, are theorems.

5. Recursive Structures and Processes

What is recursion? It is nesting and variations on nesting. The concept is very general (stories inside stories, moveies inside movies, paintings inside paintings, Russian dolls inside Russian dolls:

A recursive definition never defines something in terms of itself, but always in terms of simpler versions of itself.
To "push" means to suspend operations on the task you're currently working on, without forgetting where you are - and to take up a new task. The new task is usually said to be "on a lower level" than the earlier task
To "pop" is the reverse - it means to close operations on one level, and to resume operations exactly where you left off, one level higher.
How do you remember exactly where you were on each level? You store the relevant information in the "stack". When you pop back up a level, it is the stack that restores your context.
Modulation in music is a type of stack, and as music listeners, we don't have very reliable deep stacks.
German sentences, with the verb at the end, are kinds of stacks too. Every language has constructions which involve stacks, and there are always ways of rephrasing sentences so that the depth of stacking is minimal.
At least one pathway inside a Recursive Transition Network must not involve any recursive calls in order to avoid an infinite regress, so that the definition will, eventually, "bottom out".
A structure in which there is no single "highest level", or monitor, is called a heterarchy.
Expanding a node is a little like replace a letter in an acronym by the word it stands for.

Recursion at the lowest level of matter:

Particles are nested inside each other in a way which can be described recursively, perhaps even by some sort of grammar.
If particles didn't interact with each other, things would be incredibly simple. Particles without interactions are called bare particles, and they don't exist.
When you turn on the interactions, then particles get tangled together. These real particles are said to be renormalized. No particles can even be defined without referring to all other particles, whose definitions in turn depend on the first particles, etc, in a never-ending loop.
There is a sort of grammar to feynman diagrams, which is the result of basic laws of physics, such as conservation of energy, conservation of electric charge, etc, and like the grammar of languages, it has a recursive structure, which allows deep nestings of structures inside each other.

Copies and Sameness

Escher took the idea of an object's parts being copies of the object itself and made it into Fishes and Scales
A fish's DNA, sitting in each one of the fish's cells, is a very convoluted copy of the entire fish
What is there that is the "same" about all Escher drawings - a creator's signature or stule is contained inside every tiny section of his creations
When are two things the same? We shall see how deeply this simple question is connected with the nature of intelligence.
Recursion is based on the "same" thing happening on several different levels at once. But the events on different levels aren't exactly the same - rather we find some invariant feature in them, despite many ways in which they differ. In the Little Harmonic Labyrinth, all the stories on different levels are quite unrelated - their sameness resides only in the fact that they are stories and they involve the Tortoise and Achilles.

Programming and Recursion:

One of the essential skills in computer programming is to perceive when two processes are the same in this extended sense, for that leads to modularization - the breaking up of a task into natural subtasks.
Instead of writing out a sequence of many similar operations to be carried out you can write a loop, and the body of the loop can vary in some predictable ways.
A bounded loop is one where the maximum number of steps is known in advance. A free loop is dangerous, because it may never finish, leaving the computer in an infinite loop.
Loops can be nested in one another
Loops are a type of subroutine or procedure, which simply group a set of operations together into a single unit that can be called by name, which leads to modularization in programming. The procedure is fed a list of parameters which guide its choices of what operations to perform.
When programming a chess computer, the computer tries to identify the best move, and needs to then look at the opponent and work out what is their best move in the new context. This involves a "look ahead tree", with the move as the truck, responses as main branches, counter-responses as subsidiary branches and so on.

Hofstadter's Law:

It always takes longer than you expect, even when you take into account Hofstadter's Law!

Recursion and Unpredictability:

For a set to be recursively enumerable means that it can be generated from a set of starting points (axioms) by the repeated application of rules of inference. Thus the set grows and grows, with each new element compounded somehow out of previous elements, in a sort of "mathematical snowball". But this is the essence of recursion - something being defined in terms of simpler versions of itself, instead of explicitly (like Fibonacci numbers).
Recursive enumeration is a process in which new things emerge from old things by fixed rules.
Suitably complicated recursive systems might be strong enough to break out of any predetermined patterns. And isn't this one of the defining properties of intelligence?
Instead of just considering programs composed of procedures which can recursively call themselves, why not get really sophisticated and invent programs which can modify themselves, extending them, improving them, generalizing them, fixing them and so on? This kind of tangled recursion probably lies at the heart of intelligence.

6. The Location of Meaning

When is one thing not always the same?

The idea of an objective meaning of a message will turn out to be related to the simplicity with which intelligence can be described:

There are cases where by investing sufficient effort, you can pull very recondite pieces of information out of certain structures. In fact, the pulling-out may involve such complicated operations that it makes you feel you are putting in more information than you are pulling out.
a molecule of DNA - a genotype - is converted into a physical structure - a phenotype - by a very complex process, involving the manufacture of proteins, the replication of the DNA, the replication of cells, the gradual differentiation of cell types, etc. This unrolling of phenotype from genotype - epigenesis - is the most tangled of tangled recursions.
The DNA's structure contains the information of the phenotype's structure, ie the two are isomorphic. However, the isomorphism is an exotic one - it is highly non-trivial to divide the phenotype and genotype into parts which can be mapped onto each other. Prosaic isomorphisms, by contrast, would be one in which the parts of one structure are easily mappable onto the parts of the other.
"genetic meaning" - information about phenotype structure - is spread all through the small parts of a molecule of DNA, though no-one knows the language yet. Note that understanding this "language" is different to cracking the genetic code. The genetic code is like figuring out the phonetic values of the letters of a foreign alphabet, without figuring out the grammar of the language or the meaning of any of its words (which, I think, we have made some progress on over the last 45 years!
This genetic meaning contained in DNA is one of the best possible examples of implicit meaning. A set of mechanisms far more complicated than the genotype must operate on it to make the phenotype, and the various parts of the genotype serve as triggers for these mechanisms. Portions of the DNA trigger the manufacture of proteins, those proteins trigger hundreds of new reactions, they in turn trigger the replicating-operation which, in several steps, copies the DNA and so on. The phenotype is the "revelation", the pulling-out of the information that was present in the DNA.

Do the fragments of a smashed record contain intrinsic meaning?

The edges of the separate pieces fit together and in that way allow the information to be reconstituted - but something much more complex is going on here.
Then there is the question of the intrinsic meaning of a scrambled telephone call.
There is a vast spectrum of degrees on inherency of meaning. It is interesting to place epigenesis in this spectrum. As development of an organism takes place, can it be said that the information is being "pulled out" of its DNA? Is that where all of the information about the organism's structure resides?
The DNA relies on the fact that some very complex cellular chemical processes will happen, but does not seem to contain any code which brings them about:
- Is it that so much of the information is outside the DNA that it is not reasonable to look upon the DNA as anything more than a very intricate set of triggers, like a sequence of buttons to be pushed on a jukebox (ie chemical context is necessary), or
- Is all the information there but in a very implicit form (and only intelligence is needed to pull out this "intrinsic meaning"?

Identifying messages:

If we were to receive a message from an alien civilization, how would we recognize it as a message at all? How to identify a frame?
If an alien civilization were to find a record of Bach, its shape, acting as a trigger, gives them some information - it seems to be an artifact and may be an information-bearing artifact. This idea, triggered by the object itself, creates a new context in which it can be perceived.
If it turns out that beings throughout the universe do share cognitive structures with us to the extent that even emotions overlap, then in some sense, the record can never be out of its natural context - that context is part of the scheme of things in nature.
It would be much harder for a record of found sounds from John Cage to be understood as we do. The issue is whether any message has, per se, enough compelling inner logic that its context will be restored automatically whenever intelligence of a high enough level comes in contact with it. If some message did have that context-restoring property, then it would seem reasonable to consider the meaning of the message as an inherent property of the message.
One of the ways that we identify decoding mechanisms is the fact that they do not add any meaning to the signs or objects which they take as input: they merely reveal the intrinsic meaning of those signs or objects.

Generally we can say that meaning is part of an object to the extent that it acts upon intelligence in a predictable way.

Three layers of any message:

To understand the inner message is to have extracted the meaning intended by the sender.
To understand the frame message is to recognize the need for a decoding mechanism.
To understand the outer message is to build, or know how to build, the correct decoding mechanism for the inner message.

We normally use a shorthand beneath which there lies a wealth of subconscious, deliberately concealed or declared associations so extensive and intricate that they probably equal the sum and uniqueness of our status as an individual person:

The way of listening to a composition by Elliott Carter is radically different from the way of listening appropriate to a work by John Cage.
A novel by Beckett must in a significant sense be read differently from one by Bellow
A painting by de Kooning and one by Warhol require different perceptional-cognitive attitudes
Perhaps works of art are trying to convey their style more than anything else. If you could ever plumb a style to its very bottom, you could dispense with the creations in that style. "Style", "outer message", "decoding technique" - all ways of expressing the sam basic idea
Where an aperiodic crystal is found "packaged" inside a very regular geometric structure, there may lurk an inner message.

One cannot avoid the problem that one has to find out how to decode the inner message from the outside. The inner message itself may provide clues and confirmations, but those are at best triggers acting upon the bottle finder (or upon the people whom he enlists to help):

It seems that brains come equipped with hardware for recognizing that certain things are messages, and for decoding those messages. This minimal inborn ability to extract inner meaning is what allows the highly recursive, snowballing process of language acquisition to take place. The inborn hardware is like a jukebox: it supplies the additional information which turns mere triggers into complete messages.

In our chauvinism, we would call any being with a brain sufficiently like our own, "intelligent", and refuse to recognize other types of objects as intelligent.

Suppose we take the initial pair of values (1, 1) as a genotype from which the phenotype (the full Fibonacci sequence) - is pulled out by a recursive rule. By sending the genotype alone, we fail to send the information which allows reconstitution of the phenotype. The genotype does not contain the full specification of the phenotype. The second version - a "long" genotype contains so much information that the mechanism by which the phenotype is pulled out can be inferred by intelligence alone. The long genotype transmits not only an inner message, but also an outer message, which enables the inner message to be read. The clarity of the outer message resides in the sheer length of the message. This is not unexpected - it parallels precisely what happens in deciphering ancient texts, where the likelihood of success depends crucially on the amount of text available.

The John Cage record does carry meaning, but it is like the buttons on a jukebox. The meaning is mostly contained inside the listener to begin with and the music serves only to trigger it. And this jukebox, unlike pure intelligence is highly earthbound, depending on idiosyncratic sequences of events all over our globe for long periods of time. Hoping that John Cage's music will be understood by another civilization is like hoping that your favorite tune, on a jukebox on the moon, will have the same code buttons as in a saloon in Saskatoon.

To appreciate Bach requires far less cultural knowledge than Cage. Bach is so much more complex and organized, and Cage is so devoid of intellectuality. But there is a strange reversal here: intelligence loves patterns and balks at randomness. For most people, the randomness in Cage's music requires much explanation, and even after explanations, they may feel they are missing the message - whereas with much of Bach, words are superfluous. In that sense, Bach's music is more self-contained than Cage's. Still it is not clear how much of the human condition is presumed by Bach.

7. The Propositional Calculus

Propositional reasoning depends on the correct usage of the words "and", "if... then", "or", and "not".
The "fantasy rule" lets you write down any well-formed string and ask what if this were an axiom or a theorem
You can "push" into a fantasy, see its premise, a series of theorems and its outcome, and then "pop" back up to the previous level
You can carry over theorems from the level above into the fantasy, but you cannot export theorems out from the fantasy up to the previous level. Otherwise you could write anything as the first line of a fantasy, and then lift it out into the real world as a theorem.
One way to look at the fantasy rule is to say that an observation about the system is inserted into the system.
In the propositional calculus the whole thing is done purely typographically. There is nobody down "in there" thinking about the meaning of the strings. It is all done mechanically, thoughtlessly, rigidly, even stupidly.
The rules of the system are:
- Joining rule - If x and y are theorems then (x+y) is a theorem.
- Separation rule - If (x+y) is a theorem then x and y are both theorems.
- Double tilde rule - The string "~~" can be deleted from any theorem or inserted into any theorem, provided that the resulting string is, itself, well-formed
- Fantasy rule (deduction theorem) - If y can be derived when x is assumed to be a theorem then "if x then y" is a theorem.
- Carry-over rule - Inside a fantasy, any theorem from the "reality one level higher can be brought in and used.
- Rule of detachment (modus ponens) - If x and "if x then y" are both theorems, then y is a theorem
- Contrapositive rule - "if x then y" and "if not y then not x" are interchangeable
- De Morgan's rule - "not x and not y" and "not (x or y)" are interchangeable
- Switcheroo rule - "x or y" and "if not x then y" are interchangeable
It specifies the form of statements that are universally true, and this throws a new light onto the core truths of the universe: they are not only fundametal, but also regular: they can be produced by one set of typographical rules
Could there conceivably be a mechanical decision procedure which distinguished genuine Zen koans from other things?

Shortcuts and derived rules:

You can never give an ultimate, absolute proof that a proof in some system is correct. You can give a proof of a proof of a proof - but the validity of the outermost system always remains an unproven assumption, accepted on faith.
A derived theorem or theorem schema is a derived rule. It is part of the knowledge we have about the system. The theory about the propositional calculus is metatheory, reasoning in the i-mode, outside the system.
As soon as you admit a shortcut, you are outside the system. You could formalize the metatheory too, but no matter how many levels you formalize, someone will eventually want to make shortcuts in the top level.
Even if a system can think about itself, it is not outside itself
A proof is always something informal, a product of normal thought, while a derivation tries to reach the same goal via a logical structure whose methods are all explicit and very simple.
Derivations are often much longer than their corresponding proofs. A proof is simple but depends on the complexities of human thought, while a derivation is often so big than it is impossible to grasp.
In systems based on the Propositional Calculus, contradictions cannot be contained; they infect the whole system like an instantaneous global cancer

8. Typographical Number Theory

Three examples of indirect self-reference are included in the preceding dialog. To see them, you have to look at the form, as well as the content.
Open formulas with free variables, which express a property and quantified variables which express a truth or falsity.
A formula with at least on free variable, an open formula is called a predicate.
Assertion of existence and universal assertion.

The five Peano postulates:

Genie is a djinn
Every djinn has a meta (which is also a djinn)
Genie is not the meta of any djinn
Different djinns have different metas
And... If Genie has X, and each djinn relays X to its meta, then all djinns get X

Incompleteness:

A rule of all cannot be used in the M-mode. Only people who are thinking about the system can ever know that an infinite set of strings are all theorems. Thus this is not a rule that can be stuck inside any formal system.
A system is w-incomplete if all the strings in a pyramidal family are theorems, but the universally quantified summarizing string is not a theorem.
TNT now has the same strength as the system in Prinicipia Mathematica, and it can now prove every theorem that you would find in a standard treatise on number theory
If TNT were complete then all number theorists would be put out of business. As it turns out, this is impossible
Gödel showed that in order to pull the heavy rope across the gap, you can't use a lighter rope; there just isn't a strong enough one. Any system that is strong enough to probe TNT's consistency is at least as strong as TNT itself. And so circularity is inevitable.

9. Mumon and Gödel

One of the basic tenets of Zen Buddhism is that there is no way to characterize what Zen is. No matter what verbal space you try to enclose Zen in, it resists, and spills over. But Zen koans are a central part of Zen study, verbal though they are. Koans are supposed to be triggers which thought they do not contain enough information in themselves to impart enlightenment, may possible be sufficient to unlock the mechanisms inside one’s mind that lead to enlightenment. But in general, the Zen attitude is that words and truth are incompatible, or at least that no words can capture truth.
Mumon (no gate) in China (1183 to 1260) and Fibonacci in Italy (1180 to 1250) lived at the same time.
Zen paradoxes attempt to « break the mind of logic ». When one is in a bewildered state, one’s mind does begin to operate nonlogically. Only by stepping outside of logic can one make the leap to enlightenment

Dualism:

Is the conceptual division of the world into categories. in fact dualism is just as much a perceptual division of the world into categories. Human perception is by nature a dualistic phenomenon.
As soon as you perceive an object, you draw a line between it and the rest of the world. You divide the world artificially into parts and you thereby miss the way.
Words lead to some truth - some falsehood perhaps as well - but certainly not to all truth. Relying on words to lead you to the truth is like relying on an incomplete formal system to lead you to the truth.
The dilemma of mathematicians is what else is there to rely on but formal systems?
The dilemma of Zen people is, what else is there to rely on but words? « It cannot be expressed with words and it cannot be exposed without words. »
« The way does not belong to things seen. nor to things unseen. It does not belong to things known: nor to things unknown. Do not seek it, study it, or name it. To find yourself on it, open yourself wide as the sky.

Ism:

Ism is an antiphilosophy, a way of being without thinking. The masters of ism are rocks, trees, clams, but it is the fate of higher animal species to have to strive for ism, without ever being able to attain it fully. Still one is granted glimpses of ism.
To suppres perception, to suppress logical, verbal, dualistic thinking - this is the essence of Zen, of ism. This is the Un-mode - not Intelligent, not Mechanical, just « Un »
Zen is hoism carried to its logical extreme. If holism claims that things can only be understood as wholes, not as sums of their parts, Zen goes one further, in maintaining that the world cannot be broken into parts at all. To fivide the world into parts is to be deluded, and to miss enlightenment.
An enlightened state is one where the borderlines between the self and the rest of the universe are dissolved. But what is that state, if not death? How can a live human being dissolved the borderlines between himself and the outside world?
There is always futher to go; enlightenment is not the end-all of Zen. And there is not recipe which tells how to transcend Zen. Zen is a system and cannot be its own metasystem; there is alays something outside of Zen, which cannot be fully understood or described within Zen.
What kind of decoding mechanism, I wonder, would it take to suck the three baskets out of one character? Perhaps one with two hemispheres.

Indra’s Net:

Tells of an endless net of threads throughout the universe, the horizontal threads running through space, the vertical ones through time. At every crossing of threads is an individual, and every individual is a crystal bead.
The great light of « Absolute Being » illuminates and penetrates every crystal bead, and every crystal bead reglects not only the light from every other crystal in the net - but also every reflection of every reflection throughout the universe.
This brings forth an image of renormalized particles: in every electron, there are virtual photons, positrons, neutrinos, muons… In every photon, there are virtual elecrons, protons, neutrons, pions… in every pion there are…
But then another image arises: that of people, each one reflected in the minds of many others, who in turn are mirrored in yet others, and so on.

Mu and Muon:

If you do not pass the barrier of the patriarchs or if your thinking road is not blocked, whatever you think, whatever you do, is like a tangling ghost. You may ask: What is a barrier of a patriarch? This one word ‘MU’ is it.
There is a way to embed all problems about any formals system, in number theory.
The discovery of Godel-numbering has been liked to the discovery, by Descartes, of the isomorphism between curves in a plane and equations in two variables. incredibly simple, once you see it - and opening onto a vast new world.
Typographical rules for manipulating numerals are actually arithmetical rules for operating on numbers.
This simple observation is at the heart of Godel’s method, and it will have an absolutely shattering effect. It tells us that once we have a Godel-numbering for any formal system, we can straightaway form a set of arithmetical rules which complete the Godel isomorphism. We can transfer the study of any formal system into number theory.
The set of producible nymbers of any system is a recursively enumerable set. And the set of nonproducible numbers? Is that always recursively enumerable?
Through MUMON and strings like it, TNT is now capable of speaking « in code » about the MIU system.
Is MUMON a thorem of TNT?
In reality there is no such thing as an uncoded message. There are only messages written in more familiar codes, and those written in less familiar codes. If the meaning of a message is to be pulled out of the code by some sort of mechanism or isomorphism. It may be diffidult to discover the method by which the decoding should be done; but once that method has been discovered, the message becomes transparent as water.
When a code is familiar enough, it ceases appearing like a code. One forgets that there is a decoding mechanism. The message is identified with its meaning.

Godel-numbering TNT:

The natural trick would be to turn TNT’s capability of mirroring other formal systems back on itself.
TNT contains strings which talk about other strings of TNT. The metalanguage in which we, on the outside, can speak about TNT, is at least partially imitated inside TNT itself. The architecture of any formal system can be mirrored inside N (number theory). It is in the nature of any formalization of number theory that its metalanguage is embedded within it.
The central Dogma of Mathematical Logic - a string of TNT has an interpretation in N; and a statement of N may have a second meaning as a statement about TNT.
G says « G is not a theorem of TNT », so NOT-G must say « G is a theorem:
- G « I am not a theorem of TNT. »
- NOT-G « My negation is a theorem of TNT »

10. Levels of Description, and Computer Systems

We go to the doctor, who looks at us on lower levels than we think of ourselves. e read about DNA and « genetic engineering » and sip our coffee.
Flickering dots and the moview we’re watching - we have these two wildly different representations of what is on the screen, but that does not confuse us. We can just shut one out, and pay attention to the other - which is what all of us do. Which one is more real? It depends whether you’re a human, a dog, a computer, or a television set.
One of the major problems of AI research is to figure out how to bridge the gap between these two descriptions; how to construct a system which can accept one level of description, and produce the other.
Chess masters perceive the distribution of pieces in chunks. Certain types of sitation recur - certain patterns - and the grand master is sensitive to those high-level patterns. He thinks on a different level from the novice - his set of concepts is different.
The grand master rarely looks ahead any further than the novice, and he usually examines only a handful of possible moves. He sees the board through a filter - he literally does not see the bad moves, like even the novice does not see the illegal moves.
This might be called implicit pruning of the giant branching tree of possibilities, in contrast to the explicit pruning of thinking of all available moves and then rejecting them one by one
Similarly, a gifted mathematician doesn’t usually think up and try out all sorts of false pathways to the desired theorem, as less gifted people might do; rahter, he just « smells » the promising paths and takes them immediately.
Intelligence depends crucially on the ability to create high-level descriptions of complex arrays such as chess boards, television screens, printed pages, or paintings.
What is confusing is when a single system admits of two or more descriptions on different levels which nevertheless resemble each other in some way. Then we find it hard to avoid mixing levels when we think about the system, and can easily get totally lost.
Undoubtedly, this happens when we think about our own psychology - for instance, when we try to understand people’s motivations for various actions. There are many levels in the human mental structure. Our confusion about who we are is certainly related to the fact that we consist of a large set of levels, and we use overlapping language to describe ourselves on all of those levels.

Computer levels:

How does the computer know what instruction to execute at any given time? The CPU has a special pointer which points at (ie stores the address of) the next word which is to be interpreted as an instruction. The CPU fetches that word from memory, and copies it electronically into a special word belong to the CPU itself (words in the CPU are normally called « registers ». Then the CPU executes that instruction.
The difference between machine language and assembly language is like that between painfully specifying each nucleotide, atom by atom, and specifying it by simply giving its name (ie, A, G, C, or T). There is a tremendous saving of labor in this very simple chunking operation, although conceptually not much has been changed.
There is no reason to be reluctant about describing things from a higher-level point of view. So one can think of the assembly language program running concurrently with the machine language program. We have two modes of describing what the CPU is doing.
Compilers operate at the next higher level, as do interpreters, but the latter is working in real-time, reading a new line, « understanding it », and executing it simultaneously.
As sophistication increased, we realized a partially written compiler could compile extensions of itself, so that once a minimal core of a compiler had been written, it could translate bigger compilers into machine language - which in turn could translate yet bigger compilers, until the final, full-blown compiler had been compiled.
This is called bootstrapping, and is similar to the attainment by a child of a critical level of fluency in his native language, from which point on his vocabulary and fluency can grow by leaps and bounds, since he can use language to acquire new language.
System programmers struggle to provide high-level, rather than low-level error messages. Bugs are only manifest to people at a high level.
Next up, emulators and operating systems.
It is virtually certain that there is something like an operating system in the brain, handling of many stimuli at the same time, decisions of what should have priority over what and for how long, instantaneous interrupts caused by emergencies, etc
The many levels in a complex computer system have the combined effect of cushioning the user, preventing him from having to think about the many lower-level goings-on which are most likely totally irrelevant to him anyway.
It’s possible to allow imprecision in program inputs, but once such translation is built in, then these transgression are no longer transgressions, because they have been built into the rules
Sometimes the user is aware of the built-in flexibilities of the language and its translator, but there are so many of them and they interact with each other in such a complex way that he cannot tell how his programs will be interpreted. Even the author of the program may not be able to anticipate how it will react to a given type of unusual construction.
One key for the understanding and creation of intelligence lies in the constant development and refinement of the languages in terms of which processes for symbol manipulation are describable.
The space of all possible programs is so huge that no one can have a sense of what is possible. Each higher-level language is naturally suited for exploring certain regions of program space, thus the programmer, by using that language, is channeled into those areas of program space. He is not forced by the language into writing programs of any particular type, but the language makes it easy for hime to do certain kinds of things. Proximity to a concept, and a gentle shove, are often all that is needed for a major discovery - and that is the reason for the drive towards languages of ever higher levels.
The idea that you know all about yourself is so familiar from interaction with people that it was natural to extend it to the computer… Their question was not unlike asking a person « Why are you making so few red blood cells today? People do not know about that « operating system » level of their bodies.
All the flexibility (of system inputs) has to bottom out somewhere. There must be a hardware level which underlies it all and which is inflexible. It may lie deeply hidden, and there may be so much flexibility on levels above it that few users feel the hardware limitations, but it is inevitably there.
The amazing flexibility of our minds seems nearly irreconcilable with the notion that our brains must be made out of fixed-rule hardware, which cannot be reprogrammed. We cannot make our neurons fire faster or slower, we cannot rewire our brains, we cannot redesign the interior of a neuron, we cannot make any choices about the hw - and yet we can control how we think.
But there are clearly aspects of thought which are beyond our control. We cannot make ourselves smarter by an act of will, we cannot learn a new language as fast as we want, we cannot make ourselves think faster than we do, we cannot make ourselves think about several things at once.
This is a kind of primordial self-knowledge which is so obvious that it is hard to see it at all, it is like being conscious that the air is there.
to suggest ways of reconciling the software of mind with the hardware of brain is a main goal of this book.

From tornados to quarks:

When a team of football players assembles, the individual players retain their separateness, but some processes are going on in their brains which are evoked by the team-context, and which would not go on otherwise, so that in a minor way, the players change identity when they become part of the larger team - this is a « nearly decomposable system » involving weakly interacting modules, like many systems studied in physics.
There are also « nearly indecomposable systems », where modules have nearly no independent identity outside the system.
Various levels of analysis from nuclear physicist to atomic physicist, to chemist, to molecular biologist to cell biologist. Each level is, in some sense, « sealed off.
Although there is always some leakage between the hierarchical levels of science, so that a chemist cannot afford to ignore lower-level physics entirely, or a biologist chemistry entirely, there is almost no leakage from one level to a distant level. That is why people can have intuitive understandings of other people without necessarily understanding the quark model, the structure of nuclei, the nature of electron orbits, the structure of proteins, the organelles in a cell, the methods of intercellular communication, the physiology of the various organs in the human body, or the complex interactions among organs. All that a person needs is a chunked model of how the highest-level acts, and as we all know, such models are very realistic and successful.
We sacrifice determinism for simplicity. A chunked model defines a space within which behavior is expected to fall, and specifies probabilities of its falling in different parts of that space.
As you program in ever higher-level languages, you know less and less precisely what you’ve told the computer to do.
Alll the internal rumbling provoked by the input of a high-level statement is invisible to you, just as when you eat a sandwich you are spared conscious awareness of the digestive processes that it triggers.
It is a visible consequence of the overall system organization - an epiphenomenon.

11. Brains and Thoughts

Thought must depend on representing reality in the hardware of the brain.
We must have active symbols, rather than passive typographic symbols.
Not all descriptions of a person need be attached to some central symbol for that person, which stores the person’s name. Descriptions can be manufactured and manipulated in themselves. We can invent nonexistent people by making descriptions of them; we can merge two descriptions when we find they represent a single entity; we can split one description into two when we find it represents two things, not one - and so on.
This « calculus of descriptions » is at the heart of thinking. It is said to be intensional and not extensional, which means that descriptions can « float » without being anchored down to specific, known objects. The intensionality of thought is connected to its flexibility; it gives us the ability to imagine hypothetical worlds, to amalgamate different descriptions or chop one description into separate pieces, and so on.
Fantasy and fact intermingle very closely in our minds, and this is because thinking involves the manufacture and manipulation of complex descriptions, which need in no way be tied down to real events or things.
A flexible, intensional representation of the world is what thinking is all about

Nerve calls:

The most important cells in the brain are nerve cells, or neurons, of which there are about 10bn. There are about 10 times as many glia.
Each neuron possesses a number of synapses (entry ports) and one axon (output channel.
The input and output are electochemical flows: that is moving ions.
In between the entry ports and the output channel is the cell body, where « decisions » are made.
The type of decision which a neuron faces, up to 1k times per second is whether or not to fire, ie to release ions down its axon, which eventually will cross over into the entry ports of one or more other neurons, thus causing them to make the same sort of decision. If the sum of all inputs exceed a certain threshold, the decision is yes, otherwise no.
Some inputs can be negative, which cancel out positive inputs, and it is simple addition which rules the lower level of the mind
There may be as many as 200k separate entry ports to a neuron.
The cerebrum is the largest part of the human brain and is divided into left and right hemispheres.
The outer few mm of each hemisphere are coated with a layered « bark » or cerebral corte. The amount of cerebral cortex is the major distinguishing feature, in terms off anatomy, between human brains and brains of less intelligent species.
Cat, monkey, and human brains all have dedicated areas of the cortex at the back of their brains where visual processing is done: the visual cortex. In each of them, this area is broken up into three subregions, called areas 17, 18, 19. Within each column in each area layers of simple, complex, and hyper-complex layers are organized in similar ways, but from this level down, each individual is different.

Symbols:

A common phenomenon is the sense of something « crystallizing in your mind at the moment of recognition, which takes place not when the light rays hit your retina, but sometime later, after some part of your intelligence has had a chance to act on the signals.
We are led to the conclusion that for each concept there is a fairly well-defined module which can be triggered - a neural complex. These hypothetical neural complexes are « symbols », which can be either dormant or awake (activated).
When a symbol is activated, it sends out messages, or signals, whose purpose is to try to awaken or trigger other symbols.
Symbols symbolize things and neurons don’t. Symbols are the hardware realizations of concepts.
It seems reasonable to think that the brush strokes of language are also brush strokes of thought, and therefore that symbols represent concepts of about this size. A symbol is roughly something for which you know a word or stock phrase, or with which you associate a proper name. And the representation in the brain of a more complex idea, such as a problem in a love affair, would be a very complicated sequence of activations of various symbols by other symbols.
Each symbol can be a category or individual, a class or an instance, a type or a token. Instance symbols can exist side by side with class symbols, and are not just modes of activation of the latter.
The prototype principle - The most specific event can serve as a general example of a class of events.
There is generality in the specific.
Instance symbolos often inherit many of their properties from the classes to which they belong. Unconsciously, you will rely on a host of presuppositions about a movie. These are built into the class symbol as expected links to other symbols (ie, potential triggering relations) and are default options, which provide reasonable guesses, but can be overridden by more specificity of a particular instance.
A fresh and simple instance is like a child without its own ideas or experiences - it relies entirely on its parents’ experiences and opinions and just parrots them. But gradually, as it interacts more and more with the rest of the world, the child acquires its own idiosyncratic experiences and inevitably begins to split away from the parents. Eventually, the child becomes a full-fledged adult. In the same way, a fresh instance can split off from its parent class over a period of time, and become a class, or prototype, in its own right.
When encountering someone for the first time, their symbol starts as a satellite orbiting around its mother symbol, like an artificial satellite circling the Earth, which is so much bigger and more massive. Then there comes an intermediate stage where one symbol was more important than the other, but they could be seen as orbiting around each other - something like the Earth and the Moon. Finally, the new symbol becomes quite autonomous; now it might easily serve as a class symbol around which could start rotating new satellites - symbols for other prople who are less familiar but who have something in common and for whom he can serve as a temporary stereotype, until you acquire more information, enabling the new symbols also to become autonomous.
The problem of the « reality » of boundaries drawn between what are perceived to be autonomous or semi-autonomous clusters will create endless trouble when we relate it to symbols in the brain.
We do not have a separate instance for every object or person we meet. We can rely on a single class symbol, eg « person » to timeshare itself among all the different people
Two or more symbols can act as one, under the proper conditions.
Overlapping and tangled symbols are probably the norm, so that each neuron, far from being a member of a unique symbol, is probably a functioning part of hundreds of symbols.
Perhaps several symbols can coexist in the same set of neurons by having different characteristic firing patterns
Is it possible that one single symbol could be isolated from all others? Probably not. Just as objects in the world always exist in a context of other objects, so symbols are always connected to a constellation of other symbols.
A symbol ‘s identity lies precisely in its ways of being connected (via potential triggering links) to other symbols. The network by which symbols can potentially trigger each other constitutes the brain’s working model of the real universe, as well as of the alternate universes which it considers.
Our facility for making instances out of classes and classes out of instances lies at the basis of our intelligence, and it is one of the great differences between human thought and the thought processes of other animals. Other species to not form general concepts as we do, or imagine hypothetical worlds - variants on the world as it is, which aid in figuring out which future pathway to choose.
Bees and other insects do not seem to have the power to generalize.
What would happen if I did this? This type of thought process requires an ability to manufacture instances and to manipulate them as if they were symbols standing for objects in a real situation, although that situation may not be the case, and may never be the case.
Nouns may have fairly localized symbols, while verbs and prepositions might have symbols with many « tentacles » reaching all around the cortex.
You can create artificial universes, in which there can happen nonreal events with any amount of detail that you care to imbue them with. But the class symbols themselves, from which all of this richness springs, are deeply grounded in reality.
We have in our brains chunked laws not only of how inanimate objects act, but also of how plants, animals, people, and societies act - chunked laws of biology, psychology, sociology and so on.
Our representation of reality ends up being able only to predict probabilities of ending up in certain parts of abstract spaces of behavior - not to predict anything with the precision of physics.

12. Minds and Thoughts

There can be no isomorphism between two brains on the neural or macroscopic suborgan level.
But on the symbol level, there could be functional isomorphisms between symbols and triggering patterns.
These would not be exact (even identical twins have different memories and thought symbols), but clearly some humans think more alike than others do.
What is a partial isomorphism or conceptual nearness?
It is not accurate to think of a symbol as simply on or off. Each symbol can be activated in many different ways, and the type of activation will be influential in determining which other symbols it tries to activate.
Two spider webs are never exactly the same, yet there is still some sort of style or form that infallibly brands a given species' web.
The jabberwocky contains made-up words that are "exciters of nearby symbols" through sounding similar.
When considering the translations, an exact translation is not possible, but some rough equivalence is attainable. There is a kind of rough isomorphism, partly global, partly local, between the brains of all the readers of these three poems.

The ASU:

Your personal ASU (your attempt to recreate a roadmap of the USA from memory) will be very much like the USA in the area where you grew up. And wherever your travels have chanced to lead you or where you have persused maps with interest, your ASU will have spots of striking agreement with the USA: a few small towns in N. Dakota or Montana, perhaps, or the whole of metropolitan NY might be quite faithfully reproduced.
There is no local or global isomorphism between the USA and your ASU. Some correspondences will extend down to the very local level, yet there may not be a single town that is found in both Montanas. What is relevant is the centrality of the city, in terms of economics, communication, transportation, etc. The more vital the city, the more certain it will be to occur in both the ASU and the USA.
We can use the agreement of big cities to establish points of reference with which I can communicate the location of smaller cities in my ASU to yours.
It is necessary to start with identical external conditions - otherwise nothing will match.
It would be useful to be able to pinpoint what this invariant core of human intelligence is, and then to be able to des ribe the kinds of "embellishments" which can be added to it, making each one of us a unique embodiment of this abstract and mysterious quality called "intelligence".

Common knowledge:

A large proportion of every human's network of symbols is universal. It is so taken for granted that we don't even notice it. Instead we look beyond the standard overlap and generally find some major differences, as well as some unexpected additional overlap.
The triggering patterns of people with other languages will be somewhat different from our own, but still the major class symbols, and the major routes between them, will be universally available, so that more minor routes can be described with reference to them.
What's the difference between true fluency and a mere ability to communicate? Choice of word and word frequency, but mainly there is the problem of different associations which is attached to the culture as a whole - its history, geography, religion, children's stories, literature, technological level, and so on.
It is not the difference in native language, but that in culture (or subculture) that gives rise to perceptual difference. The relationships between the symbols of people with different native languages have every reason to be quite similar, because everyone lives in the same world. When you come down to more detailed aspects of the triggering patterns, you will find that there is less in common.
In the ASU, a thought corresponds to a trip. The towns which are passed through represent the symbols which are excited.
But when a thought recurs in someone's mind sufficiently often, it can get chunked into a single concept. In the ASU a commonly-taken trip would become, in some fashion, a new town or city! Therefore we must remember that cities represent not only the elementary symbols but also symbols which get created as a result of the chunking ability of the brain.
If virtually any sequence of symbols can be activated in any desired order, it may seem that a brain is an indiscriminate system, which can absorb or produce any thought whatsoever. But, in fact, there are certain kinds of thoughts which we call knowledge or beliefs, which play quite a different role from random fancies, or humorously entertained absurdities. How can we characterize the difference between dreams, passing thoughts, beliefs, and pieces of knowledge?
There are some pathways which are taken routinely in going from one place to another. Other pathways can only be followed in one is led through them by the hand. There are "potential pathways", which would be followed only if special external circumstances arose.
The pathways which one relies on over and over again are pathways that incorporate knowledge - facts (declarative knowledge) and how-tos (procedural knowledge)
Pieces of knowledge merge gradually with beliefs. This leaves us with fancies, lies, falsities, absurdities, etc. All these "aberrant" kinds of thoughts are composed, at rock bottom, completely out of beliefs or pieces of knowledge.
Dreams are perhaps just random meanderings about the ASU's of our minds.
Jabberwoky is like an unreal journey around an ASU, hopping from one state to another very quickly, following very curious routes

Describing our minds:

We are able to describe in chunked terms the activity of our minds at any given time. We chunk only that part which is active, but if someone asks about a subject which is coded in a currently inactive area, we can almost instantly gain access to the appropriate area and come up with a chunked description of its. We have zero info on the neural level of the brain: our description is so chunked that we don't even known what part of our brain we are talking about
The number of far-fetched pathways which can be followed in our brains is without bound, just as is the number of insane itineraries that could be planned on an ASU. But what is a "sane" itinerary? Or a "reasonable" thought? The brain does not forbid any pathway, but maybe offers more resistance to some than to others.
External circumstances at any time will play a large determining role in choosing the route.
Thoughts that clash totally may be produced by a single brain, depending on the circumstances. We all are bundles of contradictions, and we manage to hang together by bringing out only one side of ourselves at a given time. The selection cannot be predicted in advance, because the conditions which force the selection are not known in advance. What the brain state can provide, if properly read, is a conditional description of the selection of routes.
A chunked description of a brain state will consist of a probabilistic catalogue, in which are listed those beliefs which are most likely to be induced (and those symbols which are most likely to be activated) by various sets of reasonably likely circumstances, themselves described on a chunked level

The self:

Consciousness is that property of a system that arises whenever there exist symbols in the system which obey triggering patterns somewhat like the ones described above.
The symbol for the self is probably the most complex of all the symbols in the brain. It is a kind of subsystem, rather than a mere symbol - a constellation of symbols, each of which can be separately activated under the control of the subsystem itself. It functions almost as an independent "subbrain" equipped with its own repertoire of symbols which can trigger each other internally. It also communicates withe the rest of the brain. It is a symbol that has grown so complicated that it has many subsymbols which interact among themselves.
Other subsystems represent the people we know intimately. A subsystem symbolizing a friend can activate many of the symbols in my brain just as I can. I can virtually feel myself in his shoes, running through thoughts that he might have, activating symbols in sequences which reflect his thinking patterns more accurately than my own. My symbol of my friend constitues my own chunked description of his brain.
Me and my friend share many symbols, including the basic symbols for many objects
The self-subsystem can play the role of soul - in communicating constantly with the rest of the subsystems and symbols in the brain, it keeps track of what symbols are active, and in what way. It has to have symbols for mental activity - symbols for symbols and symbols for the actions of symbols.
This way of describing awareness - as the monitoring of brain activity by a subsystem of the brain itself - seems to resemble the nearly indescribable sensation which we call consciousness.
All the stimuli coming into the system are centered on one small mass in space. It would be quite a glaring hole in a brain's symbolic structure not to have a symbol for the physical object in which it is housed, and which plays a larger role in the events it mirrors than any other object.

13. BlooP and FlooP and GlooP

Primitive recursivity and general recursivity
An orderly system of sufficient complexity that it can mirror itself cannot be totally orderly - it must contain some strange, chaotic features
Recursive function theory
Primitive recursive truths involve only predictably terminating calculations. These core truths serve for N as Euclid's first four postulates served for geometry; they allow you to throw out certain candidates before the game begins, on the grounds of "insufficient power". From here on out, the representability of all primitive recursive truths will be the criterion for calling a system "sufficiently powerful".
The quest for decision procedures for formal systems involves solving the mystery of unpredictably long searches - chaos - among the integers.
Is nature chaotic or patterned? And what is the role of intelligence in determining the answer to this question?
Can every problem, like an orchard, be seen from such an angle that its secret is revealed? Or are there some problems in number theory which, no matter what angle they are seen from, remain mysteries?
BlooP is our language for defining predictably terminating calculations. The standard name for functions which are BlooP-computable is primitive recursive functions; and the standard name for properties that can be detected by BlooP-tests is primitive recursive predicates. Thus the function "2 to the 3 to the n" is a primitive recursive function and the statement, "n is a prime number" is a primitive recursive predicate.
We must distinguish between representability and expressibility. Expressing a predicate is a mere matter of translation from English into a strict formalism. For a predicate to be represented, on the other hand, is a much stronger notion. It means that:
- All true instances of the predicate are theorems.
- All false instances are nontheorems.
From Cantor, we get two distinct types of infinity:
- One kind describes how many entries there can be in an infinite directory or table
- Another describes how many real numbers there are, and this latter is "bigger", in the sense that the real numbers cannot be squeezed into a table whose length is described by the former kind of infinity
The idea (from Gödel, Cantor, Turing) is to feed the termination tester its own Gödel number
No one has ever found any more powerful computer language than FlooP. It is widely believed that this is impossible. This hypothesis was formulated in the 1930s by Alan Turing and Alonzo Church and is called the Church-Turing Thesis. It means that people can calculate Reddiag(N) for any value of N, but there is no way to program a computer to do so
BlooP-computible is synonymous with "primitive recursive". Now FlooP-computible functions can be divided into:
- Those which are computable by terminating FlooP programs, ie are generally recursive
- Those which are computable only by nonterminating FlooP programs, ie are partially recursive

14. On Formally Undecideable Propositions of TNT and Related Systems

The arithmetical version of quining - arithmoquining - will allow us to make a TNT-sentence which is "about itself".
- a' is the Gödel number of the formula gotten by arithmoquining the formula with Gödel number a
- a' is the arithmoquinification of a
It's not enough to quine - you must quine a quine-mentioning sentence
We have gradually pulled a high-level interpretation - a sentence of meta-TNT - out of what was originally a low-level interpretation - a sentence of number theory.
We have achieved a way of substituting a description of a number, rather than its numeral, into a predicate.
It can be shown, by lengthy but fairly straightforward reasoning, that - as long as TNT is consistent - this oath-of-consistency by TNT is not a theorem of TNT. So TNT's powers of introspection are great when it comes to expressing things, but fairly weak when it comes to proving them. This is quite a provocative result, if one applies it metaphorically to the human problem of self-knowledge.
TNT's incompleteness is of the omega variety. This means that there is some infinite pyramidal family of strings all of which are theorems, but whose associated "summarizing string" is a non-theorem.
This shows that TNT can be extended, just as absolute geometry could be.
We will call the numbers that ~G is announcing to us, the supernatural numbers (following negative, irrational, complex, and imaginary numbers)
Supernatural numbers share all the properties of natural numbers, as long as those properties are given to us in theorems of TNT. Everything that can be formally proven about natural numbers is thereby established also for supernatural numbers.
They are best visualized as infinitely large integers. Although TNT can rule out negative numbers, fractions, irrational numbers, and complex numbers, it cannot rule out infinitely large integers. The problem is that there is no way even to express the statement "There are no infinite quantities."
Let us call the number which makes a TNT-proof-pair with G's Gödel number "I" and it should be just the size of a number which specifies the structure of a proof of G, no bigger, no smaller.
"I", like "i" (the square root of -1) is nonunique. It is some specific one of the many possible supernatural numbers which form TNT-proof-pairs with the arithmoquinification of u.
A supernatural theorem of TNT - namely G - may assert a falsity, but all natural theorems still assert truths.
Supernatural schoolchildren who learn their supernatural plus-tables cannot know their supernatural times-tables - and vice versa. You cannot know both at the same time.
When we decided to formalize TNT, we preselected the terms we would use as interpretation words (number, plus, times, and so on), and were committing ourselves to whatever passive meanings these terms might take on. We didn't know that there would be some questions about nymbers which TNT would leave open, and which therefore could be answered by extensions of TNT heading off in different directions.
Physicists will always use a variety of different geometries, choosing in any given situation the one that seems simplest and most convenient. They don't just study the 3D space we live in, but also Hilbert space, momentum space, reciprocal space, phase space, etc.
Every theorem of TNT remains a theorem in any extension of TNT.
You fit your mathematics to the world, and not the other way round.
The thought processes involved in doing mathematics, just like those in other areas, involve "tangles hierarchies" in which thoughts on one level can affect thoughts on any other level. Levels are not cleanly separated, as the formalist version of what mathematics is would have one believe.

15. Jumping out of the System

Gw was not clever enough to foresee its own embeddability inside number theory.
Any system, no matter how complex or tricky it is, can be Gödel-numbered, and then the notion of its proof-pairs can be defined - and this is the petard by which it is hoise. Once a system is well-defined, or "boxed", it becomes vulnerable.
TNT is "essentially incomplete". The downfall occurs essentially because the system is powerful enough to have self-referential sentences.
Some fundamentally new kind of step has been taken - a sort of irregularity has been encountered. Thus a new name must be supplied ad hoc.
As the ordinals get bigger and bigger, there are irregularities, and irregularities in the irregularities, and irregularities in the irregularities in the irregularities, etc. No single schema, no matter how complex, can name all the ordinals. And from this, it follows that no algorithmic method can tall how to apply the method of Gödel to all possible kinds of formal systems. And unless one is rather mystically inclined, therefore one must conclude that any human being simply will reach the limits of his own ability to Gödelize at some Point. From there on out, formal systems of the complexity, though admittedly incomplete for the Gödel reason, will have as much power as that human being.
"Lucas cannot consistently assert this sentence." - He is just on a par with a sophisticated formal system.
It is possible for a program to modify itself, but such modifiability has to be inherent in the program to start with, so that cannot be counted as an example of "jumping out of the system".
It is important to see the difference between perceiving oneself, and transcending oneself. You cannot quite break out of your own skin and be on the outside of yourself. TNT can talk about itself, but it cannot jump out of itself.
Why add Sagredo at all? It gives the impression of stepping out of the system, in some intuitively appealing sense.
Perhaps self-transcendence is the central theme of Zen. A Zen person is always trying to understand more deeply what he is, by stepping more and more out of what he sees himself to be, by breaking every rule and convention which he perceives himself to be chained by - including those of Zen itself. Somewhere along this elusive path may come enlightenment. By gradually deepening one's self-awareness, by gradually widening the scope of "the system", one will in the end come to a feeling of being at one with the entire universe.

16. Self-Ref and Self-Rep

The word sequences are the tips of the icebergs and the processing which must be done to understand them is the hidden part.
Self-reproducing object = self-rep - we want to have the feeling that, to the max extend possible, it explicitly contains the directions for copying itself. There is an intuitive borderline on one side of which we perceive true self-directed self-rep, and on the other side merely copying being carried out by an inflexible and autonomous copying system
Self-referencing object = self-ref
What is a copy?
Self-rep by retrograde motion - reversing
Self-rep by translation - into another language. G is an outstanding example of a self-ref via translation
Why is the act of making young called "self-reproduction"? There is a coarse-grained isomorphism between parent and child, which preserves the information about species. What is reproduced is the class, not the instance.
Complex systems where self-reps vie against each other for survival.
What is the original? We will see self-reps where data, program, interpreter, and processor are all extremely intertwinded and in which self-rep involves replicating all of them at once.

Typogenetics

The Central Dogma of Molecular Biology = DNA > RNA > proteins
We do begin with some arbitrary strand, somewhat like an axiom in a formal system. But we have, initially, no "rules of inference" - that is, no enzymes. However, we can translate each strand into one or more enzymes! Thus, the strands themselves will dictate the operations which will be performed on them, and those operations will in turn produce new strand which will dictate further enzymes, etc; This is mixing levels with a vengeance
Definition of gene - that portion of a strand that codes for a single enzyme.

The mechanism which reads strands and produces the enzymes which are coded inside them is called a ribosome

The genetic code is a mapping from triplets of nucleotides into amino acids. The genetic code is stored in the DNA itself
From its central throne in the nucleus, DNA sends off long strands of messenger RNA to the ribosomes in the cytoplasm, and the ribosomes, making use of the flashcards of tRNA hovering about them, efficiently construct proteins, amino acid by amino acid, according to the blueprint contained in the mRNA. Only the primary structure of the proteins is dictated by the DNA, but this is enough, for as they emerge from the ribosomes, the proteins "magically" fold up into complex conformations which then have the ability to act as powerful chemical machines.
Enzymes are the universal mechanisms for getting things done in the cell. There are enzymes which stick things together and take them apart and modify them and activate them and deactivate them and copy them and repair them and destroy them.
There are several levels of meaning which can be read from a strand of DNA, depending on the size of the chunks:
- Lowest level (bases/nucleotides) - each DNA strand codes for an equivalent RNA strand - the process of decoding being "transcription".
- If you chunk DNA into triplets, then a "genetic decoder" can read the DNA as a sequence of amino acids. This is "translation".
- DNA is readable as a code for a set of proteins. The physical pulling-out of proteins from genes is called "gene expression". (this is currently the highest level we know how to read
- There is every reason to believe that DNA codes for such features as nose shape, music talent, quickness of relexes, etc. Could we read these from a strand of DNA?
Recognition is one of the central themes of cellular and subcellular biology. How do molecules recognize each other? It's like knowing if a string is a theorem or not. Is there a decision procedure?
Self-assembling units - get away with self-reproduction without telling the cell anything about their construction
non-self-assembling units - need to give instructions as to how to assemble themselves
Cellular differentiation - how do differnet cells, sharing exactly the same DNA, perform different roles - such as a kidney cell, a bone marrow cell, and a brain cell?
Morphogenesis (birth of form) ) how does intercellular communication on a local level give rise to large-scale, global structures and organizations - such as the various organs of the body, the shape of the face, the suborgans of the brain, and so on?
Nature feels quite comfortable in mixing levels which we tend to see as quite distinct.

17. Church, Turing, Tarski, and Others

Every aspect of thinking can be viewed as a high-level description of a system which, on a low level, is governed by simple, even formal, rules.
The only way to understand such a comple system as a brain is by chunking it on higher and higher levels, and therby losing some precision at each step. What emerges at the top level is the « informal system » which obeys so many rules of such complexity that we do not yet have the vocabulary to think about it. And that is what Artificial Intelligence research is hoping to find.
in many real-life situations, deductive reasoning is inappropriate, not because it would give wrong answers, but because there are too many correct but irrelevant statements which can be made: there are just too many things to take into Account simultaneously for reasoning alone to be sufficient.
A sense of judgement - « What is important here, and what is not? » - is called for. Tied up with this is a sense of simplicity, a sense of beauty. Where do these intuitions come from? How can they emerge from an underlying formal system.
Ramanujan - famous Indian mathematicain - his lack of rigor:

His memory, and his powers of calculation, were very unusual, but they could not reasonably be called « abnormal ». If he had to multiply two large numbers, he multiplied them in the ordinary way; he could do it with unusual rapidity and accuracy, but not more rapidly and accurately than any mathematician who is naturally quick and has the habit of computation… With his memory, his patience, and his power of calculation, he combined a power of generalization, a feeling for form, and a capacity for rapid modification of his hypotheses, that were often really startling, and made him, in his own field, without a rival in his day.

Most people have a sense of quantity up to six - he had it - for sheep in a field, words in a sentence, etc - up to 30
Nothing occult takes place during the performances of lightning calculators, but simply that their minds race through intermediate steps with the kind of self-confidence that a natural athlete has in executing a complicated motion quickly and gracefully. They do not reach their answers by some sort of instantaneous flash of enlightenment (though subjectively it may feel that way to some of them), but - like the rest of us - by sequential calculation, which is to say, by FlooPing (or BlooPing) along.
When one computes something, one’s mental activity can be mirrored isomorphically in some FlooP program. The assumption is that there exist software entities in the brain which play the role of various mathematical constructs. There is no assertion of isomorphic activity on the lower level of the brain and computer (ie neurons and bits)
A number theory problem, once stated, is complete in and of itself. A real-world problem, on the other hand, is never sealed off from any part of the world with absolute certainty.
We can liken real-world thought processes to a tree whose visible part stands sturdily above ground but depends vitally on its invisible roots which extend way below ground, giving it stability and nourishment. In this case the roots symbolize complex processes which take place below the conscious level of the mind - processes whose effects permeate the way we think but of which we are unaware.
When it comes to real-world understanding, it seems that there is no simple way to skim off the top level, and program it alone. The triggering patterns of symbols are just too complex. There must be several levels through which thoughts may percolate and bubble.
Imagery and analogical thought processes intrinsically require several layers of substrate and are therefore intrinsically non-skimmable. It is precisely at this point that creativity starts to emerge.
High-level meaning is an optional feature of a neural network - one which may emerge as a consequence of evolutionary environmental pressures.
Brain processes do not possess any more mystique - even though they possess more levels of organization - than, say, stomach processes.
All brain processes are derived from a computable substrate.
There is no reason to believe that a computer’s faultlessly functioning haredware could not support high-level symbolic behavior which would represent such complex states as confusion, forgetting, or appreciation of beauty. It would require that there exist massive subsystems interacting with each other according to a complex « logic ». The overt behavior could appear either rational or irrational; but underneath it would be the performance of reliable, logical hardware.
All intelligences are just variations on a single theme; to create true intelligence, AI workers will just have to keep pushing to ever lower levels, closer and closer to brain mechanisms, if they wish their machines to attain the capabilities we have.
There is no way of expressing the notion of truth inside TNT. Notice that this makes truth a far more elusive property than theoremhood, for the latter is expressible.
Our minds contain interpreters which accept two-dimensional patterns and then « pull » from them high-dimensional notions which are so complex that we cannot consciously describe them. The same could be said about how we respond to music.
Syntactic qualities of form - well-formedness, which can be detected by predictably terminating tests
Semantic qualities of form - require open-ended tests. The act of pulling out a string’s meaning involves, in essence, establishing all the implications of its connections to all other strings, and this leads, to be sure, down an open-ended trail. So « semantic » properties are connected to open-ended searches because, in an important sense, an object’s meaning is not localized within the object itself. This is not to say that no understanding of any object’s meaning is possible until the end of time, for as time passes, more and more of the meaning unfolds. However, there are always aspects of its meaning which will remain hidden arbitrarily long.
The music interpreter works by setting up a multidimensional cognitive structure - a mental representation of the piece - which it tries to integrate with pre-existent information by finding links to other multidimensional mental structures which encode previous experiences. As this process takes place, the full meaning gradually unfolds.
Syntactic properties reside unambiguously inside the object under consideration, whereas semantic properties depend on its relations with a potentially infinite class of other objects, and therefore are not completely localizable. There is nothing cryptic or hidden, in principle, in syntactic properties, whereas hiddenness is of the essence in semantic properties.
When you hear the Epimenides sentence (This sentence is false), your brain sets up some coding of the sentence, it tries to classify it as true or false, and this act of classification physically disrupts the coding of the sentence…

18. Artificial Intelligence: Retrospects

Tesler's theorem - "AI is whatever hasn't been done yet."
Translation involves having a mental model of the world being discussed, and manipulating symbols in that model. A program which makes no use of a model of the world as it reads the passage will soon get hopelessly bogged down in ambiguities and multiple meanings.
Skillful game players choose their moves according to mental processes which they do not fully understand - they use their intuitions. Now there is no known way that anyone can bring to light all of his own intuitions; the best one can do via introspection is to use "feeling" or "meta-intuition" - an intuition about one's intuitions - as a guide, and try to describe what one thinks one's intuitions are about.
"Wash me!" on a car - a game based on reading the "me" at the wrong level. How far back do we ordinarily trace the I in a sentence? We look for a sentient being to attach the authorship to. But what is a sentient being? Something onto which we can map ourselves comfortably.
Problem reduction - whenever one has a long-range goal, there are usually subgoals whose attainment will aid in the attainment of the main goal. If one breaks up a given problem into a series of new subproblems, then breaks those in turn into subsubproblems, and so one, in a recursive fashion, one eventually comes down to very modest goals which can presumably be attained in a couple of steps. Or so, at least it would seem...
The look ahead technique is not based on planning: it simply has no goals and explores a huge number of pointless alternatives. Having a goal enables you to develop a strategy for the achievement of that goal, and this is a completely different philosophy from looking ahead mechanically.

The dog and bone problem - where a dog can see a bone through a wire fence, but has to go out of his way to a gate to get into the other garden.

In some sense, all problems are abstract versions of the dog and bone problem.

Many problems are not in physical space but in some sort of conceptual space. When you realize that direct motion towards the goal in that space runs you into some sort of abstract fence, you can do one of two things.

1. Try moving away from the goal in some sort of random way, hoping that you may come upon a hidden gate through which you can pass and then reach your bone, or
2. Try to find a new space in which you can represent the problem, and in which there is no abstract fence separately you from your goal - then you can proceed straight towards the goal in this new space.
The first method may seem like the lazy way to go, and the second method may seem like a difficult and complicated way to go.
And yet, solutions which involve restructuring the problem space more often than not come as sudden flashes of insight rather than as products of a series of slow, deliberate thought processes. Probably these intuitive flashes come from the extreme core of intelligence.
What AI sorely lacks is programs which can "step back" and take a look at what is going on, and with this perspective, reorient themselves to the task at hand.

Stepping back:

An intelligent program would presumably be one which is versatile enough to solve problems of many different sorts. It would learn to do each different one and would accumulate experience in doing so. It would be able to work within a set of rules and yet also, at appropriate moments, to step back and make a judgment about whether working within that set of rules is likely to be profitable in terms of some overall set of goals which it has. It would be able to choose to stop working within a given framework, if need be, and to create a new framework of rules within which to work for a while.
Having an overview is tantamount to choosing a representation within which to work; and woking within the rules of the system is tantamount to trying the technique of problem reduction within that selected framework.
Hardy's common on Ramanujan's style - particularly hiw willingness to modify his own hypotheses - illustrates this interplay between the M-mode and the I-mode in creative thought.

Representation of knowledge:

Representation of knowledge is at the crux of AI.
Backwards chaining begins with the goal and works its way backwards, presumably towards things that may already be known.
Analogical awareness - comparing situations and spotting similarities.
You get bored with something not when you have exhausted its repertoire of behavior, but when you have mapped out the limits of the space that contains its behavior. The behavior space of a person is just about complex enough that it can continually surprise other people.
The external form of a sentence - its composition in terms of elementary signs - does not divide up so neatly into syntactic and semantic aspects.
How many levels should a system have? How much and what kind of "intelligence" should be placed on which level? These are some of the hardest problems facing AI today. Since we know so little about natural intelligence, it si hard for us to figure out which level of an artificially intelligent system should carry out what part of a task.

19. Artificial Intelligence: Prospects

I believe that "almost" situations and unconsciously manufactured subjunctives represent some of the richest potential sources of insight into how human beings organize and categorize their perceptions of the world.
The "slippability" of a feature of some event or circumstance depends on a set of nested contexts in which the event or circumstance is perceived to occur.

We build up our mental representation of a situation layer by layer:

The lowest layer establishes the deepest aspect of the context - sometimes being so low that it cannot vary at all, like the 3D nature of our world, which is so ingrained that most of us never would imagine letting it slip mentally. It is a constant constant.
Then there are layers that establish temporarily, though not permanently, fixed aspects of situations, which could be called background assumptions - in the back of your mind, you know they can vary, but most of the time you unquestioningly accept as unchanging aspects. This are still "constants", like the rules of a football game.
Then there are parameters, which are more variable but temporarily held constant, like the weather, the opposing team, etc. There could be, and probably are, several layers of parameters.
Finally we reach the shakiest aspects of your mental representation of the situation - the variables, like whether a player stepped out of bounds or not, which are mentally loose and which you don't mind letting slip away from their real values for a short moment.

Frames:

The theory of representing knowledge in frames relies on the idea that the world consists of quasi-closed subsystems, each of which can serve as a context for others without being too disrupted, or creating too much disruption in the process.
The existence of default values for slots allows the recursive process of filling slots to come to an end. You say, I will fill in the slots myself as far as three layers down; beyond that I will take the default options.
The nested structure of a frame gives you a way of zooming in and looking at small details from as close up as you wish. It is like looking through a telescope with lenses of different power; each lens has its own uses. It is important that one can make use of all the different scales; often detail is irrelevant and even distracting

Processing (of Bongard Problems):

Raw data are preprocessed. Some salient features are detected. The names of these features constitute a mini-vocabulary for the problem; they are drawn from a general salient-feature vocabulary.
In a second stage of preprocessing, some knowledge about elementary shapes is used; and, if any are found, their names are also made available. This is roughly the point at which the conscious and the unconscious meet, in humans.
Now that the picture is "understood" to some extent in terms of familiar concepts, some looking around is done. Tentative descriptions are made for one or a few of the twelve boxes. Each of these descriptions sees the box through a "filter".
Clearly a lot of information has been thrown away and even more could be thrown away. A priori, it is very hard to know what it would be smart to throw away and what to keep.
Sameness detectors and description schemas or templates.
The concept network, in which all the known nouns, adjectives, etc are linked in ways which indicate their interrelations. It is just brimming with info about relations between terms such as what is the opposite of what, what is similar to what, what often occurs with what, and so on.
A sameness detector spots the recurrence of the concept three in all the slots dealing with o's, and this is strong reason to undertake a second template-restructuring operation. The first was suggested by the concept net, the second by SAM.
Of course, an enormous amount of information has been thrown away concerning the sizes, positions, and orientations of these triangles, and many other things as well. But that is the whole point of making descriptions instead of just using the raw data! It is like funneling.
There is a constant back-and-forth interaction of individual descriptions, templates, the sameness-detector SAM, and the concept network.
Everything in the net can be talked about - both nodes and links. Nothing in the net is on a higher level than anything else
One of the main functions of the concept network is to allow early wrong ideas to be modified slightly to slip into variations which may be correct.
Related to this notion of slipping between closely related terms is the notion of seeing a given object as a variation on another object. One has to be able to bend concepts, when it is appropriate. Nothing should be absolutely rigid. On the other hand, things shouldn't be so wishy-washy that nothing has any meaning at all, either. The trick is to know when and how to slip one concept into another.
The program must be sufficiently flexible that it can go back and forth between such different representations tor a given part of a drawing. It is wise to store old representations, rather than forgetting them and perhaps having to reconstruct them, for there is no guarantee that a newer representation is better than an old one. Thus, along with each old representation should be stored some of the reasons for liking it and disliking it.
The program should construct descriptions of descriptions (or metadescriptions). Perhaps on this second level some common feature will emerge.
The program should not be doomed if, malaphorically-speaking, it "barks up the wrong alley" for a while
Focusing involves making a description whose focus is some part of the drawing in the box, to the exclusion of everything else.
Filtering involves making a description which concentrates on some particular way of viewing the content of the box, and deliberately ignores all other aspects.
Focusing has to do with objects (nouns) and filtering with concepts (adjectives)
The Bongard-problem world is a place where science is done, where the purpose is to discern patterns in the world. As patterns are sought, templates are made, unmade, and remade, slots are shifted from one level of generality to another, filtering and focusing are done. There are discoveries on all levels of complexity.
Real science does not divide up into normal periods and conceptual revolutions, rather paradigm shifts pervade - there are just bigger and smaller ones, and paradigm shifts on different levels.
The intuition to know when it makes sense to blur distinctions, to try redescriptions, to backtrack, to shift levels, and so on, is something which probably comes only with much experience in thought in general.
Our notion of simplicity may be universal.
The skill of solving Bongard-problems lies very close to the core of pure intelligence.
Some of the problems of visual patterns which we human being seem to have completely flattened into our unconscious are quite amazing - recognition of faces, of hiking trails in forests and mountains, and reading text without hesitation in hundreds, thousands of different typefaces.

Actors:

Actors with the ability to exchange messages become somewhat autonomous agents. Each actor can have its idiosyncratic way of interpreting any given message - thus a message's meaning will depend on the actor it is intercepted by.
Let us call a frame with the capability of generating and interpreting complex messages a "symbol".
There must be a whole range of messages - with and without destinations, a central receiving dock, different classes of urgency, SASEs, extremely long messages that can be sent slowly etc - like a postal system of telephone system
Enzymes in a cell are like agents.
Have many different species of triggerable subroutines just lying around waiting to be triggered.
Sameness detectors could be implemented as enzyme-like subprograms.
If new SAMS could be synthesized, that would be like the seepage of pattern detection into lower levels of our minds.
Fission is the gradual divergence of a new symbol from its parent symbol. It's more or less inevitable as it becomes autonomous and has its own interactions with the outside world.
Fusion is when two or more originally unrelated symbols participate in a joint activation passing messages so tightly back and forth that they get bound together and the combination can thereafter be treated as if it were a single symbol. But when do two concepts really beomes one? Is there some precise instant when a fusion takes place?

Abstractions, Skeletons, Analogies:

The whole process can be seen as a succession of mappings of ideas onto each other at varying levels of abstraction. This is conceptual mapping and the abstract structures which connect up two different ideas are conceptual skeletons.
Where do abstract views come from? How do you make abstract views of specific notions?
A view that has been abstracted from a concept along some dimension is a conceptual skeleton. It is like a set of constant features (not parameters or variables) - features which should not be slipped in a subjunctive instant replay or mapping-operation. Having no parameters or variables of its own to vary, it can be the invariant core of several different ideas.
They must exist on different levels of abstraction and along different conceptual dimensions.
One of the major characteristics of each idiosyncratic style of thought is how new experiences get classified and stuffed into memory, for that defines the "handles" by which they will later be retrievable. And for events, objects, ideas, and so on there is a wide variety of handles. All of them can act as ports of access.
There are partitions between these aspects of one symbol, partitions that prevent my thought from spilling over sloppily in the manner of free associations. My mental partitions are important because they contain and channel the flow of my thoughts.
When two ideas are seen to share conceptual skeletons on some level of abstraction, different things can happen. Usually the first stage is that you zoom in on both ideas, and, using the higher-level match as a guide, you try to identify corresponding subideas. Sometimes the match can be extended recursively downwards several levels, revealing a profound isomorphism. Sometimes it stops earlier, revealing an analogy or similarity. And then there are times when the high-level similarity is so compelling that, even if there is no apparent lower-level continuation of the map, you just go ahead and make one: this is the forced match.
I have presented a number of related ideas connected qith the creation, manipulation, and comparison of symbols. Most of them have to do with slippage in some fashion, the idea being that concepts are composed of some tight and some loose elements, coming from different levels of nested contexts (frames). The loose ones can be dislodged and replace rather easily, which, depending on the circumstances can create a "subjunctive instant replay", a forced match, or an analogy. A fusion of two symbols may result from a process in which parts of each symbol are dislodged and other parts remain.

Creativity and Randomness:

When programs cease to be transparent to their creators, then the approach to creativity has begun.
Randomness is an intrinsic feature of thought, not something which has to be artificially inseminated whether through dice, decaying nuclei, random number tables, or what-have-you. It is an insult to human creativity to imply that it relies on such arbitrary sources.
Perhaps what differentiates highly creative ideas from ordinary ones is some combined sense of beauty, simplicity, and harmony. Superficially similar ideas are often not deeply related, and deeply related ideas are often superficially disparate.
The illusive sense for patterns which we humans inherit from our genes involves all the mechanisms of representation of knowledge, including nested contexts, conceptual skeletons and conceptual mapping, slippability, descriptions and meta-descriptions and their interactions, fission and fusion of symbols, multiple representations (along different dimensions and different levels of abstraction, default expectations, and more.

20. Strange Loops, Or Tangled Hierarchies

Reasoning involves an infinite regress.
Machines may someday have wills despite the fact that no magic program spontaneously appears in memory from out of nowhere. Instead it will be by reason of organization and structure on many levels of hardware and software;
When humans think, we certainly do change our own mental rules, and we change the rules that change the rules, and on and on - but these are, so to speak, software rules. However, the rules at bottom do not change. Neurons run in the same simple way the whole time. You can't "think" your neurons into running some nonneural way, although you can make your mind change style of subject of though. You have access to your thoughts, but not to your neurons. Software rules on various levels can change; hardware rules cannot - in fact their rigidity is due to the software's flexibility.

A game of chess with multiple boards, to move the pieces, the rules, the metarules, the metametarules, etc:

We begin by collapsing the whole array of boards into a single board, so when you move a piece you also change the rules. The distinction between game, rules, metarules, metametarules, etc has been lost. What was once a nice clean hierarchical setup has become a Strange Loop or tangled hierarchy. There are still different levels but the distinction between lower and higher has been wiped out.
But there are still plenty of rules that are inviolate - you still take turns and have conventions for how to interpret the various levels. There is, in fact, an Inviolate level, the I-level, on which the interpretation conventions reside, and a Tangled level, the T-level, on which the tangled hierarchy resides. The I-level governs the T-level, but the T-level cannot affect the I-level.
Then of course we could tangle the I-level and the T-level, but then a new I-level would emerge. Each time you think you have reached the end, there is some new variation on the theme of jumping out of the system, which requires a kind of creativity to spot.
In any system there is always some "protected" level which is unassailable by the rules on other levels, no matter how tangled their interaction may be among themselves.

In our thoughts:

Symbols activate other symbols, and all interact heterarchically. Furthermore, the symbols may cause each other to change internally, in the fashion of programs acting on other programs. The illusion is created because of the tangled hierarchy of symbols, that there is no inviolate level. One thinks there is no such level because that level is shielded from our view.
Only the symbol hierarchy is a tangled hierarchy. The neural tangle is just a "simple" tangle, which doesn't involve violations of presumed level distinctions.
We feel self-programmed. In fact we couldn't feel any other way, for we are shielded from the lower levels, the neural tangle. <our thoughts seem to run about in their own space, creating new thoughts and modifying old ones, and we never notice any neurons helping us out!
Is it possible to lay down laws as to what evidence is and how to make sense out of situations? Probably not, for any rigid rules would undoubtedly have exceptions, and nonrigid rules are not rules.
Deciding what is valid of what is true is an art; and it relies as deeply on a sense of beauty and simplicity as it does on rock-solid principles of logic or reasoning or anything else which can be objectively formalized. Truth is too elusive for any human or any collection of humans ever to attain fully, and will be for AI too, even if it surpasses us.

Introspection:

The total picture of who I am is integrated in some enormously complex way inside the entire mental structure and contains in each one us a large number of unresolved, possibly unresolvable inconsistencies. These undoubtedly provide much of the dynamic tension which is so much a part of being human.
How can you tell if your own logic is peculiar or not, given that you have only your own logic to judge itself? Only versions of formal number theory which assert their own consistency are inconsistent!
Just as we cannot see our faces with our own eyes, is it not reasonable to expect that we cannot mirror our complete mental structures in the symbols which carry them out?
Once the ability to represent your own structure has reached a certain critical point, that is the kiss of death: it guarantees that you can never represent your self totally.
Step by step, inexorably, Western science has come towards investigation of the human mind - which is to say, of the observer.
Art in this century has abandoned representation (the beginnings of abstract art), then surrealism came along. In conceptual art, it is not that there is no code by which ideas are conveyed to the viewer. Actually, the code is a much more complex thing, involving statements about the absence of codes and so forth - that is, it is part code, part metacode, and so on.
"Ism" is the spirit of Zen in art. And just as the central problem of Zen is to unmask the self, the central problem of art in this century seems to be to figure out what art is.

My belief is that the explanations of "emergent" phenomena in our brains - ideas, hopes, images, analogies, and finally consciousness and free will - are based on a kind of Strange Loop, an interaction between levels in which the top level reaches back down towards the bottom level and influences it, while at the same tim being itself determined by the bottom level. On other words, a self-reinforcing "resonance" between different levels - quite like the Henkin sentence which, by merely asserting its own provability, actually becomes provable. The self comes into being at the moment it has the power to reflect itself.

From this balance between self-knowledge and self-ignorance comes the feeling of free will.
A writer isn't quite sure how some images fit together in his mind, and he experiments around, expressing things first one way and then another, and finally settles on some version. But does he know where it all came from? Only in a vague sense. Much of the source, like an iceberg, is deep underwater, unseen - and he knows that.
The important idea is that this "vortex" of self is responsible for the tangledness, for the Gödelian-ness of the mental processes
The Musical Offering is a fugue of fugues, a Tangled Hierarchy like those of Escher and Godel, an intellectual construction which reminds me, in ways I cannot express, of the beautiful many-voiced fugue of the human mind. And that is why in my book the three strands of Godel, Escher, and Bach are woven into an Eternal Golden Braid