On Selection and Intelligence

Intuitively, it makes sense that intelligence would persist in Nature, because for all animals, brain power is useful and therefore at a minimum shouldn’t hurt a species. That said, it’s not clear why in particular homosapiens have grown more intelligent over time. As a general matter, selection should explain it. For example, it’s generally accepted that some kind of apocalyptic event lead to the demise of the largest dinosaurs, presumably an asteroid hitting the Earth. This killed off the largest animals above ground, presumably because they needed more food to survive than the small ones, and a catastrophic event could of course limit the total amount of food available for all animals. Smaller animals could survive on less food, and then eventually get bigger once conditions improved, which seems to be roughly what happened. So we can plainly see the mechanics involved: the total food supply literally shrinks, which kills anything that needs too much food, and whether or not this is exactly what happened, it makes sense, mechanically. Now why on Earth would animals get more intelligent as a function of time?

I think the answer is the same: disaster. When disaster strikes, the ability to predict the behavior of your environment could mean the difference between life and death. Because reality is governed by physics and mathematics, disaster could again be responsible for the selection of intelligence. So it’s indirect in this view: disaster forces selection for mechanical intuition and mathematics, and both of those require intelligence. Just imagine starting a fire near a bunch of wolves, near a body of water. Now set that same fire near a bunch of monkeys. Do the same with a bunch of people. Only the stupidest people won’t realize that the water probably won’t set on fire, whereas the same is not true of the wolves, and might not be true of the monkeys. That’s all you need to have literally all animals replaced by humans, disaster, and further, that’s all you need to get rid of stupid people as well, which could explain why humans have generally become more intelligent as a function of time.

Keep in mind a UTM is equivalent to processing language, suggesting that once you speak a language, you have all the intelligence you need, as a technical matter, to do all of computable mathematics. This suggests that in some sense humanity peaked at the development of language, and I think this could explain the proliferation of Phoenician-like languages, all the way out to Mongolia, for the simple reason that the Phoenicians might have been the first people to develop a written language, of course taking into account the possibility of a prior group that was subject to disaster.

On the Complexity of a Graph

Introduction

I’ve been thinking about representational complexity for a long time, and in fact one of the first papers I wrote was on the intersection number of an infinite graph. Despite literally 20 years of thinking on the subject, for some reason, I wasn’t until today able to accurately think about the Kolmogorov Complexity of a graph. Sure, it’s a complex topic, but the answer is pretty trivial. Specifically, every graph can be represented as an N x N matrix, where N is the number of vertices. However, this is obviously too big of a structure to be the K-complexity, since it contains duplicate entries (I’m assuming it’s an undirected graph with no loops). That is, entry i,j in the matrix is the same as entry j,i, and as such contributes no new information about the graph. Therefore, the K-complexity of a graph must be less than N^2.

We can improve our measure of K-complexity by noting that any graph can be described by simply listing its edges. For example, a complete graph on three vertices can be represented by the set of vectors E = \{(1,2), (2,3), (3,1)\}. That is, each vector represents an edge, and the digits in the vector tell you the labels of the vertices connected by the edge in question. In this case E connects vertex 1 to 2, 2 to 3, and 3 to 1, producing a circuit on three vertices, i.e., a complete graph on three vertices. As a consequence, the K-complexity of a graph cannot exceed the complexity of its edge set. Each label in a graph will be less than the order of the graph (i.e., the number of vertices N), and as a result, we can encode each label in the edge set with at most \log(N) bits. Each edge consists of two labels, and so the K-complexity of a graph must be less than or equal to |E| 2\log(N) + C.

Structural Complexity

Note that |E| 2\log(N) is again an upper bound on the K-complexity of a graph, it’s just much better than N^2, since it at least takes into account the number of edges, and uses a more efficient representation of the graph. However, particular graphs can definitely be expressed in fewer bits. Specifically, the empty graph, the complete graph, and a Hamiltonian circuit, each of a given order N, can be expressed in \log(N) + C bits, where C is a constant that is the length of the program that constructs the graph, which we can think of as simply generating the list of edges that defines the graph. For example, the following Octave Code will generate a Hamiltonian circuit of any order N:

E{1} = (1,N);

for i = 2 : N – 1

E{i} = (i, i+1);

endfor

As a result, we can produce a Hamiltonian circuit of any order, using at most \log(N) + C bits, where C is the length of the program above in bits. Analogous code will produce a complete graph, whereas an empty graph can be produced by simply specifying N (i.e., there are no edges), which again requires exactly \log(N) + C bits, but using different code. Note that even in these cases, it’s still possible that an even shorter, presumably not universal program produces a given graph as its output when fed as the input to a UTM. That is, e.g., U(x) = M(G), where x is a binary string, U is some UTM, and M(G) is the matrix that defines the graph G. As a theoretical matter, the length of x could be even smaller than \log(N), where again N is the order of G. This would be true for empty graphs, complete graphs, and Hamiltonian circuits of order N, where the number N itself is highly compressible.

Kolmogorov-Random Graphs

It turns out it’s pretty easy to prove that Kolmogorov-Random Graphs exist, in the sense that we can produce graphs using Kolmogorov-Random strings. There’s obviously scholarship on point, but I think the ideas I’ve set out in this note are a lot easier to think about, and unify perfectly with the complexity of strings. Specifically, recall that we can represent any graph using its edge set. If we modify the method used to do so above, we can show that Kolmogorov-Random strings will produce graphs when fed as inputs to a UTM, which implies that those graphs are literally Kolmogorov-Random, in the same sense as a binary string.

Returning to the example of a complete graph on three vertices above, let’s represent that graph using three sets: the set of vertices adjacent to v_1, the set of vertices adjacent to v_2, and the set of vertices adjacent to v_3. Because it’s a complete graph on three vertices, we would have the following: \{v_2\}, \{v_3\}, \{v_1\}. Note that as we move forward in the labels, e.g., from vertex 1 to vertex 2, we gain information about the graph. Specifically, there’s no reason to include v_1 in the set of vertices adjacent to v_2, since that information is contained in the set of vertices adjacent to v_1. As a result, the maximum number of elements in the set for v_1 is N-1, and for v_2 it’s N-2, and so on. As a result, we can represent the graph as a “matrix” where the number of columns begins at N-1 for v_1, and decreases by 1 for each row. Note that as a result, the N-th row has no columns, implying that the “matrix” has N-1 rows. We can treat every entry in this “matrix” as a binary number indicating the presence or absence of a given edge. For example, if entry (1,3) is 1, then edge (1,3) is included in the graph. This is just like a regular graph matrix, except it contains no duplicate entries, and therefore the number of columns is not constant. Note that therefore, it corresponds to a binary string of length (N - 1) + (N - 2) + \ldots + 1 = \frac{N (N - 1)}{2} = N \choose 2, which is also the maximum number of edges.

Let s be a Kolmogorov-Random string of length N = n \choose 2. All we have to do is restructure s into the shape of a “matrix” described above, which will produce a graph G. Because s is Kolmogorov-Random, there cannot be another binary string shorter than s that will generate G on a UTM, by which we mean generating the edges of G. Assume such a string y exists. It follows that E(G) = U(y), which in turn will allow us to construct a “matrix” of the type above, which will in turn produce s. Because s is Kolmogorov-Random, there cannot be a string shorter than s that generates s, yet y is exactly such a string, which produces a contradiction, completing the proof.

Note that such a graph G cannot be a complete graph, since a complete graph has a trivial complexity, as we’ve seen above. It is therefore interesting that a Kolmogorov-Random Graph has a Kolmogorov Complexity of exactly the number of edges in a complete graph of the same order. Because not all numbers are of the form N = n \choose 2, we cannot say that the Kolmogorov Complexity of a Kolmogorov Random graph is always the number of edges in a complete graph of the same order, though it might nonetheless be true.

Computability and Infinite Sets of Graphs

Let H_N denote the Hamiltonian circuit of order N, and let \Gamma = \{H_1, H_2, \ldots \}. Despite the fact the set is infinite, because each such graph has a simple structure, we can define a computable test that will allow us to determine whether a graph G is in that set, or not. Just to sketch the algorithmic test, we would begin by testing the degree of each vertex in G, and if it’s not 2, the algorithm terminates and returns “false”. If the degree of each vertex is in fact 2, then we traverse the edges of the graph, and if we don’t return to the original vertex, we again return “false”, otherwise we return “true”. As a result, there is a computable test that allows us to state whether or not a given graph G is in \Gamma. Note that as a consequence, we can also generate \Gamma, using brute force, producing all graphs of orders N = 1, 2, \ldots and testing which of those belong in \Gamma. Similarly, as noted above, we can algorithmically construct a Hamiltonian circuit of all orders, and can therefore, test whether or not a given graph is in the set of graphs generated by that process. Note that the constructive algorithm above, is very different from the algorithmic test we just articulated in this section. Therefore, we have an interesting equivalence, in that the existence of an algorithmic test implies a constructive algorithm (via brute force), and a constructive algorithm implies an algorithmic test (again via brute force, e.g., in the case where there are multiple graphs of a given order).

At the same time, because the set of graphs is countably infinite, and the set of algorithms is countably infinite, the number of infinite subsets of the set of all graphs is uncountable and therefore larger than the set of all algorithms. Therefore, it must be the case that there are infinite sets of graphs that do not have an algorithmic test, and therefore do not have a constructive algorithm. One simple example follows from the existence of non-computable infinite strings. Again, the set of all infinite binary strings is uncountable, whereas the set of all algorithms is countable. As a consequence, there must be infinite strings that cannot be generated by a finite program running on a UTM.

Let \sigma be such a non-computable infinite string. Define \bar{\Gamma} as the subset of \Gamma such that H_i is included in \bar{\Gamma} if \sigma(i) = 1. That is, where \sigma(i) = 1, we include the corresponding element of \Gamma. As noted above, there is a simple algorithmic test to determine whether or not each element of \bar{\Gamma} is Hamiltonian, but this is distinct from an algorithmic test that determines whether or not a given graph G, is in \bar{\Gamma}. That is, not all Hamiltonian circuits are in \bar{\Gamma}. Now assume that such an algorithmic test T exists. It follows that we can then generate every Hamiltonian circuit starting with N = 1, and T will tell us whether or not that particular order is included in \bar{\Gamma}. If H_i is included, set binary string entry \zeta(i) = 1 and otherwise set  \zeta(i) = 0. It follows that \zeta = \sigma. However, we generated \zeta on a UTM, which is not possible, since \sigma is by definition non-computable, which leads to a contradiction. This implies T does not exist. This demonstrates that even if a set is computable, it will have non-computable subsets, just like the natural numbers, and all countable sets generally.

Properties without Computable Tests

In the previous section I noted that Hamiltonian circuits are simple structures, and therefore, we can come up with simple algorithms that test for whether or not a graph is a Hamiltonian circuit. Testing whether or not a given graph contains a Hamiltonian circuit as a subgraph is believed to be intractable. As such, there are simple structures that correspond to simple algorithms, though finding simple structures in more complex graphs is suddenly intractable. This raises the question of whether or not there are properties that cannot be tested for algorithmically at all. The answer is yes.

Order the set of all finite graphs G_1, G_2, \ldots, and let \gamma be an infinite binary string. If \gamma(i) = 1, then we can interpret this as G_1 having property \gamma given the ordering, or equivalently, view \gamma as defining a set of graphs. There are again more binary strings than there are algorithms, and therefore, in this view, more properties than there are algorithms. The question is whether there are more interesting and / or useful properties than there are algorithms, but the bottom line is, the number of properties is uncountable, and therefore the density of testable properties is zero. You might be tempted to say that all such properties are uninteresting, or not useful, but unfortunately, whether or not a program will halt is precisely one of these properties, which is both useful and interesting, suggesting at least the possibility of other such useful and interesting properties.

The Distribution of Graph Properties

Ramsey Theory is one of the most astonishing fields of mathematics, in particular, because like all combinatorics, it has intrinsic physical meaning, in that combinatorics provides information about the behavior of reality itself. What’s astonishing about Ramsey Theory in particular, is that it’s fair to say that it shows that structure increases as a function of scale, in that as objects get larger, they must have certain properties. What I realized today, is that the number of properties that they can have also grows as a function of scale. For intuition, consider all graphs on N = 1,2,3 vertices. You can see right away that the complete graph on 3 vertices is the smallest possible Hamiltonian circuit, and therefore, the smallest Hamiltonian graph. It is simply not possible to have a Hamiltonian graph with fewer vertices. This suggests the possibility that the number of properties that a graph can have grows as a function of its order, and we show using some not terribly impressive examples, that this in fact must be true, suggesting the possibility, that useful and interesting properties of objects continue to arise as a function of their scale.

Let \gamma be an infinite binary string, and define an infinite subset of the natural numbers A such that k \in \mathbb{N} is included in A if \gamma(i) = 1. Now define an infinite set of graphs \Gamma such that H_i is in \Gamma if the number of edges of H_i is divisible by A(i). Because each H_i is a Hamiltonian circuit, the number of edges in H_i is simply i. Putting it all together, we have an infinite set of graphs, each of which are Hamiltonian circuits of different orders, and whether or not a given order is included in the set, is determined by whether or not that order is divisible by the corresponding integer value in A. Note that larger graphs will have orders that are divisible by more numbers, though the number of divisors does not increase monotonically as a function of n \in \mathbb{N}. Therefore, in this view, larger graphs have more properties in the sense of being included in more sets of this type.

Complexity Versus Computability

Intuitively, there is a connection between complexity and computability. In at least one case, this is true, in that the Kolmogorov Complexity of a non-computable infinite string is countable. That is, if a string is non-computable, then there is no finite input to a UTM that will generate that string. As a result, its complexity cannot be finite. At the same time, we can simply copy the string from the input tape, to the output tape, assuming both are infinite tapes. Therefore, the Kolmogorov Complexity of a non-computable infinite string is countable. However, this intuition is subject to more careful examination in the case of infinite sets, in that you can construct non-computable sets of low-complexity objects.

Now assume that \gamma from the example in the previous section is non-computable, which implies that both A and \Gamma are non-computable. As discussed above, it follows that there is no computable test for inclusion in \Gamma. At the same time, each of the graphs in \Gamma is a Hamiltonian circuit, with a Kolmogorov Complexity of \log(N) + C. Note that the ratio \frac{\log(N)}{{N \choose 2}} approaches zero as N approaches infinity. As such, we have an infinite set of graphs that is non-computable, but each of the individual graphs are low-complexity objects, though note that the set was generated using a non-computable string.