Shannon’s entropy equation can be used to solve resource management problems:
Let’s say we have 5 nodes in a search space, and we expect each node to be about the same distance from some goal state. In this case, there’s no reason to allocate more computational resources to one node than any other, since ex ante we expect all nodes to perform the same.
Now imagine that we expect one of those nodes to be significantly closer to the goal state than the others. It would be rational in that case to dedicate more computational power to the states generated by that node.
We can use Shannon’s equation to answer the question of how much additional power should be allocated.
We can also use information theory to help guide the search process itself. If we know nothing about a state-space, we’ll probably want to fully unpack all the nodes until we have a robust set of nodes that have some minimum threshold of diversity.
Generating that last, sufficiently diverse state-space will of course necessarily require producing a sequence of state-spaces, each with their own measure of entropy.
We can then apply my standard technique in AI, which is to select the state-space that generates the greatest change in the entropy of the state-space during that process.
This is the state-space that produced the greatest change in the structure of the state-space, which we can then use as our starting point for more carefully unpacking the depths below.
Finally, generating a state-space is generally very computationally intensive, and likely to be exponential as a function of the number of iterations (i.e., a state-space can be represented as a tree with an exponentially wider breadth as a function of depth). As a result, it’s rational to try and compress the number of dimensions used in navigating the state-space.
I introduced an approach to dimension compression in the note below this one that makes use of my categorization algorithm, but one possibly faster approach is to simply test how sensitive the value we’re attempting to predict is to changes in the input dimension in question. That is, we would start with dimension 1, and test the response of the function we’re evaluating to changes in the value of that dimension, and repeat this process for each dimension. Then, we allocate resources according to how sensitive the output variable is to the dimension in question.
So for example, if we’re trying to find the minimum of the function , the value of is going to be far more sensitive to changes in the value of than changes in the value of . This is easy to test, so we can do this type of test quickly for each dimension of a dataset, and then allocate resources using Shannon’s equation (where the most sensitive dimensions get the most resources), or simply select some fixed number of the most sensitive dimensions. This could dramatically reduce the workload of the state-space navigation function for high-dimensional functions.
I’ll follow up with code and a research note.