Issues with my Code bin

My code bin has apparently been displaying minus signs as dashes, making copying and pasting inconvenient, to say the least, so my apologies for not noticing this sooner.

As a result, I’ve attached the code for all of my algorithms as zip files to my researchgate page.

Link: Code for all Algorithms


Vectorized Image Partitioning

In this article, I’m going to present a low-degree polynomial runtime image partition algorithm that can quickly and reliably partition an image into objectively distinct regions, using only the original image as input, without any training dataset or other exogenous information. All of the code necessary to run the algorithm is available on my code bin.

There’s also a simple script “Consolidate Partition – CMND LINE” that consolidates these features into larger objects.
I’ll follow up with a “crawler” that will assign contiguous regions unique labels to make feature extraction more convenient.


Vectorized Image Boundary Detection

In a previous note, I introduced an algorithm that can, without any dataset or other prior information, reassemble a scrambled image to its original state. In this note, I’m going to introduce a related algorithm that can quickly find boundaries in an image without any dataset.

Though I’ve yet to formally analyze the run time of the boundary detection algorithm, it is remarkably fast, and certainly a low-degree polynomial. I suspect this algorithm will have applications in real-time object tracking, lane detection, and probably machine generated art.

The core insight of both algorithms is that real life objects are generally structured in a manner that causes colors to be locally consistent. That is, if we look closely at an object, we’ll find that color palettes that are proximate in space are generally similar. Therefore, as we traverse an image in a given direction, if we find that the color palette changes sharply, then we’ve probably transitioned to a new region of the image.

Both algorithms measure local color consistency using a vectorized implementation of set intersection that I’ve developed. Specifically, both test how many colors two regions have in common by taking the intersection of the color sets in the two regions. However, unlike ordinary intersection, these algorithms measure what I call the “\delta-intersection” of two sets, which, rather than test for equality, tests whether the norm of the difference between two color vectors is less than some value \delta. If the norm of the difference is less than \delta, then the two colors are treated as “the same” for purposes of calculating the intersection between the two sets that contain the colors in question.

This value of \delta is optimized using the same methods that I introduced previously, which make use of information theory, producing a context-dependent level of distinction that allows us to say whether any two given colors are close enough to be considered the same. You can find an overview of my work in artificial intelligence in my research paper, A New Model of Artificial Intelligence.

How it works

This is just a high level summary, as I’m leaving Sweden tomorrow for New York City, and plan to write a comprehensive, formal research paper on all of my work on color processing and color perception, which is at this point, extensive.

The first step is to break up the image into rectangular regions using my core image partition algorithm. This will impose a grid upon the image that already contains the macroscopic boundaries of the objects within the image, but because it’s structured as a grid, it also contains edges that don’t correspond to actual boundaries. The example below shows a photograph I took in Williamsburg, Brooklyn, of a small brick wall, together with the image as broken into regions by the partition algorithm. The image on the right shows the average color of each region generated by the partition algorithm, as a visual aid.


The next step is to go through each of these regions, first by “reading” each row of the partition from left to right, and calculating the \delta-intersection of the colors contained in neighboring regions. So beginning with the top left region of the image, which is mostly white, we calculate the \delta-intersection between that region and the region to its immediate right, which is also mostly white. As we transition from left to right, we’re going to eventually reach regions that have a darker tone, causing the intersection count to start to drop off. As you can see, regions (1,3) and (1,4) have significantly different coloring, which is going to cause the \delta-intersection count to suddenly drop off, suggesting that there’s a boundary, which is actually the case.

All of my work so far in artificial intelligence has made use of finite differences to construct categories, and more generally, distinguish between objects. For example, the value of \delta above that we use to distinguish between colors is a fixed value that is intended to be used as a limit on the difference between two vectors, above which, we distinguish. I’ve coined the term minimum sufficient difference to describe this concept of \delta generally. That is, in this case, \delta is the minimum sufficient difference between two color vectors necessary to distinguish between the colors. In simple terms, if the difference between two colors is more than \delta, then they’re not the same in the context of the image, and their difference exceeds the minimum sufficient difference for distinction in that context.

However, when “reading” an image from left to right, there might not be a single minimum sufficient difference between intersection counts capable of identifying all of the boundaries in an image. As a simple example, consider the following sequence of integers:

1, 2, 5, 107, 210, 250.

Let’s pick a fixed value of \delta = 6. Reading this sequence from left to right, and calculating the difference between adjacent entries, we would place delimiters as follows:

1, 2, 5 || 107 || 210 || 250.

If these numbers represent the intersection counts between neighboring regions in an image, then this partition is probably wrong, since the numbers 107, 210, and 250, probably all correspond to a single, new region that begins at 107. That is, the correct partition is probably the following:

1, 2, 5 || 107, 210, 250.

This partition cannot be produced using a fixed finite difference. Specifically, since 5 and 107 are categorized separately, it must be the case that 107 - 5 = 102 > \delta. Because 107 and 210 are categorized together, it must be the case that 210 - 107 = 103 < \delta. But obviously, it cannot be the case that \delta < 102 and \delta > 103. Nonetheless, we might need to produce this partition, so as a result, the boundary algorithm makes use of a ratio test, rather than a finite difference test. Specifically, it tests the ratio between the intersection counts of neighboring regions.

Continuing with the sequence of integers above, we would calculate the ratio between 1 and 2 (.5), 2 and 5 (.4), 5 and 107 (.0467), 107 and 210 (.5095), and 210 and 250 (.84). Using this approach, we can fix a minimum ratio of \Delta = .4, which will cause us to draw a boundary at the right place, between 5 and 107.

Applying this approach to the original image of the wall generates the following partition:


I’ve come up with a method of optimizing \Delta that you can see in the code, which is available in my code bin (see, “delimit_image”). I’ll follow up with a feature extraction algorithm based upon this technique, that will identify contiguous regions in an image.

I’ll explain all of this in greater detail from the other side of the Atlantic, in New York City. In the interim, here are three more examples of boundaries generated by the algorithm:





Recovering a Distorted Image With No Prior Information

In this note, I’m going to present an algorithm that can, without any prior information, take a scrambled image, and reassemble it to its original state. In short, it shows that even if we know nothing about a distorted image, we can still figure out what the image was supposed to look like using information theory. The only input to the algorithm is data derived from the scrambled image itself, and the algorithm isn’t recognizing objects. It is instead, rearranging the scrambled image in a manner intended to maximize the expectation that any objects in the image will be reassembled.

Human beings can solve these types of puzzles easily because we have experience with real world objects, and know, whether consciously or not, what things should look like, generally. Because this algorithm operates with no prior information at all, it suggests that our intuitions for structure might have an even more primordial origin that goes beyond experience, and is instead rooted in how objects in the natural world are generally organized.

This algorithm is yet another example of the utility of my approach to artificial intelligence, which is to make theoretical assumptions based upon information theory about what should happen given a particular input, and proceed mechanistically without a dataset, with the expectation that what is generated should be useful and accurate. Interestingly, this particular algorithm can be fairly characterized as a matrix algebra algorithm, since it swaps entries in a matrix according to a simple closed form formula. As a result, this algorithm has more in common with Gaussian Elimination than a typical machine learning algorithm.

In this note, I’m going to apply this algorithm only to images, but I suspect it has more general applications in information recovery. In particular, I think this algorithm could be used to reassemble not only images, but 3D objects, and though not an area of my personal interest, it seems like it could also have applications in code-breaking and DNA analysis.

Partitioning an Image

Previously, I’ve discussed my image partition algorithm, which makes use of assumptions based in information theory in order to partition an image into objectively distinct features, without any prior information (i.e., without a dataset). You can read about this algorithm, and others, in my working paper, “A New Model of Artificial Intelligence“.

In short, the image partition algorithm breaks an image into regions that have maximally distinct amounts of color information, as measured by the entropy of the color distribution in each region. The result is a very fast edge-detection algorithm that can then be used to extract shape information, and color information, which can in turn facilitate object recognition and image classification. For more on this, see my working paper, “A New Model of Artificial Intelligence: Application to Data II“.

At a high level, the image partition algorithm takes a random image and asks, even though I know nothing about this image, where do I expect to find the edges of the objects in the image? This first example above is a picture of a bird that I found online, and on the right hand side, you can see how the partition algorithm extracts the shape and edge information from the original image. The image on the right consists of all points in the original image that the algorithm believed to be part of the foreground of the original image.

The second example is a photograph I took in Williamsburg, Brooklyn, of a small brick wall that a street artist painted red, and wrote a message on. The image on the left is the original image, and the image on the right is the image after being processed by the partition algorithm, which also produces a series of weights, based upon whether the algorithm thinks the region in question is a foreground feature. In this case, I’ve applied a “macro” version of the partition algorithm that searches for macroscopic objects, such as the wall. The darkness of a region indicates the probability of the region being a foreground feature, with darker regions less likely to be part of the foreground of the image. These weights are not relevant for purposes of this note, but they help to distinguish between the regions in the image identified by the partition algorithm, and demonstrate the algorithm’s ability to locate macroscopic boundaries of objects.

The algorithm I’ll present in this note takes a random image and asks, even though I know nothing about the original state of the image, what was it supposed to look like?

Measuring Local Consistency

Standard Deviation and Average Color

The fundamental observation that underlies the image reassembly algorithm is that when an image is in its original state, colors are generally locally consistent. I’ve deliberately selected this example since it features a bright red wall, green grass, and a blue sky, with each more or less sequestered from the other, which exaggerates this general principle.

If we permute the regions in this image, we’re going to create greater local variation in color, but we’re not going to affect the standard deviation of the colors. This might seem like a trivial observation, but the implication is that permuting the regions adds disorder that is not measured by the standard deviation of the colors.

Having made this observation, I set out to measure what it is that’s changing as we permute the regions in an image. What I’ve come up with is a tractable, and useful measure of local color consistency that also serves as the measure that ultimately allows the reassembly algorithm to function. At a high level, the algorithm swaps regions in the scrambled image, and tests whether the local color consistency score increased or decreased as a result of the swap, and proceeds mechanistically with the goal of maximizing local color consistency. The images above show the average color of each region in the original image, the image after being scrambled 25 times, and a graph showing the total color consistency score as a function of the number of scrambles (i.e., starting with the original image and randomly swapping regions).

Color, Distinction, and Complexity

As noted, at a high level, the reassembly algorithm operates by maximizing the consistency between the colors contained within two regions. This requires us to measure the similarity between sets of colors, which in turn requires that we measure the similarity between individual colors.

Because colors are typically represented as RGB vectors, it’s both natural and convenient to measure the difference between two colors using the norm of the difference between their corresponding color vectors. However, for purposes of the reassembly algorithm, we’ll also have to be able to say whether we should distinguish between two colors. That is, we’ll have to develop a binary test that determines whether or not two colors are similar enough to constitute a “match”.

This question cannot be answered by looking only to the norm of the difference between two color vectors, since the norm operator will produce only a measure of difference. That is, on its own, this difference cannot tell us whether we should distinguish between the two colors. This is essentially the same question that I answered in the development of all of my algorithms, which is, given the context, what is the minimum sufficient difference between two objects that justifies distinguishing between them?

Distinction is what drives complexity, both as a practical, and theoretical matter. That is, the more granular our distinctions are, the more complexity any observed object will have, and the less we distinguish, the less complexity any observed object will have.  Though the actual algorithmic complexity (i.e., Kolmogorov Complexity) of an object is not computable, we can get close enough using the Shannon entropy, which is a built-in function in both Matlab and Octave.

As a result, we can find a useful answer to the question of what minimum difference justifies distinguishing between two objects in a dataset by iterating from \delta = 0 up to the standard deviation of the dataset, and measuring the rate of change in the entropy of the structure generated by each value of \delta (i.e., each level of distinction). For example, if \delta = 0, then any difference between two objects will cause us to distinguish between them. Specifically, if \delta = 0, and the norm of the difference between two color vectors is non-zero, then we will distinguish between those two colors.

The core observation that underlies nearly all of my work in artificial intelligence is that the value of \delta that generates the greatest change in the measured structure of the object is the value of \delta at which the greatest change in the structure of the object became apparent. That is, as we iterate through values of \delta, there will be some value that maximizes the rate of change in the entropy of the object, which is the point at which the measured structure of the object changed the most.

Imagine adjusting the focal point of a camera lens – at all focal points before and after the correct focal point, the image is simply blurry, suggesting that the rate of change in structure is probably roughly the same on either side, until you approach the actual focal point, when suddenly, a drastic change in perceived structure occurs. Though this is not a mathematically precise analogy, it is useful for establishing an intuition for why this approach works as a general matter.

The “Delta-Intersection” of Sets

The algorithm measures the similarity between two regions in an image by counting the intersection of the sets of colors contained in the two regions. Ordinarily, intersection is measured by counting the number of common elements contained in two sets. In this case, we treat two colors as “the same” for purposes of calculating the intersection, so long as the norm of the difference between their respective color vectors is less than \delta. That is, we test whether the norm of the difference is less than \delta, rather than testing for equality, which is the ordinary way of calculating intersection. This is done to address the practical reality that because RGB encoding allows for an enormous number of colors, we’re unlikely to find a substantial intersection between two regions if we test for equality between colors.

The object whose complexity we’re measuring is in this case a matrix that contains the intersection count between a given region and its four neighbors (up, down, left, and right). That is, each region in the image can be associated with a row and a column index (i,j), and this matrix contains the total number of colors that the (i,j)-region has in common each of its neighbors, as measured by the \delta-intersection operator. That is, the (i,j) entry of the matrix contains the sum of the \delta-intersections between the (i,j) region and each of its four neighbors (i.e., we ignore the diagonals).

Note that as we ratchet up the value of \delta, more colors will qualify as the same, increasing the intersection count between all regions. The algorithm will find the optimum value of \delta that maximizes the rate of change in the entropy of the matrix.

In calculating the entropy of the matrix, we treat the intersection count in each entry as a frequency, divide by the total intersection count across all entries, and then treat the resulting set of numbers as a probability distribution. This means that as the intersection count becomes more uniform across the image, the entropy of the matrix will increase. In turn, this means that images that are uniformly color-consistent will have a high entropy, whereas images that have pockets of highly consistent regions, and other pockets of less consistent regions, will have lower entropy. This is similar to how entropy is used generally, as a measure of dispersion, but in this case, the quantity whose dispersion we are measuring is the intersection count as a portion of the total intersection count over the entire matrix.

The results of the algorithm of course depend upon the complexity of the image, and as the example above shows, it does produce errors with more complex scenes, and fails to recognize some of the obvious outlier regions. Nonetheless, it clearly uncovers a significant portion of the original structure of the image, and moreover, I expect to be able to improve its performance without significantly impacting runtime by tweaking some of the underlying variables, such as the number of colors it samples in each region.

After I’m convinced that I’ve maximized its performance, I’m going to follow up with a more formal working paper that measures the performance of the algorithm, and possibly, present other applications of this algorithm outside of image reconstruction. But for now, I can say that it is a very low-degree polynomial runtime algorithm (I believe O(M^3), where M is the number of regions the image is broken into) that performs well, and can be run on cheap consumer devices.

I’ll also follow up with a new image partition algorithm that makes use of the \delta-intersection.

The relevant Matlab / Octave code can be found in the Algorithms tab (see, “unsup_image_rebuild_fast”). Subroutines can be found by typing the function name in the search field on my code bin.


Angular Momentum and Power Generation

Every generator I’ve ever seen is housed in a fixed mount, and uses a rotating magnet in a column, powered by something like combustion, or wind, to generate a current through Faraday induction.

But if you have rotational motion, you get spin acceleration that is orthogonal to the original rotational motion “for free”:

I was wondering if there are generators that use floating columns to take advantage of this additional acceleration. If not, why? Is it not enough acceleration to justify the added complexity?

It’s a weird property as a general matter, and it depends upon the mass of the rotating object, even if less than all of the mass is rotating. E.g., if you hang a weight from a rotating wheel, the wheel spins faster, even though the weight isn’t adding to the rotation of the wheel at all.

It seems to me that, as a result, something like a gyroscopic cage for a generator would allow for significant additional acceleration.

Though generally unrelated, today is also Richard Feynman’s birthday, who I’ve always looked up to as a role model in terms of his delivery – compression is consideration, for otherwise you’re making your audience do the work of untangling your message.


A Unified Model of the Gravitational, Electrostatic, and Magnetic Forces

This is a research note I wrote a while back that presents a unified model of the Gravitational, Electrostatic, and Magnetic Forces, each rooted in my model of physics, that is in turn based upon information theory and computer theory. I clarified these ideas in a follow up paper (available here), but the equations and concepts remain generally unchanged.

A Unified Model of the Gravitational, Electrostatic, and Magnetic Forces

Thought Experiment on the Cantor Set as a Frequency

Imagine we had a line segment with a finite length, and then iteratively extracted the center \frac{1}{3}; and then extracted the center \frac{1}{3} of the two resulting segments, and so on.

The set of points that remains after this process is of course the Cantor Set:

Note that this set will be bounded by the two end points of the original line segment.

Now imagine that we gave the line segment some velocity, sending the entire Cantor Set in between, in tact, through space. Further, imagine that we had a sensor at a fixed point along the path of the set that lights up every time a point in the set crosses the sensor.

Because there are a countable number of gaps in the line segment, the light will blink on and off with some frequency, an infinite number of times. The signal generated will depend upon both the gaps in the set, and the velocity of the line segment.

Also note that the amount of time it takes for the line segment to cross the sensor is given by t = \frac{L}{v}, where L is the length of the segment, and v is the velocity of the segment. Because L and v are both finite, t is finite.

Now imagine that we have two such line segments S_1 and S_2, both of length L, but that S_1 is travelling with a faster velocity of v_1 > v_2. Because v_1 > v_2, it will take less time for S_1 to cross the sensor, causing the sensor to be triggered for a shorter amount of time by S_1 than S_2.

For example, the length of the gap in the middle (the largest gap initially removed) has a length of \frac{L}{3}. The amount of time it takes for this gap to cross the sensor is \frac{L}{3v}, which will obviously depend upon the velocity of the segment. If we assume that the light turns off once the sensor hits this gap, then the amount of time the light is off during this gap will vary with the velocity of the segment.

The same will be true of all gaps in the set.

This implies that the signal generated by S_1 is objectively distinguishable from the signal generated by S_2, despite the fact that both will cause the sensor to trigger an infinite number of times.

This same thought experiment works with any bounded set that has a countable number of “holes” in it.

Note that this (admittedly theoretical) hypothetical suggests that an infinite signal can be conveyed in a finite amount of time.

Unsupervised 3D Feature Extraction and Edge Detection Algorithm

In this note, I’ll present an unsupervised algorithm that can extract three-dimensional features from an ordinary two-dimensional image, and detect edges within the image, thereby extracting two-dimensional shape information, in each case in polynomial time.

A research note explaining the algorithm, together with the code, is available on my researchgate homepage here, and I’ve also attached a PDF.

The relevant scripts are also attached as PDFs below. Any missing scripts can be found in previous posts, and in the “Algorithms” tab.