My Letter to the Congressional Artificial Intelligence Caucus

Dear Members of Congress,

I’m a machine learning engineer, and I’m emailing you today because my research shows unambiguously that small scale, cheap devices can be easily turned into potentially dangerous machines using algorithms that require relatively few lines of code to function.

Similar algorithms can likely be hidden on consumer devices, since they require so little power and memory to function.

Attached is a short paper that shows how this can be done using an ordinary PC, including all of the code necessary to run the algorithms, and see for yourself that this is in fact the case.

Prior to working in artificial intelligence, I was a derivatives lawyer at BlackRock, Inc., McDermott Will & Emery LLP, and an author for The Atlantic. I was a mathematician prior to practicing law.

Though the motivation for my research is commercial, I thought it would be reckless for me not to bring these algorithms to the attention of a responsible party in government. Since there is no clear regulator in this space that I’m aware of, I thought the Members copied on this email would be a good starting point.

As a general matter, my research shows that artificial intelligence is probably far more powerful than the government and public realize. The bottom line is that malicious actors could use these techniques to do serious harm to the American people, and our infrastructure, using small devices.

The full suite of my research papers on artificial intelligence is available here:

https://www.researchgate.net/profile/Charles_Davi

I would be happy to answer any questions you might have.

Best Regards,

Charles Davi

ATTACHED: Autonomous_Real_Time_AI

Advertisements

Generating Independent Predictions

Attached is a script that calls my real-time prediction functions to generate a series of independent predictions.
This is probably a solid base for applying other machine learning techniques to extract more information from these predictions.

Autonomous Real-Time Deep Learning

In a previous paper, I introduced a new model of artificial intelligence rooted in information theory that can solve deep learning problems quickly and accurately in polynomial time. In this paper, I’ll present another set of algorithms that are so efficient, they allow for real-time deep learning on consumer devices. The obvious corollary of this paper is that existing consumer technology can, if properly exploited, drive artificial intelligence that is vastly more powerful than traditional machine learning and deep learning techniques.

Link here: https://www.researchgate.net/publication/335870861_Autonomous_Real-Time_Deep_Learning

PDF: Autonomous Real-Time Deep Learning.

A New Model of Artificial Intelligence: II

In the next few days I’m going to assemble all of the work I’ve done since publishing my first paper into a single comprehensive paper, both for my own clarity of thought, and to allow for readers to get a better understanding of the work I’ve done. In particular, I’m going to cover all my recent work on real-time AI.

I’ll also discuss the various measures of distinction I’ve made use of, such as Euclidean distance, intersection, and other operators, as well as introduce new methods I’ve developed, where we can generate different values of delta for each dimension, and then count the number of dimensions in two vectors that are within the applicable delta of each other, as well as filter out dimensions. I’ll also discuss the state-space navigation and function optimization algorithms I published, but never discussed.

In short, there are many algorithms I’ve published but not discussed, and an enormous amount of work I’ve yet to even publish. My goal is to select the very best of this work, and present it in a single comprehensive paper that will hopefully revolutionize the field.

Generating Novel Data

Attached is an algorithm that can quickly take a dataset of videos, and generate a new sequence of frames from those videos that looks very realistic. In short, given a dataset of videos, this algorithm can generate a new video, with frames taken from each of the original videos, and reassemble them in a manner that looks very realistic. I’m still tweaking this, but it works, and the results are quite cool.

The concept is more general, and I plan on using it to produce novel images given datasets of images, in particular, datasets of paintings. This approach appears to be a very fast substitute for the types of composite images that are generated by neural networks. On the dataset below, this algorithm runs in about 2 seconds.

The image files for the video I used can be found on dropbox.

The actual command line script is below:

Imitate_Data_NN

Regrettably, I’m having some issues with my researchgate page, and it looks like all attachments have simply disappeared. A decent amount of my code is still available on this blog, but if you need a particular script, feel free to send me a message.

Ayin Real-Time Engine

Attached is an algorithm that does real-time deep learning.

The “observations” input is the dataset. The observations dataset is meant to simulate a buffer, and as data is read from it, that data is used to make predictions, and build a training dataset, from which predictions are generated. However, once the accuracy of the predictions meets or exceeds the value of “threshold”, the algorithm stops learning and only makes predictions. If the accuracy drops below threshold, then learning begins again. “N” is the dimension of the dataset, above which data is ignored.

Also attached is a command line script demonstrating how to use the algorithm.

Ayin_RTE

find_NN

Ayin_CMNDLINE

Finally, here’s related code that does real-time video classifications:

Real_Time_Video_CMNDLINE

The image files for the video training example can be found on dropbox.

Real-Time Autonomous Video Classification

Below is an algorithm that does real-time video classification, at an average rate of approximately 3 frames per second, running on an iMac.

Each video consists of 10 frames of HD images, roughly 700 KB per frame. The individual unprocessed frames are assumed to be available in memory, simulating reading from a buffer.

This algorithm requires no prior training, and learns on the fly as new frames become available.

The particular task solved by this algorithm is classifying the gestures in the video:

I raise either my left-hand or my right-hand in each video.

The accuracy is in this case 95.455%, in that the algorithm correctly classified 42 of the 44 videos.

Though the algorithm is generalized, this particular instance is used to do gesture classification in real-time, allowing for human-machine interactions to occur, without the machine having any prior knowledge about the gestures that will be used.

That is, this algorithm can autonomously distinguish between gestures in real-time, at least when the motions are sufficiently distinct, as they are in this case.

This is the same classification task that I presented here:

https://www.researchgate.net/publication/335224609_Autonomous_Deep_Learning

The only difference is that in this case, I used the real-time prediction methods I’ve been working on.

The image files from the video frames are available here:

https://www.dropbox.com/s/9qg0ltm1243t7jo/Image_Sequence.zip?dl=0

Though there is a testing and training loop in the code, this is just a consequence of the dataset, which I previously used in a supervised model. That is, predictions are based upon only the data that has already been “observed”, and not the entire dataset.

Note that you’ll have to adjust the file path and number of files in the attached scripts for your machine.

The time-stamps printed to the command line represent the amount of time elapsed per video classification, not the amount of time elapsed per frame. Simply divide the time per video by 10 to obtain the average time per frame.

Code here:

https://www.researchgate.net/project/Information-Theory-SEE-PROJECT-LOG/update/5d740d823843b0b9826313af

Real-Time Function Prediction

Below is a script that allows for real-time function prediction.

Specifically, it can take in a training set of millions of observations, and an input vector, and immediately return a prediction for any missing data in the input vector.

Running on an iMac, using a training set of 1.5 million vectors, the prediction algorithm had an average run time of .027 seconds per prediction.

Running on a Lenovo laptop, also using a training set of 1.5 million vectors, the prediction algorithm had an average run time of 0.12268 seconds per prediction.

Note that this happens with no training beforehand, which means that the training set can be updated continuously, allowing for real-time prediction.

So if our function is of the form z = f(x,y), then our training set would consist of points over the domain for which the function was evaluated, and our input vector would be a given (x,y) pair within the domain of the function, but outside the training set.

I’ve attached a command line script that demonstrates how to use the algorithm, applying it to a sin curve in three-space (see “9-6-19NOTES”).

Code available here:

https://www.researchgate.net/project/Information-Theory-SEE-PROJECT-LOG/update/5d72d55f3843b0b98262f6f8

Prometheus Express

I’ve developed a new version of Prometheus that produces nearly instantaneous classifications running on an ordinary consumer device. It’s based upon a real-time learning engine that I’m in the process of developing.

It’s also a GUI-based application: you simply point, click, and classifications are automatically generated.

Download Prometheus Express.

Real-time Clustering

I’ve developed an algorithm that can generate a cluster for a single input vector in a fraction of a second. See “no_model_optimize_cluster” using the link below.
This will allow you to extract items that are similar to a given input vector from a dataset without any prior training, basically instantaneously.
Further, I presented a related hypothesis that there is a single objective value that warrants distinction for any given dataset in this research note:
To test this hypothesis again, I’ve also included a script that repeatedly calls the clustering function over an entire dataset, and measures the norm of the difference between the items in each cluster.
The resulting difference appears to be very close to the value of delta generated by my categorization algorithm, providing further evidence for this hypothesis.
The code is available here: Real-time Clustering