Searching for extraterrestrial intelligence in our own genome

What?

In 2017, after narrowly missing out on a job at a genomic research company, a curious idea popped into my mind: what if someone had left us a signal in our own genome? I assume that the idea was a remnant of my favorite movie, 2001: A Space Odyssey, which (spoiler alert!) posits an alien intelligence modifying early humans to make them smarter, and then leaving us a signal that we could not discover until we managed to colonize the Moon (predicted, in that year before the first Moon landing, to happen by 2001).

Sally and I had never watched Ancient Aliens at that time, and I wasn’t even aware of a 2013 paper that had claimed to find an intelligent signal in our genome. It just seemed like an interesting idea.

I decided that the logic of some of the early SETI searches should also apply to this hypothesis: the simplest and most unambiguous way for an alien intelligence to leave us such a signal would be for them to encode the universal constant π into our genome somewhere, mapping each of the four bases to one of the four dibits (two-bit integers). There are 24 possible was to perform this mapping, and you can read the genome in either direction, so there are 48 possible mappings to check.

I wrote some rudimentary programs to search for π in the reference human genome, and satisfied myself that the results were no more statistically significant than they would be for searching for a random string of bits rather than π. A nice idea, but no cigar.

In 2024 I was again thinking about this idea. I decided that my 2017 search was indicative, but not really comprehensive, for two reasons. First, by searching for as many successive bits of π as I could, I was not taking into account the possibility that some of the bases may have been corrupted in the reference human genome, either through mutation or due to the fact that the reference genome is in some sense an “average” over all humans, which could be particularly problematical if the signal were contained in the “junk” (now called “non-coding”) parts of the genome. Second, I hadn’t even considered the real possibility that such an alien intelligence might have stripped off the leading 3 of π and just encoded the fractional part, which I would have missed completely.

I decided to relaunch this personal project and fix both of these flaws. To this end, in September 2024 Sally and I each purchased 100X Whole Genome Sequencing from Nebula Genomics, so that I could run my search against relatively accurate genomic sequences for at least two particular humans, rather than just on the reference genome. We received our data about six weeks later. I also decided to figure out a way to search for π that would be robust against a moderate number of leading bits being missing, and robust against some of the bits being corrupted by substitution mutations, but would still allow me to make an analysis of the likelihood that the results were statistically significant.

This project sat on the backburner until I had some time to kick it off in April 2025. Then, in June 2025, Sally and I happened to watch the 2018 episode of Ancient Aliens where the 2013 paper was discussed. In that instant I realized that my backburner project was not just a personal academic curiosity, but could in fact provide an important test for an open and contentious scientific claim. I decided to immediately ramp up the project, and to release my results publicly, no matter what I found. I therefore wrote up almost all of the paper below, and almost all of this page, before actually performing the experiment.

To cut to the chase: I again found no evidence for such a signal in the human genome.

Paper

In August 2025 I had nearly finished writing up the paper, but I realized that I needed to refactor my code to remove the large placeholder blocks of N codes in the reference genome, as they were destroying my ability to perform a clean statistical analysis.

While refactoring that code, I took a small diversion into studying the raw sequence data itself. To avoid delaying this almost-complete project, I decided to put up an incomplete version of the paper:

In October 2025 I realized that I had incorrectly processed the reverse complement reads in the Nebula Genomics data. That mistake does not affect this project, because I explicitly search in both directions and for all possible mappings of bases to bits. But it does mean that I need to correct my codebase before completing the paper and making my codebase publicly available.

I intend to complete the above paper using the corrected codebase some time in 2026.

Code

I performed the experiments described in the paper above using simple code that I wrote in ANSI C. As noted above, in October 2025 I realized that I had incorrectly processed the Nebula Genomics data, and needed to fix my codebase. I plan to release that codebase here some time in 2026.

Disclaimers

This page describes personal hobby research that I have undertaken since 2017. All opinions expressed herein are mine alone. All code provided here is from my personal codebase, and is supplied under the MIT-0 License.