Submit your pics for ‘forever’ storage in DNA

(Credit: Getty Images)

Researchers are looking to collect 10,000 original images from around the world to preserve them indefinitely in synthetic DNA, which holds promise as a revolutionary storage medium that lasts much longer and is many orders of magnitude denser than current technologies.

What would you pick? A picture of your family, an endangered landscape, a page of poetry, or a snapshot that sends a message to the future?

“It’s your turn to show us what should be preserved in DNA forever.”

The team has already encoded important compositions in DNA molecules, including the Universal Declaration of Human Rights, the top 100 books of Project Gutenberg, songs from the Montreux Jazz Festival, and an OK Go video.

The #MemoriesInDNA Project invites the public to submit original photographs that they’d like to see preserved in DNA for millennia. The images—which you can upload at the project website—will be encoded in synthetic DNA and made available to researchers worldwide. The researchers also are encouraging people to share their images on social media with the hashtag #MemoriesInDNA and include a story about why the photograph or video is important to them.

“It’s your turn to show us what should be preserved in DNA forever,” says Luis Ceze, professor in the University of Washington’s Paul G. Allen School of Computer Science & Engineering. “We want people to go out and take a picture of something that they want the world to remember—it’s a fun opportunity to send a message to future generations and help our research in the process.”

DNA data storage has emerged as a potential solution to bridge the growing gap between the amount of digital data generated today—by everything from commercial video to space imagery to medical records—and our ability to affordably and efficiently store that data.

Unlike data centers, which require acres of land and account for nearly 2 percent of the total electricity consumption in the United States, DNA molecules can store information millions of times more compactly. The basic process converts the strings of ones and zeroes in digital data into the four basic building blocks of DNA sequences—adenine, guanine, cytosine, and thymine. It employs synthetic DNA molecules created in a lab, not living DNA.

The team of University of Washington computer scientists and electrical engineers and Microsoft researchers of the Molecular Information Systems Lab, and working with the manufacturers of the synthetic DNA, Twist Bioscience, holds the current world record for the amount of data stored in DNA. So far they have been able to encode photographic images and video in DNA and retrieve and convert those individual molecular “files” back into digital data.

Their next challenge involves exploring how to perform meaningful data processing directly in DNA—without having to convert the images back into their electronic form.

“Let’s suppose you have a trillion images encoded in DNA and want to find all the photographs that have a red car in them, or to find out whether a person’s face exists in those images,” says Ceze. “We want to be able to do that information processing in DNA directly—to search in a smart way and make the molecules themselves carry out that computer vision work.”

The team will encode approximately 10,000 of the crowdsourced images in manufactured snippets of DNA. The researchers’ approach to searching images directly in DNA relies on the fact that certain nucleotides stick to others—A binds to T and C binds to G.

“We will use neural networks to explore ways to classify visual patterns in the images and video that we encode in DNA.”

They can introduce strips of DNA into the solution that contains a coded “query”—essentially, a string of complementary DNA that causes all photographs with a red car or certain facial features or whatever meets the criteria of the query to bind to it. By attaching magnetic nanoparticles to the query DNA, they can use a magnet to pull out all the similar images that have stuck to it.

“Having a set of diverse images from around the world will help us invent new ways to make molecules work with each other to carry out these computations directly,” says Microsoft partner architect and collaborator Douglas Carmean.

The team will employ machine learning to devise methods to map and encode all the visual features contained in a photograph—such as colors, curves, lines, and objects—in DNA. The main challenge is doing that in a way that allows scientists to extract similar things and perform meaningful data processing.

“We will use neural networks to explore ways to classify visual patterns in the images and video that we encode in DNA,” says Georg Seelig, associate professor of electrical engineering and in the Allen School. “For example, are there more red cars than blue cars in a photograph? Or are there people riding bicycles?”

More than half of all selfies fall into this category

“With proof-of-concept achieved for DNA as a digital data storage media, we are working to drive down the cost of synthesizing DNA to enable its potential as a widely-available commercial solution for the growing body of precious data in digital format, such as archival data, financial, and health record backups, and all long-term data retention where current media is not practical,” says Emily M. Leproust, CEO of Twist Bioscience. “MemoriesInDNA is a fabulous project to showcase the technological, scientific and cultural importance of DNA worldwide and we look forward to our role in this historic event.”

#MemoriesInDNA will provide an important library of images to be encoded in a separately funded project that the Defense Advanced Research Projects Agency (DARPA) Molecular Informatics program supports.

To be included in the DNA image collection, photographs cannot be under copyright by any other party and must be free of violent or inappropriate content. The image dataset will be preserved in DNA indefinitely and shared with researchers worldwide. For more details about how to upload and share images, visit the #MemoriesInDNA Project website.

Source: University of Washington