Molecular computing is an emerging field to which chemistry, biophysics, molecular biology, electronic engineering, solid state physics and computer science contribute to a large extent. It involves the encoding, manipulation and retrieval of information at a macromolecular level in contrast to the current techniques, which accomplish the above functions via IC miniaturization of bulk devices. The biological systems have unique abilities such as pattern recognition, learning, self-assembly and self-reproduction as well as high speed and parallel information processing. The aim of this article is to exploit these characteristics to build computing systems, which have many advantages over their inorganic (Si,Ge) counterparts.
DNA computing began in 1994 when Leonard Adleman proved thatDNA computing was possible by finding a solution to a real- problem, a Hamiltonian Path Problem, known to us as the Traveling Salesman Problem,with a molecular computer. In theoretical terms, some scientists say the actual beginnings of DNA computation should be attributed to Charles Bennett's work. Adleman, now considered the father of DNA computing, is a professor at the University of Southern California and spawned the field with his paper, "Molecular Computation of Solutions of Combinatorial Problems." Since then, Adleman has demonstrated how the massive parallelism of a trillion DNA strands can simultaneously attack different aspects of a computation to crack even the toughest combinatorial problems.
Adleman's Traveling Salesman Problem:
The objective is to find a path from start to end going through all the points only once. This problem is difficult for conventional computers to solve because it is a "non-deterministic polynomial time problem" . These problems, when they involve large numbers, are intractable with conventional computers, but can be solved using massively parallel computers like DNA computers. The Hamiltonian Path problem was chosen by Adleman because it is known problem.
The following algorithm solves the Hamiltonian Path problem:
1.Generate random paths through the graph.
2.Keep only those paths that begin with the start city (A) and conclude with the
end city (G).
3.If the graph has n cities, keep only those paths with n cities. (n=7)
4.Keep only those paths that enter all cities at least once.
5.Any remaining paths are solutions.
The key was using DNA to perform the five steps in the above algorithm. Adleman's first step was to synthesize DNA strands of known sequences, each strand 20 nucleotides long. He represented each of the six vertices of the path by a separate strand, and further represented each edge between two consecutive vertices, such as 1 to 2, by a DNA strand which consisted of the last ten nucleotides of the strand representing vertex 1 plus the first 10 nucleotides of the vertex 2 strand. Then, through the sheer amount of DNA molecules (3x1013 copies for each edge in this experiment!) joining together in all possible combinations, many random paths were generated. Adleman used well-established techniques of molecular biology to weed out the Hamiltonian path, the one that entered all vertices, starting at one and ending at six. After generating the numerous random paths in the first step, he used polymerase chain reaction (PCR) to amplify and keep only the paths that began on vertex 1 and ended at vertex 6. The next two steps kept only those strands that passed through six vertices, entering each vertex at least once. At this point, any paths that remained would code for a Hamiltonian path, thus solving the problem.
BIO MOLECULAR COMPUTING
Biomolecular computing, ‘computations performed by biomolecules’, is challenging traditional approaches to computation both theoretically and technologically. Often placed within the wider context of ‘natural’ or even ‘unconventional’ computing, the study of natural and artificial molecular computations is adding to our understanding both of biology and computer science well beyond the framework of neuroscience. The papers in this special theme document only a part of an increasing involvement of Europe in this far reaching undertaking. In this introduction, I wish to outline the current scope of the field and assemble some basic arguments that biomolecular computation is of central importance to both computer science and biology. Readers will also find arguments for not dismissing DNA Computing as limited to exhaustive search and for a qualitatively distinctive advantage over all other types of computation including quantum computing.
The idea that molecular systems can perform computations is not new and was indeed more natural in the pre-transistor age. Most computer scientists know of von Neumann’s discussions of self-reproducing automata in the late 1940s, some of which were framed in molecular terms. Here the basic issue was that of bootstrapping: can a machine construct a machine more complex than itself?
Important was the idea, appearing less natural in the current age of dichotomy between hardware and software, that the computations of a device can alter the device itself. This vision is natural at the scale of molecular reactions, although it may appear utopic to those running huge chip production facilities. Alan Turing also looked beyond purely symbolic processing to natural bootstrapping mechanisms in his work on self-structuring in molecular and biological systems. Purely chemical computers have been proposed by Ross and Hjelmfelt extending this approach. In biology, the idea of molecular information processing took hold starting from the unraveling of the genetic code and translation machinery and extended to genetic regulation, cellular signaling, protein trafficking, morphogenesis and evolution - all of this independently of the development in the neurosciences. For example, because of the fundamental role of information processing in evolution, and the ability to address these issues on laboratory time scales at the molecular level, I founded the first multi-disciplinary Department of Molecular Information Processing in 1992. In 1994 came Adleman’s key experiment demonstrating that the tools of laboratory molecular biology could be used to program computations with DNA in vitro. The huge information storage capacity of DNA and the low energy dissipation of DNA processing lead to an explosion of interest in massively parallel DNA Computing. For serious proponents of the field however, there really never was a question of brute search with DNA solving the problem of an exponential growth in the number of alternative solutions indefinitely. In a new field, one starts with the simplest algorithms and proceeds from there: as a number of contributions and patents have shown, DNA Computing is not limited to simple algorithms or even, as we argue here, to a fixed hardware configuration.
After 1994, universal computation and complexity results for DNA Computing rapidly ensued (recent examples of ongoing projects here are reported in this collection by Rozenberg, and Csuhaj-Varju). The laboratory procedures for manipulating populations of DNA were formalized and new sets of primitive operations proposed: the connection with recombination and so called splicing systems was particularly interesting as it strengthened the view of evolution as a computational process. Essentially, three classes of DNA Computing are now apparent: intramolecular, intermolecular and supramolecular. Cutting across this classification, DNA Computing approaches can be distinguished as either homogeneous (ie well stirred) or spatially structured (including multi-compartment or membrane systems, cellular DNA computing and dataflow like architectures using microstructured flow systems) and as either in vitro (purely chemical) or in vivo (ie inside cellular life forms). Approaches differ in the level of programmability, automation, generality and parallelism (eg SIMD vs MIMD) and whether the emphasis is on achieving new basic operations, new architectures, error tolerance, evolvability or scalability. The Japanese Project lead by Hagiya focuses on intramolecular DNA Computing, constructing programmable state machines in single DNA molecules which operate by means of intramolecular conformational transitions. Intermolecular DNA Computing, of which Adleman's experiment is an example, is still the dominant form, focusing on the hybridization between different DNA molecules as a basic step of computations and this is common to the three projects reported here having an experimental component (McCaskill, Rozenberg and Amos). Beyond Europe, the group of Wisconsin are prominent in exploiting a surface based approach to intermolecular DNA Computing using DNA Chips. Finally, supramolecular DNA Computing, as pioneered by Eric Winfree, harnesses the process of self-assembly of rigid DNA molecules with different sequences to perform computations. The connection with nanomachines and nanosystems is then clear and will become more pervasive in the near future.