quines

The basic idea is this: It is impossible (in most programming languages) for a program to manipulate itself (i.e. its textual representation — or a representation from which its textual representation can be easily derived) directly.

So to make this possible anyway, we write the build the program from two parts, one which call the code and one which we call the data. The data represents (the textual form of) the code, and it is derived in an algorithmic way from it (mostly, by putting quotation marks around it, but sometimes in a slightly more complicated way). The code uses the data to print the code (which is easy because the data represents the code); then it uses the data to print the data (which is possible because the data is obtained by an algorithmic transformation from the code).

This idea is summarized by the sentence “quine ‘quine’”. Here, the verb to quine (invented by Douglas R. Hofstadter) means “to write (a sentence fragment) a first time, and then to write it a second time, but with quotation marks around it” (for example, if we quine “say”, we get “say ‘say’”). Thus, if we quine “quine”, we get “quine ‘quine’”, so that the sentence “quine ‘quine’” is a quine… In this linguistic analogy, the verb “to quine”, plays the role of the code, and “quine” in quotation marks plays the role of the data.

We will henceforth use the words “code” and “data” a lot, to designate the code and data parts of the quine as just explained.

If we are to take an analogy with cellular biology (thanks to Douglas Hofstadter again), what I have called the “code” would be the cell, and the “data” would be the cell's DNA: the cell is able to create a new cell using the DNA, and this involves, among other things, replicating the DNA itself. So the DNA (the data) contains all the necessary information for the replication, but without the cell (the code), or at least some other code to make the data live, it is a useless, inert, piece of data.

Note how the data may contain (depending on how it's interpreted) bits that aren't used to write the code, but are still copied when the data is written on the output. Such bits are called introns, in analogy with the parts of the genetic code which aren't used to produce proteins. The example we gave above had an intro (the string sx), clearly marked as such. Quite obviously an intron can be modified with great ease; it is a kind of subliminal information that is reproduced with the quine, although it is not necessary to the quine. The possible existence of introns will be the key feature making multi-quines (something we will talk about later) possible.

via madore.org.