Friday, July 28, 2006

perl's intermediate form

Quick: what does perl convert Perl into before running it? A syntax tree, or a series of bytecodes?

The answer is mu, which according to my interpretation of Pirsig's explaination in ZMM means "your question assumes an answer space which is too narrow". Actually, the intermediate form is a tree of bytecodes! Read `perldoc perlhack` and `perldoc perlguts` and see for yourself. The tree comes in handy for modeling hierarchies of lexical scope: just propagate context downward. It also comes in handy for constant folding, i.e., replacing an operation with its result at compile time: replace the operation node with a constant node, and eliminate all of its children. After the optimization steps, perl actually runs the tree by executing a given node and then dereferencing the node's pointer to the next node for execution, regardless of the location of each bytecode in the tree. The bytecodes are just indexes to a table of all the C implementations for each Perl operation.

Say what you like about Perl's syntax and semantics, but some very smart people have worked on the implementation; its regular expression engine alone is known for being without equal. A good related link is Squawks of the Parrot, the blog of Dan Sugalski, who worked on Parrot, the interpreter/VM intended for Perl 6 but also any other similar language. His archives are well categorized, and he wrote about such high-level concepts as continuations, co-routines, garbage collection, etc. from the perspective of a VM designer. Explanations at that level are pretty fascinating, especially when Dan writes so clearly. If you've ever programmed in assembly at all, you should be able to understand. (Actually, since he's often writing about the Parrot VM and not something like x86, it may be easier to comprehend!).

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.