Ruby has anonymous code blocks, Python doesn't.
Anonymous code blocks are (apparently) an important feature in implementing DSLs, much touted by Ruby protaganists.
As far as I can tell, the major difference between code block in Ruby and functions in Python is that code blocks are executed in the current scope. You can rebind local variables in the scope in which the code block is executed. Python functions have a lexical scope, with the execption that you can't rebind variables in the enclosing scope.
Note
It turns out that this is wrong. Ruby code blocks are lexically scoped like Python functions. This article is really an exploration of dynamic scoping.
If you define a function inside a function or method which uses the variable 'x', this will be loaded from the scope in which the function was defined; not the scope in which it is executed. This is enormously useful, but perhaps not always the desired behaviour. If a function assigns to the variable 'x' this will always be inside the scope of the function and not affect the scope the function was defined in or executed in.
I thought it would be fun to try and implement this feature of anonymous code blocks for Python, using code objects. This should be a fun way to learn more about the implementation of Python scoping rules by experimenting with byte-code. If this sounds like it's a hack, then it's only because it is. It is interesting to note however that Aspect Oriented Programming is a well accepted technique in Java, and is mainly implemented at the bytecode level.
This article looks at the byte-code operations used in code objects and experiments with creating new ones. Although the details of the byte-codes are shown, no great technical knowledge should be needed to follow the article.
Python doesn't have code blocks. It does have code objects. These can be executed in the current scope, but they are inconvenient to create inside a program. The code must be stored as a string, compiled and then executed.
Functions store a code object representing the body of the function as the func_code attribute. For a reference on function attributes, see the function type. The byte-code contains instructions telling the interpreter how to load and store values. It is a combination of the function attributes and the byte-code, including code object attributes, that implement the scoping rules.
You can't just execute the code object of a function:
The co_freevars attribute of the code object contains a list of the variables from the enclosing scope used by the code object. Their are various other attributes like co_varnames which tell the interpreter how to load names. For a reference on code objects, see: Code Objects (Unofficial Reference Wiki).
Code objects are immutable, or at least the interesting attributes are read only, so we can't just change the attributes we are interested in.
We can create new code objects. The documentation doesn't seem to encourage this though :
In order to implement code blocks I would like to take the code objects from a function and transform them into ones which can be executed in the current scope.
There is an interesting recipe which transforms bytecodes and creates new code objects in this way: Implementing the make statement by hacking bytecodes.
Luckily there is an easier way.
There is a great module called Byteplay. This lets you manipulate byte-codes and create new code objects. Ideal for my purposes.
It is also great for exploring byte-codes. Let's see what the byte-code looks like for some functions. The Python Byte Code Instructions comes in handy here.
The following Python creates three code blocks and uses Byteplay to print out the names of the byte-codes operations. The three code blocks come froma function which is defined in the global scope, the same code (without the argument 'x') compiled from a string in the global scope, and a function defined inside another function.
This prints out the following (you don't need to read it all) :
From Function:[(SetLineno, 6), (LOAD_CONST, 1), (STORE_FAST, 'y'), (SetLineno, 7), (LOAD_FAST, 'x'), (PRINT_ITEM, None), (PRINT_NEWLINE, None), (SetLineno, 8), (LOAD_FAST, 'y'), (PRINT_ITEM, None), (PRINT_NEWLINE, None), (SetLineno, 9), (LOAD_GLOBAL, 'z'), (PRINT_ITEM, None), (PRINT_NEWLINE, None), (LOAD_CONST, None), (RETURN_VALUE, None)]From current scope:[(SetLineno, 2), (LOAD_CONST, 1), (STORE_NAME, 'y'), (SetLineno, 3), (LOAD_NAME, 'y'), (PRINT_ITEM, None), (PRINT_NEWLINE, None), (SetLineno, 4), (LOAD_NAME, 'z'), (PRINT_ITEM, None), (PRINT_NEWLINE, None), (LOAD_CONST, None), (RETURN_VALUE, None)]Code defined in another scope, using a local rather than a global.[(SetLineno, 66), (LOAD_CONST, 1), (STORE_FAST, 'y'), (SetLineno, 67), (LOAD_FAST, 'x'), (PRINT_ITEM, None), (PRINT_NEWLINE, None), (SetLineno, 68), (LOAD_FAST, 'y'), (PRINT_ITEM, None), (PRINT_NEWLINE, None), (SetLineno, 69), (LOAD_DEREF, 'z'), (PRINT_ITEM, None), (PRINT_NEWLINE, None), (LOAD_CONST, None), (RETURN_VALUE, None)]
In summary, this tells us:
Store a local variable: STORE_FASTLoad an argument: LOAD_FASTLoad a variable local to function: LOAD_FASTLoad a global: LOAD_GLOBALLoad a value from the enclosing scope: LOAD_DEREFLoad a value from the same scope: LOAD_NAMEStore a value in the same scope: STORE_NAME
So in order to rescope a code block to execute in the current scope, we need to transform LOAD_FAST and LOAD_DEREF into LOAD_NAME, and STORE_FAST and STORE_DEREF (which we haven't seen here) into STORE_NAME.
The Byteplay module allows us to iterate over the opcodes. It stores them as a list of tuples. Because lists are mutable we can replace the byte-codes we are interested in.
The Byteplay module also has a dictionary called opmap, which is a mapping of opcode names to their symbolic values.
At the start of the function AnonymousCodeBlock we use Code.from_code to turn the function byte-code object into a Byteplay object. By the end, so far, we have a list newBytecode which holds our transformed bytecode.
There is one more step. We need to turn this back into a code object, but one which executes in the current scope. This means that we need to set the freevars attribute to () (empty) and the newlocals attribute to False.
Because we're not interested in functions which take arguments, we ought to check the function we've been passed. inspect.getargspec makes this easy. The full AnonymousCodeBlock, looks like this.
To use AnonymousCodeBlock you pass it a function. It returns a code object which represent the body of the function. You can execute this with a call to exec. Local variables used by the code, and names bound by it, will be looked up and bound in the scope in which you execute the code.
The above code uses two functions which work with the variables 'x' and 'z'. One of the functions (thunk) is used directly. The second (innerThunk) is obtained by calling getInnerThunk. If you run it (I won't spoil the surprise), you'll see that it does what it should. The variable 'x' is printed and then changed: whether the function comes from an inner scope or not, and whichever scope it is executed in.
So there we have it, an implementation of anonymous code blocks for Python, sort of.
Note
Note that AnonymousCodeBlock doesn't change global lookups. You probably shouldn't use it in production code either.
聯(lián)系客服