equip.rewriter package¶
Submodules¶
equip.rewriter.merger¶
Responsible for merging two bytecodes at the specified places, as well as making sure the resulting bytecode (and code_object) is properly created.
copyright: |
|
---|---|
license: | Apache 2, see LICENSE for more details. |
-
class
equip.rewriter.merger.
CodeObject
(co_origin)[source]¶ Bases:
object
Class responsible for merging two code objects, and generating a new one. This effectively creates the new bytecode that will be executed.
-
JUMP_OP
= [93, 110, 120, 121, 122, 143, 111, 112, 113, 114, 115, 119]¶
-
MERGE_BACKLIST
= ('co_code', 'co_firstlineno', 'co_name', 'co_filename', 'co_lnotab', 'co_flags', 'co_argcount')¶ List of fields in the code_object not to merge. We only keep the ones from the original code_object.
-
add_global_name
(global_name)[source]¶ Adds the
global_name
as a known imported name. The instrument bytecode will get modified to change any LOAD_* to a LOAD_GLOBAL when finding this name.Parameters: global_name – The imported global name.
-
get_op_oparg
(op, arg, bc_index=0)[source]¶ Retrieve the opcode (op) and its argument (oparg) from the supplied opcode and argument.
Parameters: - op – The current opcode.
- arg – The current dereferenced argument.
- bc_index – The current bytecode index.
-
-
class
equip.rewriter.merger.
Merger
[source]¶ Bases:
object
-
AFTER
= 2¶ Only valid for
MethodDeclaration
. This specifies that the instrument code should be injected before each return of the method (i.e., before each encounteredRETURN_VALUE
in the bytecode).
-
AFTER_IMPORTS
= 6¶ Valid for
ModuleDeclaration
orMethodDeclaration
. This specifies that the instrument code should be injected after the encountered imports.
-
BEFORE
= 1¶ Only valid for
MethodDeclaration
. This specifies that the instrument code should be injected before the body.
-
BEFORE_IMPORTS
= 5¶ Valid for
ModuleDeclaration
orMethodDeclaration
. This specifies that the instrument code should be injected before the encountered imports.
-
INSTRUCTION
= 4¶ Valid for all
Declaration
. This specifies that the instrument code should be injected after each instrument.
-
LINENO
= 3¶ Valid for all
Declaration
. This specifies that the instrument code should be injected each time the current line number changes.
-
MODULE_ENTER
= 8¶ Valid for
ModuleDeclaration
. This specifies that the code should be injected at the beginning of the module.
-
MODULE_EXIT
= 9¶ Valid for
ModuleDeclaration
. This specifies that the code should be injected at the end of the module.
-
RETURN_VALUES
= 7¶ Unused.
-
UNKNOWN
= 0¶ Error case for the kind of location for the merge.
-
static
already_instrumented
(bc_source, bc_input)[source]¶ Checks if the instrumentation in bc_input is already in bc_source
-
static
get_final_bytecode
(bc_source, bc_input, co_source, co_input, location, ins_lineno, ins_offset=-1)[source]¶ Computes the final sequences of opcodes and keep old values. It also tracks what sequences come from the instrument code or the original code, so we can resolve jumps.
Parameters: - bc_source – The bytecode of the orignal code.
- bc_input – The instrument bytecode to inject.
- co_source – The orignal code object.
- co_input – The instrument code object.
- location – The location of the instrumentation. It should be either:
BEFORE
,AFTER
,LINENO
, etc. - ins_lineno – The line number to inject the instrument at. Only valid when
the injection location is
LINENO
. - ins_offset – Not used.
-
static
inline_instrument
(dst_bytecode, src_bytecode, original_lineno, instr_counter=-1, template=None, location=0)[source]¶ Inline the instrument bytecode in place of the current state of
dst_bytecode
.Parameters: - dst_bytecode – The list that contains the final bytecode.
- src_bytecode – The bytecode of the instrument.
- original_lineno – The line number from the original bytecode, so we always map the instrument code line numbers to the code being instrumented.
- instr_counter – A counter to track the frames of the different instrumentation code being inlined. This is used to resolve jump targets.
- template – An instrumentation can follow a template, if so, the actual
template is supplied here. An example is the instrumentation
AFTER
which requires to capture the return value. Defaults to None.
-
static
merge
(co_source, co_input, location=0, ins_lineno=-1, ins_offset=-1, ins_import_names=None)[source]¶ The merger makes sure that the bytecode is properly inserted where it should be, but also that the consts/names/locals/etc. are re-indexed. We will always append at the end of the current tuples.
We need to first compute the new bytecode resolve the jumps, and then dump it... if we just emit it as right now, we have an issue since we cannot know where an absolute/relative jump will land since some instr code can be inserted in between.
-
static
merge_exit
(new_co, bc_source, bc_input, ins_import_names=None)[source]¶ Special handler for inserting code at the very end of a module.
-
static
resolve_jump_targets
(bytecode, new_co)[source]¶ Resolves targets of jumps. Since we add new bytecode, absolute (resp. relative) jump address (resp. offset) can change and we need to track the changes to find the new targets.
The resolver works in two phases:
- Create the list of bytecode indices based on the size of the opcode and its argument.
- For each jump opcode, take its argument and resolve it in the same part of the bytecode (e.g., instrument bytecode or original bytecode).
Parameters: - bytecode – The structure computed by
get_final_bytecode
which overlays the final bytecode sequences and its origin. - new_co – The currently created
CodeObject
.
-
-
equip.rewriter.merger.
RETURN_CANARY_NAME
= '_______0x42024_retvalue'¶ This global name is always injected as a new variable in
co_varnames
, and used to carry the return values. We essentially add:STORE_FAST '_______0x42024_retvalue' ... instrument code that can use `{return_value}` LOAD_FAST '_______0x42024_retvalue' RETURN_VALUE
as specified by the
RETURN_INSTR_TEMPLATE
.
-
equip.rewriter.merger.
RETURN_INSTR_TEMPLATE
= ((125, '_______0x42024_retvalue'), (-2, None), (124, '_______0x42024_retvalue'))¶ The template that dictates how return values are being captured.
equip.rewriter.simple¶
A simplified interface (yet the main one) to handle the injection of instrumentation code.
copyright: |
|
---|---|
license: | Apache 2, see LICENSE for more details. |
-
class
equip.rewriter.simple.
SimpleRewriter
(decl)[source]¶ Bases:
object
The current main rewriter that works for one
Declaration
object. Using this rewriter will modify the given declaration object by possibly replacing all of its associated code object.-
KNOWN_FIELDS
= ('method_name', 'lineno', 'file_name', 'class_name', 'arg0', 'arg1', 'arg2', 'arg3', 'arg4', 'arg5', 'arg6', 'arg7', 'arg8', 'arg9', 'arg10', 'arg11', 'arg12', 'arg13', 'arg14', 'arguments', 'return_value')¶ List of the parameters that can be used for formatting the code to inject. The values are:
method_name
: The name of the method that is being called.lineno
: The start line number of the declaration object beinginstrumented.
file_name
: The file name of the current module.class_name
: The name of the class a method belongs to.
-
static
format_code
(decl, python_code, location)[source]¶ Formats the supplied
python_code
with format string, and values listed in KNOWN_FIELDS.Parameters: - decl – The declaration object (e.g.,
MethodDeclaration
,TypeDeclaration
, etc.). - python_code – The python code to format.
- location – The kind of insertion to perform (e.g.,
Merger.BEFORE
).
- decl – The declaration object (e.g.,
-
static
get_code_object
(python_code)[source]¶ Actually compiles the supplied code and return the
code_object
to be merged with the sourcecode_object
.Parameters: python_code – The python code to compile.
-
static
get_formatting_values
(decl, location)[source]¶ Retrieves the dynamic values to be added in the format string. All values are statically computed, but formal parameters (of methods) are passed by name so it is possible to dereference them in the inserted code (same for the return value).
Parameters: - decl – The declaration object.
- location – The kind of insertion to perform (e.g.,
Merger.BEFORE
).
-
static
indent
(original_code, indent_level=0)[source]¶ Lousy helper that indents the supplied python code, so that it will fit under an if statement.
-
insert_before
(python_code)[source]¶ Insert code at the beginning of the method’s body.
The submitted code can be formatted using
fields
declared inKNOWN_FIELDS
. Sincestring.format
is used once the values are dumped, the injected code should be property structured.Parameters: python_code – The python code to be formatted, compiled, and inserted at the beginning of the method body.
-
insert_enter_code
(python_code, import_code=None)[source]¶ Insert generic code at the beginning of the module. The code is wrapped in a
if __name__ == '__main__'
statement.Parameters: - python_code – The python code to compile and inject.
- import_code – The import statements, if any, to add before the insertion of python_code. Defaults to None.
-
insert_exit_code
(python_code, import_code=None)[source]¶ Insert generic code at the end of the module. The code is wrapped in a
if __name__ == '__main__'
statement.Parameters: - python_code – The python code to compile and inject.
- import_code – The import statements, if any, to add before the insertion of python_code. Defaults to None.
-
insert_generic
(python_code, location=0, ins_lineno=-1, ins_offset=-1, ins_module=False, ins_import=False)[source]¶ Generic code injection utils. It first formats the supplied
python_code
, compiles it to get the code_object, and merge this new code_object with the one of the current declaration object (decl
). The insertion is done by theMerger
.When the injection is done, this method will go and recursively update all references to the old code_object in the parents (when a parent changes, it is as well updated and its new
code_object
propagated upwards). This process is required as Python’s code objects are nested in parent’s code objects, and they are all read-only. This process breaks any references that were hold on previously used code objects (e.g., don’t do that when the instrumented code is running).Parameters: - python_code – The code to be formatted and inserted.
- location – The kind of insertion to perform.
- ins_lineno – When an insertion should occur at one given line of code, use this parameter. Defaults to -1.
- ins_offset – When an insertion should occur at one given bytecode offset, use this parameter. Defaults to -1.
- ins_module – Specify the code insertion should happen in the module itself and not the current declaration.
- ins_import – True of the method is called for inserting an import statement.
-