eric4.DebugClients.Python.Cyclops.Cyclops

Module implementing Cyclops to stare at cycles in Python data structures.

Cyclops started life as a faster implementation of Lars Marius Garshol's Plumbo.py, but almost instantly diverged due to feature bloat.

The only object of real interest is class CycleFinder. First get an instance of that:

import Cyclops
z = Cyclops.CycleFinder()

This creates a cycle-tracker with an empty "root set". Then tell z which objects you're curious about (more on that later).

After running your regular code, tell z to look for cycles:

z.find_cycles(purge_dead_roots=0)
Look for cycles reachable from the root set. Return 1 if a cycle was found, else 0. If optional arg purge_dead_roots is true, first remove objects from the root set whose true refcount (exclusive of CycleFinder internal references) is 0. See also method install_cycle_filter.

Then various methods can be used to get useful reports. Which are *most* useful you'll just have to play with. If you have a few small cycles, show_cycles is very revealing; if you have many cycles, show_sccs and show_arcs *may* cut to the heart of the problem quickly.

z.show_stats()
Print some stats about the last run of find_cycles: number of distinct structured objects reachable from the root set, number of distinct objects involved in a cycle, number of cycles found, number of strongly connected components, number of objects in the root set, and number of edges (graph arcs) examined.
z.show_cycles()
Print each cycle found to stdout.
z.show_cycleobjs(compare=typename_address_cmp)
Print a listing of all objects involved in some cycle. The objects are first sorted via the optional "compare" function. By default, sorts first by type name and then by address (id); among objects of instance type, sorts by the instances' class names; among objects of class type, sorts by the classes' names.
z.show_sccs(compare=typename_address_cmp)
Print a listing of cycle objects partitioned into strongly connected components (that is, the largest groupings possible such that each object in an SCC is reachable from every other object in that SCC). Within each SCC, objects are sorted as for show_cycleobjs.
z.show_arcs(compare=None)
Print a listing of all arc types involved in some cycle. An arc is characterized by:
1) its source object type or class,
2) the selector (e.g. attribute) from the source, and
3) its destination object type or class.
The output is a table with four columns: count, source, selector, destination. The table is sorted by default on columns 2, 3, 4 in that order; the first column (count) is ignored. Specify an alternative compare function to change the sort; the compare function sees pairs of the form ((source_type, selector, destination_type), count) where each element of the triple is a string and count is an int.
z.show_obj(obj)
Print a 2-line description of obj, including its address, adjusted refcount, type name and a short representation of its contents. See the method docstring for details.

There are two ways to add objects to the root set:

z.run(func, args=(), kwargs={})
Run func (with optional args and keyword arguments kwargs), adding every object initialized by an __init__ function to the root set. This is the closest Cyclops gets to plumbo's "lazy" mode. It's factored out here so it can be intermixed with the next method.
z.register(obj)
Add obj to the root set.

To see the current root set,

z.get_rootset()
Return the root set, as a list of (rc, cyclic?, obj) tuples. See method docstring for details. In short, rc is the true refcount (CycleFinder internal references are subtracted out); cyclic is true iff the object is in a known cycle; and obj is the object.

Finally, to empty the root set again:

z.clear()
Empty the root set, and release all other internal references to register'ed objects.

Notes:

CHANGING THE SET OF INTERESTING TYPES

Methods of a CycleFinder object can be used to alter/query its view of the set of types find_cycles "chases":

z.chase_type(t, t_refs_func, t_tag_func)
Add type t to the set of interesting types. t_refs_func is a function of one argument x, where type(x) is t; it should return a sequence (typically a list or tuple) of all objects directly reachable from x. t_tag_func is a function of two arguments x and i, where type(x) is t and i in range(len(t_refs_func(x))). It should return a brief string describing how t_refs_func(x][i] was obtained from x. See the _XXX_refs and _XXX_tag functions at the start of the module for examples of all this.
z.dont_chase_type(t)
Remove type t from the set of interesting types. find_cycles will not attempt to chase the objects reachable from objects of type t.
z.get_chased_types()
Return the set of interesting types, as a list.

CHANGING THE SET OF INTERESTING CYCLES

Sometimes there are "expected" cycles you would like to ignore; e.g., this can happen if you install a module-chaser, because there are cycles in Python's implementation you typically don't care about (for example, if your module imports sys, there's a cycle because sys.modules points back to your module!).

z.install_cycle_filter(filter_func=None)
filter_func=None -> a way to ignore "expected" cycles.

filter_func is a function of one argument, a cycle. Each time find_cycles finds a cycle, the cycle is passed to filter_func. The cycle is ignored unless filter_func returns true. Passing None for filter_func restores the default behavior (do not ignore any cycle).

A cycle is a list of (object, index) pairs, where the first object in the list is the same as the last object in the list, and where the object at cycle[i][0] is the cycle[i][1]'th object obtained from object cycle[i-1][0]. cycle[0][1] should be ignored (it tells us how we got to the first item in the cycle to begin with, but that's irrelevant to the cycle).

CASE STUDY

Below is the driver I used to track cycles in IDLE; it's a replacement for IDLE's idle.py.

At first it didn't install function or module chasers, or a cycle filter, and printed everything. This turned up a bunch of easy cases, and the show_sccs output was surprisingly revealing (all the cycles fell into a handful of SCCs, which corresponded to distinct cycle-creating IDLE subsystems). show_arcs was also very helpful in getting the big picture. show_cycles output was too voluminous to be helpful.

After those cycles were broken, the job got harder. A module chaser was added, which turned up another class of cycles, and then a function chaser turned up 100s more. Most of these involved expected cycles due to Python's implementation, so a cycle filter was installed to ignore cycles that didn't contain at least one class instance. The remaining cycles were isolated special cases, and only show_cycles output was of real use.

After all cycles were purged, IDLE was still leaking, so driver output was added to display the root-set objects still alive at the end. This turned up many cases where objects were living only because registration in the root set was keeping them alive. So a loop was added to the driver that repeatedly purges dead root-set objects and tries again. The __del__ methods of the purged roots caused other root objects to become trash, and after several iterations of this the output reaches a steady state.

IDLE is still leaking (as of 18-Jul-1999), but ever less so, and while Cyclops is no longer finding cycles, the driver's "live at the end" output is still the best clue I've got for guessing what to do next.

Interesting: At the start of this, it turned out that almost all cycles were reachable from places outside themselves. That is, they would not have been considered trash even if Python used a mark-&-sweep form of garbage collection. IDLE's problems, in large part inherited from Tkinter, are simply that "everything points to everything else". The good news is that Guido was able to clean most of this up just by adding reference- purging code to widgets' explicitly-called destroy() methods.

#! /usr/bin/env python
import PyShell
import Cyclops
import types

def mod_refs(x):
    return x.__dict__.values()

def mod_tag(x, i):
    return "." + x.__dict__.keys()[i]

def func_refs(x):
    return x.func_globals, x.func_defaults

def func_tag(x, i):
    return (".func_globals", ".func_defaults")[i]

def instance_filter(cycle):
    for obj, index in cycle:
        if type(obj) is types.InstanceType:
            return 1
    return 0

# Note: PyShell binds ModifiedInterpreter.locals  to __main__.__dict__,
# and __main__ is *us*.  So if we don't keep the CycleFinder instance
# out of the global namespace here, z starts chewing over its own
# instance attributes.  Nothing breaks, but the output is at best
# surprising.

def hidez():
    z = Cyclops.CycleFinder()
    z.chase_type(types.ModuleType, mod_refs, mod_tag)
    z.chase_type(types.FunctionType, func_refs, func_tag)
    z.install_cycle_filter(instance_filter)
    z.run(PyShell.main)
    z.find_cycles()
    z.show_stats()
    z.show_cycles()
    # z.show_cycleobjs()
    # z.show_sccs()
    z.show_arcs()
    while 1:
        print "*" * 70
        print "non-cyclic root set objects:"
        sawitalready = {}
        numsurvivors = numdead = 0
        for rc, cyclic, x in z.get_rootset():
            if not sawitalready.has_key(id(x)):
                sawitalready[id(x)] = 1
                if rc == 0:
                    print "DEAD",
                    numdead = numdead + 1
                    z.show_obj(x)
                elif not cyclic:
                    numsurvivors = numsurvivors + 1
                    z.show_obj(x)
        x = None
        print numdead, "dead;", numsurvivors, "non-cycle & alive"
        if numdead == 0:
            break
        print "releasing dead root-set objects and trying again"
        z.find_cycles(1)
        z.show_stats()

hidez()

Global Attributes

_ClassType
_InstanceType
__version__
_class_refs
_class_tag
_default_refs_dispatcher
_default_tag_dispatcher
_quickrepr
_repr
_tuple_refs
_tuple_tag

Classes

CycleFinder Class for finding cycles in Python data structures.
X Class for the test
Y Empty class for the test.
_CyclopsRepr A class to make short string representations of objects.

Functions

_dict_refs Function returning a sequence of all objects directly reachable from x.
_dict_tag Function returning a string describing how the reference was obtained from x.
_instance_method_refs Function returning a sequence of all objects directly reachable from x.
_instance_method_tag Function returning a string describing how the reference was obtained from x.
_instance_refs Function returning a sequence of all objects directly reachable from x.
_instance_tag Function returning a string describing how the reference was obtained from x.
_list_refs Function returning a sequence of all objects directly reachable from x.
_list_tag Function returning a string describing how the reference was obtained from x.
_test Function to test the module.
typename_address_cmp Method to do some special comparison based on type or name and address.


CycleFinder

Class for finding cycles in Python data structures.

See Cyclops module description for details.

Derived from

object

Class Attributes

None

Methods

CycleFinder Create a cycle finder with empty root set.
__find_cycles Private method to find cycles.
__init_tracer Private method to initialize the tracer.
__obj2arcname Private helper method.
__reset Private method to perform a reset.
__study_cycle Private method to analyze a cycle.
_print_cycle Method to print a cycle.
_print_separator Method to print a separator line.
chase_type t, t_refs_func, t_tag_func -> chase type t.
clear Remove all internal references to external objects.
dont_chase_type t -> remove type t from the set of chased types.
find_cycles purge_dead_roots=0 -> look for cycles, return true if found.
get_chased_types Return the set of chased types, as a list.
get_rootset Return the root set, as a list of (rc, cyclic?, obj) tuples.
install_cycle_filter filter_func=None -> a way to ignore "expected" cycles.
register obj -> add object obj to the root set.
run func, args=(), kwargs={} -> add objects to root set by magic.
show_arcs compare=None -> print unique arc types in cycles.
show_cycleobjs compare=typename_address_cmp -> print all objects in cycles.
show_cycles Print all cycles to stdout.
show_obj obj -> print short description of obj to sdtout.
show_sccs compare=typename_address_cmp -> print SCCs.
show_stats Print statistics for the last run of find_cycles.

CycleFinder (Constructor)

CycleFinder()

Create a cycle finder with empty root set.

CycleFinder.__find_cycles

__find_cycles(obj, i, id=id, type=type, len=len)

Private method to find cycles.

CycleFinder.__init_tracer

__init_tracer(frame, event, args)

Private method to initialize the tracer.

CycleFinder.__obj2arcname

__obj2arcname(obj)

Private helper method.

CycleFinder.__reset

__reset()

Private method to perform a reset.

CycleFinder.__study_cycle

__study_cycle(slice)

Private method to analyze a cycle.

CycleFinder._print_cycle

_print_cycle(slice)

Method to print a cycle.

CycleFinder._print_separator

_print_separator()

Method to print a separator line.

CycleFinder.chase_type

chase_type(t, t_refs_func, t_tag_func)

t, t_refs_func, t_tag_func -> chase type t.

See module docstring for details.

CycleFinder.clear

clear()

Remove all internal references to external objects.

Empties the root set. Does not change the set of types this CycleFinder chases. Does not change the cycle filter in effect.

CycleFinder.dont_chase_type

dont_chase_type(t)

t -> remove type t from the set of chased types.

See module docstring for details.

CycleFinder.find_cycles

find_cycles(purge_dead_roots=0)

purge_dead_roots=0 -> look for cycles, return true if found.

Identify all cycles among objects reachable from the root set. Return true iff at least one cycle is found.

This should be called before any of the show_XXX methods. Note that it's OK to add more objects to the root set and call it again, or to change the set of chased types, etc. find_cycles starts over from scratch each time it's called.

If optional arg purge_dead_roots is true (default false), before searching for cycles the root set is purged of all objects that the preceding run of find_cycles determined had a true refcount of 0 (that is, the root set objects that are still alive only because they appear in the root set). Purging these allows their finalizers to get invoked, which may allow a cascade of other objects (including cycles) to go away too.

See also method install_cycle_filter.

CycleFinder.get_chased_types

get_chased_types()

Return the set of chased types, as a list.

CycleFinder.get_rootset

get_rootset()

Return the root set, as a list of (rc, cyclic?, obj) tuples.

Should be called after find_cycles. For each object in the root set, returns a triple consisting of

refcount
number of outstanding references less those due to CycleFinder internals; see show_obj docstring for more details; this will be None if find_cycles hasn't been run, or not since the last clear()
cyclic?
true (1) iff obj is known to be in a cycle
obj
the object

CycleFinder.install_cycle_filter

install_cycle_filter(filter_func=None)

filter_func=None -> a way to ignore "expected" cycles.

See module docstring for details. This is a callback function invoked whenever find_cycles() finds a cycle; the cycle is ignored unless the callback returns true.

CycleFinder.register

register(obj)

obj -> add object obj to the root set.

CycleFinder.run

run(func, args=(), kwargs={})

func, args=(), kwargs={} -> add objects to root set by magic.

Function func is invoked with arguments args and keyword arguments kwargs. For the duration of the call, each class instance initialized by an __init__ call is automatically added to the root set. The result of invoking func is returned.

CycleFinder.show_arcs

show_arcs(compare=None)

compare=None -> print unique arc types in cycles.

See module docstring for details. Briefly, each arc in a cycle is categorized by the type of the source node, the kind of arc (how we got from the source to the destination), and the type of the destination node. Each line of output consists of those three pieces of info preceded by the count of arcs of that kind. By default, the rows are sorted first by column 2 (source node type), then by columns 3 and 4.

CycleFinder.show_cycleobjs

show_cycleobjs(compare=typename_address_cmp)

compare=typename_address_cmp -> print all objects in cycles.

Prints to stdout. Each distinct object find_cycles found in a cycle is displayed. The set of objects found in cycles is first sorted by the optional "compare" function. By default, objects are sorted using their type name as the primary key and their storage address (id) as the secondary key; among objects of instance type, sorts by the instances' class names; among objects of class type, sorts by the classes' names.

CycleFinder.show_cycles

show_cycles()

Print all cycles to stdout.

CycleFinder.show_obj

show_obj(obj)

obj -> print short description of obj to sdtout.

This is of the form

<address> rc:<refcount> <typename>
repr: <shortrepr>

where

<address>
hex address of obj
<refcount>
If find_cycles() has been run and obj is in the root set or was found in a cycle, this is the number of references outstanding less the number held internally by CycleFinder. In most cases, this is what the true refcount would be had you not used CycleFinder at all. You can screw that up, e.g. by installing a cycle filter that holds on to references to one or more cycle elements. If find_cycles() has not been run, or has but obj wasn't found in a cycle and isn't in the root set, is "?".
<typename>
type(obj), as a string. If obj.__class__ exists, also prints the class name.
<shortrepr>
repr(obj), but using a variant of the std module repr.py that limits the number of characters displayed.

CycleFinder.show_sccs

show_sccs(compare=typename_address_cmp)

compare=typename_address_cmp -> print SCCs.

Prints to stdout. Shows the objects in cycles partitioned into strongly connected components (that is, the largest groupings possible such that each object in an SCC is reachable from every other object in that SCC). Within each SCC, objects are sorted as for show_cycleobjs.

CycleFinder.show_stats

show_stats()

Print statistics for the last run of find_cycles.

Up


X

Class for the test

Derived from

None

Class Attributes

None

Methods

X
__repr__

X (Constructor)

X(name)

X.__repr__

__repr__()
Up


Y

Empty class for the test.

Derived from

None

Class Attributes

None

Methods

None
Up


_CyclopsRepr

A class to make short string representations of objects.

The std repr.repr sorts dicts by keys, but we refer to the keys via numeric subscript in cycle reports, so first a derived class that leaves dicts in raw order. Also, instances, instance methods and classes frequently appear in cycle reports, so avoid chopping their reprs at all. We're only trying to prevent massive expense for massive lists, tuples & dicts.

Derived from

_repr.Repr

Class Attributes

repr_instance_method

Methods

_CyclopsRepr Constructor
repr_class Method overriding the base class formatter.
repr_dictionary Method overriding the base dictionary formatter.
repr_instance Method overriding the base instance formatter.

_CyclopsRepr (Constructor)

_CyclopsRepr()

Constructor

_CyclopsRepr.repr_class

repr_class(x, level)

Method overriding the base class formatter.

x
class object to be shown
level
level of detail
Returns:
string representation of the class object

_CyclopsRepr.repr_dictionary

repr_dictionary(x, level)

Method overriding the base dictionary formatter.

x
dictionary to be shown
level
level of detail
Returns:
string representation of the dictionary

_CyclopsRepr.repr_instance

repr_instance(x, level)

Method overriding the base instance formatter.

x
instance object to be shown
level
level of detail
Returns:
string representation of the instance object
Up


_dict_refs

_dict_refs(x)

Function returning a sequence of all objects directly reachable from x.

x
object to check
Returns:
sequence of all objects directly reachable from x
Up


_dict_tag

_dict_tag(x, i)

Function returning a string describing how the reference was obtained from x.

x
object to check
i
index into list of references
Returns:
a string describing how the reference was obtained from x
Up


_instance_method_refs

_instance_method_refs(x)

Function returning a sequence of all objects directly reachable from x.

x
object to check
Returns:
sequence of all objects directly reachable from x
Up


_instance_method_tag

_instance_method_tag(x, i)

Function returning a string describing how the reference was obtained from x.

x
object to check
i
index into list of references
Returns:
a string describing how the reference was obtained from x
Up


_instance_refs

_instance_refs(x)

Function returning a sequence of all objects directly reachable from x.

x
object to check
Returns:
sequence of all objects directly reachable from x
Up


_instance_tag

_instance_tag(x, i)

Function returning a string describing how the reference was obtained from x.

x
object to check
i
index into list of references
Returns:
a string describing how the reference was obtained from x
Up


_list_refs

_list_refs(x)

Function returning a sequence of all objects directly reachable from x.

x
object to check
Returns:
sequence of all objects directly reachable from x
Up


_list_tag

_list_tag(x, i)

Function returning a string describing how the reference was obtained from x.

x
object to check
i
index into list of references
Returns:
a string describing how the reference was obtained from x
Up


_test

_test()

Function to test the module.

Up


typename_address_cmp

typename_address_cmp(x, y)

Method to do some special comparison based on type or name and address.

x
first object
y
second object
Returns:
comparisson result as returned from cmp
Up