DTraceing Python (long post)

Patric Fors patric at zro.se
Fri Apr 18 12:17:48 BST 2008


Sorry if this post is abit long, even for a first post in the  
mailinglist...

SHORT VERSION:
Using DTrace as a tool to speed up Bazaar, check http://blogs.sun.com/bmc/entry/dtrace_at_google

LONG VERSION:
First a disclaimer; I'm new to Python, new to Bazaar and DTrace,  
mostly used Subversion as a VCS.

My "first" encounter with DVCS was quite recently, just a month ago  
actually.
Heared about Arch some years ago, but could not wrap my head around  
the distributed part, so I ignored it.
Also seen some talk about Mercurial on OpenSolaris pages, but just  
wondered why they didn't use Subversion. :)

I've just converted some of my homegrown scripts from beeing "manual- 
zip-file-revision-controlled" to
live in a Bazaar repository branch, just to get a "feel" of how it  
works before taking Bazaar to my development
environment at work (and converting that svn repo to bzr).
Having a svn repo at home seemed abit clunky, but I wanted some  
version management of my files.

Well, started with Mercurial last month, then compiled Git after some  
buzz on the net (never used it though) and
finally chose to use Bazaar since it feels helpfull and intuative,  
just as a tool should be.
So why not try to help out abit, even if I can't program Python (yet).

Reading on the mailinglist about performance, and the hunt for  
improvement I thought that I could share
some links that I stumbled upon regarding Python and DTrace.

I've searched the mailinglist archives for "DTrace" and found none,  
but hey, that doesn't mean you haven't tried it right? ;)

Links:
"Recently, I gave a Tech Talk at Google on DTrace...jump ahead to  
about 57:16 to see a demo of DTrace for Python..."
http://blogs.sun.com/bmc/entry/dtrace_at_google

DTraceToolkit available from http://opensolaris.org/os/community/dtrace/dtracetoolkit/

Of course, I needed to try some of the scripts in the DTraceToolkit,  
below is some of the results.
Unfortunately I'm abit "blind" since I can't interpret the output data  
from it, so I hope it's giving
abit more to you than it gave to me.

(Python on Solaris and Max OS X have the DTrace bindings in them)

=============================
First abit on the bzr environment where my scripts live:
(MacBook Pro, old model)

nollan at amanda:Graph-o-Grams$ python -V
Python 2.5.1

nollan at amanda:Graph-o-Grams$ uname -a
Darwin amanda.local 9.2.2 Darwin Kernel Version 9.2.2: Tue Mar  4  
21:17:34 PST 2008; root:xnu-1228.4.31~1/RELEASE_I386 i386

nollan at amanda:Graph-o-Grams$ pwd
/Users/nollan/utveckling/WoW/Graph-o-Grams

nollan at amanda:Graph-o-Grams$ ls -l trunk/
total 96
-rwxr-xr-x@ 1 nollan  nollan  31830 17 Apr 20:49 HealDmgVectorGraph.rb
-rw-r--r--@ 1 nollan  nollan  13355 16 Apr 13:01 graphics.r

nollan at amanda:Graph-o-Grams$ bzr version-info trunk/
revision-id: patric at zro.se-20080417192121-y24d2xivootj3r5l
date: 2008-04-17 21:21:21 +0200
build-date: 2008-04-18 10:28:26 +0200
revno: 8
branch-nick: trunk

nollan at amanda:Graph-o-Grams$ time bzr branch trunk test
Branched 8 revision(s).

real	0m0.966s
user	0m0.351s
sys	0m0.197s

=============================

First script is "py_flowtime.d", from the comments in the file:
    py_flowtime.d - snoop Python functions with flow and delta times.
                    Written for the Python DTrace provider.

TERMINAL WINDOW 1 (the snooping one)
nollan at amanda:Python$ pwd
/Users/nollan/Documents/Datorer/DTrace/DTraceToolkit-0.99/Python
nollan at amanda:Python$ sudo ./py_flowtime.d > D_dump.txt

TERMINAL WINDOW 2
nollan at amanda:Graph-o-Grams$ bzr branch trunk test
Branched 8 revision(s).

nollan at amanda:Python$ ls -l D_dump.txt
-rw-r--r--  1 nollan  nollan  15808395 18 Apr 10:35 D_dump.txt

Ok, file is mighty huge, just giving head data and some big deltas
nollan at amanda:Python$ head D_dump.txt
   C TIME(us)         FILE             DELTA(us)  -- FUNC
   1 26745361758      os.py                    3 -> <module>
   1 26745361827      os.py                   69   -> _get_exports_list
   1 26745361964      os.py                  136   <- _get_exports_list
   1 26745362101      os.py                  137 <- <module>
   1 26745362145      os.py                   44 -> <module>
   1 26745362191      os.py                   46 <- <module>
   1 26745362254      os.py                   62 -> <module>
   1 26745362299      os.py                   45 <- <module>
   1 26745362349      os.py                   50 -> <module>

nollan at amanda:Python$ egrep "  [0-9]{4,} + [<->]" D_dump.txt
     1 26745366325      os.py                 1570   <- _exists
     0 26745430872      site.py               1827   <- main
     1 26745765127      lazy_import.py        1793     <- _import
     1 26745775524      <string>              1699       <- <module>
     1 26745778069      <string>              1693       <- <module>
     1 26745967573      ast.py                5309                 <-  
<module>
     1 26748632333      socket.py              
2320                       <- <module>
     1 26748740335      lazy_import.py         
2287                         <- _import
     1 26750293236      index.py              
26702                                 <- iter_all_entries
     0 26752439940      lazy_import.py        1074                 <-  
_import
     1 26752881042      transform.py          1146                      
<- create_file
     1 26753791493      index.py             59286                      
<- iter_all_entries

=============================

Next test is "py_calltime.d", from the comments in the file:
    py_calltime.d - measure Python elapsed times for functions.
                    Written for the Python DTrace provider.

TERMINAL WINDOW 1 (the snooping one)
nollan at amanda:Python$ pwd
/Users/nollan/Documents/Datorer/DTrace/DTraceToolkit-0.99/Python
nollan at amanda:Python$ sudo ./py_calltime.d > D_dump2.txt
^C

TERMINAL WINDOW 2
nollan at amanda:Graph-o-Grams$ bzr branch trunk test2
Branched 8 revision(s).

nollan at amanda:Python$ ls -l D_dump2.txt
-rw-r--r--  1 nollan  nollan  243547 18 Apr 10:49 D_dump2.txt

File is smaller, but since it's quite big even this time I only paste  
head and tail of it

nollan at amanda:Python$ head D_dump2.txt
Tracing... Hit Ctrl-C to end.


Count,
    FILE                 TYPE        
NAME                                COUNT
    StringIO.py          func        
<module>                                1
    UserDict.py          func        
UserDict                                1
    UserDict.py          func        
__init__                                1
    __init__.py          func        
<lambda>                                1
    __init__.py          func        
CDLL                                    1


nollan at amanda:Python$ tail D_dump2.txt
    sre_parse.py         func       _parse_sub                         
2727871
    bzrdir.py            func       sprout                             
2908930
    lazy_import.py       func       __getattribute__                   
3107195
    lazy_import.py       func       _import                            
3121159
    lazy_import.py       func       _replace                           
3157305
    builtins.py          func       run                                
4416570
    commands.py          func       run_bzr_catch_errors               
4905873
    commands.py          func       main                               
4907181
    bzr                  func       <module>                           
4957344

"lazy_import.py" seems not to be that lazy at all :-)

Some sort of conclusion, well since I've no clue on the inner workings  
on Bazaar and Python,
I have no idea if above text is just ASCII noise or might even give  
some ideas to try out
DTrace to find and eliminate some "speed bumps" in the code.

With Best Regards,
Patric Fors






More information about the bazaar mailing list