DTraceing Python (long post)
Patric Fors
patric at zro.se
Fri Apr 18 12:17:48 BST 2008
Sorry if this post is abit long, even for a first post in the
mailinglist...
SHORT VERSION:
Using DTrace as a tool to speed up Bazaar, check http://blogs.sun.com/bmc/entry/dtrace_at_google
LONG VERSION:
First a disclaimer; I'm new to Python, new to Bazaar and DTrace,
mostly used Subversion as a VCS.
My "first" encounter with DVCS was quite recently, just a month ago
actually.
Heared about Arch some years ago, but could not wrap my head around
the distributed part, so I ignored it.
Also seen some talk about Mercurial on OpenSolaris pages, but just
wondered why they didn't use Subversion. :)
I've just converted some of my homegrown scripts from beeing "manual-
zip-file-revision-controlled" to
live in a Bazaar repository branch, just to get a "feel" of how it
works before taking Bazaar to my development
environment at work (and converting that svn repo to bzr).
Having a svn repo at home seemed abit clunky, but I wanted some
version management of my files.
Well, started with Mercurial last month, then compiled Git after some
buzz on the net (never used it though) and
finally chose to use Bazaar since it feels helpfull and intuative,
just as a tool should be.
So why not try to help out abit, even if I can't program Python (yet).
Reading on the mailinglist about performance, and the hunt for
improvement I thought that I could share
some links that I stumbled upon regarding Python and DTrace.
I've searched the mailinglist archives for "DTrace" and found none,
but hey, that doesn't mean you haven't tried it right? ;)
Links:
"Recently, I gave a Tech Talk at Google on DTrace...jump ahead to
about 57:16 to see a demo of DTrace for Python..."
http://blogs.sun.com/bmc/entry/dtrace_at_google
DTraceToolkit available from http://opensolaris.org/os/community/dtrace/dtracetoolkit/
Of course, I needed to try some of the scripts in the DTraceToolkit,
below is some of the results.
Unfortunately I'm abit "blind" since I can't interpret the output data
from it, so I hope it's giving
abit more to you than it gave to me.
(Python on Solaris and Max OS X have the DTrace bindings in them)
=============================
First abit on the bzr environment where my scripts live:
(MacBook Pro, old model)
nollan at amanda:Graph-o-Grams$ python -V
Python 2.5.1
nollan at amanda:Graph-o-Grams$ uname -a
Darwin amanda.local 9.2.2 Darwin Kernel Version 9.2.2: Tue Mar 4
21:17:34 PST 2008; root:xnu-1228.4.31~1/RELEASE_I386 i386
nollan at amanda:Graph-o-Grams$ pwd
/Users/nollan/utveckling/WoW/Graph-o-Grams
nollan at amanda:Graph-o-Grams$ ls -l trunk/
total 96
-rwxr-xr-x@ 1 nollan nollan 31830 17 Apr 20:49 HealDmgVectorGraph.rb
-rw-r--r--@ 1 nollan nollan 13355 16 Apr 13:01 graphics.r
nollan at amanda:Graph-o-Grams$ bzr version-info trunk/
revision-id: patric at zro.se-20080417192121-y24d2xivootj3r5l
date: 2008-04-17 21:21:21 +0200
build-date: 2008-04-18 10:28:26 +0200
revno: 8
branch-nick: trunk
nollan at amanda:Graph-o-Grams$ time bzr branch trunk test
Branched 8 revision(s).
real 0m0.966s
user 0m0.351s
sys 0m0.197s
=============================
First script is "py_flowtime.d", from the comments in the file:
py_flowtime.d - snoop Python functions with flow and delta times.
Written for the Python DTrace provider.
TERMINAL WINDOW 1 (the snooping one)
nollan at amanda:Python$ pwd
/Users/nollan/Documents/Datorer/DTrace/DTraceToolkit-0.99/Python
nollan at amanda:Python$ sudo ./py_flowtime.d > D_dump.txt
TERMINAL WINDOW 2
nollan at amanda:Graph-o-Grams$ bzr branch trunk test
Branched 8 revision(s).
nollan at amanda:Python$ ls -l D_dump.txt
-rw-r--r-- 1 nollan nollan 15808395 18 Apr 10:35 D_dump.txt
Ok, file is mighty huge, just giving head data and some big deltas
nollan at amanda:Python$ head D_dump.txt
C TIME(us) FILE DELTA(us) -- FUNC
1 26745361758 os.py 3 -> <module>
1 26745361827 os.py 69 -> _get_exports_list
1 26745361964 os.py 136 <- _get_exports_list
1 26745362101 os.py 137 <- <module>
1 26745362145 os.py 44 -> <module>
1 26745362191 os.py 46 <- <module>
1 26745362254 os.py 62 -> <module>
1 26745362299 os.py 45 <- <module>
1 26745362349 os.py 50 -> <module>
nollan at amanda:Python$ egrep " [0-9]{4,} + [<->]" D_dump.txt
1 26745366325 os.py 1570 <- _exists
0 26745430872 site.py 1827 <- main
1 26745765127 lazy_import.py 1793 <- _import
1 26745775524 <string> 1699 <- <module>
1 26745778069 <string> 1693 <- <module>
1 26745967573 ast.py 5309 <-
<module>
1 26748632333 socket.py
2320 <- <module>
1 26748740335 lazy_import.py
2287 <- _import
1 26750293236 index.py
26702 <- iter_all_entries
0 26752439940 lazy_import.py 1074 <-
_import
1 26752881042 transform.py 1146
<- create_file
1 26753791493 index.py 59286
<- iter_all_entries
=============================
Next test is "py_calltime.d", from the comments in the file:
py_calltime.d - measure Python elapsed times for functions.
Written for the Python DTrace provider.
TERMINAL WINDOW 1 (the snooping one)
nollan at amanda:Python$ pwd
/Users/nollan/Documents/Datorer/DTrace/DTraceToolkit-0.99/Python
nollan at amanda:Python$ sudo ./py_calltime.d > D_dump2.txt
^C
TERMINAL WINDOW 2
nollan at amanda:Graph-o-Grams$ bzr branch trunk test2
Branched 8 revision(s).
nollan at amanda:Python$ ls -l D_dump2.txt
-rw-r--r-- 1 nollan nollan 243547 18 Apr 10:49 D_dump2.txt
File is smaller, but since it's quite big even this time I only paste
head and tail of it
nollan at amanda:Python$ head D_dump2.txt
Tracing... Hit Ctrl-C to end.
Count,
FILE TYPE
NAME COUNT
StringIO.py func
<module> 1
UserDict.py func
UserDict 1
UserDict.py func
__init__ 1
__init__.py func
<lambda> 1
__init__.py func
CDLL 1
nollan at amanda:Python$ tail D_dump2.txt
sre_parse.py func _parse_sub
2727871
bzrdir.py func sprout
2908930
lazy_import.py func __getattribute__
3107195
lazy_import.py func _import
3121159
lazy_import.py func _replace
3157305
builtins.py func run
4416570
commands.py func run_bzr_catch_errors
4905873
commands.py func main
4907181
bzr func <module>
4957344
"lazy_import.py" seems not to be that lazy at all :-)
Some sort of conclusion, well since I've no clue on the inner workings
on Bazaar and Python,
I have no idea if above text is just ASCII noise or might even give
some ideas to try out
DTrace to find and eliminate some "speed bumps" in the code.
With Best Regards,
Patric Fors
More information about the bazaar
mailing list