even more Python vs Perl performance

Following my last two posts (http://blog.entek.org.uk/?p=106 and http://blog.entek.org.uk/?p=112) I took some profilers to my codes.

First up my revised Python implementation:

> python -m cProfile checkmail2
...
         16225093 function calls in 8.663 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    8.663    8.663 :1()
        1    0.000    0.000    0.000    0.000 UserDict.py:17(__getitem__)
        1    5.906    5.906    8.662    8.662 checkmail2:3()
        1    0.000    0.000    8.663    8.663 {execfile}
       37    0.000    0.000    0.000    0.000 {method '__enter__' of 'file' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
       22    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'iteritems' of 'dict' objects}
 16224990    2.755    0.000    2.755    0.000 {method 'startswith' of 'str' objects}
       37    0.001    0.000    0.001    0.000 {open}
        1    0.000    0.000    0.000    0.000 {posix.listdir}

Quite clearly there is a huge number (16 million!) of calls to startswith which is the biggest time-sink outside the main script.

Comparing the Perl implementation:

> perl -d:DProf checkmail3
...
> dprofpp
Total Elapsed Time = 3.426896 Seconds
  User+System Time = 3.401494 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 0.12   0.004  0.007      4   0.0010 0.0016  main::BEGIN
 0.03   0.001  0.001      5   0.0002 0.0002  File::Basename::BEGIN
 0.03   0.001  0.001      1   0.0009 0.0009  warnings::BEGIN
 0.03   0.001  0.001     37   0.0000 0.0000  File::Basename::_strip_trailing_sep
 0.03   0.001  0.001     37   0.0000 0.0000  File::Basename::fileparse
 0.00   0.000  0.002     37   0.0000 0.0000  File::Basename::basename
 0.00   0.000  0.000      1   0.0003 0.0003  File::Glob::doglob
 0.00   0.000  0.000      1   0.0001 0.0001  DynaLoader::dl_load_file
 0.00   0.000  0.000      1   0.0001 0.0003  XSLoader::load
 0.00   0.000  0.000      1   0.0001 0.0001  File::Basename::fileparse_set_fstype
 0.00   0.000  0.000      1   0.0001 0.0001  Exporter::import
 0.00   0.000  0.000      2   0.0000 0.0000  warnings::import
 0.00   0.000  0.000      3   0.0000 0.0000  strict::import
 0.00   0.000  0.000      1   0.0000 0.0003  File::Glob::csh_glob
 0.00   0.000  0.000      1   0.0000 0.0000  strict::bits

Ignoring the actual times which are not directly comparable due to the profiling overheads we can clearly see Perl is benefiting hugely from the inbuilt regex engine as there are 0 function calls associated with each line check.

I did replace the ‘str.startswith’ implementation of the Python script with a version which used ‘re’ regex objects, but this showed even worse performance:

> time python checkmail4
...
python checkmail4 8.42s user 0.33s system 99% cpu 8.765 total

Profiling this one we see the overhead of using ‘re.match’ was about double that of ‘str.startswith’ and, obviously, the number of calls remained the same. On top of this I introduced additional overhead of two calls to ‘re.compile’ at the start of the script, which the profiler showed incurred not insignificant function calls of their own:

> python -m cProfile checkmail4
...
         16225312 function calls in 12.416 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   12.416   12.416 :1()
        1    0.000    0.000    0.000    0.000 UserDict.py:17(__getitem__)
        1    6.662    6.662   12.415   12.415 checkmail4:3()
        2    0.000    0.000    0.000    0.000 re.py:186(compile)
        2    0.000    0.000    0.000    0.000 re.py:227(_compile)
        2    0.000    0.000    0.000    0.000 sre_compile.py:367(_compile_info)
        2    0.000    0.000    0.000    0.000 sre_compile.py:38(_compile)
        4    0.000    0.000    0.000    0.000 sre_compile.py:480(isstring)
        2    0.000    0.000    0.000    0.000 sre_compile.py:486(_code)
        2    0.000    0.000    0.000    0.000 sre_compile.py:501(compile)
       15    0.000    0.000    0.000    0.000 sre_parse.py:144(append)
        2    0.000    0.000    0.000    0.000 sre_parse.py:146(getwidth)
        2    0.000    0.000    0.000    0.000 sre_parse.py:184(__init__)
       21    0.000    0.000    0.000    0.000 sre_parse.py:188(__next)
        2    0.000    0.000    0.000    0.000 sre_parse.py:201(match)
       19    0.000    0.000    0.000    0.000 sre_parse.py:207(get)
        2    0.000    0.000    0.000    0.000 sre_parse.py:307(_parse_sub)
        2    0.000    0.000    0.000    0.000 sre_parse.py:385(_parse)
        2    0.000    0.000    0.000    0.000 sre_parse.py:669(parse)
        2    0.000    0.000    0.000    0.000 sre_parse.py:73(__init__)
        2    0.000    0.000    0.000    0.000 sre_parse.py:96(__init__)
        2    0.000    0.000    0.000    0.000 {_sre.compile}
 16224990    5.751    0.000    5.751    0.000 {built-in method match}
        1    0.000    0.000   12.416   12.416 {execfile}
        6    0.000    0.000    0.000    0.000 {isinstance}
       44    0.000    0.000    0.000    0.000 {len}
       37    0.000    0.000    0.000    0.000 {method '__enter__' of 'file' objects}
       59    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
       24    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
        2    0.000    0.000    0.000    0.000 {method 'items' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'iteritems' of 'dict' objects}
        4    0.000    0.000    0.000    0.000 {min}
       37    0.002    0.000    0.002    0.000 {open}
       13    0.000    0.000    0.000    0.000 {ord}
        1    0.000    0.000    0.000    0.000 {posix.listdir}

Quite clearly from the profiler output each call to either ‘str.startswith’ or ‘re.match’ use a very small amount of processor time (too small to be output) but the cumulative effect of 16 million calls is where the big slowdown was occurring. To get around this I tried implementing the ‘str.startswith’ version using string splicing (i.e. “line[:5] == ‘From ‘” rather than “line.startswith(‘From ‘)”) and the result was dramatic:

> time python checkmail4
...
python checkmail4 3.86s user 0.31s system 99% cpu 4.186 total

The profiler output for this version shows that the number of function calls is now on a par with the Perl implementation:

> python -m cProfile checkmail4
...
         110 function calls in 4.311 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    4.311    4.311 :1()
        1    0.000    0.000    0.000    0.000 UserDict.py:17(__getitem__)
        1    4.308    4.308    4.311    4.311 checkmail4:3()
        1    0.000    0.000    4.311    4.311 {execfile}
       37    0.000    0.000    0.000    0.000 {method '__enter__' of 'file' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
       29    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'iteritems' of 'dict' objects}
       37    0.003    0.000    0.003    0.000 {open}
        1    0.000    0.000    0.000    0.000 {posix.listdir}

This puts the Python version within 0.6s of the Perl version, which is close enough for me. Especially considering this is effectively comparing Perl to Python on Perl’s hometurf of text matching.

I think Perl would probably still out perform Python if I was wanting to do something more fancy involving regex substitutions but Python’s performance issues, in this case, seem to be purely down to function call overheads which Perl sidesteps by incorporating the regex engine into the core language.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>