Monday, March 22, 2010

pexpect and subprocess

If you want to run a program from your python script, you have two main options: subprocess and pexpect.  Recently I wanted to run ssh-keygen from a python script and I tried it both ways, so you can see a comparison.

pexpect:
import pexpect

def get_fingerprint_pexpect(key_file_name):
    command = 'ssh-keygen -lf %s' % (key_file_name,)
    child = pexpect.spawn(command)
    result_id = child.expect([
            pexpect.TIMEOUT,
            'No such file',
            'fail',
            'error',
            'not a public key',
            '[0-9a-f:]+',])
    if result_id == 0:
        raise Exception, 'Timeout occurred.'
    if result_id == 1:
        raise Exception, 'File "%s" does not exist.' % (key_file_name,)
    if result_id in (2, 3, 4):
        raise ValueError, 'Improperly formatted key.'

    fingerprint = child.match.group()
    return fingerprint

subprocess:
import re
import subprocess

NO_SUCH_FILE_ERROR = re.compile('No such file')
KEY_FORMAT_ERROR = re.compile('(fail|error|not a public key)')
FINGERPRINT = re.compile('[0-9a-f]{2}:[0-9a-f:]+')

def get_fingerprint_subprocess(key_file_name):
    command = ['ssh-keygen', '-lf', key_file_name,]
    result = subprocess.Popen(command,
            stdout = subprocess.PIPE,
            stderr = subprocess.STDOUT,
            stdin = subprocess.PIPE).communicate()[0]
no_such_file_error = NO_SUCH_FILE_ERROR.search(result)
    if no_such_file_error:
        raise Exception, 'File "%s" does not exist.' % (key_file_name,)

    key_format_error = KEY_FORMAT_ERROR.search(result)
    if key_format_error:
        raise ValueError, 'Improperly formatted key.'

    fingerprint = FINGERPRINT.search(result)
    if fingerprint:
        return fingerprint.group()

    raise Exception, 'Error generating key fingerprint.'

What the pexpect code is doing
With pexpect, you hand it a list of responses you expect to see.  In this case we are expecting to either see a timeout, a few potential error strings, or the fingerprint.  The content of this list is automatically converted into regular expressions by pexpect.

When we call "child.expect" with the list of responses we expect, we get back an id that refers to the index of the matching item in this list.  So when we say "if result_id == 1...", we're saying "if the resulting response was the item at index 1 (i.e. 'No such file')...".

We use the following regex to represent the fingerprint:
[0-9a-f]{2}:[0-9a-f:]+
Then later we say "child.match.group()" to get the actual content that was matched.

What the subprocess code is doing
First off, we split up the command into its component parts ("ssh-keygen", "-lf", "...", "...").  We redirect stdout, stderr, and stdin so that we can read it easily.  When we say "subprocess.PIPE", we're telling it to use subprocess to communicate with stdout and stdin.  When we set stderr equal to "subprocess.STDOUT", we're saying to send errors to stdout.

Calling subprocess.Popen(...).communicate() gives you a tuple of responses for stdout and stderr.  We choose to only look at stdout by adding the "[0]" to the end.

Subprocess doesn't do any matching for us like pexpect does.  It simply returns the response, so you much find out what it means on your own.  In this example, we have some regular expressions that we match against to find out what the result was.

Lastly, if the response was none of these, we treat it as an unexpected error, much like the pexpect timeout.

Conclusion
You may be thinking now that subprocess and pexpect are interchangeable, but they're not.  If you want to do something quick and easy, subprocess is nice.  But if you're dealing with more complicated things (for example, the "ssh" command requires a password, and it can't be passed in via stdin), you should stick with pexpect.  It can be tricky working out all of the regular expressions, but there's a pexpect log you can look at to see everything that pexpect is seeing.

3 comments:

  1. It would be great if you can shed some more light on conclusion i.e. what to use when. I am still confused between the two. I obviously want to interact with the the shell like providing password etc. but also want to be on safer side.

    ReplyDelete