programmerin: 2010

If you want to run a program from your python script, you have two main options: subprocess and pexpect. Recently I wanted to run ssh-keygen from a python script and I tried it both ways, so you can see a comparison.

pexpect:

import pexpect

def get_fingerprint_pexpect(key_file_name):
    command = 'ssh-keygen -lf %s' % (key_file_name,)
    child = pexpect.spawn(command)
    result_id = child.expect([
            pexpect.TIMEOUT,
            'No such file',
            'fail',
            'error',
            'not a public key',
            '[0-9a-f:]+',])
    if result_id == 0:
        raise Exception, 'Timeout occurred.'
    if result_id == 1:
        raise Exception, 'File "%s" does not exist.' % (key_file_name,)
    if result_id in (2, 3, 4):
        raise ValueError, 'Improperly formatted key.'

    fingerprint = child.match.group()
    return fingerprint

subprocess:

import re
import subprocess

NO_SUCH_FILE_ERROR = re.compile('No such file')
KEY_FORMAT_ERROR = re.compile('(fail|error|not a public key)')
FINGERPRINT = re.compile('[0-9a-f]{2}:[0-9a-f:]+')

def get_fingerprint_subprocess(key_file_name):
    command = ['ssh-keygen', '-lf', key_file_name,]
    result = subprocess.Popen(command,
            stdout = subprocess.PIPE,
            stderr = subprocess.STDOUT,
            stdin = subprocess.PIPE).communicate()[0]

no_such_file_error = NO_SUCH_FILE_ERROR.search(result)
    if no_such_file_error:
        raise Exception, 'File "%s" does not exist.' % (key_file_name,)

    key_format_error = KEY_FORMAT_ERROR.search(result)
    if key_format_error:
        raise ValueError, 'Improperly formatted key.'

    fingerprint = FINGERPRINT.search(result)
    if fingerprint:
        return fingerprint.group()

    raise Exception, 'Error generating key fingerprint.'

What the pexpect code is doing
With pexpect, you hand it a list of responses you expect to see. In this case we are expecting to either see a timeout, a few potential error strings, or the fingerprint. The content of this list is automatically converted into regular expressions by pexpect.

When we call "child.expect" with the list of responses we expect, we get back an id that refers to the index of the matching item in this list. So when we say "if result_id == 1...", we're saying "if the resulting response was the item at index 1 (i.e. 'No such file')...".

We use the following regex to represent the fingerprint:

[0-9a-f]{2}:[0-9a-f:]+

Then later we say "child.match.group()" to get the actual content that was matched.

What the subprocess code is doing
First off, we split up the command into its component parts ("ssh-keygen", "-lf", "...", "..."). We redirect stdout, stderr, and stdin so that we can read it easily. When we say "subprocess.PIPE", we're telling it to use subprocess to communicate with stdout and stdin. When we set stderr equal to "subprocess.STDOUT", we're saying to send errors to stdout.

Calling subprocess.Popen(...).communicate() gives you a tuple of responses for stdout and stderr. We choose to only look at stdout by adding the "[0]" to the end.

Subprocess doesn't do any matching for us like pexpect does. It simply returns the response, so you much find out what it means on your own. In this example, we have some regular expressions that we match against to find out what the result was.

Lastly, if the response was none of these, we treat it as an unexpected error, much like the pexpect timeout.

Conclusion
You may be thinking now that subprocess and pexpect are interchangeable, but they're not. If you want to do something quick and easy, subprocess is nice. But if you're dealing with more complicated things (for example, the "ssh" command requires a password, and it can't be passed in via stdin), you should stick with pexpect. It can be tricky working out all of the regular expressions, but there's a pexpect log you can look at to see everything that pexpect is seeing.

I recently had to write some code to use ssh-keygen to get an SSH key's fingerprint (I wasn't allowed to use any fancy ssh python modules). ssh-keygen expected the key to be stored in a file, so to get around it I had to create a temporary file. Here's how you do it in python:

import tempfile
import os

temp_file = tempfile.NamedTemporaryFile()
temp_file.write('hi there')
temp_file.flush()
os.fsync(temp_file.fileno())

print 'Created temporary file "%s"' % (temp_file.name,)

temp_file.close()

Of course, instead of just printing the file name, I used the temporary file to pass to ssh-keygen and extract the key's fingerprint. Also, always remember to close your file when you're done. Even if an exception happens, you should close your file handles before leaving so that you don't have files lying around that you don't need.

programmerin

Monday, March 22, 2010

pexpect and subprocess

Wednesday, March 17, 2010

Hashing a string on the command line

How to create a temporary file in python