Monday, July 5, 2010

parallel ssh tool roundup

So I'm in the market for a good "parallel" ssh tool.  Basically, I want to ssh and type some commands like I always do except that instead of one command output, I want the command to be run on a bunch of machines and I want responses from each.  I've used things like mpiexec in the past, but I was hoping for something more ad-hoc.  I just want to specify hosts on the command line (with no prior setup).  I really don't even want to required ssh keys or having the same password if I can avoid it.  Like I said, just ssh as usual, run some commands, get multiple results.

Anyway, these are the tools I've come across (in no particular order).  I'll outline what I regard as the advantages and disadvantages of each below.
  1. pdsh - "a high-performance, parallel remote shell utility" from Lawrence Livermore National Laboratory
  2. pssh an implementation of some parallel ssh tools in python
  3. dsh - dancer's / distributed shell
  4. pydsh - a python version of dancer's shell
  5. clusterssh - "a tool for making the same change on multiple servers"
  6. mussh - "a shell script ... to execute a command or script over ssh on multiple hosts"
  7. sshpt - "SSH Power Tool (sshpt) enables you to execute commands and upload files to many servers simultaneously via SSH"
  8. multixterm - part of the expect project
  9. clusterit - "a collection of clustering tools, to turn your ordinary everyday pile of UNIX workstations into a speedy parallel beast"
  10. dish - "The diligence shell 'dish' executes commands via ssh/rsh/telnet/mysql simultaneously on several systems"

pdsh from LLNL is first up and aside from the fact that it is a native binary (and hence slightly less flexible than something written in bash or python), pdsh has a wide variety of features including different distribution algorithms, readline support, and cluster resource manager integration (which I really don't need).  I wasn't able to get dynamic modules to work under cygwin (to be fair, I didn't give it much effort), but once configured with static modules, pdsh works as advertised.

pssh is a suite tools for using ssh across multiple machines.  I give it points for using python and hence running anywhere with a python interpreter.  However, it doesn't appear to have an interactive mode, the command to run (or file to be copied or whatever) must be supplied on the command line.

dsh is the original distributed shell tool around.  The original C implementation has lead to versions in python and perl.  An implementation is included in ClusterIt as well.  I didn't bother to compile it mostly due to its dependencies, but it still seems like a good candidate in this realm.

ClusterSSH also looks interesting and like it would meet my needs.  A Perl implementation of anything always garners my favor.  But, like dvt from ClusterIt, ClusterSSH relys on the X11 protocol for controlling its parallel terminals.

Kudos to the guys who wrote mussh for sticking with bash and only bash.  By sticking to shell scripting they've come up with an easy way to get starting running commands across a bunch of machines.  However, without some sort of terminal support or repl loop (which simply might not be possible using only bash), my requirement for interactive usage is not met.

ClusterIt actually does much more than I want or need.  It contains an entire suite of tools for running programs on a cluster as well as managing the configuration.  The dvt (distributed virtual terminal) does in fact look like the kind of thing I want, but with its heavy reliance on X11 for communications between the main terminal and where the commands are executed, I'm writing clusterit off without much further thought.


multixterm is appears to be an expect script that has been around a while.  But it doesn't seem to have been included with my expect distribution, and with all these other tools available, I'm not keen on tracking it down.

dish looks promising, it uses tcl (with threads) and expect to control multiple ssh sessions.  Unfortunately, cygwin's default tcl build is without threads and my primary use case is to ssh from my laptop to a bunch of machines.  You'd be correct to point out I should get a real computer... and trust me, I'm working on it, but for now that is not within my control.
In the end, maybe I should write my own using paramiko and throw another contender on the list.

1 comment:

Unknown said...

http://m.a.tt/er/massh
http://m.a.tt/er/ambit

Shameless self promotion. Originally Massh was a serialized script for pushing, then running small, procedural 'jobs' on multiple hosts and was named deploy.sh (not the Jetty deploy.sh - yet another one ). Like you, I found myself searching and trying the parallel ssh'ers that existed at that time. I tried a few and settled on one from your list that will remain unnamed.

I work in a very large environment. Initially I used my chosen parallel ssh'er rather conservatively - slowly but surely using it for more complicated tasks, on increasingly more hosts. It did not take long for said ssh'er to completely fall over with performance issues (setting inflight ssh connections to 10 for a couple hundred target hosts crushed the box I was running the ssh'er from). Another area where almost all ssh'ers failed miserably was output. Most of them were fine dealing a with single line of output per target host. Most of them were useless for multi-line output per target host (what's the point of returning out-of-order output for df -h [/usr of host x directly under / of host y]?). This inspired me to write Massh.

Massh is written in Bash, but has existed previously in Perl (Went from Bash, to Perl, to a Perl variant written almost entirely by an employee of mine at a past company, back to Bash). Massh has a companion script - Ambit - that handles enumerating hostnames and managing HostGroups from an ever increasing # of sources:

Files
Predefined System Wide HostGroups
Predefined User Specific HostGroups
Predefined Network HostGroups [DNS TXT records]
last but not least...
Ambit expandable string[s] on the command line (example below)

In the near future I will be adding support for Genders (used by pdsh) and and NIS netgroups. In fact Gender support is probably already working but I've yet to test it.


Example: Ambit Expanding Hostlist from String Provided on the Command Line

$ ambit ns[1..5].[apple,google,twitter,facebook].com
ns1.apple.com
ns1.facebook.com
ns1.google.com
ns1.twitter.com
ns2.apple.com
ns2.facebook.com
ns2.google.com
ns2.twitter.com
ns3.apple.com
ns3.facebook.com
ns3.google.com
ns3.twitter.com
ns4.apple.com
ns4.facebook.com
ns4.google.com
ns4.twitter.com
ns5.apple.com
ns5.facebook.com
ns5.google.com
ns5.twitter.com

Massh does the following very well and very parallelized:

Runs Commands on Target Hosts
Runs Local Scripts on Target Hosts
Pushes Files to Target Hosts
Pulls Files From Target Hosts

Here is a short vid that gives a good indication of Massh's performance (the # of hosts in the vid is 200):

massh-is-fast.mov

This image shows well ordered and clear multi-line output.

multi-line-output.jpg

As for these two requirements:

"I just want to specify hosts on the command line (with no prior setup). I really don't even want to required ssh keys or having the same password if I can avoid it."

Massh does depend on ssh private/public keys - BUT - I have a working first build of a parallelized public key deployer. It *safely* prompts the user for their password for the target hosts and then quickly installs the public key in parallel. The problem is the "[not] having the same password" part. I really don't know of an effective way to manage multiple passwords, that can be referenced by an ssh'er during auth AND is actually secure. I almost always insist that users have no password set (shadow entries are set to ! invalid string) and SSH only allow keyed authentication. Let me know if you try out Massh. Feel free to contact me if you have any issues or want to suggest a feature.

Regards,
Mike (Massher) Marschall