Category Archives: ZSH

ZSH map and filter functions using “anonymous functions”

UPDATE: Arash Rouhani picked this up and took it quite a bit further here.

The other day I was thinking that it would be handy to have a map function (in the functional programming sense) for zsh, and I couldn’t see anything in zsh itself that looked like what I wanted. So a quick google turned up this page by Yann Esposito, which gives an implementation of not only map but also filter and fold. Very cool!

The only problem, as the author points out, is that it is inconvenient to have to actually define a separate named function in order to use the facilities. So I groveled around in the zsh docs and found the (e) qualifier for parameter expansion, which causes an evaluation. That led me to write new versions of map and filter and related commands that work with anonymous “functions” — really just bits of code that get evaluated with $1 set to something — like so:

### Map each of a list of integer Xs to X+1:

$ mapa '$1+1' {1..4}
2
3
4
5

### Map each FOO.scala to FOO.class:

$ map '$1:r.class' test{1,2}.scala
test1.class
test2.class

### Get the subset which are ordinary files (bin is a dir):

$ filterf 'test -f' bin test{1,2}.scala
test1.scala
test2.scala

### Get the even numbers between 1 and 5:

$ filtera '$1%2 == 0' {1..5}
2
4

### Map each filename to 0 if it is an ordinary file:

$ each '[ -f $1 ]; echo $?' /bin test{1,2}.scala
1
0
0

### Given a directory tree containing some Foo.task files,
### an isStartable function that returns success if a task is startable now,
### and a startTask function that starts it, start all startable tasks.

$ eachf startTask $( filterf isStartable **/*.task )

Here are the functions:

###### map{,a}

### map Word Arg ...
### For each Arg, evaluate and print Word with Arg as $1.
### Returns last nonzero result, or 0.

function map() {
  typeset f="$1"; shift
  typeset x
  typeset result=0
  for x; map_ "$x" "$f" || result=$?
  return $result
}
function map_() {
  print -- "${(e)2}"
}

### mapa ArithExpr Arg ...   # is shorthand for
### map '$[ ArithExpr ]' Arg ...

function mapa() {
  typeset f="\$[ $1 ]"; shift
  map "$f" "$@"
}

###### each{,f}

### each Command Arg ...
### For each Arg, execute Command with Arg as $1.
### Returns last nonzero result, or 0.

function each() {
  typeset f="$1"; shift
  typeset x
  typeset result=0
  for x; each_ "$x" "$f" || result=$?
  return $result
}
function each_() {
  eval "$2"
}

### eachf Command Arg ...   # is shorthand for
### each 'Command $1' Arg ...

function eachf() {
  typeset f="$1 \"\$1\""; shift
  each "$f" "$@"
}

###### filter{,f,a}

### filter Command Arg ...
### For each Arg, print Arg if Command is successful with Arg as $1.

function filter() {
  typeset f="$1"; shift
  typeset x
  for x; filter_ "$x" "$f"
  return 0
}
function filter_() {
  eval "$2" && print -- "$1"
}

### filterf Command Arg ...   # is shorthand for
### filter 'Command "$1"' Arg ...

function filterf() {
  typeset f="$1 \"\$1\""; shift
  filter "$f" "$@"
}

### filtera ArithRelation Arg ...  # is shorthand for
### filter '(( ArithRelation ))' Arg ...

function filtera() {
  typeset f="(( $1 ))"; shift
  filter "$f" "$@"
}

Writing this kind of code is tricky for me; it’s easy to get the quoting wrong. For example,

$ each 'echo "$1"' \*

should just print an asterisk, but at first I was getting a list of files.

Anyway, it was fun. I might add fold and friends later, when I have more time.

Oh, one last thing. I use this function for testing zsh code:

TEST() {
  echo TEST: "${(qq)@}"
  "$@"
}

That way I can write a suite of many tests like this one

TEST filtera '$1%2 == 0' {1..5}

and get output like this, showing me what is being tested, followed by the results:

TEST: 'filtera' '$1%2 == 0' '1' '2' '3' '4' '5'
2
4

Without the (qq) you wouldn’t be able to tell where one argument ends and the next begins:

TEST: filtera $1%2 == 0 1 2 3 4 5
2
4

Basic calculator in the shell

I don’t like having to crank up a calculator to do simple calculations. Sometimes I start ruby or scala to do them, but I really like just being able to do them right in the shell. I use zsh, and have code in my .zshrc file to allow me to evaluate expressions very simply while in the shell. In addition, it keeps the last result in variable z (the last letter), which can be used in further calculations or in other commands (as $z).

Here is an example of its use. Note that I’m giving PS1 a value that makes it stand out, just to make this post easier to read — don’t interpret that as a special prompt. The inline comments should make it clear what’s going on.

% PS1="=======> "
=======>               # still in the shell; this is just our prompt
=======>               # z is given 0 as its initial value
=======> + 5           # add 5 to z
5
=======> + 7           # add 7 to z
12
=======> - 3           # subtract 3 from z
9
=======> * 4           # multiply z by 4
36
=======> / 6           # divide z by 6
6
=======> +             # with no value, this means "double z"
12
=======> *             # square z
144
=======> -             # negate z
-144
=======> /             # take the reciprocal of z
-0.0069444444
=======> * 10000
-69.4444444444
=======> int           # remove the fractional part
-69
=======> z - ( 1 / (z/10) )   # calculation using z
-68.8550724638
=======> z             # say again?
-68.8550724638
=======> echo $z >foo  # use that number in a command
=======> , (9/5)**3    # 9/5 cubed, but what's wrong?
1
=======> , (9.0/5)**3   # oh, 9/5 is integer division
5.832

That’s pretty concise and straightforward, no? Here’s the relevant code in my .zshrc file. Note the use of noglob to prevent * from being replaced with filenames:

alias a=alias
typeset -F z=0.0
a calc='noglob calc_'
calc_() {
  # echo 1>&2 'calc: $* = ' "$@"
    if [ $# = 2 ]; then
        [ "$2" = \- ] && set -- 0.0  \- "$1"   #  3 -  =>  0.0 - 3
        [ "$2" = \/ ] && set -- 1.0  \/ "$1"   #  3 /  =>  1.0 / 3
        [ "$2" = \+ ] && set -- "$1" \+ "$1"   #  3 +  =>    3 + 3
        [ "$2" = \* ] && set -- "$1" \* "$1"   #  3 *  =>    3 * 3
    fi
    (( z = $* ))
    case $z in
        *.*) echo $z | sed -e 's/0*$//' -e 's/\.$//'
             ;;
          *) echo $z
             ;;
    esac
}
a      ,='calc'        # comma starts a new calculation
a      z='calc z'
a      0='calc 0.0'
a      1='calc 1.0'
a -- '+'='calc z +'
a -- '-'='calc z -'
a -- '*'='calc z *'
a -- '/'='calc z /'
int() {
    calc $( z | sed 's/\..*//' )
}

I discovered something interesting while modifying the code. It wasn’t working, and I couldn’t figure out why not. I had leading and trailing spaces in the alias bodies, for readability — on a lark I decided to remove them, not really thinking it would help but running out of sensible things to try, and sure enough that fixed the problem. Zsh, at least, appears to treat an alias differently if its body begins with a space. A quick search didn’t turn up anything about that, so if you know what that’s about, please comment!

Give your app a shell-based CLI

I want to share a neat trick for making powerful CLIs (command-line interfaces). I used it at Sun eons ago in a tool called “warlock”, which statically analyzes multi-threaded C programs for locking problems — data races, deadlocks, etc.

But I should start with the project I was working on before that — MP-SAS, an architectural simulator for Sparc multiprocessor systems. The simulator had a CLI, and occasionally somebody would add something like the ability to re-execute the previous line, or the ability to store some result in a named variable.

I argued that we should stop adding shell-like features piecemeal and instead rig the simulator to get started as a daemon along with a real interactive shell, like ksh, and arrange for commands run from the shell to be able to talk to the simulator. Then you could do everything that you already know how to do in the shell while talking to the simulator.

They didn’t go for it — one guy in particular was convinced that it would be too slow. But a year later I had put such a CLI on my next project, warlock. Well, guess what — the performance was just fine. Not only did you get all of the interactive features like recalling lines, line editing, completion, and so on, but you also got scripting. Anything you could do in a shell you could do in warlock. And not somebody else’s shell — *your* shell, whichever shell you happened to like, with all of the aliases you use, with all of the environment variables you have set, etc. I used zsh for interactive warlock work, but scripts typically used sh for compatibility. Ksh, bash, csh, tcsh — they could all be used as the front-end to warlock.

For example, you could give warlock commands like

  load xlt_*.ll  # load files matching a pattern
  locks | grep xlt | sort >locks.out  # save sorted info about certain locks to a file
  func foo<TAB>  # complete a function name

The shell integration was pretty handy! There was even a feature with which you could push the current state of the analyzer on a stack, perform some experiment, and then return to that saved state by popping it off the stack. This was fairly trivial to implement — the “save” command caused the daemon to fork(). The parent waited for the child to exit, and the child responded to further requests from the user. When you said “exit”, the child exited and the parent took over again.

This is not to say that warlock was a highly usable program — few actually suffered with it long enough to get good results. One who did, Frits Vanderlinden, managed to discipline an entire group of engineers writing Solaris device drivers to make their code clean of warlock errors before checking in changes, and he claimed that as a result warlock caught “countless” bugs in driver code, making Solaris releases that much more solid.

Anyway, the lack of usability wasn’t the fault of the shell integration — I was always quite happy with the way that turned out.

The way I implemented it, when you ran the warlock CLI you were really invoking a perl script (I would probably use python or ruby today, but perl was a great choice then) that did the following:

  1. Set up a temp directory for the session.
  2. Created two named pipes in it, COMMANDS and RESPONSES.
  3. Started the warlock analyzer (a C++ program, but you could do it in Java or whatever) in the background.
  4. Started a shell (whatever was specified in env var WARLOCK_SHELL, or sh by default) with its path augmented to include a directory containing warlock’s commands, as ordinary executables.
  5. Waited around for either the shell or the analyzer to exit.
  6. Cleaned things up.

If you invoked the tool with -c Command, it just passed that on to the shell — batch mode processing.

The analyzer just went into a simple loop in which it basically did

  • Open the COMMANDS pipe, read a command, and close the pipe.
  • Open the RESPONSES pipe, write the response, and close the pipe.

A command like “funcs -v” would just write “funcs -v” to the COMMANDS pipe and read the results on the RESPONSES pipe. But because funcs is just another command to the shell, you could use pipes, redirection, for loops — whatever — to accomplish some task.

Anyway, that’s the idea. An alternative, by the way, is to link with a ksh library. That would give you better performance, if you need to perform hundreds of commands per second. However, it would force your users to use that particular shell, limit you to that shell’s features, and limit the hosts you could run on. Another option would be to use something like Guile, if your users wouldn’t mind it. That would give your users a very powerful scripting environment.  On the other hand, it probably doesn’t have interactive features on a par with modern shells, and most people would have quite a learning curve to use your program.

Different techniques would be appropriate for different situations. I’ve used this technique of grafting an actual shell into the CLI twice now, and both times the result has been great. There you have it!

UPDATE:

I recently did yet another such CLI, this time in Python using the awesome Requests library, that talks directly to a RESTful API (no named pipes). Very nice!