Bash

From AdminWiki

(Difference between revisions)
Jump to: navigation, search
(Using FIFOs as a way of interprocess communication with shellscripts)
 
(3 intermediate revisions not shown)
Line 195: Line 195:
if you want to have both in one file
if you want to have both in one file
   ls -l 1>normal_out 2>&1
   ls -l 1>normal_out 2>&1
-
===FIFOs===
+
===Named Pipes (FIFO)s===
 +
Named Pipes or First-In-First-Out Buffers are what their names suggest: named pipes, the same thing as what you use by typing | after a command, but with a name.
-
FIXME say something about fifos here
+
Ok, so what gives? A lot actually. As special files, FIFOs behave a little differently. Consider having output in one program/shellscript and you want to use it in another shellscript, but there simply is no way of piping them together.
 +
 
 +
Some people would use a temporary file for that, and real files have some advantages and disadvantages over FIFOs which one should be aware of when using them:
 +
 
 +
* First and foremost, '''FIFOs block'''. What does that mean? If you read on on end, and nobody is writing on the other end, the read will block,as will a write operation for that matter, if nobody is reading.
 +
Use two terminals to verify.
 +
First terminal:
 +
philip@jambalaya:~$ mkfifo fifo
 +
philip@jambalaya:~$ echo 1 > fifo
 +
aaand this will block. But open a second terminal, and do something along the lines of
 +
philip@jambalaya:~$ cat fifo
 +
1
 +
aaand there you go! The lock on the first terminal is released, the echo command can exit. The same will happen to normal processes who try to read files (which are normally not blocked, or not long anyway), so this is why grepping a directory with FIFOs in it is a bad idea.
 +
* '''FIFOs don't use diskspace'''. This excentricity is the beauty of special files - they are only logical items for the kernel, which use inodes, but no real diskspace.
 +
* '''FIFOs need to have filesystem support'''. This is a direct result of being a kernel matter and this is also why you are out of luck on (V)FAT partitions, among others.
 +
* '''FIFOs can not "cache" anything'''
 +
* '''One cannot seek in FIFOs''' - first-in-first-out is all that matters.
 +
* '''FIFOs are a kernel feature''' - not a bash or filesystem feature.
 +
 
 +
So, what can you use FIFOs for?
 +
 
 +
====Using FIFOs as a way of interprocess communication with shellscripts====
 +
First, create a fifo:
 +
philip@jambalaya:~$ mkfifo fifo
 +
Put this in a bash shellscript and run it:
 +
result="no"
 +
while [ "$break" != "yes" ]
 +
do
 +
        break=`cat fifo`;
 +
        echo "may i exit? $break."
 +
done
 +
echo "thanks!"
 +
and try:
 +
philip@jambalaya:~$ echo -n no > fifo
 +
philip@jambalaya:~$ echo -n yes > fifo
 +
philip@jambalaya:~$
 +
 
 +
====Using FIFOs as "parachutes"====
 +
You can halt processes by redirecting STDIN, STDOUT or STDERR. This has only very few use cases, but when you need it, you will know it and it will come in handy, so don't forget.
 +
 
 +
=====Make a process receive your input=====
 +
philip@jambalaya:~$ cat fifo | process
 +
now use another shell to supply the standard input:
 +
philip@jambalaya:~$ cat file > fifo
 +
process will receive anything you put in the fifo until EOF. You can use this multiple times, once for each blocking read() operation process does on STDIN.
== Questions and Solutions ==
== Questions and Solutions ==

Latest revision as of 15:05, 11 April 2009

Contents

Bash Handling

Job Control

Interactive

Key Combinations

I will not use the ^X notation of key combos since those are a bit misleading imo - after all, we are mostly using lower case letters, also people who modified the ^ or meta keys know how to substitute correctly. And of course the terminal type must be right.

Bash input modes

Bash's key combo mode can be set to editor-alike behaviour, specifically it can be set to behaving like emacs or to behaving like vi. You do this with:

set -o emacs 

for the emacs mode, obviously, and

set -o vi

for the vi mode.

Most distributions of the GNU bash come with emacs mode preset.

For the purpose of readability, we will give the key kombinations as follows:

  • Ctrl-a/A means: "Ctrl-a in emacs mode, "a" in vi mode"
  • Ctrl-r/" means: "Ctrl-r in emacs mode, Ctrl-r in vi mode"
  • Ctrl-o/- means: "Ctrl-o in emacs mode, nothing identical in vi mode (or not known to the author ;)"

A word about the vi mode

The vi mode is only "something like" the vi you know. Per default, you are in insert mode, you get to command mode (the mode where key combos will work) by pressing Esc. If you use Ctrl-c, you will be returned to a clear insert mode Prompt again. By pressing "i", you get back to the insert mode. Be aware that some vi key combos do not really behave alike in bash.

Searching in bash history

Of course you know that bash will pull lines from the history into the buffer with the up and down cursor keys.

Ctrl-r/" allows you to recall-by-typing from the bash history.

Executing in bash history

Ctrl-o/-(emacs) alone does nothing, but while pressed when on the line of a history entry (pulled with Ctrl - r or cursor keys), it will execute the bash history entry and when it exists, put the next command in history into the shell buffer.

Example:

philip@dinky:~$ vi test.c
philip@dinky:~$ make test
cc     test.c   -o test
philip@dinky:~$ ./test
Hello World!
philip@dinky:~$ (press Ctrl-r vi here)
(reverse-i-search)`vi': vi test.c (press enter here)
philip@dinky:~$ vi test.c
philip@dinky:~$ (press Ctrl-r mak here)
(reverse-i-search)`mak': make test (press Ctrl-o here)
philip@dinky:~$ make test
cc     test.c   -o test
philip@dinky:~$ ./test

Moving about the prompt/command line

There are some slight differences between the emacs and the vi versions, mostly about whats considered to be in the copy buffer, where the cursor offset will be etc.

  • ctrl-a/0 move the cursor to the beginning of the bash prompt/line
  • ctrl-e/A move the cursor to the end of line
  • ctrl-w/- cut the word previous to the cursor (including intermittent whitespace)
  • ctrl-u/" cut everything from line beginning to, but not including the cursor position
  • ctrl-k/", /D cut everything from, and including the cursor position to end of line
  • ctrl-_/u undo
  • alt-b/- move the cursor back one word
  • alt-f/- move the cursor forward one word
  • ctrl-y/p paste the last thing cut
  • -/h move cursor left
  • -/l move cursor right
  • -/a start inserting after current cursor position
  • -/D cut everything from, and not including
  • -/dd delete line (save it for pasting)
  • ctrl-c/" reset input buffer (do not save for pasting)
  • arrow-up/j move up through history
  • arrow-down/k move down through history

Environment

Profile

Your Environment in a bash is controlled by several files which might be read after login:

  • /etc/profile and /etc/bashrc - will be read in any case
  • .bash_profile, .bash_login, .profile - only the first one found will be read

sudo, su

If you su, do it with su -, which initiates a complete and correct environment for root. Previously messed up environment variables from your user session will be restored to sane defaults, and you will be put in the correct $HOME.

Variables controlling funny things

FIXME PS1, PS2, HOME, ... what can you think of?

Bash Programming

Gotchas

A word about POSIX compliancy

Bash is the most common shell on linux, and most other unix systems. If you are into shell programming, you might know that there is a distinct difference between

#!/bin/sh

and

#!/bin/bash (or anything else, like /bin/ksh for that matter)

Using /bin/sh implies that you know that your shellscript will be posix compliant and you will not use anything else than posix shell supported commands and constructs. This is a tad difficult to achieve when working on bash, but can be done. On most systems /bin/sh might still be a symlink to /bin/bash - but bash assumes posix compliant shell behaviour when called like that. On other systems, there are efforts to use a smaller, more lightweight but still posix compliant shell as default /bin/sh, like the debian dash. Although dash is not fully posix compliant, it is a good tool for developing platform-independent shellscript since it will not provide for the use of bash-isms.

That said, always use #!/bin/sh as file magic for your shellscripts, and always execute your script with ./script so that you use the right interpreter also for testing.

If you do not chose to be platform independent, make sure you reflect it with making #!/bin/bash the file magic

Do I want to do this in shell?

There usually is a lot of ideologically driven discussion about how or when or why someone should do a shell script instead of a perl/python/ruby/... script.

There are a few things to consider when writing scripts in shell.

Use efficient shell utilities. Efficient shell utilities are normally those who accomplish a complex task which is not as easy to implement in "real" scripting languages like perl or python. Examples would be sorting (sort(1), uniq(1)), certain repeating directory/file operations (xargs(1), find(1)), complete applications (rsync(1), tar(1)) and similar.

Inefficient shell utilities would be those whose function would be quickly implemented using builtins of scripting languages, such as the functionality of tools like basename(1), cat(1) and such.

There also is something to be said about how you use those tools. For example, find(1) will not stat() the files it combs through unnecessarily. But if you (for example) use xargs(1) and stat(1) afterwards to filter for links, the only effect is more processes and more overhead as compared to simply using the right find(1) switches and options. Scenarios such as that also exist in favour of scripting languages.

  • What good and efficient shell tools (like find, awk, xargs, sort, uniq, grep ...) can you use to accomplish your task?
  • How many unefficient shell tools (like basename, cat, ...) would you be required to use?
  • How efficient/well can you use your tools? Consider find(1), which can be a powerful tool and can very often be used on its own to solve huge problems - but only if the user knows it *very* well.
  • Is it supposed to be platform independent? Then you shouldn't count on bash builtins to speed things up.
  • Will you still profit from the job control your shell provides?
  • Will you profit from piping? Or will it add lag/overhead?

Those questions are certainly not complete. You should decide on a per case basis what to use. Loading an interpreter and precompiling bytecode can also cost a lot. Also, not all interpreted languages ship with optimized (machine code) versions of certain tasks, like matching strings like grep can do, sorting, file operations etc.

At last, let me remind you that every execution of a tool like awk or sed is

  • a complete vfork of the bash process
  • read operation from the disk for loading the tool
  • setting up a linking environment
  • processing libc loading, nameservice caching etc
  • and a whole lot more

This is the reason why you should avoid find ... | xargs ... constructs with large result sets.

Plus, the | (pipe) operator denotes, that there are some performance/ressource caveats. Long command | command | command constructs can lead to

  • Additional buffers (= bottlenecks)
  • Huge execution times where all of these process stay in memory, often
  • Processes trying to continue their work without being able to do so
  • Receiving the result only when all other operations finished, as in
du -xm / |sort -g

All those warnings might sound very far fetched to you, but consider a system where several thousand instances of small utilities are started each minute - like a monitoring server. Even modern multicore systems will be "impressed" by this load at some point.

Redirection

Standard Input, Output and Error

Digression for a better understanding of file descriptors

If you are on a system with a /proc filesystem (and may god have mercy on your soul if you are not), you will find that there are interesting symlinks in /proc/<pid>/fd. Let me elaborate.

Consider the following: you are logged on, have an interactive shell like bash, and you are the owner of your terminal (tty). Now, find out your PID:

philip@dinky:~$ w |head -3
 15:15:04 up 72 days, 17 min, 35 users,  load average: 0.15, 0.11, 0.09
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
philip   tty2     -                11Feb09 58days  0.10s  0.00s /bin/login -- 
philip@dinky:~$ ps waux |grep bash |grep tty2
philip   22008  0.0  0.0  21396  1372 tty2     S    Feb11   0:00 -bash
philip@dinky:~$ 

now, with that information, let's have a peek into our filedescriptors:

philip@dinky:~$ ls -l /proc/22008/fd
total 0
lrwx------ 1 philip philip 64 2009-04-10 15:16 0 -> /dev/tty2
lrwx------ 1 philip philip 64 2009-04-10 15:16 1 -> /dev/tty2
lrwx------ 1 philip philip 64 2009-04-10 15:16 2 -> /dev/tty2
lrwx------ 1 philip philip 64 2009-04-10 15:16 255 -> /dev/tty2
philip@dinky:~$ 

there are, of course, redirections to your current interactive shell. This is why your software will crash receiving a SIGPIPE signal when you close your terminal while something is running.

So, what do those numbers mean? If software allocates file descriptors (thats what it gets when it opens files for reading or writing) from the operating system, it will receive integer numbers. Thus, the PID plus the file descriptor will uniquely identify an open file system wide - a bit like an ip-address + port pair for a TCP connection.

  • 0 is standard input
  • 1 is standard output
  • 2 is standard error
  • everything else are filedescriptors received when the software opened other files, like configuration files, log files etc.
  • 255 is just there to mess with my explanation.

The whole "standard X" metaphor is a bit to wrap ones head around, but after a while using them extensively and with deep understanding, one will be able to prevent a lot of shit from happening (processes dying under certain conditions, missing logfiles, mailbombs from cron etc) and finally lead a happy life.

The standard file descriptors are the fds where software will write and receive their user interaction by default. Yet, software does not necessarily need to do that; while libc provides for every running piece of software with those filedescriptors by standard, the software can close them willingly. Software can indeed decide to "unbind" itself from standard input and output by closing the respective file descriptor. Analog to that, the respective symlinks in the proc fs will disappear, and so will the depdendy on those files (like the interactive shell).

By using redirection, you can remap those standard file descriptors in your own way. Consider doing something like this on one terminal:

philip@dinky:~$ cat > /dev/null

and try finding the PID of this cat process, then do something like the following on another shell (without killing the cat of course):

philip@dinky:~$ ls /proc/10519/fd -l
total 0
lrwx------ 1 philip philip 64 2009-04-10 15:31 0 -> /dev/pts/17
l-wx------ 1 philip philip 64 2009-04-10 15:31 1 -> /dev/null
lrwx------ 1 philip philip 64 2009-04-10 15:31 2 -> /dev/pts/17

so, i hope now it starts making sense for you: by redirecting (which is a bash/shell feature by the way), you can remap the file descriptors to an endpoint you chose. This can be useful for logging errors to a file, like in

find . 2>nopermissions.txt

or to pipe all output through a filter, as in

strace echo Hello World 2>&1 |less

the latter will redirect the standard error output into the pipe of the standard output and pipe the resulting combined output through less.

I encourage you to play with redirection a bit before using it for bigger purposes. At the end of the day. almost all of the bash behaviour for redirection makes sense in the greater picture of the "everything is a file" metaphor - you just need to figure it all out ;)

redirect output and error

 ls -l 1>normal_out 2>error_out

if you want to have both in one file

 ls -l 1>normal_out 2>&1

Named Pipes (FIFO)s

Named Pipes or First-In-First-Out Buffers are what their names suggest: named pipes, the same thing as what you use by typing | after a command, but with a name.

Ok, so what gives? A lot actually. As special files, FIFOs behave a little differently. Consider having output in one program/shellscript and you want to use it in another shellscript, but there simply is no way of piping them together.

Some people would use a temporary file for that, and real files have some advantages and disadvantages over FIFOs which one should be aware of when using them:

  • First and foremost, FIFOs block. What does that mean? If you read on on end, and nobody is writing on the other end, the read will block,as will a write operation for that matter, if nobody is reading.

Use two terminals to verify. First terminal:

philip@jambalaya:~$ mkfifo fifo
philip@jambalaya:~$ echo 1 > fifo

aaand this will block. But open a second terminal, and do something along the lines of

philip@jambalaya:~$ cat fifo
1

aaand there you go! The lock on the first terminal is released, the echo command can exit. The same will happen to normal processes who try to read files (which are normally not blocked, or not long anyway), so this is why grepping a directory with FIFOs in it is a bad idea.

  • FIFOs don't use diskspace. This excentricity is the beauty of special files - they are only logical items for the kernel, which use inodes, but no real diskspace.
  • FIFOs need to have filesystem support. This is a direct result of being a kernel matter and this is also why you are out of luck on (V)FAT partitions, among others.
  • FIFOs can not "cache" anything
  • One cannot seek in FIFOs - first-in-first-out is all that matters.
  • FIFOs are a kernel feature - not a bash or filesystem feature.

So, what can you use FIFOs for?

Using FIFOs as a way of interprocess communication with shellscripts

First, create a fifo:

philip@jambalaya:~$ mkfifo fifo

Put this in a bash shellscript and run it:

result="no"
while [ "$break" != "yes" ]
do
       break=`cat fifo`;
       echo "may i exit? $break."
done
echo "thanks!"

and try:

philip@jambalaya:~$ echo -n no > fifo
philip@jambalaya:~$ echo -n yes > fifo
philip@jambalaya:~$

Using FIFOs as "parachutes"

You can halt processes by redirecting STDIN, STDOUT or STDERR. This has only very few use cases, but when you need it, you will know it and it will come in handy, so don't forget.

Make a process receive your input
philip@jambalaya:~$ cat fifo | process

now use another shell to supply the standard input:

philip@jambalaya:~$ cat file > fifo

process will receive anything you put in the fifo until EOF. You can use this multiple times, once for each blocking read() operation process does on STDIN.

Questions and Solutions

Below some examples for bash problems that might come up

Arrays in Bash:

 foo[0]=1;
 foo[1]=2;
 foo[2]=3;
 # loop
 for (( i=0; i<${#foo[@]}; i++ ));
 do
     echo ${foo[$i]};
 done;
 # more like a foreach
 for i in ${foo[@]};
 do
     echo $i;
 done;

Variable Variables

 foobar=5;
 bar="foobar";
 echo $foobar;
 echo $bar;
 echo ${!bar};

Variable Variables in Arrays

 foo[0]=1;
 foo[1]=2;
 foo[2]=3;
 foo_tcp[0]="AT";
 foo_upd[0]="AU";
 foo_tcp[1]="BT";
 foo_upd[1]="BU";
 foo_tcp[2]="CT";
 foo_upd[2]="CU";
 # loop
 for (( i=0; i<${#foo[@]}; i++ ));
 do
     echo ${foo[$i]};
     for k in upd tcp;
     do
         data=foo_${k}[$i];
         echo ${!data};
     done;
 done;

More Documentation

The best advanced guide for bash scripting: http://www.tldp.org/LDP/abs/html/

Personal tools