Bash
From AdminWiki
(→Using FIFOs as a way of interprocess communication with shellscripts=) |
(→Using FIFOs as a way of interprocess communication with shellscripts=) |
||
Line 219: | Line 219: | ||
So, what can you use FIFOs for? | So, what can you use FIFOs for? | ||
- | ====Using FIFOs as a way of interprocess communication with shellscripts | + | ====Using FIFOs as a way of interprocess communication with shellscripts==== |
First, create a fifo: | First, create a fifo: | ||
philip@jambalaya:~$ mkfifo fifo | philip@jambalaya:~$ mkfifo fifo |
Revision as of 14:55, 11 April 2009
Contents |
Bash Handling
Job Control
Interactive
Key Combinations
I will not use the ^X notation of key combos since those are a bit misleading imo - after all, we are mostly using lower case letters, also people who modified the ^ or meta keys know how to substitute correctly. And of course the terminal type must be right.
Bash input modes
Bash's key combo mode can be set to editor-alike behaviour, specifically it can be set to behaving like emacs or to behaving like vi. You do this with:
set -o emacs
for the emacs mode, obviously, and
set -o vi
for the vi mode.
Most distributions of the GNU bash come with emacs mode preset.
For the purpose of readability, we will give the key kombinations as follows:
- Ctrl-a/A means: "Ctrl-a in emacs mode, "a" in vi mode"
- Ctrl-r/" means: "Ctrl-r in emacs mode, Ctrl-r in vi mode"
- Ctrl-o/- means: "Ctrl-o in emacs mode, nothing identical in vi mode (or not known to the author ;)"
A word about the vi mode
The vi mode is only "something like" the vi you know. Per default, you are in insert mode, you get to command mode (the mode where key combos will work) by pressing Esc. If you use Ctrl-c, you will be returned to a clear insert mode Prompt again. By pressing "i", you get back to the insert mode. Be aware that some vi key combos do not really behave alike in bash.
Searching in bash history
Of course you know that bash will pull lines from the history into the buffer with the up and down cursor keys.
Ctrl-r/" allows you to recall-by-typing from the bash history.
Executing in bash history
Ctrl-o/-(emacs) alone does nothing, but while pressed when on the line of a history entry (pulled with Ctrl - r or cursor keys), it will execute the bash history entry and when it exists, put the next command in history into the shell buffer.
Example:
philip@dinky:~$ vi test.c philip@dinky:~$ make test cc test.c -o test philip@dinky:~$ ./test Hello World! philip@dinky:~$ (press Ctrl-r vi here) (reverse-i-search)`vi': vi test.c (press enter here) philip@dinky:~$ vi test.c philip@dinky:~$ (press Ctrl-r mak here) (reverse-i-search)`mak': make test (press Ctrl-o here) philip@dinky:~$ make test cc test.c -o test philip@dinky:~$ ./test
Moving about the prompt/command line
There are some slight differences between the emacs and the vi versions, mostly about whats considered to be in the copy buffer, where the cursor offset will be etc.
- ctrl-a/0 move the cursor to the beginning of the bash prompt/line
- ctrl-e/A move the cursor to the end of line
- ctrl-w/- cut the word previous to the cursor (including intermittent whitespace)
- ctrl-u/" cut everything from line beginning to, but not including the cursor position
- ctrl-k/", /D cut everything from, and including the cursor position to end of line
- ctrl-_/u undo
- alt-b/- move the cursor back one word
- alt-f/- move the cursor forward one word
- ctrl-y/p paste the last thing cut
- -/h move cursor left
- -/l move cursor right
- -/a start inserting after current cursor position
- -/D cut everything from, and not including
- -/dd delete line (save it for pasting)
- ctrl-c/" reset input buffer (do not save for pasting)
- arrow-up/j move up through history
- arrow-down/k move down through history
Environment
Profile
Your Environment in a bash is controlled by several files which might be read after login:
- /etc/profile and /etc/bashrc - will be read in any case
- .bash_profile, .bash_login, .profile - only the first one found will be read
sudo, su
If you su, do it with su -, which initiates a complete and correct environment for root. Previously messed up environment variables from your user session will be restored to sane defaults, and you will be put in the correct $HOME.
Variables controlling funny things
FIXME PS1, PS2, HOME, ... what can you think of?
Bash Programming
Gotchas
A word about POSIX compliancy
Bash is the most common shell on linux, and most other unix systems. If you are into shell programming, you might know that there is a distinct difference between
#!/bin/sh
and
#!/bin/bash (or anything else, like /bin/ksh for that matter)
Using /bin/sh implies that you know that your shellscript will be posix compliant and you will not use anything else than posix shell supported commands and constructs. This is a tad difficult to achieve when working on bash, but can be done. On most systems /bin/sh might still be a symlink to /bin/bash - but bash assumes posix compliant shell behaviour when called like that. On other systems, there are efforts to use a smaller, more lightweight but still posix compliant shell as default /bin/sh, like the debian dash. Although dash is not fully posix compliant, it is a good tool for developing platform-independent shellscript since it will not provide for the use of bash-isms.
That said, always use #!/bin/sh as file magic for your shellscripts, and always execute your script with ./script so that you use the right interpreter also for testing.
If you do not chose to be platform independent, make sure you reflect it with making #!/bin/bash the file magic
Do I want to do this in shell?
There usually is a lot of ideologically driven discussion about how or when or why someone should do a shell script instead of a perl/python/ruby/... script.
There are a few things to consider when writing scripts in shell.
Use efficient shell utilities. Efficient shell utilities are normally those who accomplish a complex task which is not as easy to implement in "real" scripting languages like perl or python. Examples would be sorting (sort(1), uniq(1)), certain repeating directory/file operations (xargs(1), find(1)), complete applications (rsync(1), tar(1)) and similar.
Inefficient shell utilities would be those whose function would be quickly implemented using builtins of scripting languages, such as the functionality of tools like basename(1), cat(1) and such.
There also is something to be said about how you use those tools. For example, find(1) will not stat() the files it combs through unnecessarily. But if you (for example) use xargs(1) and stat(1) afterwards to filter for links, the only effect is more processes and more overhead as compared to simply using the right find(1) switches and options. Scenarios such as that also exist in favour of scripting languages.
- What good and efficient shell tools (like find, awk, xargs, sort, uniq, grep ...) can you use to accomplish your task?
- How many unefficient shell tools (like basename, cat, ...) would you be required to use?
- How efficient/well can you use your tools? Consider find(1), which can be a powerful tool and can very often be used on its own to solve huge problems - but only if the user knows it *very* well.
- Is it supposed to be platform independent? Then you shouldn't count on bash builtins to speed things up.
- Will you still profit from the job control your shell provides?
- Will you profit from piping? Or will it add lag/overhead?
Those questions are certainly not complete. You should decide on a per case basis what to use. Loading an interpreter and precompiling bytecode can also cost a lot. Also, not all interpreted languages ship with optimized (machine code) versions of certain tasks, like matching strings like grep can do, sorting, file operations etc.
At last, let me remind you that every execution of a tool like awk or sed is
- a complete vfork of the bash process
- read operation from the disk for loading the tool
- setting up a linking environment
- processing libc loading, nameservice caching etc
- and a whole lot more
This is the reason why you should avoid find ... | xargs ... constructs with large result sets.
Plus, the | (pipe) operator denotes, that there are some performance/ressource caveats. Long command | command | command constructs can lead to
- Additional buffers (= bottlenecks)
- Huge execution times where all of these process stay in memory, often
- Processes trying to continue their work without being able to do so
- Receiving the result only when all other operations finished, as in
du -xm / |sort -g
All those warnings might sound very far fetched to you, but consider a system where several thousand instances of small utilities are started each minute - like a monitoring server. Even modern multicore systems will be "impressed" by this load at some point.
Redirection
Standard Input, Output and Error
Digression for a better understanding of file descriptors
If you are on a system with a /proc filesystem (and may god have mercy on your soul if you are not), you will find that there are interesting symlinks in /proc/<pid>/fd. Let me elaborate.
Consider the following: you are logged on, have an interactive shell like bash, and you are the owner of your terminal (tty). Now, find out your PID:
philip@dinky:~$ w |head -3 15:15:04 up 72 days, 17 min, 35 users, load average: 0.15, 0.11, 0.09 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT philip tty2 - 11Feb09 58days 0.10s 0.00s /bin/login -- philip@dinky:~$ ps waux |grep bash |grep tty2 philip 22008 0.0 0.0 21396 1372 tty2 S Feb11 0:00 -bash philip@dinky:~$
now, with that information, let's have a peek into our filedescriptors:
philip@dinky:~$ ls -l /proc/22008/fd total 0 lrwx------ 1 philip philip 64 2009-04-10 15:16 0 -> /dev/tty2 lrwx------ 1 philip philip 64 2009-04-10 15:16 1 -> /dev/tty2 lrwx------ 1 philip philip 64 2009-04-10 15:16 2 -> /dev/tty2 lrwx------ 1 philip philip 64 2009-04-10 15:16 255 -> /dev/tty2 philip@dinky:~$
there are, of course, redirections to your current interactive shell. This is why your software will crash receiving a SIGPIPE signal when you close your terminal while something is running.
So, what do those numbers mean? If software allocates file descriptors (thats what it gets when it opens files for reading or writing) from the operating system, it will receive integer numbers. Thus, the PID plus the file descriptor will uniquely identify an open file system wide - a bit like an ip-address + port pair for a TCP connection.
- 0 is standard input
- 1 is standard output
- 2 is standard error
- everything else are filedescriptors received when the software opened other files, like configuration files, log files etc.
- 255 is just there to mess with my explanation.
The whole "standard X" metaphor is a bit to wrap ones head around, but after a while using them extensively and with deep understanding, one will be able to prevent a lot of shit from happening (processes dying under certain conditions, missing logfiles, mailbombs from cron etc) and finally lead a happy life.
The standard file descriptors are the fds where software will write and receive their user interaction by default. Yet, software does not necessarily need to do that; while libc provides for every running piece of software with those filedescriptors by standard, the software can close them willingly. Software can indeed decide to "unbind" itself from standard input and output by closing the respective file descriptor. Analog to that, the respective symlinks in the proc fs will disappear, and so will the depdendy on those files (like the interactive shell).
By using redirection, you can remap those standard file descriptors in your own way. Consider doing something like this on one terminal:
philip@dinky:~$ cat > /dev/null
and try finding the PID of this cat process, then do something like the following on another shell (without killing the cat of course):
philip@dinky:~$ ls /proc/10519/fd -l total 0 lrwx------ 1 philip philip 64 2009-04-10 15:31 0 -> /dev/pts/17 l-wx------ 1 philip philip 64 2009-04-10 15:31 1 -> /dev/null lrwx------ 1 philip philip 64 2009-04-10 15:31 2 -> /dev/pts/17
so, i hope now it starts making sense for you: by redirecting (which is a bash/shell feature by the way), you can remap the file descriptors to an endpoint you chose. This can be useful for logging errors to a file, like in
find . 2>nopermissions.txt
or to pipe all output through a filter, as in
strace echo Hello World 2>&1 |less
the latter will redirect the standard error output into the pipe of the standard output and pipe the resulting combined output through less.
I encourage you to play with redirection a bit before using it for bigger purposes. At the end of the day. almost all of the bash behaviour for redirection makes sense in the greater picture of the "everything is a file" metaphor - you just need to figure it all out ;)
redirect output and error
ls -l 1>normal_out 2>error_out
if you want to have both in one file
ls -l 1>normal_out 2>&1
Named Pipes (FIFO)s
Named Pipes or First-In-First-Out Buffers are what their names suggest: named pipes, the same thing as what you use by typing | after a command, but with a name.
Ok, so what gives? A lot actually. As special files, FIFOs behave a little differently. Consider having output in one program/shellscript and you want to use it in another shellscript, but there simply is no way of piping them together.
Some people would use a temporary file for that, and real files have some advantages and disadvantages over FIFOs which one should be aware of when using them:
- First and foremost, FIFOs block. What does that mean? If you read on on end, and nobody is writing on the other end, the read will block,as will a write operation for that matter, if nobody is reading.
Use two terminals to verify. First terminal:
philip@jambalaya:~$ mkfifo fifo philip@jambalaya:~$ echo 1 > fifo
aaand this will block. But open a second terminal, and do something along the lines of
philip@jambalaya:~$ cat fifo 1
aaand there you go! The lock on the first terminal is released, the echo command can exit. The same will happen to normal processes who try to read files (which are normally not blocked, or not long anyway), so this is why grepping a directory with FIFOs in it is a bad idea.
- FIFOs don't use diskspace. This excentricity is the beauty of special files - they are only logical items for the kernel, which use inodes, but no real diskspace.
- FIFOs need to have filesystem support. This is a direct result of being a kernel matter and this is also why you are out of luck on (V)FAT partitions, among others.
- FIFOs can not "cache" anything
- One cannot seek in FIFOs - first-in-first-out is all that matters.
- FIFOs are a kernel feature - not a bash or filesystem feature.
So, what can you use FIFOs for?
Using FIFOs as a way of interprocess communication with shellscripts
First, create a fifo:
philip@jambalaya:~$ mkfifo fifo
Put this in a bash shellscript and run it:
result="no" while [ "$break" != "yes" ] do break=`cat fifo`; echo "may i exit? $break." done echo "thanks!"
and try:
philip@jambalaya:~$ echo -n no > fifo philip@jambalaya:~$ echo -n yes > fifo philip@jambalaya:~$
Questions and Solutions
Below some examples for bash problems that might come up
Arrays in Bash:
foo[0]=1; foo[1]=2; foo[2]=3; # loop for (( i=0; i<${#foo[@]}; i++ )); do echo ${foo[$i]}; done; # more like a foreach for i in ${foo[@]}; do echo $i; done;
Variable Variables
foobar=5; bar="foobar"; echo $foobar; echo $bar; echo ${!bar};
Variable Variables in Arrays
foo[0]=1; foo[1]=2; foo[2]=3; foo_tcp[0]="AT"; foo_upd[0]="AU"; foo_tcp[1]="BT"; foo_upd[1]="BU"; foo_tcp[2]="CT"; foo_upd[2]="CU"; # loop for (( i=0; i<${#foo[@]}; i++ )); do echo ${foo[$i]}; for k in upd tcp; do data=foo_${k}[$i]; echo ${!data}; done; done;
More Documentation
The best advanced guide for bash scripting: http://www.tldp.org/LDP/abs/html/