Tip for some easy to remember Bash loop constructs: Incremental index: for ((i =...

stouset · on Nov 26, 2023

Please please please always quote variable expansion. Just do it everywhere, every time.

    for i in “${foo[@]}”; do

pjot · on Nov 26, 2023

To illustrate why, consider a file name with spaces.

    filename="my file.txt"
    cat $filename   # Incorrect: Tries to 'cat' two files: 'my' and 'file.txt'

    cat "$filename" # Correct: Treats 'my file.txt' as one file

mgdlbp · on Nov 26, 2023

Additionally,

    $ f=-x
    $ cat "$f"
    cat: invalid option -- 'x'

    $ cat -- "$f"
    cat: -x: No such file or directory
    # ^-- correct, but:

    $ f=-
    $ cat -- "$f"
    reading stdin...
    reading stdin...
    ^C

    $ f=./-
    $ cat -- "$f"
    cat: ./-: No such file or directory

...better to glob with ./* than *

pjot · on Nov 26, 2023

Also for sanitizing user input.

  input=$1  # User input, potentially dangerous
  rm $input # Incorrect: Risky if input contains something like '*'

  rm "$input" # Correct: Safer, treats the user input as a single item

akho · on Nov 26, 2023

Or use Fish.

rockwotj · on Nov 26, 2023

Or python…

fragmede · on Nov 26, 2023

python is unbelievably awkward for shell scripting though

acdha · on Nov 26, 2023

For writing on the command-line, yes. For anything over a screenful of code and/or using non-trivial math or non-scalar variables, I’ve found the opposite to be true. Python forces multi-line code more but the functionality is so much richer that it’s easy to end up replacing a hundred line shell script with half as many lines of easier-to-read code. Literally every time I’ve done that with a mature shell script I’ve also found at least one bug in the process, usually related to error handling or escaping, which the original author knew about but also knew would be painful to handle in shell.

rockwotj · on Nov 27, 2023

^ this

vlovich123 · on Nov 26, 2023

I think with the cmd package it’s not actually that bad and quite ergonomic. Ymmv

duskwuff · on Nov 26, 2023

Do you mean this cmd package?

https://docs.python.org/3/library/cmd.html

If so, that's entirely orthogonal to the problem. cmd is for writing the interactive "front end" to a command-line interpreter; it doesn't help at all in writing Python scripts to replace shell scripts (which typically run with no interactive interface at all).

The sort of problem I think fragmede is alluding to is that of writing Python code to emulate a shell script construct like:

    some-command 2>&1 | LANG=C sort | uniq -c | wc -l

e.g. constructing pipelines of external commands. It's certainly possible to do this in Python, but I'm not aware of any way that'd be anywhere near as concise as the shell syntax.

vlovich123 · on Nov 27, 2023

Shoot. I mixed up the module name. It's the sh module https://sh.readthedocs.io/en/latest/

    sort_env = os.environ.copy()
    sort_env["LANG"] = "C"
    sh.wc("-l", _in=sh.uniq("-c", _piped=True, _in=sh.sort(_env=sort_env, _piped=True, _in=sh("some-command", _piped=True, _err_to_out=True))))

It's not as natural in some ways for most people because you have to write it right to left instead of left to right as with the pipe syntax. If you split it over multiple lines it's better:

    some_command = sh("some-command", _piped=True, _err_to_out=True)
    sorted = sh.sort(_env=sort_env, _piped=True, _in=some_command)
    unique = sh.uniq("-c", _piped=True, _in=sorted)
    word_count = sh.wc("-l", _in=unique)

There's also all sorts of helpful stuff you can do like invoking a callback per line or chunk of output, with contexts for running a sequence of commands as sudo, etc etc.

And of course, you don't actually need to shell out to sort/uniq either:

   output_lines = sh("some-command", _err_to_out=True).splitlines()
   num_lines = len(list(set(output_lines)))

This is also cheaper because it avoids the sort which isn't strictly necessary for determining the number of unique lines (sorting is typically going to be an expensive way to do that for large files compared to a hash set because of the O(nlogn) string comparisons vs O(n) hashes).

It's really quite amazing and way less error prone too when maintaining anything more complicated. Of course, I've found it not as easy to develop muscle memory with it but that's my general experience with libraries.

duskwuff · on Nov 27, 2023

Oh neat, I guess I missed the "_piped" arg when I looked. That does make it a lot better.

> And of course, you don't actually need to shell out to sort/uniq either:

Yeah, it's a contrived example. Imagine something more important happening there. :)

js2 · on Nov 27, 2023

I really don't find using subprocess all that tedious personally. I usually have one helper function around `subproces.run` called `runx` that adds a little sugar.

But if you really want something more ergonomic, there's sh:

https://pypi.org/project/sh/

vlovich123 · on Nov 27, 2023

YMMV but I found sh to be a step function better ergonomically, especially if you want to do anything remotely complex. I just wish that they would standardize it & clean it up to be a bit more pythonic (like command-line arguments as an array & then positional arguments with normal names instead of magic leading `_` positional arguments).

duskwuff · on Nov 27, 2023

subprocess doesn't make it easy to construct pipelines -- it's possible, but involves a lot of subprocess.Popen(..., stdin=subprocess.PIPE, stdout=subprocess.PIPE) and cursing. The "sh" module doesn't support it at all; each command runs synchronously.

vlovich123 · on Nov 27, 2023

Incorrect. https://sh.readthedocs.io/en/latest/sections/piping.html

Just add `_piped=True` to the launch arguments and it'll work as expected where it won't fully buffer the output & wait for the command to complete.

fragmede · on Nov 27, 2023

yeah exactly. eg

    find . -type f | grep foo

is possible to do with os.walk() in python, but to get there is so unergonomic.

vlovich123 · on Nov 27, 2023

    for file_path in sh.find(".", "-type", "f", _iter=True):
        if "foo" in file_path:
            print(file_path)

Note that "find" is not some special function provided by the sh module - that module just uses convenience magic where it'll map whatever function you call under it as an actual command execution (i.e. construct a sh.Command using "find" as an argument & then execute it).

You can also pipe the output of find to grep if you want to run grep for some reason instead of writing that processing in python (maybe it's faster or just more convenient)

shmerl · on Nov 26, 2023

True, true. I don't worry about it for explicitly simple indexes because too many quotes are ugly. But in general it's completely right.

cube2222 · on Nov 26, 2023

Just to illustrate for others, in fish that would be

  for i in (seq 10); echo $i; end

or usually written as

  for i in (seq 10)
    echo $i
  end

and that is already based on iteration, so for iterating over files, you'll similarly use

  for f in (ls); echo $f; end

and it works the same for arrays.

paradox460 · on Nov 26, 2023

You don't even need the (ls) for file iteration:

  for i in *
    echo $i
  end

Fish also has globstar out of the box, so you can do

  for i in **/*
    echo $i
  end

fuzztester · on Nov 26, 2023

[You don't even need the (ls) for file iteration:

  for i in *
    echo $i
  end

]

You don't even need the for for that (specific example):

echo *

at least in sh and bash.

fuzztester · on Dec 1, 2023

Also, IIRC, and less commonly seen, you can do:

  for i 
  do 
    # your commands using $i here
  done

which will iterate over all the command line arguments to the script containing that for loop. It is a shortcut for:

  for i in $*
    # note: not for i in *
  # the rest is the same above

This is mentioned in either the shell man page or in the Kernighan & Pike book The Unix Programming Environment.

j16sdiz · on Nov 27, 2023

echo * does not have the newline after each file

cdrt · on Nov 27, 2023

You wouldn’t iterate over `echo *` anyway. This works just fine in bash/sh:

    for file in *; do
        echo "$file"
    done

fuzztester · on Nov 27, 2023

True, echo * will print all the filenames on one line, with a space between pairs, but for data processing purposes, the two snippets are equivalent, because as per shell behavior, newline, tab and space are all treated as space, unless quoted. Like the definition of whitespace in the K&R C book. After all, Unix is written in C.

user982 · on Nov 26, 2023

Your example in fish:

  for i in (seq 10); echo $i; end

directly translated to bash:

  for i in $(seq 10); do echo $i; done

bravetraveler · on Nov 26, 2023

No need for seq :)

    for i in {1..10}; do echo $i ; done

If, for whatever reason, you want leading zeroes - BASH will respect that. Do {01..10}

shmerl · on Nov 26, 2023

seq would work in bash too, but it's not part of the language, it's an external tool. C like syntax is more expressive I'd say for simple numerical increments.

madeofpalk · on Nov 26, 2023

“Just remember how to do it” isn’t really the problem.

It’s that the finer details of the syntax is sufficiently different from other things I do, and I don’t write shell scripts frequently enough to remember it.

doubloon · on Nov 26, 2023

exactly. i predict some kind of 'chat gpt shell' in the near future.

cube2222 · on Nov 26, 2023

GitHub has already released this a couple months ago: https://docs.github.com/en/copilot/github-copilot-in-the-cli...

arp242 · on Nov 26, 2023

zsh makes this so much easier. This doesn't capture $i, but covers most use cases of "I want to run something more than once":

  repeat 10; echo "repeat me"

If you do need $i you can use a similar construct as bash, but more convenient:

  for ((i = 0; i < 10; i++)); echo $i

  for i in {0..10}; echo $i  # {n..m} loops also work in bash

  for ((i = 0; i < 10; i++)); { echo $i; echo $i }  # Multiple commands

Short loops are hugely helpful in interactive use because it's much less muckery for short one-liners (whether you should use them in scripts is probably a bit more controversial).

Also looping over arrays "just works" as you would expect without requiring special incantations:

  arr=(a b)
  for v in $arr; echo $v

---

Whether zsh or fish is better is a matter of taste, and arguably zsh has too many features, but IMHO bash is absolutely stuck as a "1989 ksh clone" (in more ways than one; it uses K&R C all over the place, still has asm hacks to make it run on 1980s versions of Xenix, and things like that).

JNRowe · on Nov 26, 2023

Any reason to eschew short_loops in general that you're aware of? I ask because I'd probably use `for i ({0..9}) echo $i` in your for loop example. I've never managed to get my head around the necessity for a narrow short_repeat option when there is - for example - no short_select.

All of the zsh alternate forms feel far superior to me, in both interactive use and within scripts.

JNRowe notes the many off-by-one translations in other examples and tips hat to arp242

arp242 · on Nov 26, 2023

Oh yeah, I forgot about the {n..m} syntax; I almost always use repeat these days or the C-style loop if I need $i, as that's more "in the fingers", so to speak, from other languages, even though {n..m} is easier.

I don't know why you would want to avoid short loops, other than subjective stylistic reasons (which is completely valid of course). The patch that added short_repeat just asserts that "SHORT_LOOPS is bad"[1], but I don't now why the author thinks that.

Also: my previous comment was wrong; you don't need "setopt short_repeat"; you only need it if you explicitly turned off short loops with "setopt no_short_loops".

[1]: https://www.zsh.org/mla/workers/2019/msg01174.html

darrenf · on Nov 26, 2023

For the first, I prefer range syntax:

    for i in {1..10}

xorcist · on Nov 26, 2023

> For iterating over an array:

Bash array syntax is error prone and hard to read, and the expansion of arrays still depend on the field separator and must fit in the environment.

Most of the time you should just rely on that field separator and do it the simple way:

  for i in "$f"; do echo $i; done

Much more obvious and more shell-like. That's how all bash functions process their parameters.

Set IFS only if you really need to. But in that case also consider something like xargs where you are not limited by environment, have null terminated fields, and offer parallelism.

Arrays are only useful when you need multidimensionality, at which point you should probably look at using data files and process with other tools such as sed. Or start looking at something like Perl or Python.

cdrt · on Nov 27, 2023

Your example proves that not using arrays is worse. That loop will only run once and just print every element on one line. The array equivalent works as expected , _isn't_ affected by IFS, and can handle spaces in individual elements

    f='a b c d e'
    for i in "$f"; do echo $i; done
    # prints a b c d e

    f="a b c 'd e'"
    for i in $f; do echo $i; done
    # prints
    # a
    # b
    # c
    # 'd
    # e'

    f=(a b c 'd e')
    for i in "${f[@]}"; do echo $i; done
    # prints
    # a
    # b
    # c
    # d e

xorcist · on Dec 10, 2023

You could always construct an example to how a specific behaviour. The problem in the example here is that you can't set environment variables to a list of arguments, you can only set it to text strings.

Normally on the command line data comes from somewhere. It might be some sort of text processing such as a grep or sed command. And that works just fine:

  for i in "$(grep stuff)"; do

works just fine and as expected. But when you have to slice and dice that output to fit bash arrays, you not only have to take IFS into account, it quickly gets way harder to read than doing it the normal way. That's why I generally tend to view the use of arrays as a warning. Maybe you are doing something unnecessarily complicated, maybe you are writing Python in bash.

Your specific example should be written as the easiest form:

  for i in a b c "d e"; do echo $i; done

which does precisely what it looks like.

benj111 · on Nov 27, 2023

Can't speak for others but one issue is the "; do"

I can generally cobble together python, and it'll be syntactically correct. I just need to check libs.

With shell I need to stop and make sure my semicolons in a for loop are correct.

If you're a heavy user it probably isn't a problem but all the little warts just make it difficult for an occasional user to keep a good enough model in their head.

andrewshadura · on Nov 27, 2023

You don't need the semicolon. Just use a newline instead; it's also easier to remember that way.

mgdlbp · on Nov 26, 2023

zsh, short loop:

    for i in {1..10}; <<<$i

powershell:

    foreach ($i in 1..10) { $i }

    1..10|%{$_}