I think with the cmd package it’s not actually that bad and quite ergonomic. Ymm...

duskwuff · on Nov 26, 2023

Do you mean this cmd package?

https://docs.python.org/3/library/cmd.html

If so, that's entirely orthogonal to the problem. cmd is for writing the interactive "front end" to a command-line interpreter; it doesn't help at all in writing Python scripts to replace shell scripts (which typically run with no interactive interface at all).

The sort of problem I think fragmede is alluding to is that of writing Python code to emulate a shell script construct like:

    some-command 2>&1 | LANG=C sort | uniq -c | wc -l

e.g. constructing pipelines of external commands. It's certainly possible to do this in Python, but I'm not aware of any way that'd be anywhere near as concise as the shell syntax.

vlovich123 · on Nov 27, 2023

Shoot. I mixed up the module name. It's the sh module https://sh.readthedocs.io/en/latest/

    sort_env = os.environ.copy()
    sort_env["LANG"] = "C"
    sh.wc("-l", _in=sh.uniq("-c", _piped=True, _in=sh.sort(_env=sort_env, _piped=True, _in=sh("some-command", _piped=True, _err_to_out=True))))

It's not as natural in some ways for most people because you have to write it right to left instead of left to right as with the pipe syntax. If you split it over multiple lines it's better:

    some_command = sh("some-command", _piped=True, _err_to_out=True)
    sorted = sh.sort(_env=sort_env, _piped=True, _in=some_command)
    unique = sh.uniq("-c", _piped=True, _in=sorted)
    word_count = sh.wc("-l", _in=unique)

There's also all sorts of helpful stuff you can do like invoking a callback per line or chunk of output, with contexts for running a sequence of commands as sudo, etc etc.

And of course, you don't actually need to shell out to sort/uniq either:

   output_lines = sh("some-command", _err_to_out=True).splitlines()
   num_lines = len(list(set(output_lines)))

This is also cheaper because it avoids the sort which isn't strictly necessary for determining the number of unique lines (sorting is typically going to be an expensive way to do that for large files compared to a hash set because of the O(nlogn) string comparisons vs O(n) hashes).

It's really quite amazing and way less error prone too when maintaining anything more complicated. Of course, I've found it not as easy to develop muscle memory with it but that's my general experience with libraries.

duskwuff · on Nov 27, 2023

Oh neat, I guess I missed the "_piped" arg when I looked. That does make it a lot better.

> And of course, you don't actually need to shell out to sort/uniq either:

Yeah, it's a contrived example. Imagine something more important happening there. :)

js2 · on Nov 27, 2023

I really don't find using subprocess all that tedious personally. I usually have one helper function around `subproces.run` called `runx` that adds a little sugar.

But if you really want something more ergonomic, there's sh:

https://pypi.org/project/sh/

vlovich123 · on Nov 27, 2023

YMMV but I found sh to be a step function better ergonomically, especially if you want to do anything remotely complex. I just wish that they would standardize it & clean it up to be a bit more pythonic (like command-line arguments as an array & then positional arguments with normal names instead of magic leading `_` positional arguments).

duskwuff · on Nov 27, 2023

subprocess doesn't make it easy to construct pipelines -- it's possible, but involves a lot of subprocess.Popen(..., stdin=subprocess.PIPE, stdout=subprocess.PIPE) and cursing. The "sh" module doesn't support it at all; each command runs synchronously.

vlovich123 · on Nov 27, 2023

Incorrect. https://sh.readthedocs.io/en/latest/sections/piping.html

Just add `_piped=True` to the launch arguments and it'll work as expected where it won't fully buffer the output & wait for the command to complete.

fragmede · on Nov 27, 2023

yeah exactly. eg

    find . -type f | grep foo

is possible to do with os.walk() in python, but to get there is so unergonomic.

vlovich123 · on Nov 27, 2023

    for file_path in sh.find(".", "-type", "f", _iter=True):
        if "foo" in file_path:
            print(file_path)

Note that "find" is not some special function provided by the sh module - that module just uses convenience magic where it'll map whatever function you call under it as an actual command execution (i.e. construct a sh.Command using "find" as an argument & then execute it).

You can also pipe the output of find to grep if you want to run grep for some reason instead of writing that processing in python (maybe it's faster or just more convenient)