[Help-bash] Problems with background jobs

R. Diez

2018-08-16 08:52:32 UTC

Hi there:

I am trying to implement a solution to this problem:

https://sourceforge.net/p/openocd/mailman/message/36384989/

But the exact problem is not relevant. It is about coordinating
processes with pipes.

Ideally, I would do it with a Bash script like this:

1) Create an unnamed pipe, like 'coproc' does, but leaving stdin and
stdout untouched.

2) Start the OpenOCD process as a background job.

3) Wait for my OpenOCD script to write to the pipe, so that I know that
OpenOCD is now ready to accept connections.

If OpenOCD dies during initialisation, reading from the pipe will fail,
and the Bash script will exit.

4) Start GDB in the foreground, which connects to OpenOCD. The Bash
script will continue when GDB exits.

5) Wait for the OpenOCD background job to finish. We do not want to
leave zombie processes behind.

6) Exit with OpenOCD's exit code.

Problem 1: I could not do it like that because apparently Bash cannot
create an unnamed pipe leaving stdin and stdout untouched. There are
non-portable, ugly work-arounds, which I would try to avoid.

The next approached I tried was to create a named pipe with mkfifo, and
pass its name to my OpenOCD script. But then, if OpenOCD fails, my Bash
script will forever hang reading from the FIFO.

So I tried writing a loop:

- While true:

Step 1) Read from the FIFO with timeout. If something was read, break
out of the loop.

Step 2) Check whether the OpenOCD child process is still alive. If it is
dead, it will never write to the FIFO.

By the way, Bash does not document that opening such a FIFO as "read
only" hangs (at least on Linux). But I found out that, when opening as
read/write, it does work.

The problem is now how to find out whether the child process still
lives, and what it exit code was.

Problem 2: There does not seem to be a way to grab the last job ID.

Using %% is dangerous, because something else could start another
background job and overwrite it.

Using a %string is difficult, because these are long commands with
possibly sell-quoted filepaths inside. I am not sure which part of the
command string I should take.

Problem 3: The "kill -0 $pid" work-around is not safe, because process
IDs can be recycled. I guess using a Bash job spec is safer.

Problem 4: "wait %%" hangs, as it has no timeout option, so I cannot use
it in the loop.

Problem 5: I could use "jobs %%" to see if the job if still alive. But
extracting the process exit code is hard. If the process has died in the
meantime, "jobs %%" will fail with "no such job", and then "wait %%"
will also fail with "no such job", instead of returning the exit status.

I am a little confused by "jobs". When "jobs %%" reports a job as
finished like this:

[1]+ Done myprocess.sh

Has it then removed the job, so that it is no longer available for "wait
%%"?

Is it safe to parse the output from "jobs"? The exact format is not
documented (exit status, killed by signal, ...). Should I assume that
the text format can change in the future?

Thanks in advance for any help in this regard,
rdiez