Discussion:
[Help-bash] Interactive vs non-interactive behavior
Kazama Seiji
2017-09-11 15:06:44 UTC
Permalink
Hey guys.


I've got such code sample:

timeout 5 tail -fn0 access.log | awk '{print $1}'


It listens to nginx's access.log for 5 secs and outputs the log's first
column (=client IP).


The sample works the same in interactive shell as well as a standalone
script.


But as only I add more stages to the pipe it fails in interactive shell
like in:

timeout 5 tail -fn0 access.log | awk '{print $1}' | wc -l

or

timeout 5 tail -fn0 access.log | awk '{print $1}' | sort | uniq -c |
sort -k1,1nr -k2,2V


the output is a single line "Terminated". It still works as a standalone
script though.


I've figured out the enabled monitor=on bash setting for interactive
mode triggers the behavior. So with "set +m" applied it works in
interactive shell. Another way to overcome it in interactive shell is
such trick

( timeout 5 tail -fn0 access.log; true ) | awk '{print $1}' | wc -l


The question is: what is so special to pipes of 3+ stages that it
presents different behavior vs a 2-stage pipe when monitor=on in bash?
E.g. why this works
set -m
timeout 5 tail -fn0 access.log | awk '{print $1}'

but this doesn't
set -m
timeout 5 tail -fn0 access.log | awk '{print $1}' | wc -l

(I explicitly set monitor=on above for clarity.)
DJ Mills
2017-09-11 19:20:25 UTC
Permalink
http://mywiki.wooledge.org/BashFAQ/009

The output is getting buffered in the pipe
Post by Kazama Seiji
Hey guys.
timeout 5 tail -fn0 access.log | awk '{print $1}'
It listens to nginx's access.log for 5 secs and outputs the log's first
column (=client IP).
The sample works the same in interactive shell as well as a standalone
script.
But as only I add more stages to the pipe it fails in interactive shell
timeout 5 tail -fn0 access.log | awk '{print $1}' | wc -l
or
timeout 5 tail -fn0 access.log | awk '{print $1}' | sort | uniq -c | sort
-k1,1nr -k2,2V
the output is a single line "Terminated". It still works as a standalone
script though.
I've figured out the enabled monitor=on bash setting for interactive mode
triggers the behavior. So with "set +m" applied it works in interactive
shell. Another way to overcome it in interactive shell is such trick
( timeout 5 tail -fn0 access.log; true ) | awk '{print $1}' | wc -l
The question is: what is so special to pipes of 3+ stages that it presents
different behavior vs a 2-stage pipe when monitor=on in bash?
E.g. why this works
set -m
timeout 5 tail -fn0 access.log | awk '{print $1}'
but this doesn't
set -m
timeout 5 tail -fn0 access.log | awk '{print $1}' | wc -l
(I explicitly set monitor=on above for clarity.)
DJ Mills
2017-09-11 20:44:41 UTC
Permalink
Post by DJ Mills
http://mywiki.wooledge.org/BashFAQ/009
The output is getting buffered in the pipe
Or rather, to phrase it properly, the commands in the middle are buffering
their output because they see that stdout is not a tty
Kazama Seiji
2017-09-11 22:31:19 UTC
Permalink
I'm aware of stdbuf and already tried it a hardcore way as

stdbuf -oL timeout 5 tail -fn0 access.log | stdbuf -oL awk '{print
$1}' | stdbuf -oL sort

It changes nothing. The output is a single "Terminated" line still (the
line is issued by tail on SIGTERM in interactive shell only).


I've figured out the matter is more subtle than it looks at first.

Yes, awk does buffering so "stdbuf -oL awk" works in its way. But
buffering is not the issue. The issue are things like "sort" or "wc"
which swallow the whole stdin before producing any output.


Let me simplify the sample pipe to eliminate extra details. Let the pipe be

timeout 1 tail -fn0 access.log | wc -l

and let we have

set -o pipefail

in bash to check our pipe's "true" exitcode.


Non-interactive shell:

1. in 1 sec tail issues exitcode=143 on SIGTERM from timeout

2. timeout issues exitcode=124 (since we don't apply --preserve-status)

3. sort finishes reading stdin and outputs the stuff.

4. the whole pipe issues exitcode=124.


Interactive shell:

1. in 1 sec tail issues exitcode=143 on SIGTERM from timeout

2. the whole pipe is discarded and exitcode=143 is issued.


Here is a full session:

$ set -o pipefail
$ timeout 1 tail -fn0 access.log | wc -l # *<--- interactive*
Terminated
$ echo exitcode=$?
exitcode=143
$ bash -c 'set -o pipefail; timeout 1 tail -fn0 access.log | wc -l; echo
exitcode=$?' # *<--- non-interactive*
18 # *<--- wc output*
exitcode=124


The whole pipe is indeed killed in interactive mode. Try

timeout 1 tail -fn0 access.log | less

Output is displayed in less as it comes but In 1 sec less got killed.


*Once more*: the weird behavior is triggered by the default "set -m"
(monitor=on) in interactive mode. As only I manually disable it with
"set +m" the pipe works as supposed:

$ timeout 1 tail -fn0 access.log | wc -l
Terminated
$ set +m
$ timeout 1 tail -fn0 access.log | wc -l
15

*Mby it is a bug?*



-------- A sidenote ----------

Actually it works the same if I run

set -o pipefail; ( timeout 1 tail -fn0 access.log | wc -l ); echo
exitcode=$?

instead of

bash -c 'set -o pipefail; timeout 1 tail -fn0 access.log | wc -l;
echo exitcode=$?'

but I'm not sure if such subshelling can be called non-interactive.
Post by DJ Mills
http://mywiki.wooledge.org/BashFAQ/009
The output is getting buffered in the pipe
Post by Kazama Seiji
Hey guys.
timeout 5 tail -fn0 access.log | awk '{print $1}'
It listens to nginx's access.log for 5 secs and outputs the log's first
column (=client IP).
The sample works the same in interactive shell as well as a standalone
script.
But as only I add more stages to the pipe it fails in interactive shell
timeout 5 tail -fn0 access.log | awk '{print $1}' | wc -l
or
timeout 5 tail -fn0 access.log | awk '{print $1}' | sort | uniq -c | sort
-k1,1nr -k2,2V
the output is a single line "Terminated". It still works as a standalone
script though.
I've figured out the enabled monitor=on bash setting for interactive mode
triggers the behavior. So with "set +m" applied it works in interactive
shell. Another way to overcome it in interactive shell is such trick
( timeout 5 tail -fn0 access.log; true ) | awk '{print $1}' | wc -l
The question is: what is so special to pipes of 3+ stages that it presents
different behavior vs a 2-stage pipe when monitor=on in bash?
E.g. why this works
set -m
timeout 5 tail -fn0 access.log | awk '{print $1}'
but this doesn't
set -m
timeout 5 tail -fn0 access.log | awk '{print $1}' | wc -l
(I explicitly set monitor=on above for clarity.)
Loading...