Discussion:
[Help-bash] Bash Trap: How to Get Line Number of a Subprocess with Non-Zero Status
Steve Amerige
2016-12-29 17:20:27 UTC
Permalink
I've posted a question to StackOverflow, but I'm hoping the experts here
can chime in.

http://stackoverflow.com/questions/41346907/bash-trap-how-to-get-line-number-of-a-subprocess-with-non-zero-status

In essence, I want to know how to get the line number in a function of a
subprocess that exits with a non-zero status (see line 20 below):

|1#!/bin/bash23trapinfo()4{5echo "=== Trap Info: Status=$? LINENO=$@
A=$A"6}78main()9{10trap 'trapinfo $LINENO -- ${BASH_LINENO[*]}'ERR
1112set-e 13set-o ||||pipefail 14set-o errtrace 15shopt -s extdebug 1617local-g A=11819#
false # If uncommented, LINENO would be 19*20(exit 73)*# LINENO is 9.
How can I get 20 instead?2122A=223}2425main|

The above produces the following output:

|===TrapInfo:Status=73LINENO=9--250A=1|

The curious thing is that if the line with the non-zero status is not a
subprocess (e.g., the false above), then the trap does indicate the
desired LINENO value (19).

I am trying to avoid any decoration of code to work around this issue,
but any set, shopt, or anything else that can be done at or around trap
setup (e.g., lines 10 thru 15) is acceptable. What is not acceptable is
decorating the code that is protected by the trap (because I want a
general solution that doesn't require any changes to code after the trap
has been set up). In my example above, anything from line 17 onwards
shouldn't be modified as a solution to this trap problem.

Any thoughts on this puzzle?

Thanks,
Steve Amerige
Eduardo Bustamante
2016-12-30 03:39:00 UTC
Permalink
I've posted a question to StackOverflow, but I'm hoping the experts here can
chime in.
http://stackoverflow.com/questions/41346907/bash-trap-how-to-get-line-number-of-a-subprocess-with-non-zero-status
In essence, I want to know how to get the line number in a function of a
[...]
Thanks,
Steve Amerige
First: Try to do better formatting on your emails next time. The code
block is completely unreadable. I had to follow the stackoverflow link
to understand what you were asking.


I'm adding the bug-bash list, since I think this is actually a bug in
the parse_comsub function, or maybe in execute_command_internal. I
haven't been able to figure it out yet. What I do know is that these
two should behave the same:

***@yaqui:~$ cat -n h
1 #!/bin/bash
2 shopt -s extdebug
3 main() {
4 trap 'echo $LINENO' ERR
5 (exit 17)
6 }
7 main
***@yaqui:~$ ./h
3

--- vs ---

***@yaqui:~$ cat -n i
1 #!/bin/bash
2 shopt -s extdebug
3 main() {
4 trap 'echo $LINENO' ERR
5 `exit 17`
6 }
7 main
***@yaqui:~$ ./i
5

There's no actual reason for these two to be different, and this leads
me to think the behavior you're seeing is a bug somewhere in the
$(...) style command substitution parsing. Perhaps it misses updating
the line_number variable.
Eduardo Bustamante
2016-12-30 17:50:11 UTC
Permalink
Post by Eduardo Bustamante
I'm adding the bug-bash list, since I think this is actually a bug in
the parse_comsub function, or maybe in execute_command_internal. I
haven't been able to figure it out yet. What I do know is that these
I'm not sure what was I thinking, but these are of course very
different things (comsub and subshell) :-)

I still think this is a bug, because line number information
(line_number, line_number_for_err_trap and the line attribute in the
different command structures) is handled inconsistently.

The problem is that line information is lost when executing the body
of a function, so what bash does is to store line number information
in the command structures, so that it's able to correctly report to
the ERR trap when it is triggered. But for some reason this is not
handled in a consistent manner for different types of commands, so the
following fail:

- subshells e.g. (exit 17)
- arithmetic commands e.g. (( 0 ))
- conditional commands, e.g. [[ a = b ]]

I don't think this applies for the following types: if, case, for,
arith-for, but I may be wrong.

The attached err_lineno patch is a proposed fix. The reported line
number will be the closing line in the case of a subshell and the
other multi-line constructs. This seems to match the current behavior
when executing outside a function.
Chet Ramey
2016-12-30 17:50:41 UTC
Permalink
Post by Eduardo Bustamante
I'm adding the bug-bash list, since I think this is actually a bug in
the parse_comsub function, or maybe in execute_command_internal. I
haven't been able to figure it out yet. What I do know is that these
1 #!/bin/bash
2 shopt -s extdebug
3 main() {
4 trap 'echo $LINENO' ERR
5 (exit 17)
6 }
7 main
3
--- vs ---
1 #!/bin/bash
2 shopt -s extdebug
3 main() {
4 trap 'echo $LINENO' ERR
5 `exit 17`
6 }
7 main
5
There's no actual reason for these two to be different,
It's the difference between a simple command, which has a line number, and
a subshell command, which does not. The subshell command's line number is
derived from the execution context, which is relative to the start of the
function. The OP wants the subshell's line number to be absolute, like the
simple command's, and not relative to the start of the function, as it is
now.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU ***@case.edu http://cnswww.cns.cwru.edu/~chet/
João Eiras
2017-01-01 19:16:27 UTC
Permalink
Post by Chet Ramey
It's the difference between a simple command, which has a line number, and
a subshell command, which does not. The subshell command's line number is
derived from the execution context, which is relative to the start of the
function. The OP wants the subshell's line number to be absolute, like the
simple command's, and not relative to the start of the function, as it is
now.
I agree that line numbers should be consistent, specially when running
a script. Inconsistent line numbers have been now and then a problem
for me debugging my scripts.
Steve Amerige
2017-01-02 09:59:49 UTC
Permalink
On 12/30/2016 12:50 PM, Chet Ramey wrote:

It's the difference between a simple command, which has a line number, and
a subshell command, which does not. The subshell command's line number is
derived from the execution context, which is relative to the start of the
function. The OP wants the subshell's line number to be absolute, like the
simple command's, and not relative to the start of the function, as it is
now.

From an end-user's perspective, it is unexpected that the LINENO has
the value of the
beginning of the function instead of the line at which the ERR was
caught by trap.
While the "inside" of the subshell might have it's own characteristics,
the inside must
eventually meet the "outside" of the subshell... at which point, an end
user can reasonably
expect that an error is reported in a common-sense way with the line
number that is
seen in the script file itself. While the current behavior can be
explained, that doesn't
mean it is expected by end users or makes sense.
Using any other line number for LINENO also makes it harder to debug
scripts, of course,
since the line number being reported isn't as helpful.
Is there any way this can be fixed... or introduced as a feature (e.g.,
with some
set -o or shopt flag)? A fix would help out a significant number of
people. As it is, a
workaround now is to use:
$(...)
instead of:
(...)
to get the expected behavior. But, this can mean some significant
re-writing of code.
Enjoy,
Steve Amerige
Server Science Incorporated
[1]Eggsh: A Powerful and Reusable Bash Scripting Platform, an
Open-Source Project

References

1. https://eggsh.com/
Chet Ramey
2017-01-02 18:09:32 UTC
Permalink
From an end-user's perspective, it is unexpected that the LINENO has the
value of the
beginning of the function instead of the line at which the ERR was caught
by trap.
Sure, it's missing functionality. Eduardo's patch was pretty much spot-on.
It will be fixed in the next version of bash, at least.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU ***@case.edu http://cnswww.cns.cwru.edu/~chet/
Loading...