Discussion:
[Help-bash] Awkward behavior of empty arrays
Cristian Zoicas
2017-08-31 08:38:49 UTC
Permalink
Hello all

I want to create arrays and use them (easily). I am especially
interested in empty arrays and I felt very uncomfortable with them
since I found some counterintuitive or undocumented behaviors. I
provide some examples below.

Example 1:

1 set -u;
2 unset A;
3 declare -a A=;
4 echo "s1";
5 echo "size: ${#A[@]}";
6 echo "s2";
7 for e in ${A[@]}; do echo "e: ${e}"; done

Starting with the line 3 I want to create an empty array, so I
declare A to have the array attribute and then I (hope to) assign
null to it. Some people may say that I am not right because the
manual says "An array variable is considered set if a subscript
has been assigned a value." They are somehow right because I did
not assign any value to any subscript, so the array should not be
initialized. However let's see what is going on.

Even if I did not assign any subscript a value, the line 5 says
that the size of the array is 1 and it is possible to reference
element ${A[0]} without receiving any error due to "set -u". Also
${A[@]} expands to nothing.

If the manual is right, then the array should not be defined. It
should not exist and I should receive errors due to 'set -u', but
even 'declare -p A' shows that the array exists and it has a
single element whose value is null.

Is this behavior of assigning null to an array an to have an array
with a single null element the desired behavior?

Example 2:

1 set -u;
2 unset A;
3 declare -a A=();
4 echo "s1";
5 echo "size: ${#A[@]}";
6 echo "s2";
7 for e in ${A[@]}; do echo "e: $e"; done

Starting with the line 3 I want to create an empty array, so I
declare A to have the array attribute and then I (hope to) assign
an empty array to it.

The line 5 behaves well and says that the size of the array is 0,
thus I imagine that the array is empty. So according to the manual
${A[@]} should expand to nothing and the for loop in the line 7
should not print anything. Instead the message 'bash: A[@]: unbound
variable' is printed. As if A is not set.

There is nothing in the manual that says that the () cannot be used
to create empty arrays. The syntax is also accepted

It means that if the size of the array is 0 then the array cannot
be expanded. Otherwise it can. Such behavior forces people to
write difficult to understand/maintain code.

Example 3:

1 set -u;
2 unset A;
3 declare -a A=(1); unset A[0];
4 echo "s1";
5 echo "size: ${#A[@]}";
6 echo "s2";
7 for e in ${A[@]}; do echo "e: $e"; done

The behavior of this example is similar with the behavior of the
example 2 above.

I would like to know if these behaviour make sense.

regards
Cristian
Andy Chu
2017-08-31 09:51:58 UTC
Permalink
The problems you point out are indeed subtle, but if you follow a few
rules, you can avoid them:

1. Never use anything that isn't in parentheses on the RHS of an array
initialization.

OK:
declare -a A=()
decare -a A=(1 2 3)

INVALID:
declare -a A= (your example 1)
declare -a A=''
declare -a A='x'

The way you should read this is "I'm assigning a string to an array". Bash
does the non-obvious thing of coercing the string to an array -- by
creating an array with ONE element.

2. Never reference an array with anything but "${A[@]}" or "${A[i]}" where
i is an integer (double quotes are mandatory in both cases).

Examples 2 and 3 break this rule -- ${A[@]} doesn't have double quotes.

also invalid:
"${A}" # this is an implicit "${A[@]}" but is confusing
"${A[*]}"
${A[*]}
${A[@]}

3. Don't do anything with arrays except COPY them and SPLICE them into
commands.

See my post for the correct ways to do these two things: "Thirteen
Incorrect Ways and Two Awkward Ways to Use Arrays"

http://www.oilshell.org/blog/2016/11/06.html

In particular, don't expect to compare arrays with [[.

$ declare -a A=(A B C D)
$ declare -a B=('A B' 'C D')

$ echo "${#A[@]}"
4
$ echo "${#B[@]}"
2

# arrays compare equal because they're coerced to strings before comparison
$ [[ "${A[@]}" == "${B[@]}" ]]; echo $?
0

I would say that bash also only has "half an array type". Bash will coerce
arrays to strings, and strings to arrays, seemingly at random. It tries
its best to do conversions for you, which is not how any other programming
language with arrays works. (Although this behavior may have originated
with ksh.)

In fact I was thinking of creating a style guide or mode for my
bash-compatible shell OSH to enforce these rules.

The way you would join or split arrays would be:

declare -a myarray=(a b c)
mystr=${A[@]} # join array into string, no quotes

declare mystr='a b c'
declare -a myarray=( $mystr ) # split mystr into myarray, no quotes

ANY OTHER usage where bash does an implicit conversion would be a fatal
runtime error (if this mode is set). For example, [[ "${A[@]}" ==
"${B[@]}" ]] would be a fatal runtime error.

You could call this subset of bash the "strict array style" or "hygienic
array style", which I believe is what you're asking for. It's a set of
rules that let you use arrays in the natural manner, with strict errors.

Andy
Post by Cristian Zoicas
Hello all
I want to create arrays and use them (easily). I am especially
interested in empty arrays and I felt very uncomfortable with them
since I found some counterintuitive or undocumented behaviors. I
provide some examples below.
1 set -u;
2 unset A;
3 declare -a A=;
4 echo "s1";
6 echo "s2";
Starting with the line 3 I want to create an empty array, so I
declare A to have the array attribute and then I (hope to) assign
null to it. Some people may say that I am not right because the
manual says "An array variable is considered set if a subscript
has been assigned a value." They are somehow right because I did
not assign any value to any subscript, so the array should not be
initialized. However let's see what is going on.
Even if I did not assign any subscript a value, the line 5 says
that the size of the array is 1 and it is possible to reference
element ${A[0]} without receiving any error due to "set -u". Also
If the manual is right, then the array should not be defined. It
should not exist and I should receive errors due to 'set -u', but
even 'declare -p A' shows that the array exists and it has a
single element whose value is null.
Is this behavior of assigning null to an array an to have an array
with a single null element the desired behavior?
1 set -u;
2 unset A;
3 declare -a A=();
4 echo "s1";
6 echo "s2";
Starting with the line 3 I want to create an empty array, so I
declare A to have the array attribute and then I (hope to) assign
an empty array to it.
The line 5 behaves well and says that the size of the array is 0,
thus I imagine that the array is empty. So according to the manual
variable' is printed. As if A is not set.
There is nothing in the manual that says that the () cannot be used
to create empty arrays. The syntax is also accepted
It means that if the size of the array is 0 then the array cannot
be expanded. Otherwise it can. Such behavior forces people to
write difficult to understand/maintain code.
1 set -u;
2 unset A;
3 declare -a A=(1); unset A[0];
4 echo "s1";
6 echo "s2";
The behavior of this example is similar with the behavior of the
example 2 above.
I would like to know if these behaviour make sense.
regards
Cristian
Greg Wooledge
2017-08-31 12:14:00 UTC
Permalink
Actually, it's even worse than that. "${A}" and "$A" and "${A[0]}"
are all equivalent.

wooledg:~$ A=(an array); echo "${A}"
an
wooledg:~$ unset A; A[1]=an; echo "${A}"

wooledg:~$ unset A; declare -A A; A[zebra]=an A[1]=ugly A[0]=array; echo "${A}"
array

The equivalence of $A == ${A[0]} is one of the most horribly annoying
misfeatures ever.
Andy Chu
2017-08-31 16:03:01 UTC
Permalink
Post by Greg Wooledge
Actually, it's even worse than that. "${A}" and "$A" and "${A[0]}"
are all equivalent.
wooledg:~$ A=(an array); echo "${A}"
an
wooledg:~$ unset A; A[1]=an; echo "${A}"
wooledg:~$ unset A; declare -A A; A[zebra]=an A[1]=ugly A[0]=array; echo "${A}"
array
The equivalence of $A == ${A[0]} is one of the most horribly annoying
misfeatures ever.
Yes sorry, that was a typo. I meant implicit "${A[0]}" not implicit
"${A[@]}".

In the same way that you want to avoid

declare -a A=x

in favor of:

declare -a A=(x)

You also want to avoid "$A" in favor of "{$A[0]}". In both cases there is
a confusion between A and A[0] (considering arrays as you do in any other
programming language.)

Andy
Chet Ramey
2017-08-31 14:47:43 UTC
Permalink
Post by Andy Chu
The problems you point out are indeed subtle, but if you follow a few
1. Never use anything that isn't in parentheses on the RHS of an array
initialization.
This is essentially an attempt to avoid the implicit use of subscript 0.
Post by Andy Chu
i is an integer (double quotes are mandatory in both cases).
It's not bad as long as you understand that the difference between ${A[@]}
and "${A[@]}" is the same as the difference between $@ and "$@". If you
are prepared to deal with the different word splitting behavior, both can
be useful.
It's not; it's an implicit "${A[0]}".
Post by Andy Chu
"${A[*]}"
${A[*]}
These are useful in the same way that $* and "$*" are useful.
Post by Andy Chu
In particular, don't expect to compare arrays with [[.
$ declare -a A=(A B C D)
$ declare -a B=('A B' 'C D')
4
2
# arrays compare equal because they're coerced to strings before comparison
0
Because the operands in [[ commands don't undergo word splitting.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU ***@case.edu http://cnswww.cns.cwru.edu/~chet/
Greg Wooledge
2017-08-31 14:57:57 UTC
Permalink
Post by Chet Ramey
Post by Andy Chu
# arrays compare equal because they're coerced to strings before comparison
0
Because the operands in [[ commands don't undergo word splitting.
Oh, man. I didn't read that far down in Andy's message.

It's "wrong" because it doesn't take into account the indices:

wooledg:~$ unset A B; A=([1]=x [12]=y) B=(x y)
wooledg:~$ [[ "${A[@]}" = "${B[@]}" ]] && echo same!
same!

Are those really the same array? I guess that depends on your
application.

It's also *wrong* (no quotes) because it doesn't take into account
that array elements may contain spaces.

wooledg:~$ unset A B; A=("foo bar") B=(foo bar)
wooledg:~$ [[ "${A[@]}" = "${B[@]}" ]] && echo same!
same!

No matter what your application is, that's not right.

The only way to compare two arrays for equivalence correctly is to write
a loop.

That said, I've never written a shell script that needed to compare two
arrays for equivalence. I have to wonder what the actual application is.
Andy Chu
2017-08-31 16:27:08 UTC
Permalink
Post by Greg Wooledge
The only way to compare two arrays for equivalence correctly is to write
a loop.
Yes, that was an unintentional omission from the rules:

"3. Don't do anything with arrays except COPY them and SPLICE them into
commands."

->

"3. Don't do anything with arrays except ITERATE over them, COPY them, or
SPLICE them into commands."

So

for i in "${A[@]}"; do ... done

is valid but these are not:

for i in ${A[@]}; do ... done
for i in ${A[*]}; do ... done
for i in "${A[*]}"; do ... done

The point is that you can express any semantics you want with a restricted
set of syntactic rules. The latter 3 can all be expressed using the
$joined pattern I gave.

If you want to see more type confusion, including between arrays and
associative arrays, look at:

https://github.com/oilshell/oil/blob/master/spec/type-compat.test.sh

This came from a discussion here:

https://github.com/oilshell/oil/issues/26

You can make a test matrix of all combinations:

declare -a declare +a declare -A
=(foo) ='' =([key]=value)

and get a VERY VERY confusing set of semantics. They are not orthogonal at
all, similar to the fact that $@ and $* are not orthogonal with double
quoting.

(Actually, the test matrix is even larger than that, because sometimes
array syntax is parsed inside quoted strings! Also, there are two
different += operators -- one in the command language, and one in the arith
language.)

Here is an excerpt of my reasoning:

"""
The problem is that bash has two ways to express an array -- -a vs +a, and
(a b c) vs 'a b c'. I don't see why these two ways are necessary. In Oil I
just use the presence of array literals to tell me.

That is,

- declare -a myarray can simply be written declare myarray=().
- declare +a mystring can simply be written declare mystring=''

This doesn't work in bash but it has no ambiguity. It's a little more
Python-like, where objects have types,not variables.
"""

Andy
Andy Chu
2017-08-31 16:16:46 UTC
Permalink
Post by Chet Ramey
Post by Andy Chu
1. Never use anything that isn't in parentheses on the RHS of an array
initialization.
This is essentially an attempt to avoid the implicit use of subscript 0.
Yes I'm suggesting a style where all implicit conversions are avoided (from
the POV of someone coming from another programming language). If you can
see a case where implicit conversions happen using the rules I gave, that's
a bug :)

As I just wrote, avoiding

declare -a A=x

in favor of:

declare -a A=(x)

is analogous to avoiding "$A" in favor of "{$A[0]}".
Post by Chet Ramey
Post by Andy Chu
"${A[*]}"
${A[*]}
These are useful in the same way that $* and "$*" are useful.
I actually never use $* "$*" $@, or ${A[*]} "${A[*]}" ${A[@]} in any of my
programs. I believe they can always be expressed using different (IMO
clearer) mechanisms.

Consider this:

$ set -- foo '*.sh' bar

There are only 3 possibilities from the four constructs (AFAICT $* is
identical to $@):

$ argv "$@"
['foo', '*.sh', 'bar']

$ argv $@
['foo', 'bad.sh', 'local.sh', 'setup.sh', 'bar']

$ argv "$*"
['foo *.sh bar']

$ argv
$*

['foo', 'bad.sh', 'local.sh', 'setup.sh', 'bar']

These 3 possibilities can always be achieved as follows:

$ argv "$@"
['foo', '*.sh', 'bar']

$ joined="$@" # explicit join

$ argv "$joined"
['foo *.sh bar']

$ argv $joined
['foo', 'bad.sh', 'local.sh', 'setup.sh', 'bar']
Post by Chet Ramey
Post by Andy Chu
In particular, don't expect to compare arrays with [[.
$ declare -a A=(A B C D)
$ declare -a B=('A B' 'C D')
4
2
# arrays compare equal because they're coerced to strings before
comparison
Post by Andy Chu
0
Because the operands in [[ commands don't undergo word splitting.
Yes, it depends how you think about it from the implementation POV. In
OSH, variables are represented by a discriminated union (in ML-like syntax):

value =
Undef
| Str(string s)
| StrArray(string* strs)

So when I do [[ "${A[@]}" == "${B[@]}" ]], I actually have to JOIN the
arrays first to main compatibility with bash (there could be a flag to turn
this off.)

So it's not just lack of splitting, but ADDING joining (i.e. coerce an
array to string). But if bash stores arrays as FLAT strings internally,
and only does splitting upon splicing into an argv array, then you can
think of it as lack of splitting.

Andy
DJ Mills
2017-08-31 16:23:22 UTC
Permalink
Use case for "$*" (or similarly "${a[*]}"):

join() {
local IFS=$1
shift
printf '%s\n' "$*"
}

arr=(foo bar baz)
join '+' "${arr[@]}"

With a change in IFS, the construct becomes rather useful
Post by Andy Chu
Post by Chet Ramey
Post by Andy Chu
1. Never use anything that isn't in parentheses on the RHS of an array
initialization.
This is essentially an attempt to avoid the implicit use of subscript 0.
Yes I'm suggesting a style where all implicit conversions are avoided (from
the POV of someone coming from another programming language). If you can
see a case where implicit conversions happen using the rules I gave, that's
a bug :)
As I just wrote, avoiding
declare -a A=x
declare -a A=(x)
is analogous to avoiding "$A" in favor of "{$A[0]}".
Post by Chet Ramey
Post by Andy Chu
"${A[*]}"
${A[*]}
These are useful in the same way that $* and "$*" are useful.
programs. I believe they can always be expressed using different (IMO
clearer) mechanisms.
$ set -- foo '*.sh' bar
There are only 3 possibilities from the four constructs (AFAICT $* is
['foo', '*.sh', 'bar']
['foo', 'bad.sh', 'local.sh', 'setup.sh', 'bar']
$ argv "$*"
['foo *.sh bar']
$ argv
$*
['foo', 'bad.sh', 'local.sh', 'setup.sh', 'bar']
['foo', '*.sh', 'bar']
$ argv "$joined"
['foo *.sh bar']
$ argv $joined
['foo', 'bad.sh', 'local.sh', 'setup.sh', 'bar']
Post by Chet Ramey
Post by Andy Chu
In particular, don't expect to compare arrays with [[.
$ declare -a A=(A B C D)
$ declare -a B=('A B' 'C D')
4
2
# arrays compare equal because they're coerced to strings before
comparison
Post by Andy Chu
0
Because the operands in [[ commands don't undergo word splitting.
Yes, it depends how you think about it from the implementation POV. In
value =
Undef
| Str(string s)
| StrArray(string* strs)
arrays first to main compatibility with bash (there could be a flag to turn
this off.)
So it's not just lack of splitting, but ADDING joining (i.e. coerce an
array to string). But if bash stores arrays as FLAT strings internally,
and only does splitting upon splicing into an argv array, then you can
think of it as lack of splitting.
Andy
Andy Chu
2017-08-31 16:32:47 UTC
Permalink
Post by DJ Mills
join() {
local IFS=$1
shift
printf '%s\n' "$*"
}
arr=(foo bar baz)
With a change in IFS, the construct becomes rather useful
Yes that's true, those semantics can't be expressed with joined="${A[@]}".
I guess I add an operation like "ifsjoin" to my restricted set of rules.

(Part of the reason I'm thinking about these rules is as a style guide; the
other reason is to enable automatic and correct OSH to Oil translation,
e.g. http://www.oilshell.org/blog/2017/02/05.html . As mentioned Oil has
more Python-like arrays, where the type is carried along with the value and
not the variable/location.)

thanks,
Andy
Greg Wooledge
2017-08-31 16:57:31 UTC
Permalink
Post by Andy Chu
There are only 3 possibilities from the four constructs (AFAICT $* is
At the risk of preaching to the choir, unquoted $* and $@ are dangerous.
They don't even have consistent semantics across different shells!
See <https://lists.gnu.org/archive/html/bug-bash/2017-06/msg00283.html>.

Only use quoted "$@" and "$*". Ever. Always. Obviously the same
applies to the array versions, "${a[@]}" and "${a[*]}".
Chet Ramey
2017-08-31 18:19:30 UTC
Permalink
Post by Andy Chu
$ set -- foo '*.sh' bar
There are only 3 possibilities from the four constructs (AFAICT $* is
Unquoted, yes. Quoted, no.
Post by Andy Chu
['foo', '*.sh', 'bar']
$ argv "$joined"
['foo *.sh bar']
$ argv $joined
['foo', 'bad.sh', 'local.sh', 'setup.sh', 'bar']
This only works if the first character of $IFS is ' '. There is nothing
in the expansion of $@, quoted or unquoted, that separates the arguments
using the first character of $IFS and leaves the result unsplit ("$*").
Post by Andy Chu
Because the operands in [[ commands don't undergo word splitting.
Yes, it depends how you think about it from the implementation POV.
Not really. That's how [[ is defined to behave.
Post by Andy Chu
So it's not just lack of splitting, but ADDING joining (i.e. coerce an
array to string).
Word splitting (in a context where word splitting takes place) is the only
way to get multiple arguments out of an expansion.

  But if bash stores arrays as FLAT strings internally,

It doesn't, but that doesn't matter.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU ***@case.edu http://cnswww.cns.cwru.edu/~chet/
Chet Ramey
2017-08-31 14:34:42 UTC
Permalink
Post by Cristian Zoicas
Hello all
I want to create arrays and use them (easily). I am especially
interested in empty arrays and I felt very uncomfortable with them
since I found some  counterintuitive or undocumented behaviors.  I
provide some examples below.
    1 set -u;
    2 unset A;
    3 declare -a A=;
    4 echo "s1";
    6 echo "s2";
    Starting with  the line 3  I want to create  an empty array,  so I
    declare A to have the array  attribute and then I (hope to) assign
    null to it.  Some  people may say that I am  not right because the
    manual says  "An array variable  is considered set if  a subscript
    has been assigned  a value." They are somehow right  because I did
    not assign any value to any  subscript, so the array should not be
    initialized. However let's see what is going on.
    Even if I  did not assign any  subscript a value, the  line 5 says
    that the size  of the array is  1 and it is  possible to reference
    element ${A[0]} without  receiving any error due to  "set -u". Also
    If the manual  is right, then the array should  not be defined. It
    should not exist and I should  receive errors due to 'set -u', but
    even  'declare -p  A' shows  that the  array exists  and it  has a
    single element whose value is null.
    Is this behavior of assigning null to an array an to have an array
    with a single null element the desired behavior?
Yes. The manual also says that assigning to an array without using a
subscript is equivalent to assigning to element 0 (or "0").

The other significant consequence of this is treating unsubscripted
references to an array as references to element 0 (or "0").
Post by Cristian Zoicas
   1 set -u;
   2 unset A;
   3 declare -a A=();
   4 echo "s1";
   6 echo "s2";
   Starting with  the line 3  I want to create  an empty array,  so I
   declare A to have the array  attribute and then I (hope to) assign
   an empty array to it.
   The line 5 behaves  well and says that the size of  the array is 0,
   thus I imagine that the array  is empty. So according to the manual
   variable' is printed. As if A is not set.
It isn't set because there aren't any subscripts with values. Ultimately,
`declare -a A' and `declare -a A=()' are equivalent.

The real question is whether or not the size/length operator should error
out when there are no elements in the array. The current behavior could be
considered a bug.
Post by Cristian Zoicas
   There is nothing in the manual that says that the () cannot be used
   to create empty arrays. The syntax is also accepted
   It means that  if the size of the  array is 0 then the array cannot
   be expanded. Otherwise  it can.   Such  behavior  forces  people to
   write difficult to understand/maintain code.
   1 set -u;
   2 unset A;
   3 declare -a A=(1); unset A[0];
   4 echo "s1";
   6 echo "s2";
   The behavior of this example is similar with the behavior of the
   example 2 above.
There still aren't any subscripts with values.

Now, "making sense" is in the eye of the beholder. I could see an
array with no elements being `set' in the sense that a string variable
whose value is null being set. I did not choose that alternative.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU ***@case.edu http://cnswww.cns.cwru.edu/~chet/
Chris F.A. Johnson
2017-08-31 19:12:15 UTC
Permalink
Post by Cristian Zoicas
Hello all
I want to create arrays and use them (easily). I am especially
interested in empty arrays and I felt very uncomfortable with them
since I found some counterintuitive or undocumented behaviors. I
provide some examples below.
1 set -u;
2 unset A;
3 declare -a A=;
You have set A[0] to the empty string.

If you don't want to set A[0], use:

declare -a A ## No =
--
Chris F.A. Johnson, <http://cfajohnson.com>
Cristian Zoicas
2017-09-05 11:02:07 UTC
Permalink
Hello all and sorry for the delay.

Now after reading the answers from all of you I'll resume next my conclusions. Below you may
find comments related to your previous answer but in order to be short, I have addressed
only problems related to my original question (empty arrays).
1. Never use anything that isn't in parentheses on the RHS of an array initialization.
declare -a A=()
decare -a A=(1 2 3)
While the second stamtement is ok (I mean, the array works well if I want to find
the number of elements or to expand it), the first one is problematic.

Aparently 'declare -a A=()' works. But it works until you try to expand "${A[@]}".
Try the following code (essentially my example 2):

set -u
unset A
declare -a A=()
echo "${A[@]}"

The last statement will give you the error "bash: A[@]: unbound variable".

So 'declare -a A=()' is not an effective way of creating empty arrays.
I would even add that bash does not support empty arrays.
declare -a A= (your example 1)
The way you should read this is "I'm assigning a string to an array". Bash does the
non-obvious thing of coercing the string to an array -- by creating an array
with ONE element.
In a shell context I could understand that such a statement is invalid (if it
would be impossible or too difficult to implement it in a more user friendly way).
Normally I would not use it, but I had the need of creating an emtpy array, so
I wrote that statement and in the end I had the surprise of creating an array
with a single element initialized with null.

Bash did the very-non-obvious thing of creating an array with a single element from null.
i is an integer (double quotes are mandatory in both cases).
I wouldn't but the manual says

"When there are no array members, ${name[@]} expands to nothing."

So "${A[@]}" should be different from ${A[@]}. Look at my example 1.

With the statement 'declare -a A=;' I was hoping to create an empty array. The for loop (by
luck) did what I expected: the array did not contain any elements (I thought) so ${A[@]}
(not "${A[@]}") expanded to nothing as the documentation says and the statemnts inside
the loop were not executed.

On the other side if I would have written the for loop as

for e in "${A[@]}"; do echo "e: ${e}"; done

the statement "echo" would have been executed.

The for loop worked because A was created as an array with a single null element. And
the sentence "When there are no array members, ${name[@]} expands to nothing." is the
manual is wrong. When an array is empty, it is considered undefined (see my example 2).
Yes. The manual also says that assigning to an array without using a subscript is
equivalent to assigning to element 0 (or "0").
I think that the manual should be more clear about what is going on when
assigning null to an array when not using the syntax

name=(value1 ... valuen)

If the manual says that "declare -a A=" is equivalent with assigning to the
element 0 then it should be included in the Arrays section.
Example 2
...
The real question is whether or not the size/length operator should error out when there
are no elements in the array. The current behavior could be considered a bug.
If the current behavior could be considered a bug then it means that bash does not support
empty arrays. Or that bash support them in a very restricted way, that is only to create
them, but not to use them.

This is also in contradiction with the manual statement "When there are no array members, ${name[@]} expands to nothing."


In my opinion bash should support

a) *real* arrays. we should have the possibility of iterating
over arrays elements even if the arrays are empty or
not. Currently that is not supported.

b) the possibility of expanding an array to the set of string
stored in its elements and if the array is empty then we must
get an empty set of strings.

regards
Cristian Zoicas
DJ Mills
2017-09-05 12:41:36 UTC
Permalink
This is not a comment on what bash should or should not do, but simply a
workaround if you really need to use set -u (without which you wouldn't be
running into any issues).

You can always just put a simple conditional checking "${#array[@]}" for a
non-zero value before iterating.
Post by Cristian Zoicas
Hello all and sorry for the delay.
Now after reading the answers from all of you I'll resume next my
conclusions. Below you may
find comments related to your previous answer but in order to be short, I have addressed
only problems related to my original question (empty arrays).
Post by Andy Chu
1. Never use anything that isn't in parentheses on the RHS of an
array initialization.
Post by Andy Chu
declare -a A=()
decare -a A=(1 2 3)
While the second stamtement is ok (I mean, the array works well if I want to find
the number of elements or to expand it), the first one is problematic.
set -u
unset A
declare -a A=()
So 'declare -a A=()' is not an effective way of creating empty arrays.
I would even add that bash does not support empty arrays.
Post by Andy Chu
declare -a A= (your example 1)
The way you should read this is "I'm assigning a string to an
array". Bash does the
Post by Andy Chu
non-obvious thing of coercing the string to an array -- by creating
an array
Post by Andy Chu
with ONE element.
In a shell context I could understand that such a statement is invalid (if it
would be impossible or too difficult to implement it in a more user friendly way).
Normally I would not use it, but I had the need of creating an emtpy array, so
I wrote that statement and in the end I had the surprise of creating an array
with a single element initialized with null.
Bash did the very-non-obvious thing of creating an array with a single
element from null.
where
Post by Andy Chu
i is an integer (double quotes are mandatory in both cases).
I wouldn't but the manual says
With the statement 'declare -a A=;' I was hoping to create an empty
array. The for loop (by
luck) did what I expected: the array did not contain any elements
the statemnts inside
the loop were not executed.
On the other side if I would have written the for loop as
the statement "echo" would have been executed.
The for loop worked because A was created as an array with a single null element. And
manual is wrong. When an array is empty, it is considered undefined (see my example 2).
Post by Andy Chu
Yes. The manual also says that assigning to an array without using a
subscript is
Post by Andy Chu
equivalent to assigning to element 0 (or "0").
I think that the manual should be more clear about what is going on when
assigning null to an array when not using the syntax
name=(value1 ... valuen)
If the manual says that "declare -a A=" is equivalent with assigning to the
element 0 then it should be included in the Arrays section.
Post by Andy Chu
Example 2
...
The real question is whether or not the size/length operator should
error out when there
Post by Andy Chu
are no elements in the array. The current behavior could be
considered a bug.
If the current behavior could be considered a bug then it means
that bash does not support
empty arrays. Or that bash support them in a very restricted way, that
is only to create
them, but not to use them.
This is also in contradiction with the manual statement "When there are
In my opinion bash should support
a) *real* arrays. we should have the possibility of iterating
over arrays elements even if the arrays are empty or
not. Currently that is not supported.
b) the possibility of expanding an array to the set of string
stored in its elements and if the array is empty then we must
get an empty set of strings.
regards
Cristian Zoicas
Cristian Zoicas
2017-09-05 13:34:10 UTC
Permalink
Post by DJ Mills
This is not a comment on what bash should or should not do, but simply a
workaround if you really need to use set -u (without which you wouldn't be
running into any issues).
non-zero value before iterating.
It is true that you can put that conditional, but as I mentioned at the end
of the original post it forces the people to write difficult to write/maintain
code. Naturally, for small problems works well but for bigger scripts I would
avoid writing code like this.

Thank you for your feedback.

Cristian
Post by DJ Mills
Post by Cristian Zoicas
Hello all and sorry for the delay.
Now after reading the answers from all of you I'll resume next my
conclusions. Below you may
find comments related to your previous answer but in order to be short, I have addressed
only problems related to my original question (empty arrays).
Post by Andy Chu
1. Never use anything that isn't in parentheses on the RHS of an
array initialization.
Post by Andy Chu
declare -a A=()
decare -a A=(1 2 3)
While the second stamtement is ok (I mean, the array works well if I want to find
the number of elements or to expand it), the first one is problematic.
set -u
unset A
declare -a A=()
So 'declare -a A=()' is not an effective way of creating empty arrays.
I would even add that bash does not support empty arrays.
Post by Andy Chu
declare -a A= (your example 1)
The way you should read this is "I'm assigning a string to an
array". Bash does the
Post by Andy Chu
non-obvious thing of coercing the string to an array -- by creating
an array
Post by Andy Chu
with ONE element.
In a shell context I could understand that such a statement is invalid (if it
would be impossible or too difficult to implement it in a more user friendly way).
Normally I would not use it, but I had the need of creating an emtpy array, so
I wrote that statement and in the end I had the surprise of creating an array
with a single element initialized with null.
Bash did the very-non-obvious thing of creating an array with a single
element from null.
where
Post by Andy Chu
i is an integer (double quotes are mandatory in both cases).
I wouldn't but the manual says
With the statement 'declare -a A=;' I was hoping to create an empty
array. The for loop (by
luck) did what I expected: the array did not contain any elements
the statemnts inside
the loop were not executed.
On the other side if I would have written the for loop as
the statement "echo" would have been executed.
The for loop worked because A was created as an array with a single null element. And
manual is wrong. When an array is empty, it is considered undefined
(see my example 2).
Post by Andy Chu
Yes. The manual also says that assigning to an array without using a
subscript is
Post by Andy Chu
equivalent to assigning to element 0 (or "0").
I think that the manual should be more clear about what is going on when
assigning null to an array when not using the syntax
name=(value1 ... valuen)
If the manual says that "declare -a A=" is equivalent with assigning to the
element 0 then it should be included in the Arrays section.
Post by Andy Chu
Example 2
...
The real question is whether or not the size/length operator should
error out when there
Post by Andy Chu
are no elements in the array. The current behavior could be
considered a bug.
If the current behavior could be considered a bug then it means
that bash does not support
empty arrays. Or that bash support them in a very restricted way, that
is only to create
them, but not to use them.
This is also in contradiction with the manual statement "When there are
In my opinion bash should support
a) *real* arrays. we should have the possibility of iterating
over arrays elements even if the arrays are empty or
not. Currently that is not supported.
b) the possibility of expanding an array to the set of string
stored in its elements and if the array is empty then we must
get an empty set of strings.
regards
Cristian Zoicas
Andy Chu
2017-09-05 16:33:00 UTC
Permalink
Hm, interesting. I think your whole message below can be rephrased as:
empty arrays don't work with set -u. Is that right?

If so I agree, and I would consider that a design problem. I just tested
mksh and it does the same thing as bash. (mksh is a pksh fork, and pdksh
was a clone of AT&T ksh, so I suspect the behavior originated there.)

set -u is useful with arrays for out-of-bounds checks:

$ set -u
$ declare -a A=(1 2 3)
$ echo "${A[2]}"
$ echo "${A[3]}"
-bash: A[3]: unbound variable

But I agree that the empty array check is undesired. It's also
inconsistent with the argv array:

$ set -u
$ set -- # clear argv array
$ echo "$#"
0
$ argv "$@" # gives desired empty array of argument, not error
[]

Horrible workaround:

Copy to the argument array:

set -- "{$A[@]}"

Copy from the arguments array:

declare -a A=( "$@" )

Then use "$@" , for i in "$@", etc. everywhere?

-----

I would also like it if bash would provide a way to safely interpolate
empty arrays with set -u. Syntax idea:

echo "${A[@@]}"

That way there is no extra mode, which is harder to read IMO.


Andy
Post by Cristian Zoicas
Hello all and sorry for the delay.
Now after reading the answers from all of you I'll resume next my
conclusions. Below you may
find comments related to your previous answer but in order to be short, I have addressed
only problems related to my original question (empty arrays).
Post by Andy Chu
1. Never use anything that isn't in parentheses on the RHS of an
array initialization.
Post by Andy Chu
declare -a A=()
decare -a A=(1 2 3)
While the second stamtement is ok (I mean, the array works well if I want to find
the number of elements or to expand it), the first one is problematic.
set -u
unset A
declare -a A=()
So 'declare -a A=()' is not an effective way of creating empty arrays.
I would even add that bash does not support empty arrays.
Post by Andy Chu
declare -a A= (your example 1)
The way you should read this is "I'm assigning a string to an
array". Bash does the
Post by Andy Chu
non-obvious thing of coercing the string to an array -- by creating
an array
Post by Andy Chu
with ONE element.
In a shell context I could understand that such a statement is invalid (if it
would be impossible or too difficult to implement it in a more user friendly way).
Normally I would not use it, but I had the need of creating an emtpy array, so
I wrote that statement and in the end I had the surprise of creating an array
with a single element initialized with null.
Bash did the very-non-obvious thing of creating an array with a single
element from null.
where
Post by Andy Chu
i is an integer (double quotes are mandatory in both cases).
I wouldn't but the manual says
With the statement 'declare -a A=;' I was hoping to create an empty
array. The for loop (by
luck) did what I expected: the array did not contain any elements
the statemnts inside
the loop were not executed.
On the other side if I would have written the for loop as
the statement "echo" would have been executed.
The for loop worked because A was created as an array with a single null element. And
manual is wrong. When an array is empty, it is considered undefined (see my example 2).
Post by Andy Chu
Yes. The manual also says that assigning to an array without using a
subscript is
Post by Andy Chu
equivalent to assigning to element 0 (or "0").
I think that the manual should be more clear about what is going on when
assigning null to an array when not using the syntax
name=(value1 ... valuen)
If the manual says that "declare -a A=" is equivalent with assigning to the
element 0 then it should be included in the Arrays section.
Post by Andy Chu
Example 2
...
The real question is whether or not the size/length operator should
error out when there
Post by Andy Chu
are no elements in the array. The current behavior could be
considered a bug.
If the current behavior could be considered a bug then it means
that bash does not support
empty arrays. Or that bash support them in a very restricted way, that
is only to create
them, but not to use them.
This is also in contradiction with the manual statement "When there are
In my opinion bash should support
a) *real* arrays. we should have the possibility of iterating
over arrays elements even if the arrays are empty or
not. Currently that is not supported.
b) the possibility of expanding an array to the set of string
stored in its elements and if the array is empty then we must
get an empty set of strings.
regards
Cristian Zoicas
Andy Chu
2017-09-05 17:51:49 UTC
Permalink
Post by Andy Chu
I would also like it if bash would provide a way to safely interpolate
That way there is no extra mode, which is harder to read IMO.
Given that I already complained that it takes 8 punctuation characters to
interpolate an array and 10 to copy it [1], I should take this suggestion
back.

A simpler and backward compatible syntax would be:

echo "${@A}" # like "${A[@]}" but an empty array OK with set -u

I was thinking of using that syntax for arrays in OSH. And if you have
arrays, you don't really need word splitting, so you can argue that ${@A}
(no quotes) should be the same as "${@A}". It's less consistent, but then
the common case with fewer characters is correct. Or I might just leave
out ${@A} (syntax error) and only provide "${@A}".

Andy


[1] http://www.oilshell.org/blog/2016/11/06.html
Greg Wooledge
2017-09-05 17:59:40 UTC
Permalink
set +u
Cristian Zoicas
2017-09-08 07:39:37 UTC
Permalink
Post by Andy Chu
That way there is no extra mode, which is harder to read IMO.
Given that I already complained that it takes 8 punctuation characters to interpolate an array and 10 to copy it [1], I should take this suggestion back.
Andy
[1] http://www.oilshell.org/blog/2016/11/06.html
My need of using empty arrays was raised exactly from the ideas
mentioned in this document: With arrays you do not have to take
into account or defend against word splitting.

Cristi
Chet Ramey
2017-09-10 18:58:22 UTC
Permalink
Post by Cristian Zoicas
My need of using empty arrays was raised exactly from the ideas
mentioned in this document: With arrays you do not have to take
into account or defend against word splitting.
You do; you just don't have to use word splitting to simulate
individual array elements.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU ***@case.edu http://cnswww.cns.cwru.edu/~chet/
Cristian Zoicas
2017-09-07 07:07:44 UTC
Permalink
Hm, interesting. I think your whole message below can be rephrased as: empty arrays don't work with set -u. Is that right?
If so I agree, and I would consider that a design problem.I just tested mksh and it does the same thing as bash. (mksh is a pksh fork, and pdksh was a clone of AT&T ksh, so I suspect the behavior originated there.)
$ set -u
$ declare -a A=(1 2 3)
$ echo "${A[2]}"
$ echo "${A[3]}"
-bash: A[3]: unbound variable
$ set -u
$ set -- # clear argv array
$ echo "$#"
0
[]
-----
That way there is no extra mode, which is harder to read IMO.
Andy
Hello all and sorry for the delay.
Now after reading the answers from all of you I'll resume next my conclusions. Below you may
find comments related to your previous answer but in order to be short, I have addressed
only problems related to my original question (empty arrays).
1. Never use anything that isn't in parentheses on the RHS of an array initialization.
declare -a A=()
decare -a A=(1 2 3)
While the second stamtement is ok (I mean, the array works well if I want to find
the number of elements or to expand it), the first one is problematic.
set -u
unset A
declare -a A=()
So 'declare -a A=()' is not an effective way of creating empty arrays.
I would even add that bash does not support empty arrays.
declare -a A= (your example 1)
The way you should read this is "I'm assigning a string to an array". Bash does the
non-obvious thing of coercing the string to an array -- by creating an array
with ONE element.
In a shell context I could understand that such a statement is invalid (if it
would be impossible or too difficult to implement it in a more user friendly way).
Normally I would not use it, but I had the need of creating an emtpy array, so
I wrote that statement and in the end I had the surprise of creating an array
with a single element initialized with null.
Bash did the very-non-obvious thing of creating an array with a single element from null.
i is an integer (double quotes are mandatory in both cases).
I wouldn't but the manual says
With the statement 'declare -a A=;' I was hoping to create an empty array. The for loop (by
the loop were not executed.
On the other side if I would have written the for loop as
the statement "echo" would have been executed.
The for loop worked because A was created as an array with a single null element. And
manual is wrong. When an array is empty, it is considered undefined (see my example 2).
Yes. The manual also says that assigning to an array without using a subscript is
equivalent to assigning to element 0 (or "0").
I think that the manual should be more clear about what is going on when
assigning null to an array when not using the syntax
name=(value1 ... valuen)
If the manual says that "declare -a A=" is equivalent with assigning to the
element 0 then it should be included in the Arrays section.
Example 2
...
The real question is whether or not the size/length operator should error out when there
are no elements in the array. The current behavior could be considered a bug.
If the current behavior could be considered a bug then it means that bash does not support
empty arrays. Or that bash support them in a very restricted way, that is only to create
them, but not to use them.
In my opinion bash should support
a) *real* arrays. we should have the possibility of iterating
over arrays elements even if the arrays are empty or
not. Currently that is not supported.
b) the possibility of expanding an array to the set of string
stored in its elements and if the array is empty then we must
get an empty set of strings.
regards
Cristian Zoicas
Cristian Zoicas
2017-09-07 09:31:01 UTC
Permalink
Hm, interesting. I think your whole message below can be rephrased as: empty arrays don't work with set -u. Is that right?
right...
If so I agree, and I would consider that a design problem.
... but I prefer this sentence containing the expression "design problem" instead
of "empty arrays don't work with set -u". And I would also add the following problems:

* "declare -a A=" creates an array with a single null element;
* the manual should clarify some missing issues.
I just tested mksh and it does the same thing as bash. (mksh is a pksh fork, and pdksh was a clone of AT&T ksh, so I
suspect the behavior originated there.)
Probably the original designers wanted to allow for the possibility of using an empty array (e.g. declare -a A=() )
at least in a minimal way so they decided that the compromise of allowing to test the length of the array but not
the use expansion is a good one. It could makes sense if you don't want to allow empty arrays, but I cannot understand
why such arrays should be not allowed.
$ set -u
$ declare -a A=(1 2 3)
$ echo "${A[2]}"
$ echo "${A[3]}"
-bash: A[3]: unbound variable
excelent example. I've not imagined/used set -u for out-of-bound checks.
$ set -u
$ set -- # clear argv array
$ echo "$#"
0
[]
I would call this inconsistency another "design problem". Here is another awkward example based on it:

set -u;
for p in "$@"; do echo "$p"; done
declare -a A=( "$@" )
for p in "${A[@]}"; do echo "$p"; done
echo ${A[@]}

Anything works well until the @ array is empty.


$@ behaves very well (I would say perfectly) with respect to the for
loop even when it is empty. In the example below

set -u;
set --;
for i in "$@"; do
echo "i=${i}";
done

The statements in the for loop are not iterated.

I want to use this occasion to mention another issue related to the expanssion of the single possible
empty array, $@, (because it is not possible to expand other empty arrays when set -u.)

Lets' see what happens in the following cases:

1) unset A; declare A=; for v in ${A}; do echo "v="; done

# A is declared WITHOUT the array attribute. The statements in the for loop are not executed because $A expands to nothing.

2) unset A; declare A=; for v in "${A}"; do echo "v="; done

# A is declared WITHOUT the array attribute . The statements in the for loop *are* executed (it
# is not a problem). bash sees the first " and considers it the beginning of a string. Inside the
# string there is A must be expanded. It expands to null. After A the string finishes and thus we
# get a zero length string.

3) set -u; set -- A B C; for s in "X$@Y"; do echo "s=${s}"; done

# It prints : "s=XA s=B s=CY" and it behaves as documented. I imagine that a similar mechanism as above is engaged. When bash
# sees "X$@Y" it considers " the beginning of a string, it finds the 'X' and considers it a string that must be concatenated with
# the first element of the array, and the things go ahead as documented.

4) set -u; set -- A C B; for s in "$@"; do echo "s=${s}"; done

# It prints : "s=A s=B s=C"

# similar things as above happen here with the difference that a zero length string is concatenated with the first element of the
# array.

5) set -u; set --; for s in "$@"; do echo "s=${s}"; done

# it prints nothing. For me (and I hope that for others) this works perfectly. But bash does not engage the same mechanisms as above
# because in this case bash does not find a zero length string between " and $ that is concatenated with the empty expanssion
thank you.
-----
That way there is no extra mode, which is harder to read IMO.
Andy
Cristi
Greg Wooledge
2017-09-07 12:18:48 UTC
Permalink
Post by Cristian Zoicas
Hm, interesting. I think your whole message below can be rephrased as: empty arrays don't work with set -u. Is that right?
right...
You know, I'd been taking this assertion on faith, but after testing
it, I just don't see any evidence that it's true.

wooledg:~$ set -u
wooledg:~$ unset a
wooledg:~$ for i in "${a[@]}"; do :; done
wooledg:~$ a=()
wooledg:~$ for i in "${a[@]}"; do :; done
wooledg:~$ a="a string"
wooledg:~$ for i in "${a[@]}"; do :; done
wooledg:~$ unset a
wooledg:~$ a=([1]=foo)
wooledg:~$ for i in "${a[@]}"; do :; done
wooledg:~$ echo "${a[0]}"
bash: a[0]: unbound variable
wooledg:~$ set +u

The ONLY way I can get -u to trigger is to reference a specific index
that is not set within the array. In all other cases, an array that
is either unset or empty, or which has an index gap, works just
fine with "${a[@]}" under -u.

Of course, -u is still horrible and I will never use it my scripts.
But it's not because of how it treats arrays.
Cristian Zoicas
2017-09-07 12:23:50 UTC
Permalink
Post by Greg Wooledge
Post by Cristian Zoicas
Hm, interesting. I think your whole message below can be rephrased as: empty arrays don't work with set -u. Is that right?
right...
You know, I'd been taking this assertion on faith, but after testing
it, I just don't see any evidence that it's true.
wooledg:~$ set -u
wooledg:~$ unset a
wooledg:~$ a=()
wooledg:~$ a="a string"
wooledg:~$ unset a
wooledg:~$ a=([1]=foo)
wooledg:~$ echo "${a[0]}"
bash: a[0]: unbound variable
wooledg:~$ set +u
The ONLY way I can get -u to trigger is to reference a specific index
that is not set within the array. In all other cases, an array that
is either unset or empty, or which has an index gap, works just
Could you please provide information about your bash version ?

Mine is

GNU bash, version 4.3.30(1)-release (i586-pc-linux-gnu)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

Running on Debian GNU/Linux 8.1 (jessie)
Post by Greg Wooledge
Of course, -u is still horrible and I will never use it my scripts.
But it's not because of how it treats arrays.
Greg Wooledge
2017-09-07 12:31:58 UTC
Permalink
Post by Cristian Zoicas
Post by Greg Wooledge
wooledg:~$ set -u
wooledg:~$ unset a
wooledg:~$ a=()
wooledg:~$ a="a string"
wooledg:~$ unset a
wooledg:~$ a=([1]=foo)
wooledg:~$ echo "${a[0]}"
bash: a[0]: unbound variable
wooledg:~$ set +u
Could you please provide information about your bash version ?
Mine is
GNU bash, version 4.3.30(1)-release (i586-pc-linux-gnu)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
Running on Debian GNU/Linux 8.1 (jessie)
4.4.12(1)-release
As packaged by Debian 9 (stretch), amd64.

ii bash 4.4-5 amd64 GNU Bourne Again SHell
Cristian Zoicas
2017-09-07 12:32:12 UTC
Permalink
Post by Greg Wooledge
Post by Cristian Zoicas
Hm, interesting. I think your whole message below can be rephrased as: empty arrays don't work with set -u. Is that right?
right...
You know, I'd been taking this assertion on faith, but after testing
it, I just don't see any evidence that it's true.
wooledg:~$ set -u
wooledg:~$ unset a
wooledg:~$ a=()
wooledg:~$ a="a string"
wooledg:~$ unset a
wooledg:~$ a=([1]=foo)
wooledg:~$ echo "${a[0]}"
bash: a[0]: unbound variable
wooledg:~$ set +u
The ONLY way I can get -u to trigger is to reference a specific index
that is not set within the array. In all other cases, an array that
is either unset or empty, or which has an index gap, works just
Of course, -u is still horrible and I will never use it my scripts.
But it's not because of how it treats arrays.
Could you explain why -u is horrible? Does it have some drawback?

thx
Cristi
Greg Wooledge
2017-09-07 12:44:18 UTC
Permalink
Post by Cristian Zoicas
Could you explain why -u is horrible? Does it have some drawback?
http://mywiki.wooledge.org/BashFAQ/112

At least it's not set -e. set -e has no possible justification at
all. set -u is merely... quirky.
Andy Chu
2017-09-07 16:23:05 UTC
Permalink
Post by Greg Wooledge
Post by Cristian Zoicas
Could you explain why -u is horrible? Does it have some drawback?
http://mywiki.wooledge.org/BashFAQ/112
At least it's not set -e. set -e has no possible justification at
all. set -u is merely... quirky.
I agree that both are quirky, but even with quirks they're still useful,
and I use them exclusively. "no possible justification" is hyperbole --
plenty of people use them.

I believe MOST serious shell users use set -e. Look at debootstrap -- it's
a few thousands lines of shell at the foundation of Debian. They could
have stopped using it any time in the last 20 years but didn't. I also
looked at the bash scripts at the foundation of Nix -- same thing.


set -u:

- For environment variables, typically my scripts begin with

readonly SOMEVAR=${SOMEVAR:-}

This has the nice side effect of actually declaring what environment
variables your script uses! Otherwise when you see $SOMEVAR you don't know
if it is an env var or a global in some other module you've sourced.

- I didn't know about the empty array thing, but apparently that's fixed in
bash 4.4 (and there is a known workaround in 4.3 and earlier,
${A+"${A[@]}"} which I just found about).

set -e:

There are lots of nice examples here: http://mywiki.wooledge.org/BashFAQ/105

In practice I only run into one issue:

local foo=$(echo hi; false)

->

local foo
foo=$(echo hi; false)

(However local foo='bar' is fine because there's no possibility of failure.)

I never really use subshells; those need another set -e at the top. There
is a new set -o inherit-errexit option but I haven't used it.

I never really used 'echo $(foo)', instead I assign it to a variable first.

I did a very principled implementation of errexit in OSH. I didn't find
any problems with it in THEORY. Yes in practice bash has surprising
behavior, but in theory it's a good idea, and it's not hard to implement.
There are just a handful of places where you turn it off: the condition of
while/until loops, if conditions, all clauses in && and || except the last
clause, etc.

The main thing I disallowed was the weird behavior mutating errexit while
it's disabled, which bash has a somewhat odd rule for. Example:

if { true; set -o errexit; false; }; then
echo hi
fi

FWIW I tried to switch to explicit error handling once, but that just
ruined the entire reason I use bash: because it gets things done fast. If
I want to do explicit detailed error checking I'll use another language.
As I said, MOST serious scripts I see use set -e.

Andy
Chris F.A. Johnson
2017-09-07 19:18:32 UTC
Permalink
Post by Andy Chu
Post by Greg Wooledge
Post by Cristian Zoicas
Could you explain why -u is horrible? Does it have some drawback?
http://mywiki.wooledge.org/BashFAQ/112
At least it's not set -e. set -e has no possible justification at
all. set -u is merely... quirky.
I agree that both are quirky, but even with quirks they're still useful,
and I use them exclusively. "no possible justification" is hyperbole --
plenty of people use them.
I believe MOST serious shell users use set -e.
I have never used either 'set -e' or 'set -u'.
--
Chris F.A. Johnson, <http://cfajohnson.com>
DJ Mills
2017-09-07 19:36:30 UTC
Permalink
Post by Chris F.A. Johnson
I have never used either 'set -e' or 'set -u'.
--
Chris F.A. Johnson, <http://cfajohnson.com>
Seconded.
Chet Ramey
2017-09-07 19:35:59 UTC
Permalink
Post by Andy Chu
I did a very principled implementation of errexit in OSH. I didn't find
any problems with it in THEORY. Yes in practice bash has surprising
behavior, but in theory it's a good idea, and it's not hard to implement.
There are just a handful of places where you turn it off: the condition of
while/until loops, if conditions, all clauses in && and || except the last
clause, etc.
The main thing I disallowed was the weird behavior mutating errexit while
Which Posix requires.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU ***@case.edu http://cnswww.cns.cwru.edu/~chet/
Andy Chu
2017-09-07 20:00:33 UTC
Permalink
Post by Chet Ramey
Post by Andy Chu
I did a very principled implementation of errexit in OSH. I didn't find
any problems with it in THEORY. Yes in practice bash has surprising
behavior, but in theory it's a good idea, and it's not hard to implement.
There are just a handful of places where you turn it off: the condition
of
Post by Andy Chu
while/until loops, if conditions, all clauses in && and || except the
last
Post by Andy Chu
clause, etc.
The main thing I disallowed was the weird behavior mutating errexit while
Which Posix requires.
Yes fair enough. Bash is the most popular shell by far so it gets "blamed"
for behavior that's necessary for POSIX compatibility and compatiblity with
other shells.

I had noticed that some people discourage or complain about "bash-isms"
(this seems to be common in Debian.) As far as I can tell, most of those
things are historically "ksh-isms", but nobody calls them that.

Andy
Chet Ramey
2017-09-07 20:09:12 UTC
Permalink
Post by Andy Chu
I had noticed that some people discourage or complain about "bash-isms"
(this seems to be common in Debian.)  As far as I can tell, most of those
things are historically "ksh-isms", but nobody calls them that.
The comparison isn't against ksh. It's mostly made against dash, which is
much "purer", or sometimes Posix. It happens the most on Debian because
Debian uses dash as /bin/sh.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU ***@case.edu http://cnswww.cns.cwru.edu/~chet/
Andy Chu
2017-09-07 23:57:37 UTC
Permalink
Post by Chet Ramey
Post by Andy Chu
I had noticed that some people discourage or complain about "bash-isms"
(this seems to be common in Debian.) As far as I can tell, most of those
things are historically "ksh-isms", but nobody calls them that.
The comparison isn't against ksh. It's mostly made against dash, which is
much "purer", or sometimes Posix. It happens the most on Debian because
Debian uses dash as /bin/sh.
Yes, I've done a lot of tests of dash behavior. I'm pointing out that
certain things are called "bash-isms" that may not have originated with
bash.

For example, I think it's likely the ${a[@]} and set -u behavior originated
with ksh (and I'm glad it was fixed in 4.4). POSIX is of course silent
since it doesn't have arrays.

Andy
Chet Ramey
2017-09-09 19:03:32 UTC
Permalink
Post by Chet Ramey
Post by Andy Chu
I had noticed that some people discourage or complain about "bash-isms"
(this seems to be common in Debian.)  As far as I can tell, most of those
things are historically "ksh-isms", but nobody calls them that.
The comparison isn't against ksh. It's mostly made against dash, which is
much "purer", or sometimes Posix. It happens the most on Debian because
Debian uses dash as /bin/sh.
Yes, I've done a lot of tests of dash behavior.  I'm pointing out that
certain things are called "bash-isms" that may not have originated with bash.
Sure. But they're called bashisms, regardless of where they originated,
because they exist in bash and don't exist in dash, so in the end it
doesn't matter. Nothing more complicated than that.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU ***@case.edu http://cnswww.cns.cwru.edu/~chet/
Chet Ramey
2017-09-07 14:33:11 UTC
Permalink
Post by Greg Wooledge
Post by Cristian Zoicas
Hm, interesting. I think your whole message below can be rephrased as: empty arrays don't work with set -u. Is that right?
right...
You know, I'd been taking this assertion on faith, but after testing
it, I just don't see any evidence that it's true.
It changed in bash-4.4:

3. New Features in Bash

a. Using ${a[@]} or ${a[*]} with an array without any assigned elements when
the nounset option is enabled no longer throws an unbound variable error.

Originally, the nounset option worked on everything: variables, individual
positional parameters ($1), the set of positional parameters ($@), arrays,
special parameters that haven't gotten a value ($! when there haven't been
any asynchronous processes), etc. Posix decided on the special handling of
$@ in 2009, though treating $@ as a non-fatal error when there are no
positional parameters and nounset is enabled was not historical practice.
I changed the handling of ${A[@]} when A has no assigned elements to match
that behavior.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU ***@case.edu http://cnswww.cns.cwru.edu/~chet/
Andy Chu
2017-09-06 18:53:20 UTC
Permalink
Post by Cristian Zoicas
set -u
unset A
declare -a A=()
Cristian, coincidentally I noticed a solution to your problem in the commit
description here:

https://github.com/NixOS/nixpkgs/commit/81194eef45e2c03018687be60c2c695a1729df36

In short, you can do:

$ argv ${A+"${A[@]}"}

This works with set -u, including with an empty array. (At least with bash
4.3 on my machine.)

You can compare the bad-interpolate vs. good-interpolate functions here
(please test with your bash version):

https://github.com/oilshell/blog-code/blob/master/empty-arrays/demo.sh

It's very similar in form to:

${1+"$@"}

- The outer substitution is NOT quoted
- The inner subsitution IS quoted
- Use + to test for unset, not :+ to test for unset or empty. What "unset"
and "empty" mean in terms of arrays is fairly confusing.

It's not obvious to me why this works. But it's also not clear to me why
there is a problem with "${A[@]}" and set -u in the first place, because an
empty array "should be" distinct from an unset variable. Probably the only
explanation is "ksh did it that way."

Andy
Loading...