Discussion:
[Help-bash] slow in function creation
Peng Yu
2018-12-04 04:26:27 UTC
Permalink
Hi,

I see that function creation can be much slower than typeset -F when
there are many functions. Why function creation can be so slow? Is
there a way to improve its runtime?

I see that the typeset -F slows down ~10x from 100000 to 1000000. Is
100000 approximately the bucket size of the hash storing the
functions?

$ ./main.sh
==> 1 <==
0.008
0.000
1.254
==> 10 <==
0.009
0.000
1.291
==> 1000 <==
0.013
0.007
1.280
==> 10000 <==
0.022
0.065
1.296
==> 100000 <==
0.110
1.706
1.767
==> 1000000 <==
1.015
218.159
10.999
$ cat main.sh
#!/usr/bin/env bash
# vim: set noexpandtab tabstop=2:

tmpfile=$(mktemp -u)
TIMEFORMAT=%R
for n in 1 10 1000 10000 100000 1000000
do
echo "==> $n <=="
time awk -v n="$n" -e 'BEGIN { for(i=1;i<n;++i) printf("function
f%d { :; }\n", i) }' > "$tmpfile"
time source "$tmpfile"
time for ((i=0;i<100000;++i))
do
typeset -F f0
done > /dev/null
done
--
Regards,
Peng
Peng Yu
2018-12-04 04:58:51 UTC
Permalink
Post by Peng Yu
Hi,
I see that function creation can be much slower than typeset -F when
there are many functions. Why function creation can be so slow? Is
there a way to improve its runtime?
I see that the typeset -F slows down ~10x from 100000 to 1000000. Is
100000 approximately the bucket size of the hash storing the
functions?
$ ./main.sh
==> 1 <==
0.008
0.000
1.254
==> 10 <==
0.009
0.000
1.291
==> 1000 <==
0.013
0.007
1.280
==> 10000 <==
0.022
0.065
1.296
==> 100000 <==
0.110
1.706
1.767
==> 1000000 <==
1.015
218.159
10.999
$ cat main.sh
#!/usr/bin/env bash
tmpfile=$(mktemp -u)
TIMEFORMAT=%R
for n in 1 10 1000 10000 100000 1000000
do
echo "==> $n <=="
time awk -v n="$n" -e 'BEGIN { for(i=1;i<n;++i) printf("function
f%d { :; }\n", i) }' > "$tmpfile"
time source "$tmpfile"
time for ((i=0;i<100000;++i))
do
typeset -F f0
done > /dev/null
done
An associative array has about the same runtime characteristics. I
assume that the function table and associative uses the same
underlying hash implementation? Is it so?

$ ./main.sh
==> 1 <==
0.008
0.000
1.211
==> 10 <==
0.010
0.000
1.220
==> 100 <==
0.009
0.001
1.200
==> 1000 <==
0.012
0.006
1.191
==> 10000 <==
0.022
0.065
1.275
==> 100000 <==
0.106
1.996
2.092
==> 1000000 <==
1.186
336.367
15.847

$ cat main.sh
#!/usr/bin/env bash
# vim: set noexpandtab tabstop=2:

#set -v
tmpfile=$(mktemp -u)
TIMEFORMAT=%R
for n in 1 10 100 1000 10000 100000 1000000
do
echo "==> $n <=="
declare -A x
time awk -v n="$n" -e 'BEGIN { for(i=0;i<n;++i)
printf("x[x%d]=\n", i ) }' > "$tmpfile"
time source "$tmpfile"
time for ((i=0;i<100000;++i))
do
[[ -v x[x0] ]]
done > /dev/null
unset x
done
--
Regards,
Peng
Peng Yu
2018-12-04 05:24:48 UTC
Permalink
Post by Peng Yu
Post by Peng Yu
Hi,
I see that function creation can be much slower than typeset -F when
there are many functions. Why function creation can be so slow? Is
there a way to improve its runtime?
I see that the typeset -F slows down ~10x from 100000 to 1000000. Is
100000 approximately the bucket size of the hash storing the
functions?
$ ./main.sh
==> 1 <==
0.008
0.000
1.254
==> 10 <==
0.009
0.000
1.291
==> 1000 <==
0.013
0.007
1.280
==> 10000 <==
0.022
0.065
1.296
==> 100000 <==
0.110
1.706
1.767
==> 1000000 <==
1.015
218.159
10.999
$ cat main.sh
#!/usr/bin/env bash
tmpfile=$(mktemp -u)
TIMEFORMAT=%R
for n in 1 10 1000 10000 100000 1000000
do
echo "==> $n <=="
time awk -v n="$n" -e 'BEGIN { for(i=1;i<n;++i) printf("function
f%d { :; }\n", i) }' > "$tmpfile"
time source "$tmpfile"
time for ((i=0;i<100000;++i))
do
typeset -F f0
done > /dev/null
done
An associative array has about the same runtime characteristics. I
assume that the function table and associative uses the same
underlying hash implementation? Is it so?
$ ./main.sh
==> 1 <==
0.008
0.000
1.211
==> 10 <==
0.010
0.000
1.220
==> 100 <==
0.009
0.001
1.200
==> 1000 <==
0.012
0.006
1.191
==> 10000 <==
0.022
0.065
1.275
==> 100000 <==
0.106
1.996
2.092
==> 1000000 <==
1.186
336.367
15.847
$ cat main.sh
#!/usr/bin/env bash
#set -v
tmpfile=$(mktemp -u)
TIMEFORMAT=%R
for n in 1 10 100 1000 10000 100000 1000000
do
echo "==> $n <=="
declare -A x
time awk -v n="$n" -e 'BEGIN { for(i=0;i<n;++i)
printf("x[x%d]=\n", i ) }' > "$tmpfile"
time source "$tmpfile"
time for ((i=0;i<100000;++i))
do
[[ -v x[x0] ]]
done > /dev/null
unset x
done
Here are the runtime results in a easier to read format.

--
function 1 0.008 0.001 1.113
asso_array 1 0.010 0.001 0.893
--
function 10 0.011 0.001 1.130
asso_array 10 0.010 0.001 0.883
--
function 100 0.009 0.001 1.093
asso_array 100 0.011 0.001 0.881
--
function 1000 0.010 0.007 1.135
asso_array 1000 0.010 0.006 0.900
--
function 10000 0.019 0.072 1.191
asso_array 10000 0.018 0.060 0.940
--
function 100000 0.107 2.289 1.122
asso_array 100000 0.103 2.533 0.897
--
function 1000000 1.133 219.749 0.990
asso_array 1000000 0.861 272.116 0.761
--
Regards,
Peng
Loading...