Discussion:
[Help-bash] add and expand
Val Krem
2017-06-05 22:00:12 UTC
Permalink
Hi all,

I have an array and i want list the array elements with prefix. In this case my prefix is "Number"


#! /bash/bin

List="one two three"


my desired out put should be
"Numberone, Numbertwo, Numberthree"


I tried

echo ${List[@]} and echo ${Name[@/#/Number]}
did not work.

Val
Peter West
2017-06-05 23:26:11 UTC
Permalink
Post by Val Krem
Hi all,
I have an array and i want list the array elements with prefix. In this case my prefix is "Number"
#! /bash/bin
List=“one two three”
I don’t think List is an array.

List=(one two three)
is an array.
Post by Val Krem
my desired out put should be
"Numberone, Numbertwo, Numberthree"
I tried
did not work.
List=(one two three)
for n in ${List[@]}; do echo “Number$n”; done
(except with ascii double quote)

This works for me.

--
Peter West
***@pbw.id.au
This is the disciple who is bearing witness to these things

Val Krem
2017-06-05 23:52:32 UTC
Permalink
Thank you Peter,
What I did not mention in my previous posting was after adding a prefix I want export the expanded variables as one variable.I amde a littel progress


#! /bash/bin

set -- one two three

test1="${@/#/Number}"
echo $test1
resulted
Numberone Numbertwo Numberthree

How do I wrap this in quotation mark and treat it as one variable?

test2="Numberone Numbertwo Numberthree"; export test2

echo ${test2}
I should get this:-

Numberone Numbertwo Numberthree
Post by Val Krem
Hi all,
I have an array and i want list the array elements with prefix. In this case my prefix is "Number"
#! /bash/bin
List=“one two three”
I don’t think List is an array.

List=(one two three)
is an array.
Post by Val Krem
my desired out put should be
"Numberone, Numbertwo, Numberthree"
I tried
did not work.
List=(one two three)
for n in ${List[@]}; do echo “Number$n”; done
(except with ascii double quote)

This works for me.

--
Peter West
***@pbw.id.au
This is the disciple who is bearing witness to these things…
Peter West
2017-06-06 01:06:57 UTC
Permalink
Post by Val Krem
Thank you Peter,
What I did not mention in my previous posting was after adding a prefix I want export the expanded variables as one variable.I amde a littel progress
#! /bash/bin
set — one two three
Not using arrays now.
Post by Val Krem
echo $test1
resulted
Numberone Numbertwo Numberthree
test1 is a single variable. The echo shows the contents of the variable test1. You may have problems with the use of the variable because the shell is reading the contents of test1 as three separate words, thanks to IFS, which treats space, tab and newline as word separators.

As long as you use “$test1” rather than the unquoted $test1, the contents will be treated as a single variable. What specific problems are you having?
Post by Val Krem
How do I wrap this in quotation mark and treat it as one variable?
test2="Numberone Numbertwo Numberthree"; export test2
echo ${test2}
I should get this:-
Numberone Numbertwo Numberthree
--
Peter West
***@pbw.id.au
This is the disciple who is bearing witness to these things

Greg Wooledge
2017-06-06 11:57:24 UTC
Permalink
Post by Val Krem
What I did not mention in my previous posting was after adding a prefix I want export the expanded variables as one variable.I amde a littel progress
If you want to join an array into a single string, you use the [*]
expansion inside double quotes.

array=(one two three)
string="${array[*]}"
# This uses the first character of IFS, or space if IFS is unset, as
# the separator between array elements when creating the string.
Post by Val Krem
set -- one two three
That's not an array, but it's *closer* than what you started with.
Are you really trying to do this with the positional parameters instead
of an array? Well, if so, you still want to use the * instead of the @,
because you're creating a single string instead of expanding the array
(or pseudo-array of positional parameters) as a list.

string="${*/#/Number}"

Note the difference between @ and *:

imadev:~$ set -- one two "two and a half"
imadev:~$ args "${@/#/Number}"
3 args: <Numberone> <Numbertwo> <Numbertwo and a half>
imadev:~$ args "${*/#/Number}"
1 args: <Numberone Numbertwo Numbertwo and a half>
Post by Val Krem
echo $test1
QUOTES!

echo "$test1"
Post by Val Krem
How do I wrap this in quotation mark and treat it as one variable?
It already IS. Though you still should use * and not @, because you
are joining.
Post by Val Krem
echo ${test2}
QUOTES!

USE MORE QUOTES!

echo "$test2"

Curly braces are NOT a substitute for quotes.


array=(one two three)
export JOINED="${array[*]/#/Number}"

That's all you need.


Note that when the child process receives this string, it *cannot*
un-join it back into its constituent elements.

array=(one two "two and a half")
export JOINED="${array[*]/#/Number}"
declare -p JOINED

read -ra array <<< "$JOINED"
declare -p array

You do not get back what you started with. You CANNOT get back what
you started with. The space separator can appear inside an array
element, and then you get ... that.

This is why you do not export arrays through the environment. Not unless
you can serialize them in some reversible way. Simple space-joining is
not such a way.

The easiest way to serialize an array reversibly is to dump it out with
a NUL byte after each element. This data stream may be strred in a
file, or sent through an open file descriptor (e.g. stdout). It CANNOT
be stored in a variable, because bash variables cannot store the NUL
byte. (That's why NUL can be used as a delimiter, after all.)

What are you ACTUALLY trying to do?
Val Krem
2017-06-06 20:48:25 UTC
Permalink
Thank you Greg and Peter,
Here is my issue,

I have two jobs running one after the other (job2 run after job1). Job1 has three jobs running within for loop.

val="Four five six"

Job1.
for a in $val
do
Within this loop three jobs are submitted
done

Job2. This job should run after the three jobs completed.
My issue to extract the three PIDs and export them as one variable


Here is my attempt
for tr in ${val};
do

job[${#tr}]= some process;
tt1="${job[${#tr}]}" #### gets the PIDs for each job
echo $tt1
done

The echo statement within the for loop produced the three PID like

1009
1010
1011

I want this three PIDs to be exported as one variable like,

test2=$1009,$1010,$1011



Any help is highly appreciated.
Post by Val Krem
What I did not mention in my previous posting was after adding a prefix I want export the expanded variables as one variable.I amde a littel progress
If you want to join an array into a single string, you use the [*]
expansion inside double quotes.

array=(one two three)
string="${array[*]}"
# This uses the first character of IFS, or space if IFS is unset, as
# the separator between array elements when creating the string.
Post by Val Krem
set -- one two three
That's not an array, but it's *closer* than what you started with.
Are you really trying to do this with the positional parameters instead
of an array? Well, if so, you still want to use the * instead of the @,
because you're creating a single string instead of expanding the array
(or pseudo-array of positional parameters) as a list.

string="${*/#/Number}"

Note the difference between @ and *:

imadev:~$ set -- one two "two and a half"
imadev:~$ args "${@/#/Number}"
3 args: <Numberone> <Numbertwo> <Numbertwo and a half>
imadev:~$ args "${*/#/Number}"
1 args: <Numberone Numbertwo Numbertwo and a half>
Post by Val Krem
echo $test1
QUOTES!

echo "$test1"
Post by Val Krem
How do I wrap this in quotation mark and treat it as one variable?
It already IS. Though you still should use * and not @, because you
are joining.
Post by Val Krem
echo ${test2}
QUOTES!

USE MORE QUOTES!


echo "$test2"

Curly braces are NOT a substitute for quotes.


array=(one two three)
export JOINED="${array[*]/#/Number}"

That's all you need.


Note that when the child process receives this string, it *cannot*
un-join it back into its constituent elements.

array=(one two "two and a half")
export JOINED="${array[*]/#/Number}"
declare -p JOINED

read -ra array <<< "$JOINED"
declare -p array

You do not get back what you started with. You CANNOT get back what
you started with. The space separator can appear inside an array
element, and then you get ... that.

This is why you do not export arrays through the environment. Not unless
you can serialize them in some reversible way. Simple space-joining is
not such a way.

The easiest way to serialize an array reversibly is to dump it out with
a NUL byte after each element. This data stream may be strred in a
file, or sent through an open file descriptor (e.g. stdout). It CANNOT
be stored in a variable, because bash variables cannot store the NUL
byte. (That's why NUL can be used as a delimiter, after all.)

What are you ACTUALLY trying to do?
Greg Wooledge
2017-06-06 21:06:36 UTC
Permalink
Post by Val Krem
I have two jobs running one after the other (job2 run after job1). Job1 has three jobs running within for loop.
Background processes?
Post by Val Krem
val="Four five six"
Job1.
for a in $val
STOP THIS! Stop putting "lists" in a string variable with spaces between
things. Use an array.
Post by Val Krem
do
Within this loop three jobs are submitted
done
Are you running *background jobs*? Or what? What do you do inside the
loop?
Post by Val Krem
Job2. This job should run after the three jobs completed.
Are they background jobs? Did you capture their PIDs with $! one by
one? Where did you store the PIDs? Or do you simply want to call
"wait" to wait for all the background jobs to complete?
Post by Val Krem
My issue to extract the three PIDs and export them as one variable
EXPORT?! Why?
Post by Val Krem
Here is my attempt
for tr in ${val};
I thought your loop iterator was "a", not "tr". Why did you change it?
What does "a" mean? What does "tr" mean? Why did you choose these
variables?
Post by Val Krem
do
job[${#tr}]= some process;
You are creating an array named job and indexing it by the LENGTH of
the string variable tr? Huh? What?

What is "some process"?

Your assignment has a god damned SPACE after the = sign so it doesn't
even work.
Post by Val Krem
tt1="${job[${#tr}]}" #### gets the PIDs for each job
echo $tt1
done
Can't even bring myself to comment any more. So tired.
Post by Val Krem
The echo statement within the for loop produced the three PID like
1009
1010
1011
"... BY SOME MIRACLE I have an array with three elements."

Let's just assume you got these values somehow. They are in an array
named "job". I can't for the life of me figure out HOW you got them,
because your code isn't even code, and even your *fake* code is a
damned disaster, but let's say you have an array.

job=(1009 1010 1011)
Post by Val Krem
I want this three PIDs to be exported as one variable like,
test2=$1009,$1010,$1011
I told you how to do this already. You use the [*] expansion in
single quotes to join an array into a string variable, using the
first character of IFS.

IFS=,
export test2="${job[*]}"
unset IFS

Are those dollar signs supposed to be part of the string? Then you
could do a fancypants expansion.

IFS=,
export test2="${job[*]/#/\$}"
unset IFS


WHY are you exporting this crazy string into the environment with a
useless name like "test2" and inscrutable contents? What are you doing
with it?

If you just want to launch 3 background processes and wait for them all
to finish, you don't need ANY of this crap.

for i in 1 2 3; do
./my background job "$i" &
done
wait

That's it!
Val Krem
2017-06-08 02:02:20 UTC
Permalink
Hi Greg and all,

Let me explain and give a practical example again.
I am trying to develop a pipeline procedure in SLURM
(https://hpc.nih.gov/docs/job_dependencies.html).


I have two jobs job1.sh and job2.sh. Job2.sh deepened on job1.sh and give job dependency syntax when I am submitting job2.sh


In SLURM, when I submit the job the syntax is as follow


JJ1=$(sbatch job1.sh)

JJ1 will give the job id or PID (numeric value) for the job1.sh.
echo ${JJ1} will produce number like this 11254323

when I submit job2.sh

JJ2=$(sbatch --dependency=afterok:${JJ1} job2.sh)


In this case Job2.sh runs only if job1.sh finishes.
For single job submission it is ok.

My issue is when I am submitting in an array.

There three jobs runs separately for each city and the final job combine

all the results of the three jobs and do some analysis.


#! /bin/bash
#sbatch is a command in SLURM


city= "NY LA DC"

for tr in ${city};
do
dp[${#tr]=$(sbatch city${city}.sh );
done

When I put echo statement within the do loop

echo ${dp[${#tr}]} I got
14547
14548
14549


The next job depends on these three job ids.
if I knew these number at the time of job submission then I do rgis
comb=(sbatch --dependency=afterok:14547,14548,14549 combine.sh )

will work fine.

Since I don't know these numbers at the time of the submission,I have to pass it as variable

Let us assume the variable newvar contain the above numbers = 14547,14548,14549

comb=(sbatch --dependency=afterok:${newvar) combine.sh )
will do the job


The complete job submission for the two jobs


#! /bin/bash
#sbatch is a command in SLURM

city= "NY LA DC"

# Job1

for tr in ${city};
do
dp[${#tr]=$(sbatch city${city}.sh );
don
#job2.

comb=(sbatch --dependency=afterok:${newvar) combine.sh )


So my question is how to get those three job ids out of the for loop.
Here is my revised attempt



#! /bin/bashcity= "NY LA DC"
count=0

# job1.

for tr in ${city};
do
dp[${#tr]=$(sbatch city${city}.sh );
array[$count]=${dp[${#tr}]}
((count++))
done

H1="${array[@]}"
newvar=$(echo "$H1" | sed 's/[[:blank:]]/,/g'); export newvar
echo "$newvar"


# Job2.

comb=(sbatch --dependency=afterok:${newvar) combine.sh )

Is there a better and clean way of doing this?
Post by Val Krem
I have two jobs running one after the other (job2 run after job1). Job1 has three jobs running within for loop.
Background processes?
Post by Val Krem
val="Four five six"
Job1.
for a in $val
STOP THIS! Stop putting "lists" in a string variable with spaces between
things. Use an array.
Post by Val Krem
do
Within this loop three jobs are submitted
done
Are you running *background jobs*? Or what? What do you do inside the
loop?
Post by Val Krem
Job2. This job should run after the three jobs completed.
Are they background jobs? Did you capture their PIDs with $! one by
one? Where did you store the PIDs? Or do you simply want to call
"wait" to wait for all the background jobs to complete?
Post by Val Krem
My issue to extract the three PIDs and export them as one variable
EXPORT?! Why?
Post by Val Krem
Here is my attempt
for tr in ${val};
I thought your loop iterator was "a", not "tr". Why did you change it?
What does "a" mean? What does "tr" mean? Why did you choose these
variables?
Post by Val Krem
do
job[${#tr}]= some process;
You are creating an array named job and indexing it by the LENGTH of
the string variable tr? Huh? What?

What is "some process"?

Your assignment has a god damned SPACE after the = sign so it doesn't
even work.
Post by Val Krem
tt1="${job[${#tr}]}" #### gets the PIDs for each job
echo $tt1
done
Can't even bring myself to comment any more. So tired.
Post by Val Krem
The echo statement within the for loop produced the three PID like
1009
1010
1011
"... BY SOME MIRACLE I have an array with three elements."

Let's just assume you got these values somehow. They are in an array
named "job". I can't for the life of me figure out HOW you got them,
because your code isn't even code, and even your *fake* code is a
damned disaster, but let's say you have an array.

job=(1009 1010 1011)
Post by Val Krem
I want this three PIDs to be exported as one variable like,
test2=$1009,$1010,$1011
I told you how to do this already. You use the [*] expansion in
single quotes to join an array into a string variable, using the
first character of IFS.

IFS=,
export test2="${job[*]}"
unset IFS

Are those dollar signs supposed to be part of the string? Then you
could do a fancypants expansion.

IFS=,
export test2="${job[*]/#/\$}"
unset IFS


WHY are you exporting this crazy string into the environment with a
useless name like "test2" and inscrutable contents? What are you doing
with it?

If you just want to launch 3 background processes and wait for them all
to finish, you don't need ANY of this crap.

for i in 1 2 3; do
./my background job "$i" &

done
wait

That's it!
Dennis Williamson
2017-06-08 02:30:26 UTC
Permalink
On Jun 7, 2017 9:07 PM, "Val Krem" <***@yahoo.com> wrote:

Hi Greg and all,

Let me explain and give a practical example again.
I am trying to develop a pipeline procedure in SLURM
(https://hpc.nih.gov/docs/job_dependencies.html).


I have two jobs job1.sh and job2.sh. Job2.sh deepened on job1.sh and give
job dependency syntax when I am submitting job2.sh


In SLURM, when I submit the job the syntax is as follow


JJ1=$(sbatch job1.sh)

JJ1 will give the job id or PID (numeric value) for the job1.sh.
echo ${JJ1} will produce number like this 11254323

when I submit job2.sh

JJ2=$(sbatch --dependency=afterok:${JJ1} job2.sh)


In this case Job2.sh runs only if job1.sh finishes.
For single job submission it is ok.

My issue is when I am submitting in an array.

There three jobs runs separately for each city and the final job combine

all the results of the three jobs and do some analysis.


#! /bin/bash
#sbatch is a command in SLURM


city= "NY LA DC"

for tr in ${city};
do
dp[${#tr]=$(sbatch city${city}.sh );
done

When I put echo statement within the do loop

echo ${dp[${#tr}]} I got
14547
14548
14549


The next job depends on these three job ids.
if I knew these number at the time of job submission then I do rgis
comb=(sbatch --dependency=afterok:14547,14548,14549 combine.sh )

will work fine.

Since I don't know these numbers at the time of the submission,I have to
pass it as variable

Let us assume the variable newvar contain the above numbers =
14547,14548,14549

comb=(sbatch --dependency=afterok:${newvar) combine.sh )
will do the job


The complete job submission for the two jobs


#! /bin/bash
#sbatch is a command in SLURM

city= "NY LA DC"

# Job1

for tr in ${city};
do
dp[${#tr]=$(sbatch city${city}.sh );
don
#job2.

comb=(sbatch --dependency=afterok:${newvar) combine.sh )


So my question is how to get those three job ids out of the for loop.
Here is my revised attempt



#! /bin/bashcity= "NY LA DC"
count=0

# job1.

for tr in ${city};
do
dp[${#tr]=$(sbatch city${city}.sh );
array[$count]=${dp[${#tr}]}
((count++))
done

H1="${array[@]}"
newvar=$(echo "$H1" | sed 's/[[:blank:]]/,/g'); export newvar
echo "$newvar"


# Job2.

comb=(sbatch --dependency=afterok:${newvar) combine.sh )

Is there a better and clean way of doing this?
Post by Val Krem
I have two jobs running one after the other (job2 run after job1). Job1
has three jobs running within for loop.

Background processes?
Post by Val Krem
val="Four five six"
Job1.
for a in $val
STOP THIS! Stop putting "lists" in a string variable with spaces between
things. Use an array.
Post by Val Krem
do
Within this loop three jobs are submitted
done
Are you running *background jobs*? Or what? What do you do inside the
loop?
Post by Val Krem
Job2. This job should run after the three jobs completed.
Are they background jobs? Did you capture their PIDs with $! one by
one? Where did you store the PIDs? Or do you simply want to call
"wait" to wait for all the background jobs to complete?
Post by Val Krem
My issue to extract the three PIDs and export them as one variable
EXPORT?! Why?
Post by Val Krem
Here is my attempt
for tr in ${val};
I thought your loop iterator was "a", not "tr". Why did you change it?
What does "a" mean? What does "tr" mean? Why did you choose these
variables?
Post by Val Krem
do
job[${#tr}]= some process;
You are creating an array named job and indexing it by the LENGTH of
the string variable tr? Huh? What?

What is "some process"?

Your assignment has a god damned SPACE after the = sign so it doesn't
even work.
Post by Val Krem
tt1="${job[${#tr}]}" #### gets the PIDs for each job
echo $tt1
done
Can't even bring myself to comment any more. So tired.
Post by Val Krem
The echo statement within the for loop produced the three PID like
1009
1010
1011
"... BY SOME MIRACLE I have an array with three elements."

Let's just assume you got these values somehow. They are in an array
named "job". I can't for the life of me figure out HOW you got them,
because your code isn't even code, and even your *fake* code is a
damned disaster, but let's say you have an array.

job=(1009 1010 1011)
Post by Val Krem
I want this three PIDs to be exported as one variable like,
test2=$1009,$1010,$1011
I told you how to do this already. You use the [*] expansion in
single quotes to join an array into a string variable, using the
first character of IFS.

IFS=,
export test2="${job[*]}"
unset IFS

Are those dollar signs supposed to be part of the string? Then you
could do a fancypants expansion.

IFS=,
export test2="${job[*]/#/\$}"
unset IFS


WHY are you exporting this crazy string into the environment with a
useless name like "test2" and inscrutable contents? What are you doing
with it?

If you just want to launch 3 background processes and wait for them all
to finish, you don't need ANY of this crap.

for i in 1 2 3; do
./my background job "$i" &

done
wait

That's it!


What is the purpose of the dp array? You store the output of snatch, which
I presume is your job numbers, in the *SAME* position in the dp array on
*EACH* iteration of the loop! You're thus wasting the array!

You could, if you do it properly, save the job numbers in an array and
convert that to a comma separated string. Or you could just build a string
as you go.

for i in "$var"
do
string=$separator$(foo "$var")
separator=","
done

bar "$string"

I really don't think you need to use export since you're passing variables'
values as positional arguments.

By the way, if you come here and start talking about jobs without
qualifying what you mean by that, we all think you're talking about what
Bash considers to be jobs,
Greg Wooledge
2017-06-08 12:44:41 UTC
Permalink
Post by Val Krem
I am trying to develop a pipeline procedure in SLURM
(https://hpc.nih.gov/docs/job_dependencies.html).
I have two jobs job1.sh and job2.sh. Job2.sh deepened on job1.sh and give job dependency syntax when I am submitting job2.sh
In SLURM, when I submit the job the syntax is as follow
JJ1=$(sbatch job1.sh)
This is not a background process. This is a FOREGROUND process. The
sbatch command runs in the foreground, and the command substitution and
variable assignment (and therefore the entire bash script) are "paused",
waiting until the sbatch command's standard output is closed.

Once sbatch closes stdout (usually by terminating), the entire contents
that were written to stdout are grabbed, trailing newlines are trimmed,
NUL bytes are removed (with or without a warning, depending on the version
of bash), and then the result is stored in the JJ1 variable.

Calling this a "job" is incredibly misleading. The use of the English
word "job" in a shell programming context implies something running
in the background.
Post by Val Krem
JJ1 will give the job id or PID (numeric value) for the job1.sh.
echo ${JJ1} will produce number like this 11254323
Wrong wrong wrong wrong wrong.

...

Unless you're claiming that your "sbatch" command is self-backgrounding.
If "sbatch" is an evil self-backgrounding abomination, which forks and
abandons a child process, and then writes the child's process ID to
stdout, then... what you say may be literally true, but I would run
away, screaming. This is a horror show. It's the worst mistakes of
the 1980s, risen anew.
Post by Val Krem
when I submit job2.sh
JJ2=$(sbatch --dependency=afterok:${JJ1} job2.sh)
In this case Job2.sh runs only if job1.sh finishes.
For single job submission it is ok.
Oh dear god, no.

I'm out. Sorry. I can't even ... no. Just no.

You would be better off throwing this entire infrastructure away, and
building your own from scratch.

Did you know that you can create dependencies among systemd units?
Maybe that would be a better framework for doing this thing of yours.
Set up, say, one systemd service that provides the things your second
service needs. Let's call this first service "dirt", because that is a
BETTER and MORE USEFUL name than "job1.sh". Let's call the other service
"grass", because grass depends on dirt. This is a BETTER and MORE USEFUL
name than "job2.sh".

So, then you ask yourself: "What does dirt do?" Well, we don't
know, because you haven't SAID a single useful thing ANYWHERE. All
we know is that you need it to run. To be running? To have executed
at least one time since boot? Let's assume it has to be continuously
running, like a database engine or something.

To do that, you would figure out whatever command you have to run to
make the dirt service "go". To be up and running, listening on its
unix domain socket, or whatever it does, so that the grass service can
use it.

IF AT ALL POSSIBLE, the command you write to invoke the dirt service
should be a FOREGROUND process. It should not self-background. It should
not fork and abandon a child, like some 1980s BSD abomination. Then you
set up a systemd service that will run this foreground process, as the
correct user, with the correct environment variables, in the correct
directory, etc.

I won't try to show this here.

Then you set up the grass service, and tell systemd that it depdends
on dirt. If you also run grass as a foreground process, then it may
be something as simple as this:

==== example /etc/systemd/system/grass.service
[Unit]
Description=Grass service, grows in the dirt
Requires=dirt.service
After=dirt.service

[Service]
ExecStart=/usr/local/bin/grass ...
KillMode=process
Restart=on-failure

[Install]
WantedBy=multi-user.target
====

I'm not saying systemd is great, and I'm not yet ready to attempt to
write real documentation for systemd (although god knows it NEEDS it,
BADLY). But it sounds like it could be a match for your poorly defined,
obfuscated task.

This is quite a bit outside the scope of "help-bash". Setting up system
services with dependencies on each other is a system administration task,
not a bash programming task. Bash isn't even INVOLVED.

If your target system doesn't use systemd, then... well. Go to your
operating system's mailing list, explain what you are actually trying to
do, with *real names*, and actually explain what each of your processes
*does*, and the actual nature of their relationship, and maybe someone
can come up with a solution.
Dennis Williamson
2017-06-08 13:16:43 UTC
Permalink
Post by Val Krem
I am trying to develop a pipeline procedure in SLURM
(https://hpc.nih.gov/docs/job_dependencies.html).
I have two jobs job1.sh and job2.sh. Job2.sh deepened on job1.sh and
give job dependency syntax when I am submitting job2.sh
Post by Val Krem
In SLURM, when I submit the job the syntax is as follow
JJ1=$(sbatch job1.sh)
This is not a background process. This is a FOREGROUND process. The
sbatch command runs in the foreground, and the command substitution and
variable assignment (and therefore the entire bash script) are "paused",
waiting until the sbatch command's standard output is closed.

Once sbatch closes stdout (usually by terminating), the entire contents
that were written to stdout are grabbed, trailing newlines are trimmed,
NUL bytes are removed (with or without a warning, depending on the version
of bash), and then the result is stored in the JJ1 variable.

Calling this a "job" is incredibly misleading. The use of the English
word "job" in a shell programming context implies something running
in the background.
Post by Val Krem
JJ1 will give the job id or PID (numeric value) for the job1.sh.
echo ${JJ1} will produce number like this 11254323
Wrong wrong wrong wrong wrong.

...

Unless you're claiming that your "sbatch" command is self-backgrounding.
If "sbatch" is an evil self-backgrounding abomination, which forks and
abandons a child process, and then writes the child's process ID to
stdout, then... what you say may be literally true, but I would run
away, screaming. This is a horror show. It's the worst mistakes of
the 1980s, risen anew.
Post by Val Krem
when I submit job2.sh
JJ2=$(sbatch --dependency=afterok:${JJ1} job2.sh)
In this case Job2.sh runs only if job1.sh finishes.
For single job submission it is ok.
Oh dear god, no.

I'm out. Sorry. I can't even ... no. Just no.

You would be better off throwing this entire infrastructure away, and
building your own from scratch.

Did you know that you can create dependencies among systemd units?
Maybe that would be a better framework for doing this thing of yours.
Set up, say, one systemd service that provides the things your second
service needs. Let's call this first service "dirt", because that is a
BETTER and MORE USEFUL name than "job1.sh". Let's call the other service
"grass", because grass depends on dirt. This is a BETTER and MORE USEFUL
name than "job2.sh".

So, then you ask yourself: "What does dirt do?" Well, we don't
know, because you haven't SAID a single useful thing ANYWHERE. All
we know is that you need it to run. To be running? To have executed
at least one time since boot? Let's assume it has to be continuously
running, like a database engine or something.

To do that, you would figure out whatever command you have to run to
make the dirt service "go". To be up and running, listening on its
unix domain socket, or whatever it does, so that the grass service can
use it.

IF AT ALL POSSIBLE, the command you write to invoke the dirt service
should be a FOREGROUND process. It should not self-background. It should
not fork and abandon a child, like some 1980s BSD abomination. Then you
set up a systemd service that will run this foreground process, as the
correct user, with the correct environment variables, in the correct
directory, etc.

I won't try to show this here.

Then you set up the grass service, and tell systemd that it depdends
on dirt. If you also run grass as a foreground process, then it may
be something as simple as this:

==== example /etc/systemd/system/grass.service
[Unit]
Description=Grass service, grows in the dirt
Requires=dirt.service
After=dirt.service

[Service]
ExecStart=/usr/local/bin/grass ...
KillMode=process
Restart=on-failure

[Install]
WantedBy=multi-user.target
====

I'm not saying systemd is great, and I'm not yet ready to attempt to
write real documentation for systemd (although god knows it NEEDS it,
BADLY). But it sounds like it could be a match for your poorly defined,
obfuscated task.

This is quite a bit outside the scope of "help-bash". Setting up system
services with dependencies on each other is a system administration task,
not a bash programming task. Bash isn't even INVOLVED.

If your target system doesn't use systemd, then... well. Go to your
operating system's mailing list, explain what you are actually trying to
do, with *real names*, and actually explain what each of your processes
*does*, and the actual nature of their relationship, and maybe someone
can come up with a solution.


My curiosity got the better of me so I took a quick glance at the
documentation of the system the OP is using. It seems to be an hpc cluster.
It's likely that the sbatch program submits what they refer to as "jobs" to
run on a different system altogether and returns immediately. The output
might be a pid on that system or perhaps some sort of task ID. So there's
no issue with backgrounding on the local system.

Words have contextual meaning, but it's the responsibility of a good
communicator to set up the context. The OP failed in that regard.
Greg Wooledge
2017-06-08 13:33:34 UTC
Permalink
Post by Dennis Williamson
Post by Val Krem
I am trying to develop a pipeline procedure in SLURM
(https://hpc.nih.gov/docs/job_dependencies.html).
JJ1 will give the job id or PID (numeric value) for the job1.sh.
echo ${JJ1} will produce number like this 11254323
Definitely not a PID. Unless he completely pulled the number out of
his ass, which wouldn't surprise me at this point. Presenting ACTUAL
information is clearly not part of Val's skillset.
Post by Dennis Williamson
My curiosity got the better of me so I took a quick glance at the
documentation of the system the OP is using. It seems to be an hpc cluster.
It's likely that the sbatch program submits what they refer to as "jobs" to
run on a different system altogether and returns immediately. The output
might be a pid on that system or perhaps some sort of task ID. So there's
no issue with backgrounding on the local system.
Words have contextual meaning, but it's the responsibility of a good
communicator to set up the context. The OP failed in that regard.
Yeah, it could be a task ID if this sbatch thing is some sort of
"enterprise" "middleware" layer. In which case it's got practically
nothing to do with bash.

Val, try explaining what you are doing. That is, what you want the final
result to be -- NOT the sequence of steps you are trying to perform that
you think will get you there. Do not use any bash commands or syntax.
Just use English words.

Try starting with "I have a ___".

If there are sepecific external constraints that you MUST follow for
some reason, be sure to include those. "My professor says we can't use
sed or awk" would be an example of a constraint. "My boss says we
have to use SLURM" would be another.

Finally, if your task is highly dependent upon the details of this SLURM
product, you really should be asking on a SLURM mailing list, instead
of a bash one. We don't know anything about it.

Loading...