Tags:
view all tags
---+ Condor quick start This page is supposed to guide users to start using Condor in our T3 infra-structure, only mentioning the very essential, adaptations can be done from there and integrated in this page. The submission host is our usual t3-higgs login node. Everything is already setup so you can submit from your home area. Nodes passed the usual checklist. ---++ Preparing the job for submission The recommended is that you separate a directory for this. In my case, the job is a bash script that prints "alive" and sleeps for X seconds. Here's how my directory looks like : <verbatim> -bash-3.2$ ll -rw-r--r-- 1 samir users 0 May 28 12:59 sleep.err -rw-r--r-- 1 samir users 31355 May 28 14:53 sleep.log -rw-r--r-- 1 samir users 5 May 28 14:42 sleep.out.1 -rwxr-xr-x 1 samir users 35 May 28 14:42 sleep.sh -rw-r--r-- 1 samir users 116 May 28 14:43 submit.sub </verbatim> Now we will look at our submit.sub (you can call it anything). This is the file that will tell Condor what to do : <verbatim> Executable = sleep.sh Universe = vanilla Output = sleep.out.$(Process) Log = sleep.log Error = sleep.err Queue </verbatim> It couldn't be simpler. Don't ever change the Universe. All the rest is self-explanatory. The Queue parameter tells condor how many copies of this very same job we want to send, default is 1. I could do : <verbatim> Queue 4 </verbatim> And have 4 identical jobs running. That's when the $(Process) variable makes a difference, the output files will be called in the same directory : <verbatim> -rw-r--r-- 1 samir users 5 May 28 14:42 sleep.out.1 -rw-r--r-- 1 samir users 5 May 28 14:42 sleep.out.2 -rw-r--r-- 1 samir users 5 May 28 14:42 sleep.out.3 -rw-r--r-- 1 samir users 5 May 28 14:42 sleep.out.4 </verbatim> ---++ Submitting the job(s) Once you got familiar with how to configure your job, is time to submit it : <verbatim> -bash-3.2$ condor_submit submit.sub Submitting job(s).... 4 job(s) submitted to cluster 32. </verbatim> Then you can monitor with : <verbatim> -bash-3.2$ condor_q -- Submitter: t3-higgs.ultralight.org : <10.4.255.253:43446> : t3-higgs.ultralight.org ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 15.0 amott 7/2 11:18 0+00:00:00 H 0 17.1 ZeeSelectorApp 32.0 samir 5/28 15:13 0+00:00:25 R 0 0.0 sleep.sh 32.1 samir 5/28 15:13 0+00:00:25 R 0 0.0 sleep.sh 32.2 samir 5/28 15:13 0+00:00:25 R 0 0.0 sleep.sh 32.3 samir 5/28 15:13 0+00:00:25 R 0 0.0 sleep.sh 5 jobs; 0 completed, 0 removed, 0 idle, 4 running, 1 held, 0 suspended </verbatim> Or if you want to go do something else and they are gone later, you can confirm that they actually finished by spotting them on the history : <verbatim> -bash-3.2$ condor_history ID OWNER SUBMITTED RUN_TIME ST COMPLETED CMD 31.3 samir 5/28 14:43 0+00:05:05 C 5/28 14:53 /home/samir/condor-test/sleep.sh 31.2 samir 5/28 14:43 0+00:05:04 C 5/28 14:53 /home/samir/condor-test/sleep.sh 31.1 samir 5/28 14:43 0+00:05:03 C 5/28 14:48 /home/samir/condor-test/sleep.sh 31.0 samir 5/28 14:43 0+00:05:03 C 5/28 14:48 /home/samir/condor-test/sleep.sh 30.3 samir 5/28 14:42 0+00:00:09 C 5/28 14:42 /home/samir/condor-test/sleep.sh 30.2 samir 5/28 14:42 0+00:00:09 C 5/28 14:42 /home/samir/condor-test/sleep.sh </verbatim> With this, you should be able to do the basic. Feel free to edit this page and add more content and tips from your experiences. -- Main.samir - 2014-05-28
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r2
<
r1
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r1 - 2014-05-28
-
samir
Home
Site map
Main web
Sandbox web
TWiki web
Main Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Edit
Attach
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback