---+ CE Troubleshoot Find here the potential pitfalls and ways to debug a OSG GRAM Compute Element 3.2 ---++ Find job submission rate, submission errors This is a bit tricky because logging although verbose, doesn't say a lot of what you mostly want to see -- *job submission*. There is a lot of operations that get logged (most of them I'd say, in a production CE) that are "how's the job I submitted X?" or "give me the output for the job Y" and much more. However, I found that when a job submission actually happens, here's what you can see : <verbatim> TIME: Wed May 14 18:11:03 2014 PID: 29145 -- Notice: 0: Child 29147 started JMA 2014/05/14 18:11:08 GATEKEEPER_JM_ID 2014-05-15.01:11:02.0000029146.0000000000 for /DC=ch/DC=cern/OU=computers/CN=cmspilot04/vocms0167.cern.ch on ::ffff:129.79.53.27 JMA 2014/05/14 18:11:08 GATEKEEPER_JM_ID 2014-05-15.01:11:02.0000029146.0000000000 mapped to uscms4257 (20707, 504) JMA 2014/05/14 18:11:08 GATEKEEPER_JM_ID 2014-05-15.01:11:02.0000029146.0000000000 has GRAM_SCRIPT_JOB_ID 075.000.000 manager type condor JMA 2014/05/14 18:11:08 GATEKEEPER_JM_ID 2014-05-15.01:11:02.0000029145.0000000000 for /DC=ch/DC=cern/OU=computers/CN=cmspilot05/vocms0167.cern.ch on ::ffff:129.79.53.27 JMA 2014/05/14 18:11:08 GATEKEEPER_JM_ID 2014-05-15.01:11:02.0000029145.0000000000 mapped to uscms4251 (20701, 504) JMA 2014/05/14 18:11:08 GATEKEEPER_JM_ID 2014-05-15.01:11:02.0000029145.0000000000 has GRAM_SCRIPT_JOB_ID 076.000.000 manager type condor TIME: Wed May 14 18:14:46 2014 </verbatim> So a grep like this will show you better about the job submission rate : <verbatim> [root@cithep231 ~]# grep 'has GRAM_SCRIPT_JOB_ID' /var/log/globus-gatekeeper.log | grep condor # or even rate per hour : [root@cithep231 ~]# grep 'has GRAM_SCRIPT_JOB_ID' /var/log/globus-gatekeeper.log | grep condor | awk -F':' '{print $1}' | sort | uniq -c 1 JMA 2014/05/14 04 1 JMA 2014/05/14 05 7 JMA 2014/05/14 06 8 JMA 2014/05/14 07 7 JMA 2014/05/14 08 30 JMA 2014/05/14 09 20 JMA 2014/05/14 10 10 JMA 2014/05/14 11 6 JMA 2014/05/14 12 9 JMA 2014/05/14 13 6 JMA 2014/05/14 14 24 JMA 2014/05/14 15 12 JMA 2014/05/14 16 7 JMA 2014/05/14 17 4 JMA 2014/05/14 18 </verbatim> For general errors, there is always the *USER SPECIFIC* GRAM logs : /var/log/globus/gram_$(LOGNAME).log And one can configure the log levels at : /etc/globus/globus-gram-job-manager.conf Default is only the chaotic messages so even a healthy CE will be full of errors over there, watch out! -- Main.samir - 2014-05-15
This topic: Main
>
ComputingAdminCETroubleshoot
Topic revision: r1 - 2014-05-15 - samir
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback