Difference: ComputingT2Transfers (6 vs. 7)

Revision 72014-08-19 - samir

   USCMS T2 Transfers
   General status
   Site notes 
  Caltech
   Nebraska
   Purdue
   Vanderbilt
   UCSD
 
 

 USCMS T2 Transfers 

This twiki was created to report the latest status on this initiative.
-<
<
+We will have different sections for each site's notes
  General status 

DashBoard FTS plots
-<
<
+ Site notes 

 Caltech
->
>
+Month-long FTS plots
-<
<
+Monday 18:00 PST removed the 40 Gbps host for performance tuning. Overhead when recompiling CentOS 3.X series Kernel will cause it to come back by ~Tuesday afternoon. For now sites should expect 20 Gbps from Caltech (10+10 G setup). I don't see things too loaded though. 8G on average. 12G peak.
->
>
+ Site notes
  Nebraska
-<
<
+Had good rates on this Friday, when we started ramping up. Had problems with their PhEDEx node which interrupted transfers at some points in the day. More stability but no major improvement on rates since then.

 GFTP issues 

Not sure if this is still up-to-date :

Error reason: TRANSFER globus_ftp_client: the server responded with an error 500 Command failed. : Unable to extend our file-backed buffers; aborting transfer.


I don't really understand the reason for this. But might be fixable with different (higher) GridFTP buffer configurations?
->
>
+Had good rates in general, but didn't manage to pass 1000 MBps from Caltech.
  Purdue
-<
<
+Found on Monday that the problem is related to LHCONE peering. Their traffic from Caltech is going through CENIC, not LHCONE, which is sub-optimal. Purdue will contact their Network support to improve this.

 GFTP Issues : 

Error reason: TRANSFER globus_ftp_client: the server responded with an error 500 Command failed. : Allocated all 1500 file-backed buffers on server cms-g004.rcac.purdue.edu; aborting transfer.


It probably ran out of memory buffers and started using Disk buffers, what in principle shouldn't happen (Brian will know more). Quick workaround would be to raise the file buffers in the configuration and restart the service, but the ideal is to find the root cause of why it needs so much file buffers.

 Florida
->
>
+Found that there was a problem related to LHCONE peering. Their traffic from Caltech was going through CENIC, not LHCONE, which is sub-optimal. Purdue is following up with local network support.
-<
<
+Joined last Friday, had transfers ramped up on Monday but not seeing a lot of action from PhEDEx.
->
>
+Even though there was this problem, we still observed decent rates > 500 MBps at some distinct moments.
  Vanderbilt
-<
<
+Joined in the first day, LStore performance is randomly anywhere from great to poor. Have seen 600 MBps in the past but not a lot right now. Found the contacts and included in the thread.
->
>
+Joined in the first day, LStore performance is randomly anywhere from great to poor. Have seen 600 MBps in the past.
 -- Main.samir - 2014-07-29

View topic | History: r9 < r8 < r7 < r6 | More topic actions...

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback