USCMS T2 Transfers

This twiki was created to report the latest status on this initiative.

We will have different sections for each site's notes

General status

DashBoard FTS plots

Site notes

Caltech

Had a power outage in 30 nodes at 2 PM PST that degraded transfers as HDFS was affected. All nodes recovered within 10 minutes but the optimizer got traumatized and took a while to ramp-up again.

Nebraska

Had good rates on this Friday, when we started ramping up. Had problems with their PhEDEx node which interrupted transfers at some points in the day. Last time I have seen there were bursty transfers -- not enough pending.

GFTP issues

Error reason: TRANSFER globus_ftp_client: the server responded with an error 500 Command failed. : Unable to extend our file-backed buffers; aborting transfer.

I don't really understand the reason for this. But might be fixable with different (higher) GridFTP buffer configurations?

Purdue

It seemed that we have Network issues in Kansas. Even though when the optimizer lets us, we have good rates. Manoj analyzed thoroughly the network paths for sites that were good and bad, cross-checking with PerfSONAR and the best clue is that we have a problem with a route in Los Angeles, which looks good to Nebraska but bad to Purdue.

Will follow up with Iperf testing to reveal packet loss and contact Network Support.

GFTP Issues :

Error reason: TRANSFER globus_ftp_client: the server responded with an error 500 Command failed. : Allocated all 1500 file-backed buffers on server cms-g004.rcac.purdue.edu; aborting transfer.

It probably ran out of memory buffers and started using Disk buffers, what in principle shouldn't happen (Brian will know more). Quick workaround would be to raise the file buffers in the configuration and restart the service, but the ideal is to find the root cause of why it needs so much file buffers.

Florida

Just joined. Thanks, will ramp up transfers.

Vanderbilt

Joined in the first day, LStore performance is randomly anywhere from great to poor. Have seen 600 MBps in the past but not a lot right now. Still figuring out the contacts to follow up there as they reorganize the team.

-- Main.samir - 2014-07-29

Edit | Attach | Watch | Print version | History: r9 | r7 < r6 < r5 < r4 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r5 - 2014-08-02 - samir
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback