Difference: ComputingT2Transfers (1 vs. 9)

Revision 92014-09-12 - samir

Line: 1 to 1
 

USCMS T2 Transfers

Line: 14 to 14
  Spreadsheet
Added:
>
>
As of 12th of September, the only site which managed interesting rates was Purdue. Most SE to SE links won't pass ~70 transfers and 500~1000 MBps, which is a bit disappointing. Suggesting to the group that we try a general FTS-optimizer bypass if everyone is comfortable with throttling transfers in their download agents. This is important as otherwise the number of transfers can grow out of control, crashing either GridFTPs (site SE) or the PhEDEx host managing those.
 

Site notes

Added:
>
>

Caltech

Started having consistent stage-out problems by Sep 11th. Throttled down the transfers to other sites so we can investigate the issue with less load on the GridFTPs.

 

Nebraska

Had good rates in general, but didn't see more than 1000 MBps from Caltech.

Purdue

Changed:
<
<
Found that there was a problem related to LHCONE peering. Their traffic from Caltech was going through CENIC, not LHCONE, which is sub-optimal. Purdue is following up with local network support.
>
>
By Sep 11th the site started doing very interesting download rates. Up to 28 Gbps download rates averaged over 10 min.

Plots here

= Older notes = Found that there was a problem related to LHCONE peering. Their traffic from Caltech was going through CENIC, not LHCONE, which is sub-optimal. Purdue is following up with local network support.

  Even though there was this problem, we still observed decent rates > 500 MBps at some distinct moments.

Revision 82014-09-09 - samir

Line: 1 to 1
 

USCMS T2 Transfers

Line: 10 to 10
  Month-long FTS plots
Added:
>
>
Created a spreadsheet so sites can post their GridFTP configurations and get comments from the group :

Spreadsheet

 

Site notes

Nebraska

Changed:
<
<
Had good rates in general, but didn't manage to pass 1000 MBps from Caltech.
>
>
Had good rates in general, but didn't see more than 1000 MBps from Caltech.
 

Purdue

Line: 26 to 30
  Joined in the first day, LStore performance is randomly anywhere from great to poor. Have seen 600 MBps in the past.
Added:
>
>

UCSD

Observed ~450 MBps in distinct days, see monthly plot

 -- Main.samir - 2014-07-29

Revision 72014-08-19 - samir

Line: 1 to 1
 

USCMS T2 Transfers

This twiki was created to report the latest status on this initiative.

Deleted:
<
<
We will have different sections for each site's notes
 

General status

DashBoard FTS plots

Changed:
<
<

Site notes

Caltech

>
>
Month-long FTS plots
 
Changed:
<
<
Monday 18:00 PST removed the 40 Gbps host for performance tuning. Overhead when recompiling CentOS 3.X series Kernel will cause it to come back by ~Tuesday afternoon. For now sites should expect 20 Gbps from Caltech (10+10 G setup). I don't see things too loaded though. 8G on average. 12G peak.
>
>

Site notes

 

Nebraska

Changed:
<
<
Had good rates on this Friday, when we started ramping up. Had problems with their PhEDEx node which interrupted transfers at some points in the day. More stability but no major improvement on rates since then.

GFTP issues

Not sure if this is still up-to-date :

Error reason: TRANSFER globus_ftp_client: the server responded with an error 500 Command failed. : Unable to extend our file-backed buffers; aborting transfer.

I don't really understand the reason for this. But might be fixable with different (higher) GridFTP buffer configurations?

>
>
Had good rates in general, but didn't manage to pass 1000 MBps from Caltech.
 

Purdue

Changed:
<
<
Found on Monday that the problem is related to LHCONE peering. Their traffic from Caltech is going through CENIC, not LHCONE, which is sub-optimal. Purdue will contact their Network support to improve this.

GFTP Issues :

Error reason: TRANSFER globus_ftp_client: the server responded with an error 500 Command failed. : Allocated all 1500 file-backed buffers on server cms-g004.rcac.purdue.edu; aborting transfer.

It probably ran out of memory buffers and started using Disk buffers, what in principle shouldn't happen (Brian will know more). Quick workaround would be to raise the file buffers in the configuration and restart the service, but the ideal is to find the root cause of why it needs so much file buffers.

Florida

>
>
Found that there was a problem related to LHCONE peering. Their traffic from Caltech was going through CENIC, not LHCONE, which is sub-optimal. Purdue is following up with local network support.
 
Changed:
<
<
Joined last Friday, had transfers ramped up on Monday but not seeing a lot of action from PhEDEx.
>
>
Even though there was this problem, we still observed decent rates > 500 MBps at some distinct moments.
 

Vanderbilt

Changed:
<
<
Joined in the first day, LStore performance is randomly anywhere from great to poor. Have seen 600 MBps in the past but not a lot right now. Found the contacts and included in the thread.
>
>
Joined in the first day, LStore performance is randomly anywhere from great to poor. Have seen 600 MBps in the past.
  -- Main.samir - 2014-07-29

Revision 62014-08-05 - samir

Line: 1 to 1
 

USCMS T2 Transfers

Line: 14 to 14
 

Caltech

Changed:
<
<
Had a power outage in 30 nodes at 2 PM PST that degraded transfers as HDFS was affected. All nodes recovered within 10 minutes but the optimizer got traumatized and took a while to ramp-up again.
>
>
Monday 18:00 PST removed the 40 Gbps host for performance tuning. Overhead when recompiling CentOS 3.X series Kernel will cause it to come back by ~Tuesday afternoon. For now sites should expect 20 Gbps from Caltech (10+10 G setup). I don't see things too loaded though. 8G on average. 12G peak.
 

Nebraska

Changed:
<
<
Had good rates on this Friday, when we started ramping up. Had problems with their PhEDEx node which interrupted transfers at some points in the day. Last time I have seen there were bursty transfers -- not enough pending.
>
>
Had good rates on this Friday, when we started ramping up. Had problems with their PhEDEx node which interrupted transfers at some points in the day. More stability but no major improvement on rates since then.
 

GFTP issues

Added:
>
>
Not sure if this is still up-to-date :
 
Error reason: TRANSFER globus_ftp_client: the server responded with an error 500 Command failed. : Unable to extend our file-backed buffers; aborting transfer.
Line: 30 to 32
 

Purdue

Changed:
<
<
It seemed that we have Network issues in Kansas. Even though when the optimizer lets us, we have good rates. Manoj analyzed thoroughly the network paths for sites that were good and bad, cross-checking with PerfSONAR and the best clue is that we have a problem with a route in Los Angeles, which looks good to Nebraska but bad to Purdue.

Will follow up with Iperf testing to reveal packet loss and contact Network Support.

>
>
Found on Monday that the problem is related to LHCONE peering. Their traffic from Caltech is going through CENIC, not LHCONE, which is sub-optimal. Purdue will contact their Network support to improve this.
 

GFTP Issues :

Line: 44 to 44
 

Florida

Changed:
<
<
Just joined. Thanks, will ramp up transfers.
>
>
Joined last Friday, had transfers ramped up on Monday but not seeing a lot of action from PhEDEx.
 

Vanderbilt

Changed:
<
<
Joined in the first day, LStore performance is randomly anywhere from great to poor. Have seen 600 MBps in the past but not a lot right now. Still figuring out the contacts to follow up there as they reorganize the team.
>
>
Joined in the first day, LStore performance is randomly anywhere from great to poor. Have seen 600 MBps in the past but not a lot right now. Found the contacts and included in the thread.
  -- Main.samir - 2014-07-29

Revision 52014-08-02 - samir

Line: 1 to 1
 

USCMS T2 Transfers

Changed:
<
<
This twiki is intended to aggregate all necessary information for the current effort of improving inter-T2 PhEDEx transfers in the context of USCMS.
>
>
This twiki was created to report the latest status on this initiative.
 
Changed:
<
<
It is known that the networks supported between the 8 sites are of high capacity and availability. However there seem to be some limitations to be addressed and tested at the level of CMS Transfer tools or configurations of these, that could improve the overall performance and at the end, make these systems perform better and deliver data faster between sites.
>
>
We will have different sections for each site's notes
 
Changed:
<
<
The general picture on transfers over 20 Gbps and some of these configuration problems are mentioned in Samir's talk at the T2 meeting of 07/29.
>
>

General status

 
Changed:
<
<
So far, the showstopper was the uplink bandwidth for most sites. Since July 2014 this is starting to change.
>
>
DashBoard FTS plots
 
Changed:
<
<
The ideal is that even 10 Gbps sites could participate, as it is possible that the currrent settings are not optimal for fast transfers. We could tune it until it saturates the 10 Gbps link and everyone would have exercised how to improve transfer rates in Debug.
>
>

Site notes

 
Changed:
<
<

Plan for the exercise

>
>

Caltech

 
Changed:
<
<
As discussed in the meeting, we would like to use Caltech as the source site, as it managed to do 25/29 Gbps with its setup, being a good source for sites optimizing their configurations. Once everyone else optimizes their download configurations and we observe which rates we get to which sites, we could start rotating who is the source site, and see what are the maximum rates that we get from them. It is important that we have multiple sink sites, as even if there are limitations in sites, the others will add up to the total rate.
>
>
Had a power outage in 30 nodes at 2 PM PST that degraded transfers as HDFS was affected. All nodes recovered within 10 minutes but the optimizer got traumatized and took a while to ramp-up again.
 
Changed:
<
<
There are 3 major steps on this exercise, 2 of them will require coordination among sites :
>
>

Nebraska

 
Changed:
<
<
  • Tuning PhEDEx download configurations, so LoadTest settings will correspond better to reality
  • Observing how transfers behave at the FTS level, note if the Optimizer algorithm is a limiting factor or it actually helps to achieve the optimal setting of active transfers for the available bandwidth at a given moment.
  • The logical limitations would have been removed, sites can focus on setting their upload rates as they want and optimize their GridFTP setups, start observing what are the best rates they can get out of the storage.
When we are done with these, the transfer test framework through PhEDEx will be more responsive and we will actually be able to run more advanced tests on higher rates. For example coordinate pushing data from many sites to one.
>
>
Had good rates on this Friday, when we started ramping up. Had problems with their PhEDEx node which interrupted transfers at some points in the day. Last time I have seen there were bursty transfers -- not enough pending.
 
Changed:
<
<

Participation of sites

>
>

GFTP issues

 
Changed:
<
<
In order to contact only the interested sites, please fill out the table below :

Site Connectivity Participating Notes
T2_BR_SPRACE 10G N/A  
T2_US_Caltech 100G DONE  
T2_US_Florida 100G N/A  
T2_US_MIT 10G N/A  
T2_US_Nebraska 10G N/A Upgrading to 100G soon
T2_US_Purdue 100G N/A  
T2_US_UCSD 10G N/A  
T2_US_Wisconsin 10G N/A  
T2_US_Vanderbilt 10G N/A  

FTS Notes

>
>
Error reason: TRANSFER globus_ftp_client: the server responded with an error 500 Command failed. : Unable to extend our file-backed buffers; aborting transfer.
 
Changed:
<
<
Currently we have 3 official FTS servers :
>
>
I don't really understand the reason for this. But might be fixable with different (higher) GridFTP buffer configurations?
 
Changed:
<
<
  • cmsfts3.fnal.gov
  • fts3.cern.ch
  • lcgfts3.gridpp.rl.ac.uk
There is an official recommendation that is the most logic distribution of what you should use, however for this exercise people are encouraged to try other deployments and possibly different behaviors. For example it was observed 208 transfers in parallel in CERN's FTS, but not more than 50 at FNAL (yet).
>
>

Purdue

 
Changed:
<
<
In the long run, US Sites should use FNAL. But it might be worth to understand if other FTS servers have a different optimizer behavior and why.
>
>
It seemed that we have Network issues in Kansas. Even though when the optimizer lets us, we have good rates. Manoj analyzed thoroughly the network paths for sites that were good and bad, cross-checking with PerfSONAR and the best clue is that we have a problem with a route in Los Angeles, which looks good to Nebraska but bad to Purdue.
 
Changed:
<
<

PhEDEx Documentation

>
>
Will follow up with Iperf testing to reveal packet loss and contact Network Support.
 
Changed:
<
<
We will be exercising mostly the Download agent, therefore the most useful documenation for us is this.
>
>

GFTP Issues :

 
Changed:
<
<
However there is also this if you would like to read more.
>
>
Error reason: TRANSFER globus_ftp_client: the server responded with an error 500 Command failed. : Allocated all 1500 file-backed buffers on server cms-g004.rcac.purdue.edu; aborting transfer.
 
Changed:
<
<

PhEDEx configurations

>
>
It probably ran out of memory buffers and started using Disk buffers, what in principle shouldn't happen (Brian will know more). Quick workaround would be to raise the file buffers in the configuration and restart the service, but the ideal is to find the root cause of why it needs so much file buffers.
 
Changed:
<
<
One of the limitations is how much the download site PhEDEx agent submits to FTS. Caltech was asked in the meeting how they control that. In that case, we have 2 agents. One for general transfers and another exclusively for US Transfers. the -ignore and -accept flags will do the separation. Also, see that one can throttle the number of active transfers for each site as needed and set a default for the sites not specified. The relevant part for Config.Debug is :
>
>

Florida

 
Changed:
<
<
### AGENT LABEL=download-debug-fts PROGRAM=Toolkit/Transfer/FileDownload DEFAULT=on
 -db              ${PHEDEX_DBPARAM}
 -nodes           ${PHEDEX_NODE}
 -delete          ${PHEDEX_CONF}/FileDownloadDelete
 -validate        ${PHEDEX_CONF}/FileDownloadVerify
 -ignore          '%T2_US%'
 -verbose
 -backend         FTS
 -batch-files     50
 -link-pending-files     200
 -max-active-files 700
 -link-active-files   'T1_CH_CERN_Buffer=50'
 -link-active-files   'T1_DE_KIT_Buffer=10'
 -link-active-files   'T1_DE_KIT_Disk=10'
 -link-active-files   'T1_ES_PIC_Buffer=100'
 -link-active-files   'T2_RU_RRC_KI=2'
 -link-active-files   'T1_FR_CCIN2P3_Buffer=100'
 -link-active-files   'T1_FR_CCIN2P3_Disk=100'
 -link-active-files   'T1_IT_CNAF_Buffer=150'
 -link-active-files   'T1_TW_ASGC_Buffer=100'
 -link-active-files   'T1_UK_RAL_Buffer=50'
 -link-active-files   'T1_US_FNAL_Buffer=100'
 -link-active-files   'T2_DE_RWTH=10'
 -link-active-files   'T2_IT_Pisa=20'
 -default-link-active-files 100
 -protocols       srmv2
 -mapfile         ${PHEDEX_FTS_MAP}


### AGENT LABEL=download-debug-t2fts PROGRAM=Toolkit/Transfer/FileDownload DEFAULT=on
 -db              ${PHEDEX_DBPARAM}
 -nodes           ${PHEDEX_NODE}
 -delete          ${PHEDEX_CONF}/FileDownloadDelete
 -validate        ${PHEDEX_CONF}/FileDownloadVerify
 -accept          '%T2_US%'
 -verbose
 -backend         FTS
 -batch-files     20
 -link-pending-files     300
 -max-active-files 300
 -protocols       srmv2
 -mapfile         ${PHEDEX_FTS_MAP}
>
>
Just joined. Thanks, will ramp up transfers.
 
Added:
>
>

Vanderbilt

 
Changed:
<
<
>
>
Joined in the first day, LStore performance is randomly anywhere from great to poor. Have seen 600 MBps in the past but not a lot right now. Still figuring out the contacts to follow up there as they reorganize the team.
  -- Main.samir - 2014-07-29

Revision 42014-07-30 - samir

Line: 1 to 1
 

USCMS T2 Transfers

Line: 8 to 8
  The general picture on transfers over 20 Gbps and some of these configuration problems are mentioned in Samir's talk at the T2 meeting of 07/29.
Added:
>
>
So far, the showstopper was the uplink bandwidth for most sites. Since July 2014 this is starting to change.
 The ideal is that even 10 Gbps sites could participate, as it is possible that the currrent settings are not optimal for fast transfers. We could tune it until it saturates the 10 Gbps link and everyone would have exercised how to improve transfer rates in Debug.
Changed:
<
<

Current connectivity at sites

>
>

Plan for the exercise

As discussed in the meeting, we would like to use Caltech as the source site, as it managed to do 25/29 Gbps with its setup, being a good source for sites optimizing their configurations. Once everyone else optimizes their download configurations and we observe which rates we get to which sites, we could start rotating who is the source site, and see what are the maximum rates that we get from them. It is important that we have multiple sink sites, as even if there are limitations in sites, the others will add up to the total rate.

 
Changed:
<
<
So far, the showstopper was the uplink bandwidth for most sites. Since July 2014 this is starting to change, this is to track the current state from different sites
>
>
There are 3 major steps on this exercise, 2 of them will require coordination among sites :
 
Changed:
<
<
  • MIT - 10 Gbps
  • Caltech - 100 Gbps
  • Nebraska - Could do more then 10 Gbps currently, will have 100 Gbps on the T2 soon.
  • Purdue - 100 Gbps. Running tests with FNAL.
  • Florida - 100 Gbps
  • UCSD - 10 Gbps
  • SPRACE - 10 Gbps
  • Wisconsin - 10 Gbps
  • Vanderbilt - 10 Gbps
>
>
  • Tuning PhEDEx download configurations, so LoadTest settings will correspond better to reality
  • Observing how transfers behave at the FTS level, note if the Optimizer algorithm is a limiting factor or it actually helps to achieve the optimal setting of active transfers for the available bandwidth at a given moment.
  • The logical limitations would have been removed, sites can focus on setting their upload rates as they want and optimize their GridFTP setups, start observing what are the best rates they can get out of the storage.
When we are done with these, the transfer test framework through PhEDEx will be more responsive and we will actually be able to run more advanced tests on higher rates. For example coordinate pushing data from many sites to one.

Participation of sites

In order to contact only the interested sites, please fill out the table below :

Site Connectivity
<-- -->
Sorted descending
Participating Notes
T2_US_Caltech 100G DONE  
T2_US_Florida 100G N/A  
T2_US_Purdue 100G N/A  
T2_BR_SPRACE 10G N/A  
T2_US_MIT 10G N/A  
T2_US_Nebraska 10G N/A Upgrading to 100G soon
T2_US_UCSD 10G N/A  
T2_US_Wisconsin 10G N/A  
T2_US_Vanderbilt 10G N/A  
 

FTS Notes

Added:
>
>
Currently we have 3 official FTS servers :

  • cmsfts3.fnal.gov
  • fts3.cern.ch
  • lcgfts3.gridpp.rl.ac.uk
There is an official recommendation that is the most logic distribution of what you should use, however for this exercise people are encouraged to try other deployments and possibly different behaviors. For example it was observed 208 transfers in parallel in CERN's FTS, but not more than 50 at FNAL (yet).

In the long run, US Sites should use FNAL. But it might be worth to understand if other FTS servers have a different optimizer behavior and why.

 

PhEDEx Documentation

Changed:
<
<
We will be exercising mostly the Download agent, therefore the most useful documenation for us is this. However there is also this if you would like to read more.
>
>
We will be exercising mostly the Download agent, therefore the most useful documenation for us is this.

However there is also this if you would like to read more.

 

PhEDEx configurations

Changed:
<
<
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed condimentum rhoncus ligula, et lobortis ipsum aliquet id. In fringilla felis id venenatis placerat. Vivamus ornare egestas mattis. Donec sapien leo, gravida vitae magna sed, dictum convallis metus. Nulla mattis iaculis diam a gravida. Sed ac sem eget nisi tristique convallis vel sit amet dolor.
>
>
One of the limitations is how much the download site PhEDEx agent submits to FTS. Caltech was asked in the meeting how they control that. In that case, we have 2 agents. One for general transfers and another exclusively for US Transfers. the -ignore and -accept flags will do the separation. Also, see that one can throttle the number of active transfers for each site as needed and set a default for the sites not specified. The relevant part for Config.Debug is :
 
Changed:
<
<
. . . -batch-files 500
>
>
### AGENT LABEL=download-debug-fts PROGRAM=Toolkit/Transfer/FileDownload DEFAULT=on -db ${PHEDEX_DBPARAM} -nodes ${PHEDEX_NODE} -delete ${PHEDEX_CONF}/FileDownloadDelete -validate ${PHEDEX_CONF}/FileDownloadVerify -ignore '%T2_US%' -verbose -backend FTS -batch-files 50
  -link-pending-files 200 -max-active-files 700 -link-active-files 'T1_CH_CERN_Buffer=50'
Line: 48 to 78
  -link-active-files 'T2_RU_RRC_KI=2' -link-active-files 'T1_FR_CCIN2P3_Buffer=100' -link-active-files 'T1_FR_CCIN2P3_Disk=100'
Changed:
<
<
. . .
>
>
-link-active-files 'T1_IT_CNAF_Buffer=150' -link-active-files 'T1_TW_ASGC_Buffer=100' -link-active-files 'T1_UK_RAL_Buffer=50' -link-active-files 'T1_US_FNAL_Buffer=100' -link-active-files 'T2_DE_RWTH=10' -link-active-files 'T2_IT_Pisa=20' -default-link-active-files 100 -protocols srmv2 -mapfile ${PHEDEX_FTS_MAP}

### AGENT LABEL=download-debug-t2fts PROGRAM=Toolkit/Transfer/FileDownload DEFAULT=on -db ${PHEDEX_DBPARAM} -nodes ${PHEDEX_NODE} -delete ${PHEDEX_CONF}/FileDownloadDelete -validate ${PHEDEX_CONF}/FileDownloadVerify -accept '%T2_US%' -verbose -backend FTS -batch-files 20 -link-pending-files 300 -max-active-files 300 -protocols srmv2 -mapfile ${PHEDEX_FTS_MAP}

 

-- Main.samir - 2014-07-29

Revision 32014-07-30 - samir

Line: 1 to 1
 

USCMS T2 Transfers

This twiki is intended to aggregate all necessary information for the current effort of improving inter-T2 PhEDEx transfers in the context of USCMS.

Changed:
<
<
It is known that the networks supported between the 7 sites are of high capacity and availability. However there seem to be some limitations to be addressed and tested at the level of CMS Transfer tools or configurations of these, that could improve the overall performance and at the end, make these systems perform better and deliver data faster between sites.
>
>
It is known that the networks supported between the 8 sites are of high capacity and availability. However there seem to be some limitations to be addressed and tested at the level of CMS Transfer tools or configurations of these, that could improve the overall performance and at the end, make these systems perform better and deliver data faster between sites.
  The general picture on transfers over 20 Gbps and some of these configuration problems are mentioned in Samir's talk at the T2 meeting of 07/29.
Line: 15 to 15
 So far, the showstopper was the uplink bandwidth for most sites. Since July 2014 this is starting to change, this is to track the current state from different sites

  • MIT - 10 Gbps
Added:
>
>
  • Caltech - 100 Gbps
 
  • Nebraska - Could do more then 10 Gbps currently, will have 100 Gbps on the T2 soon.
  • Purdue - 100 Gbps. Running tests with FNAL.
  • Florida - 100 Gbps
Changed:
<
<
  • UCSD - 10 Gbps (Samir observed when running tests. To be confirmed by admins)
>
>
  • UCSD - 10 Gbps
 
  • SPRACE - 10 Gbps
Changed:
<
<
  • Wisconsin - 10 Gbps - Saturating from times to times. Would like to avoid tests for now.
>
>
  • Wisconsin - 10 Gbps
  • Vanderbilt - 10 Gbps
 

FTS Notes

Revision 22014-07-30 - samir

Line: 1 to 1
 

USCMS T2 Transfers

Changed:
<
<
This twiki is intended to aggregate all necessary information for the current effort of improving inter-T2 PhEDEx transfers in the context of USCMS.
>
>
This twiki is intended to aggregate all necessary information for the current effort of improving inter-T2 PhEDEx transfers in the context of USCMS.
  It is known that the networks supported between the 7 sites are of high capacity and availability. However there seem to be some limitations to be addressed and tested at the level of CMS Transfer tools or configurations of these, that could improve the overall performance and at the end, make these systems perform better and deliver data faster between sites.
Line: 26 to 26
 

PhEDEx Documentation

Added:
>
>
We will be exercising mostly the Download agent, therefore the most useful documenation for us is this. However there is also this if you would like to read more.
 

PhEDEx configurations

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed condimentum rhoncus ligula, et lobortis ipsum aliquet id. In fringilla felis id venenatis placerat. Vivamus ornare egestas mattis. Donec sapien leo, gravida vitae magna sed, dictum convallis metus. Nulla mattis iaculis diam a gravida. Sed ac sem eget nisi tristique convallis vel sit amet dolor.

Revision 12014-07-29 - samir

Line: 1 to 1
Added:
>
>

USCMS T2 Transfers

This twiki is intended to aggregate all necessary information for the current effort of improving inter-T2 PhEDEx transfers in the context of USCMS.

It is known that the networks supported between the 7 sites are of high capacity and availability. However there seem to be some limitations to be addressed and tested at the level of CMS Transfer tools or configurations of these, that could improve the overall performance and at the end, make these systems perform better and deliver data faster between sites.

The general picture on transfers over 20 Gbps and some of these configuration problems are mentioned in Samir's talk at the T2 meeting of 07/29.

The ideal is that even 10 Gbps sites could participate, as it is possible that the currrent settings are not optimal for fast transfers. We could tune it until it saturates the 10 Gbps link and everyone would have exercised how to improve transfer rates in Debug.

Current connectivity at sites

So far, the showstopper was the uplink bandwidth for most sites. Since July 2014 this is starting to change, this is to track the current state from different sites

  • MIT - 10 Gbps
  • Nebraska - Could do more then 10 Gbps currently, will have 100 Gbps on the T2 soon.
  • Purdue - 100 Gbps. Running tests with FNAL.
  • Florida - 100 Gbps
  • UCSD - 10 Gbps (Samir observed when running tests. To be confirmed by admins)
  • SPRACE - 10 Gbps
  • Wisconsin - 10 Gbps - Saturating from times to times. Would like to avoid tests for now.

FTS Notes

PhEDEx Documentation

PhEDEx configurations

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed condimentum rhoncus ligula, et lobortis ipsum aliquet id. In fringilla felis id venenatis placerat. Vivamus ornare egestas mattis. Donec sapien leo, gravida vitae magna sed, dictum convallis metus. Nulla mattis iaculis diam a gravida. Sed ac sem eget nisi tristique convallis vel sit amet dolor.

.
.
.
 -batch-files 500
 -link-pending-files 200
 -max-active-files 700
 -link-active-files 'T1_CH_CERN_Buffer=50'
 -link-active-files 'T1_DE_KIT_Buffer=10'
 -link-active-files 'T1_DE_KIT_Disk=10'
 -link-active-files 'T1_ES_PIC_Buffer=100'
 -link-active-files 'T2_RU_RRC_KI=2'
 -link-active-files 'T1_FR_CCIN2P3_Buffer=100'
 -link-active-files 'T1_FR_CCIN2P3_Disk=100'
.
.
.

-- Main.samir - 2014-07-29

 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback