data den support - raeker/ARC-Wiki-Test GitHub Wiki

Advanced Research Computing : Data Den - Support

The User's Process to Request a Data Den Allocation

Requesting a Data Den volume is done through SRS. It follows the same steps as creating a new Turbo or locker volume. A PI needs to visitITS Service Request System (SRS) and then follow these steps:

  • Select ‘Request Service’ at the top of the screen.
  • Select either Locker Storage or Turbo Storage from the list under ‘Category’
  • Chose the type of storage you need, either NFS of CIFS. 
  • Once you’ve chosen your storage type (Locker or Turbo and NFS or CIFS) you will be taken to a series of pages that you will need to fill out. Click ‘Next’ to move through each page.
  • Once all pages have been filled out you will be taken to a ‘Review’ page. Here you can verify all of your information is correct before submitting your request.
  • After reviewing all your information to ensure it is correct click on ‘Submit Now’ to submit the request for a new volume.
  • The changes will be sent to the ARC-TS Storage team so they can process the request. This can take up to a business day to complete. They will update you as soon as the changes have taken effect.

Additional information about the individual fields required for a new volume can be found on this page:

https://arc-ts.umich.edu/ordering-storage/

Relevant Information

  1. Accessing Data Den:
  • Data Den is only accessible through the campus Globus infrastructure.
  • The correct Collection (endpoint) is "umich#flux".
  • If the data being stored currently is on a device that is managed by an IT Unit, ARC-TS can mount the storage to an ARC-TS Globus endpoint.
  1. Upon request, an NFS mount can be provided to special customers who have specific workflows.

  2. Data Den has a unique quota system due to the tape filesystems on the service:

  • Tapes do not work well with small files.

  • For each TB of Data Den capacity we provide, ARC-TS reserves a quota of 10,000 files.

  • Researchers should ensure that the size of their files are as large as possible before they move them into Data Den to get most efficient use of the service.

    • Another way to ensure larger file sizes is to create bundles of files in a tarred, bzipped archive.

    • The maximum file size is 8 TB, but a minimum file size of 1 TB is suggested.

    • If archives created are larger than 1 TB, the split command can split large archives to a series of fractionally smaller archives.

Advice about transferring between Globus and Data Den

77558: if they are getting that error I think they are confusing the direction they are sending data. If scratch is not actually full, they are likely sending it to data den and confused themselves,
Data Den has a number of ways it could report ‘full’ the most common being:

  • Data Den is replicated and you have to set the quota as 2x requested eg 10TB shows up as 20TB
  • Ran out of files, people don’t read the documentation that data den requires average file size of 100MB so need to tar/zip up data as we only give them 10,000 files/TB of capacity
  • There’s also the usual chance they used up all the space they requested, or that our tape migration didn’t move from disk (small quota) to tape (real quota) in a timely enough fashion
  • Lastly Globus, even if a node rebooted, handles that gracefully
⚠️ **GitHub.com Fallback** ⚠️