Storage space is available on the TeraGrid, either as companion storage
that automatically accompanies a computation allocation
or as a separate data allocation, independent of
computation allocations. Data storage allocations meet the needs of researchers
for short- and long-term
storage and for staging of data collections in databases or on disk. They are obtained
through the same system (POPS)
that is used to request computation resources.
Each resource provider (RP) site that offers data resources for allocation through
the TeraGrid has its own policies and specifications. In addition to storage resources
available at individual RP sites, a global file system (GPFS-WAN) is mounted on
multiple platforms across the TeraGrid. The table below contains information
for each RP site, including details about networks, file space, database availability,
and recommended use associated with each storage resource and whether GPFS-WAN is available
at the resource.
(For information about storage that is associated with computation allocations,
refer to the computation resources pages of the TeraGrid Resource Catalog.)
|
| Resource Name |
Description & Recommended Use |
Specifications |
Media Type |
Total File Space |
Database |
Access |
| GPFS-WAN | TeraGrid GPFS-WAN (Global Parallel File System-Wide Area Network) is a large-scale storage system mounted on several TeraGrid platforms. Although the system is physically located at SDSC, it looks like it is local to the system on which it is mounted.
Recommended Use
GPFS-WAN is recommended for long- or short-term data storage of high-volume multi-site runs, as well as for large TeraGrid-based data collections. Allocated space is available in the
Long-term Collections Area, a 150-TB partition for data collections.
Status
Available for allocations and in production
What to Choose in POPS
In the Select the Resource Level for Your Request section, select the appropriate request level:
- 0 - 200,000 (Startup and Educational Requests for 0 - 5 TB disk or 0 - 25 TB tape)
- 200,000 - above (Research Requests for > 5 TB disk or > 25 TB tape)
On the Resource Request page, choose GPFS-WAN Disk Space.
More information on GPFS-WAN | | Disk | 700 TB Total capacity
150 TB Long-term collections
475 TG Project Area
75 TG Scratch
How to apply | Not applicable | GPFS-WAN is currently mounted on the following machines:
- IU
tg-login.iu.teragrid.org (Big Red PPC Linux Cluster)
- SDSC
tg-login.sdsc.teragrid.org (IA-64 Linux Cluster)
- UC/ANL
tg-viz-login.uc.teragrid.org (IA-32)
tg-login.uc.teragrid.org (IA-64 Linux Cluster)
|
| SDSC Tape Storage | Development (up to 25 TB), medium 25-200 TB), or large (greater than 100 TG)
allocations are available for long-term archival storage independent of SDSC computation resources
Recommended Use
SDSC Tape Storage are recommended for sets of data that require high availability but do not require real-time access.
Status
Available for allocations and in production | HPSS | Tape | 25 PB | | HSI |
|
| Resource Name |
Description & Recommended Use |
Specifications |
Media Type |
Total File Space |
Database |
Access |
| Dedicated (nonpurged) disk for databases and data collections | IU Data Collections and Database Dedicated (nonpurged) Disk Space
Recommended UseStorage of persistent data collections on disk in any standard format as well as in Oracle and MySQL databases | | Disk | 100 TB | Oracle
MySQL | GridFTP
from spinning disk storage Big Red or from the IU Data Capacitor (via Lustre clients)
IU is also in the process of creating a Web portal interface for GridFTP access |
| GPFS-WAN | TeraGrid GPFS-WAN (Global Parallel File System-Wide Area Network) is a large-scale storage system mounted on several TeraGrid platforms. Although the system is physically located at SDSC, it looks like it is local to the system on which it is mounted.
Recommended Use
GPFS-WAN is recommended for long- or short-term data storage of high-volume multi-site runs, as well as for large TeraGrid-based data collections. Allocated space is available in the
Long-term Collections Area, a 150-TB partition for data collections.
Status
Available for allocations and in production
What to Choose in POPS
On the Select the Resource Level for Your Request section, select the appropriate request level:
- Startup/Educational (less than 200,000 SUs) for < 5 TB disk or 25 TB tape
- Research Requests (over 200,000 SUs) for > 5 TB disk or 25 TB tape
On the Resource Request page, choose GPFS-WAN Disk Space.
More information on GPFS-WAN | | Disk | 700 TB Total capacity
150 TB Long-term collections
475 TG Project Area
75 TG Scratch
How to apply | Not Applicable | GPFS-WAN is currently mounted on the following machines:
- IU
tg-login.iu.teragrid.org (Big Red PPC Linux Cluster)
- SDSC
tg-login.sdsc.teragrid.org (IA-64 Linux Cluster)
- UC/ANL
tg-viz-login.uc.teragrid.org (IA-32)
tg-login.uc.teragrid.org (IA-64 Linux Cluster)
|
| IU Archival Storage (replicated or single copy) | Archival storage under control of the HPSS (High Performance Software System) software, stored on 500 GB tapes. The HPSS installation is a geographically distributed data storage system. Data may be copied in one location or replicated in two locations (Bloomington and Indianapolis)
Recommended Use
Storage of very large data sets, including storage of highly valuable data in two geographically distinct locations | HPSS
500 GB tapes
52 tape drives
R/W=100 MB/sec
1- and 4-way stripes | Tape | 2.8 PB | NA | directly from tape or from a front-end cache
GridFTP
HPSS Hierarchical Storage Interface (HSI)
IU recommends use of GridFTP |
| Lustre file space (IU Data Capacitor) | Dedicated (nonpurged) disk storage in support of data collections and data-centric computing Lustre is offered as an experimental service for researchers who may have a particular interest in Lustre.
Recommended Use
Storage of persistent data collections on disk in any standard format as well as in Oracle and MySQL databases
Status
Available for allocations and in production as a test service for the TeraGrid. | | Disk | 535 TB | Oracle
MySQL | via Lustre from the IU Data Capacitor
IU is also in the process of creating a Web portal interface for GridFTP access. |
NCSA |
| Resource Name |
Description & Recommended Use |
Specifications |
Media Type |
Total File Space |
Database |
Access |
| NCSA Data Resource Services | Recommended Use
Web/FTP access to smaller datasets. Users that required a collection to be accessed by users through anonymous ftp or have web portal applications can request data storage with simple application help.
| Andrew File System (AFS) storage | Fibre Channel disk | 2TB shared storage | N/A | Through local NCSA login for data update, and public accessible depending on data requirements from user |
| NCSA Tape Storage | Archival storage managed by EMC's DiskXtender software, is stored on various sizes of tape and disk depending on its size and length of time on the archive. NCSA continues to keep 2 copies of all data stored for data protection of media failure. Open allocations currently available, but allocations of <1TB, <5TB, <25TB, and >=25TB are under evaluation. (small, medium, large and extra-large data allocations)
Recommended Use
NCSA Tape Storage is recommended for any data that require high availability but do not require real-time access and may need long time storage.
Status
Available for allocations and in production | | Tape | 10 PB | N/A | FTP and SSH access is available. MSS can be accessed both from outside the NCSA domain and from NCSA 's production machines at mss.ncsa.uiuc.edu or mss.ncsa.teragrid.org |
UC/ANL |
| Resource Name |
Description & Recommended Use |
Specifications |
Media Type |
Total File Space |
Database |
Access |
| GPFS-WAN | TeraGrid GPFS-WAN (Global Parallel File System-Wide Area Network) is a large-scale storage system mounted on several TeraGrid platforms. Although the system is physically located at SDSC, it looks like it is local to the system on which it is mounted.
Recommended Use
GPFS-WAN is recommended for long- or short-term data storage of high-volume multi-site runs, as well as for large TeraGrid-based data collections. Allocated space is available in the
Long-term Collections Area, a 150-TB partition for data collections.
Status
Available for allocations and in production
What to Choose in POPS
In the Select the Resource Level for Your Request section, select the appropriate request level:
- 0 - 200,000 (Startup and Educational Requests for 0 - 5 TB disk or 0 - 25 TB tape)
- 200,000 - above (Research Requests for > 5 TB disk or > 25 TB tape)
On the Resource Request page, choose GPFS-WAN Disk Space.
More information on GPFS-WAN | | Disk | 700 TB Total capacity
150 TB Long-term collections
475 TG Project Area
75 TG Scratch
How to apply | Not applicable | GPFS-WAN is currently mounted on the following machines:
- IU
tg-login.iu.teragrid.org (Big Red PPC Linux Cluster)
- SDSC
tg-login.sdsc.teragrid.org (IA-64 Linux Cluster)
- UC/ANL
tg-viz-login.uc.teragrid.org (IA-32)
tg-login.uc.teragrid.org (IA-64 Linux Cluster)
|
NCAR |
| Resource Name |
Description & Recommended Use |
Specifications |
Media Type |
Total File Space |
Database |
Access |
| GPFS-WAN | TeraGrid GPFS-WAN (Global Parallel File System-Wide Area Network) is a large-scale storage system mounted on several TeraGrid platforms. Although the system is physically located at SDSC, it looks like it is local to the system on which it is mounted.
Recommended Use
GPFS-WAN is recommended for long- or short-term data storage of high-volume multi-site runs, as well as for large TeraGrid-based data collections. Allocated space is available in the Long-term Collections Area, a 150-TB partition for data collections.
Status
Available for allocations and in production
What to Choose in POPS
In the Select the Resource Level for Your Request section, select the appropriate request level:
- 0 - 200,000 (Startup and Educational Requests for 0 - 5 TB disk or 0 - 25 TB tape)
- 200,000 - above (Research Requests for > 5 TB disk or > 25 TB tape)
On the Resource Request page, choose GPFS-WAN Disk Space.
More information on GPFS-WAN | | Disk | 700 TB Total capacity
150 TB Long-term collections
475 TG Project Area
75 TG Scratch
How to apply | Not applicable | GPFS-WAN is currently mounted on the following machines:
- IU
tg-login.iu.teragrid.org (Big Red PPC Linux Cluster)
- SDSC
tg-login.sdsc.teragrid.org (IA-64 Linux Cluster)
- UC/ANL
tg-viz-login.uc.teragrid.org (IA-32)
tg-login.uc.teragrid.org (IA-64 Linux Cluster)
|
| NCAR HPSS Storage System |
This archival system features
- a maximum file size of 1 TB,
- initial per-user quota of 5 TB with opportunity for requesting increased allocations,
- ability to choose 1 or 2 copies for a file at creation time, and
- a POSIX-compliant interface.
NCAR HPSS is recommended to frost users for storing archival datasets, i.e. ones that require high availability but do not require real-time access. | HPSS | Tape | 1 PB | | HSI (for more information see Archiving files on NCAR's new HPSS Storage System) |
| NCAR Mass Storage System (Single or Double Copy) | Controls archival storage on the NCAR Mass Storage System (MSS) by way of the MSS software. Data is stored on 1 Terabyte tapes. NCAR MSS is located at NCAR.
Recommended Use
NCAR MSS is recommended for use in archiving data required for running jobs on frost, or for frost output that cannot be moved offsite. MSS files are limited to 100 billion bytes in size as of March 2009.
Status
Available for allocations and in production | NCAR MSS
1 Terabyte Tapes
Number of tape drives, r/w speed, striping, etc. is irrelevant because this system is for archival rather than file server purposes | Tape | 10 PBs as of March, 2009
Limit per user: 5 TB-Years | NA | NCAR MSS DCS Commands (see frost man pages on msrcp command for more information, or check NCAR DCS Information Web Pages). |