logo Teragrid07
""
Home
""
About TeraGrid
""
Program Committee
"" Program Agenda
""Demonstrations
"" Tutorials
""
Speakers
""
Call for Participation
""
Abstracts
""
Posters
""
Registration
""
Scholarships
""
Accommodations
""
Transportation
""
Meeting Site Info
""
Conference Maps
""
Student Contests
""
Student Volunteers
""
Exhibitor & Sponsor Info
""
Media Information
"" Sponsor Listing
""
Download
Presentations
""
Photo Album

TeraGrid '07 Abstracts

Tuesday June 5

Implementing an Inter-institutional Undergraduate Computational Science Program
S. Gordon, K. Carey, I. Vakalis

Ten Ohio higher education institutions are collaborating to create an inter-institutional, competency-based undergraduate minor curriculum in computational science. The program will be available to majors in science, mathematics, and engineering in Autumn 2007. It is part of an NSF Cyberinfrastructure Team Award (SCI-0537405) and the new, statewide virtual Ralph Regula School of Computational Science. Key elements of the program development have included the definition of the interdisciplinary competency requirements for the program, the development of sharable instructional modules, and the inter-institutional agreements that allow students to cross-register for classes at cooperating institutions and a funding mechanism for the program. These are presented along with some example instructional modules and the 2007-2008 academic year curriculum.

The Internet to the Hogan and Dine' Grid Project
M. Trebian, T. Davis, J. Ribble, J. Arviso

The Internet To The Hogan and Dine' Grid is a project designed to fundamentally change the socio-economic realities for the Navajo Nation through the building of a cutting-edge cyberinfrastructure connecting Navajo communities to the global scientific community. The ultimate outcome is to end the digital divide, starting with communities surrounding Navajo Technical College, by building a high-speed wireless backbone with OC3 bandwidth that joins with communities such as the TeraGrid through the Lambda Rail and Internet2 from Alburquerque, NM and the ABQ GigaPop. The OC3 backhaul will connect through communities in the Navajo Nation and the Pueblo Nation over a 120 mile path using Harris radio technologies. Collaborations and agreements between sovereign nations were negotiated and the resources of the global scientific community were brought to bear to make the backbone a reality. After making the connection to Navajo Technical College the build out will continue to 52 chapterhouses and community centers surrounding the college. Motorola Canopy technologies will be installed that will provide broadband wireless coverage at each chapterhouse and community center out to a radius of 30 miles. Another aspect of the cyberinfrastructure implementation is the establishment of supercluster technologies on campus and distributed cluster technologies at the chapterhouses and community centers through the LittleFe project. The Dine' Grid will be more than just a distributed computing network where communities will have the opportunity for direct access to computing resources they would normally never have available. Teaching and learning will occur to provide communities the opportunity to be contributors to the maintenance and expansion of the grid resources being made available to them in this project and to science occurring in and outside their communities. E-learning and telephony services will be made available through the application of open source products. Collaborations and on-going relationships will be fostered through interactions with member communities of the TeraGrid to create research opportunities that have a direct impact on the communities touched by the project. Finally, a technology transfer model will be designed to create enterprises and entrepreneurships that can successfully compete in niche and national markets as intellectual human resources are built in the Navajo Nation that can construct a new economic structure through well prepared human capital for the high value, high intellect jobs and tasks of the Twenty-first Century.

Education and Outreach at Scale - Lessons form EPIC and EOT-PACI
R. Giles

A significant challenge to broadening participation in cyberinfrastructure (CI) and CI-enabled science is to scale education and out-reach activities to match the wide geographic, disciplinary, and institutional scales implicit in cyberinfrastructure itself. In this paper, we describe some of the activities and impacts of two projects that have addressed this challenge and focus on lessons learned that can inform future efforts. EOT-PACI (Education, Outreach and Training Partnerships for Advanced Computational Infrastructure 1997-2004) was the integrated education and outreach effort of the NCSA Alliance and SDSC's NPACI partnerships. EPIC (Engaging People in Cyberinfrastructure 2005-2006) was a national collaboration building on EOT-PACI's efforts led by Boston University and University of Wisconsin. Both these projects engaged about 25 institutional partners in activities involving: education at all levels; inclusion of underrepresented women, minorities and people with disabilities; and participation in the design of technologies that better support the needs of a broader constituency.

Key strategies included: building on opportunties to move programs and ideas of partners from local to national scale; attention to evaluation as a means to understanding successful programs and how to scale them; empowering participants to address their own problems and interests, build-ing a community among partners which allowed for collaboration and activities beyond the scope of the convening project; attempts to insure that the needs and interests of educators, students, and people from underrepresented groups are reflected in new cyberinfrastructure being developed;

Federating the TeraGrid with European Grids for Large-scale Molecular Dynamics Investigations of Clay Based Nanoparticles Student
P. Coveney, J. Suter, M. Thyveetil, S. Zasada

In this study, we have combined the resources available on the US TeraGrid with those on the UK National Grid Service and the EU DEISA grid to perform a series of computationally demanding large-scale molecular dynamics (MD) simulations. We define large-scale MD as systems that contain more than 100,000 atoms. The Application Hosting Environment (AHE), a lightweight grid middleware system, provides an easy-to-use and uniform interface to this federation of grids. Using this new technology, we have simulated montmorillonite and layered double hydroxide (LDH) clay systems containing upwards of a million atoms, with dimensions approaching those of a realistic clay platelet. This considerably extends the spatial dimensions of microscopic simulation into a domain normally encountered in mesoscopic simulation. The simulations exhibit emergent behavior with increasing size, manifesting collective thermal motion of clay sheet atoms. The thermal bending fluctuations allow us to calculate material properties, which are hard to obtain experimentally due to the small size of clay platelets. This information is important for determining the materials properties of clay-polymer nanocomposites, a new type of material with enhanced mechanical properties compared to conventional materials.

Cooperative International Simulations with McStas
V. Lynch, M. Hagen, E. Farhi, P. Willendrup, K. Lefmann

McSTAS is a neutron ray-trace simulation package that simulates neutron scattering instruments. Its developers, from Riso National Laboratory in Denmark and the Institut Laue Langevin in France, are collaborating with beamline scientists from the Spallation Neutron Source and the Neutron Science TeraGrid developers with the goal of producing a McSTAS simulation portal with access to distributed computing resources on the TeraGrid. Other goals of this collaboration are improved visualization, standardized NeXus output, improved performance, more sample kernels, event mode and histogram interfaces, and an analysis interface.

A New Domain Decomposition Technique for TeraGrid Simulations
L. Grinberg, B. Toonen, N. Karonis, G. Karniadakis

We present a new scalable approach for simulating large computational mechanics problems on the TeraGrid and beyond. Specifically, we consider 3D simulation of blood flow in the human arterial tree using the spectral/hp element method that provides high-order accuracy while treats the geometric complexity effectively. However, solution of the dense linear systems involved in the entire tree is extremely expensive and does not scale well in cross-site simulations. To this end, we employ a multi-layer hierarchical approach whereby the innermost layers are tightly coupled whereas the outer layers are loosely coupled. This leads to better conditioned linear systems and reduced communication overhead costs. We use MPICH-G2 and the recently developed MPIg to implement this new composition ap-proach and we demonstrate its effectiveness for a smaller prototype problem.

Science Gateways: Progress using the Clarens Toolkit on TeraGrid
J. Bunn, C. Steenberg, F. Kahn, I. Legrand, H. Newman

We describe how we have used the Clarens Grid Portal Toolkit to develop powerful application and browser-level interfaces to ROOT and Pythia - applications which are used widely in the HEP Community. The Clarens Toolkit is a codebase that was initially developed under the auspices of the Grid Analysis Environment project at Caltech, with the goal of enabling users engaged in physicis analysis to bring the full power of the Grid to their desktops, while at the same time not altering the look, feel and interface of the chosen analysis tool. By wrapping existing applications, and providing a well documented wrapper API, clients are able to exchange commands, data and results using standard protocols that include XML-RPC, HTTP and HTTPS. In particular, we have implemented a wrapper to the Pythia particle collision simulation code, and developed an encapsulated form of the ROOT environment, called a ROOTlet, that allows convenient access to the power of the TeraGrid for HEP simulation and analysis tasks. The paper describes these developments, and plans for the future.

TeraGrid's GRAM Auditing & Accounting, & its Integration with the LEAD Science Gateway
S. Martin, P. Lane, I. Foster, M. Christie

Science Gateways have been proposed as a means of lowering the barrier to scientists and their applications using TeraGrid resources. A Science Gateway provides an application- or domain-specific interface that a scientist can easily understand; under the covers, its implementation uses Web Services interfaces (including those provided by the Globus Toolkit) to access computers, storage, and other TeraGrid resources. A single gateway may have 1000s of users who may collectively generate many thousands of file transfers and job submissions. Thus, efficiency and scalability are paramount. The GRAM service provides secure scalable efficient access to remote computing resources, but some auditing enhancements are needed to allow gateway users to be multiplexed over a common/service credential. We describe how auditing information is produced by Globus Toolkit services and integrated with local accounting systems for use by TeraGrid's resource providers. Use cases focus on the GRAM service, but the design is meant to be extensible for any grid service. The audit system can be (1) leveraged by Grid Service Providers to provide GRAM clients new options for secure access to service audit information in scalable and efficient ways, and (2) integrated with a Grid's accounting system to provide Web Service access via OGSA-DAI to usage information. We also describe how the LEAD Gateway has integrated GRAM auditing capabilities to provide auditing and accounting information for workflows of jobs submitted by users of the LEAD Portal.

Configuration Life-Cycle Management on the TeraGrid
T. Leggett, C. Leuninghoener, N. Desai

Modern computer environments are constantly changing. Experience has shown that using a comprehensive configuration management system has many benefits and can help keep this constant change under control. Configuration management tools have continued to make great progress in the last several years, and the University of Chicago/Argonne National Laboratory TeraGrid cluster administrators have identified many areas throughout the configuration life-cycle that have been positively affected by their migration to Bcfg2 as their configuration management tool. This paper documents their experiences.

Survey of TeraGrid Job Distribution: Toward Specialized Serial Machines as TeraGrid Resources
A. Gopu, R. Repasky, S. McCaulay

As we proceed towards the age of petascale computing, it is important to be aware that even today more than half of national cyberinfrastructure users are serial users who run single processor code; more over, coarse-grained parallel application do not necessarily benefit from a high-speed low-latency interconnect. While a majority of compute resources on the TeraGrid today are massive parallel machines with high-speed low-latency interconnects like Myrinet or Infiniband, optimized to run large fine-grained parallel applications that use hundreds of processors/cores in parallel, usage patterns indicate that there is still considerable demand that could be just as effectively met by large computational resources with no special high-speed or low-latency interconnects. Research allocations involving large serial applications or coarse-grained parallel applications could be allocated to these machines, thus possibly leading to decreased wait times for massive parallel and large serial jobs. This change in focus would also lower the financial barrier for potential new resource providers to the national cyberinfrastructure, by allowing them to reallocate funds from the interconnect to additional computational capacity.

Modularized Parallel Neutron Instrument Simulation
M. Chen, J. Cobb, M. Hagen, S. Miller, V. Lynch

In order to build a bridge between the TeraGrid (TG), a national scale cyberinfrastructure resource, and neutron science, the Neutron Science TeraGrid Gateway (NSTG) is focused on introducing productive HPC usage to the neutron science community, primarily the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory (ORNL). Monte Carlo simulations are used as a powerful tool for instrument design and optimization at SNS. One of the successful efforts of a collaboration team composed of NSTG HPC experts and SNS instrument scientists is the development of a software facility named PSoNI, Parallelizing Simulations of Neutron Instruments. Parallelizing the traditional serial instrument simulation on TeraGrid resources, PSoNI quickly computes full instrument simulation at sufficient statistical levels in instrument de-sign. Upon SNS successful commissioning, to the end of 2007, three out of five commissioned instruments in SNS target station will be available for initial users. Advanced instrument study, proposal feasibility evalua-tion, and experiment planning are on the immediate schedule of SNS, which pose further requirements such as flexibility and high runtime efficiency on fast instrument simulation. PSoNI has been redesigned to meet the new challenges and a preliminary version is developed on TeraGrid. This paper explores the motivation and goals of the new design, and the improved software structure. Further, it describes the realized new fea-tures seen from MPI parallelized McStas running high resolution design simulations of the SEQUOIA and BSS instruments at SNS. A discussion regarding future work, which is targeted to do fast simulation for automated experiment adjustment and comparing models to data in analysis, is also presented.

V sPort: Web-Based Access to Community-Specific Visualization Functionality
M. Baker, R. Heiland, E. Bachta, M. Das

The VisPort visualization portal is an experiment in providing Web-based access to visualization functionality from anyplace and at anytime. VisPort adopts a service-oriented architecture to encapsulate visualization functionality and to support remote access. Users employ browser-based client applications to choose data and services, set parameters, and launch visualization jobs. Visualization products typically images or movies are viewed in the user's standard Web browser. VisPort emphasizes visualization solutions customized for specific application communities. Finally, VisPort relies heavily on XML, and introduces the notion of visualization informatics the formalization and specialization of information related to the process and products of visualization.

A Scalable Approach to Deploying and Managing Workspaces
R. Bradshaw, N. Desai, T. Freeman, K. Keahey

The use of virtualization in Grid computing has seen a lot of interest lately. However, while much effort has been expanded on developing the capabilities of Virtual Machine Monitors (VMMs) and associated tools and services relatively little has been done to investigate the requirements underlying the scalable production, deployment, and management of VM images. At the same time, the clear understanding of requirements and capabilities in this area is critical to creating progress in exploring the applications of virtualization. In this paper, we investigate the issues and propose some of the solutions relevant to this question.

Integrating LEAD Research in Education
E. Meyers, H. Gadde, J. Kurdzo, T. Daley, J. Vogt

Linked Environments for Atmospheric Discovery (LEAD) is making meteorological data, forecast models, and analysis and visualization tools available to anyone who wants to interactively explore the weather as it evolves. The LEAD education and outreach initiative is aimed at bringing new capabilities into classroom from the middle school level to graduate education and beyond. One of the principal goals of LEAD is to democratize the availability of advanced weather technologies for research and education. The degree of democratization is tied to the growth of student knowledge and skills, and is correlated with education level. This is necessary to accommodate not only differences in knowledge and skills, but to assure that the teachable moment is not lost. Undergraduates will have the opportunity to query observation data and model output, explore and discover relationships through concept mapping using an ontology service, select domains of interest based on current weather, and employ an experiment builder within the LEAD portal as an interface to configure, launch the WRF model, monitor the workflow, and visualize results using Unidata's Integrated Data Viewer (IDV), whether it be on a local server or across the TeraGrid. Such a robust and comprehensive suite of tools and services can create new paradigms for embedding students in an authentic, contextualized environment where the knowledge domain is an extension, yet integral supplement, to the classroom experience. The presentation will focus on the development and refinement of LEAD-to-LEARN modules, collaborative efforts between LEAD Education and Outreach thrust and National Center for Supercomputing Applications Cybereducation group including the development of a basic version of IDV.

A project-centric approach to cyberinfrastructure education
D. Rainey, S. Faulkner, L. Craddock, S. Crammer, B. Tretola

We have developed an interdisciplinary bioinformatics course focused on preparing the future scientific workforce. Central to the course is a project-centric teaching paradigm to engage students in applying the concepts of cyberinfrastructure through the integration of the disciplines of biology, computer science, mathematics, and statistics in the field of bioinformatics. High school and college teachers and their students were introduced to the concepts of cyberinfrastructure (CI) through the incorporation of genomics software tools and data. The cornerstone of the project-centric approach was the development and implementation of educational modules centered on applying a transdisciplinary approach to specific and typical challenges that are faced by current scientists in the area of pathosystems biology (host-pathogen-environment interactions). The course modules were further modified by the institutions deploying the course to tailor to specific needs and objectives and to fit the education level and background of the trainees. We report here the first implementation of the CI course and a summary of our initial observations to aid others in implementing similar courses. Specifically, we discuss some of the materials that were developed, some of the pedagogical considerations important to course implementation and communication requirements needed in the establishment of a virtual community. Additional information and a link to our activities can be found at http://ci.vbi.vt.edu/CITEAM.

SORCERESS: A Scalable, Extensible Real-Time Status Service for TeraGrid Condor Resources
U. Topkara, C. Song, S. Park, J. Woo, P. Smith

Grids serve the growing computational needs of ambitious scientific computing projects in many fields, such as earth sciences and biology. The loosely coupled management of the resources that span many independent institutions enables the growth of computational grids beyond the capabilities of the participating institutions. Self-maintenance features and freedom from a central control facility empower the grids to thrive with their very large sizes. In addition, the users are abstracted away from the inhibiting complexities of dealing with logistic details of the grid by a transparent aggregation of distributed resources. On the other hand, the users are constrained to be content with local views of grid resources (i.e., their institutional boundaries). This limitation of the users' horizons becomes the weak link in the usability of grid systems. We believe that grid systems will benefit from a facility that provides the users with the ability to monitor the global grid activity and to access the archives of past activity. This paper presents SORCERESS, a grid service for the TeraGrid Condor resources. SORCERESS provides real-time information about the global grid activity to enhance the grid usability and overall experience of the grid user. Access to the real-time status information of the grid will enable users and administrators to make intuitive cost evaluations, whereas the archives of this meta-data will provide developers and grid researchers invaluable information to improve grid service quality, and to develop educational tools for grid usage.

SORCERESS monitors the real-time Condor activity beyond institutional boundaries and archives the collected meta-data stream for future use. Our design does not require any changes to the existing middleware at participating remote resource pools, and uses sensor processes that can be run with normal user privileges. We present the details of our design and implementation of SORCERESS, sample statistics from the TeraGrid Condor usage. We also demonstrate an application for user-friendly visualization of real-time Condor status.

Visualizing electron microscope 3D reconstructions using Teragrid resources
C. Gilpin, K. Gaither

A previous study by the current authors described the first use of Teragrid resources to investigate 3D electron microscope reconstructions using a visualization cluster1. This proof of concept study enabled data sets too large for a workstation to be modeled, rendered and manipulated in real-time using a remote connection. It is clear that data sets obtained from electron microscopes are not only large in size but present considerable difficulties for visualization due to poor signal to noise ratio and little change in image intensity between adjacent but unrelated structures. We are continuing to test a variety of software packages which can adequately handle our large data sets. In addition we have embarked on the difficult task of feature detection.

Enabling Science through the TeraGrid Visualization Gateway
M. Dahan, J. Insley, M. Papka, T. Uram, K. Gaither

Visualization and data analysis are an essential part of the scientific discovery process. Many researchers, however, do not have access to the resources required to effectively visualize large datasets. This problem has been addressed by the development of the TeraGrid Visualization Gateway, which provides simplified access to resources, both hardware and software, for a broad population of users. The motivation for developing the Gateway is discussed, along with the goals that it aims to achieve. Several usage scenarios are presented to illustrate the intended use of the Gateway. Also described are current and future modes of user access to the Gateway, as well as available services, and additional services under development.

Large Scale Simulations of Nanoelectronic devices with NEMO3-D on the Teragrid
H. Bae, S. Clark, G. Klimeck, S. Lee, M. Naumov

This paper describes recent progress in large scale numerical simulations for computational nano-electronics using the NEMO3-D package. NEMO3-D is a parallel analysis tool for nano-electronic devices such as quantum dots. The atomistic model used in NEMO3-D leads to large scale computations in two main phases: strain and electronic structure. This paper focuses primarily on the electronic structure phase of the computations. The eigenvalue problem associated with the Hamiltonian matrix is challenging for a number of reasons: (i) the need for very large scale, 100 million to one billion unknowns (ii) the desired eigenvalues (along with the associated eigenvectors) lie in the interior of the spectrum and (iii) the eigenvalues are often degenerate. New results on the performance and scalability of NEMO3-D are presented, on advanced parallel architectures, including Teragrid resources. Results presented here were obtained with runs on up to 192 processors, for systems with 40 million atoms. We also report on on-going work to incorporate new advanced algorithms into NEMO3-D. We describe how the NEMO3-D code has been linked to the Teragrid through the NanoHub.

SCEC Earthworks Science Gateway: Interactive Configuration and Automated Execution of Earthquake Simulations on the TeraGrid
P. Maechling, J. Muench, H. Francoeur, D. Okaya, Y. Cui

The SCEC Earthworks Science Gateway is a portal-based scientific workflow system designed to help members of the SCEC geosciences community perform computationally-intensive, geophysical research using TeraGrid resources. The SCEC Earthworks Science Gateway allows users to configure and execute earthquake wave propagation simulations using well-validated geophysical models and high performance simulation software. The Earthworks system generates a set of data products including surface seismograms and ground motion maps. Users access the SCEC Earthworks system through a web-based portal interface. Users can configure, submit, and monitor wave propagation simulations and they can also search, discover, and access the resulting data sets. They can also submit verification and validation simulations to ensure new versions of the Geoscientific codes work properly. All steps in the wave propagation simulations including mesh generation, wave propagation, and post processing are run using a scientific workflow system based on the Virtual Data System (VDS), the Pegasus meta-scheduler system, and the Globus toolkit. Long term data storage is provided by the Storage Resource Broker (SRB).

Modular Information Provider: A Grid-Independent Approach
E. Shook, S. Wang, R. Briggs, A. Padmanabhan, T. Hansen

Collecting and organizing information from resources distributed across the Grid is a fundamental challenge. Information provider solutions address this challenge, but require significant effort to develop, deploy, and support. A Grid-independent approach can reduce the significant cost associated with information providers by reducing duplicated effort. We propose the Modular Information Provider (MIP) which utilizes a Grid-independent framework to support re-usability. The Grid-independent framework contains several components that utilize an XML data-model to support multiple information modeling schemas and languages. Additional MIP components interact with the framework to support the formatted output and collection of information. Experiments conducted on the Open Science Grid and Australian Partnership for Advanced Computing grid have demonstrated that MIPs Grid-independent approach can easily adapt to different software stacks and the heterogeneous configuration of several sites on a grid.

Building a Grid Portal for Teragrid's Big Red
M. Nacar, J. Choi, M. Pierce, G. Fox

We describe the Big Red Portal, which builds on the Open Grid Computing Environment (OGCE) portal software. In addition to standard OGCE capabilities, this portal includes MEME job submission and job dashboard portlets that are built using OGCE and related portlet libraries. To simplify the development of such portlets in the future, we introduce an XML tag library approach that encapsulates common Grid operations for rapid development

MyCluster Ensemble Manager: Ensemble Jobs in the TeraGrid User Portal
M. Dahan, E. Roberts, E. Walker, J. Boisseau

An ensemble allows users to execute many computational tasks with different inputs—typically input data and/or parameter files, but potentially even applications—and thereby explore a wide range of initial conditions, analyze a large number of different data sets, or explore a broad parameter space. MyCluster is a software technology that enables users to execute ensembles on the TeraGrid with high throughput. MyCluster creates a virtual cluster of nodes from different systems for use by the ensemble, leveraging the syntax and capabilities of various resource management systems. This paper describes current efforts and future plans for providing a simple, yet powerful web interface for using MyCluster's provisioning capabilities to simplify the execution of ensembles across TeraGrid. The MyCluster Ensemble Manager, which is being developed as part of the TeraGrid User Portal, will make the availability of terascale distributed ensemble computing easier and thus useful to a broader research community.

Wide Area Filesystem Performance using Lustre on the TeraGrid
S. Simms, G. Pike, D. Balog

Today's scientific applications demand computational resources that can only be provided by parallel clusters of computers. Storage subsystems have responded to the increased demand for high-throughput disk access by moving to network attached storage. Emerging Cyberinfrastructure strategies are leading to geographically distributed computing resources such as the National Science Foundation's TeraGrid. One feature of the TeraGrid is a dedicated national network with WAN bandwidth on the same scale as machine room bandwidth. A natural next step for storage is to export file systems across wide area networks to be available on diverse resources. In this paper we detail our testing with the Lustre file system across the TeraGrid network. On a single 10 Gbps WAN link we achieved single host performance approaching 700 MB/s for single file writes and 1GB/s for two simultaneous file writes with minimal tuning.

Survey of TeraGrid Job Distribution: Toward Specialized Serial Machines as TeraGrid Resources
A. Gopu, R. Repasky, S. McCaulay

As we proceed towards the age of petascale computing, it is important to be aware that even today more than half of national cyberinfrastructure users are serial users who run single processor code; more over, coarse-grained parallel application do not necessarily benefit from a high-speed low-latency interconnect. While a majority of compute resources on the TeraGrid today are massive parallel machines with high-speed low-latency interconnects like Myrinet or Infiniband, optimized to run large fine-grained parallel applications that use hundreds of processors/cores in parallel, usage patterns indicate that there is still considerable demand that could be just as effectively met by large computational resources with no special high-speed or low-latency interconnects. Research allocations involving large serial applications or coarse-grained parallel applications could be allocated to these machines, thus possibly leading to decreased wait times for massive parallel and large serial jobs. This change in focus would also lower the financial barrier for potential new resource providers to the national cyberinfrastructure, by allowing them to reallocate funds from the interconnect to additional computational capacity.

Wednesday June 6

Adapting an Application for Use in a Condor Based Parameter Sweep on TeraGrid
P. Cheeseman, M. Deem, D. Earl, W. Whitson

This paper describes the method by which a single case application was adapted to usage as the core of a high throughput parameter sweep. Performance characteristics of the application, which were considered in planning the adaptation, are first described. The steps towards adapting the code to high throughput batch processing systems are then discussed with an emphasis on adaptation to Condor1 as deployed in the Purdue TeraGrid resources. A summary of the change in performance characteristics of the application is presented followed by concluding remarks.

CyberBridges; A Model Collaborative Educational Infrastructure for e-Science
H. Alvarez, J. Ibarra, K. Kumar, C. Zhang, D. Cox

The CyberBridges pilot project was an innovative model for creating a new generation of scientists and engineers who integrate cyberinfrastructure into their educational, professional, and creative activities. CyberBridges augments graduate student education to include a foundation of understanding in Advanced Networking and Grid Infrastructure for High Performance Computing, and thereby bridges the divide between the information technology community and science and engineering disciplines. CyberBridges is increasing the rate of discovery for science and engineering faculty by empowering them with cyberinfrastructure, fostering inter-disciplinary research collaborations, improving minority graduate education, and institutionalizing this change process. We demonstrate the effectiveness of CyberBridges by providing four case studies with graduate students of physics, bioinformatics, chemistry, and biomedical engineering.

Science Gateways on the TeraGrid
C. Catlett, S. Goasguen, N. Wilkins-Diehr, S. Martin, D. Middleton

Increasingly, the scientific community has been using web portals and desktop applications to organize their work. The TeraGrid team determined that it would be important to create a set of capabilities that would allow TeraGrid services and resources to be integrated, potentially in a transparent way, with these scientific computing environments. This paper outlines the Science Gateways program and provides an overview of key lessons learned in developing mechanisms to allow for such integration.

Software Provider Forum

Interactive briefings and discussions between TeraGrid GIG and RP staff and Coordinated TeraGrid Software and Service "CTSS" providers; and between gateway developers and the providers of gateway development software.

Bridging Teaching, Learning, and Education in Grid-enabled Bioinformatics
G. Rendon, J. Mashi, S. McLean, M. Miller, N. Nicely

This paper describes the present and future states of the Biology Student Workbench project at the National Center for Supercomputing Applications, NCSA, and its relationship to the Next Generation Biology Workbench and the Bioportal projects.

Using Adaptive Fault Tolerance to Improve Application Robustness on the TeraGrid
Y. Li, Z. Ian

Application robustness becomes a major concern with the continued scaling of high performance computing (HPC). In a recent study [8], we have developed an adaptive fault management scheme called FT-Pro for improving application robustness by combining the merits of proactive process migration and reactive checkpointing. In this paper, we push forward this study by integrating FT-Pro with a production-level MPI package and investigating its effectiveness across a number of real-world parallel applications on the TeraGrid. Extensive experiments are conducted on the IA32 cluster at TeraGrid/ANL by comparing FT-Pro as against periodic checkpointing under a wide range of system parameters and failure behaviors. These preliminary experiments show the potential of using adaptive fault tolerance to improve application performance in the presence of failures.

FermiGrid
D. Yocum, E. Berman, K. Chadwick, G. Garzoglio, P. Canal

As one of the founding members of the Open Science Grid Consortium (OSG), Fermilab enables coherent access to its production resources through the Grid infrastructure system called FermiGrid. This system successfully provides for centrally managed grid services, stakeholder interoperability, development of OSG Interfaces for Fermilab, and an interface to the Fermilab dCache system. FermiGrid supported virtual organizations (VOs) include high energy physics experiments (USCMS, MINOS, D0, CDF, ILC), astrophysics experiments (SDSS, Auger, DES), biology experiments (GADU, Nanohub) and educational activities.

Problem-Solving and Critical Thinking Skills of Experts and Their Importance in Computer Science Education
D. Bushey, S. Stephenson

A documented shortage of technical leadership and top-tier performers in the computational sciences threatens the technological edge, security, and economic well being of the nation. This study posits that these desirable, high-level performers have measurable and observable cognitive traits beyond their subject matter expertise. Past literature identifies important skills and traits in the areas of problem-solving, expertise, and creative and critical thinking important for success as a professional in industry. The study reported here attempts to measure those skills among college freshmen, college seniors, and expert popula-tions. The results suggest a shift in undergraduate curriculum emphasizing the cognitive skills identified.

Scientific Impact of Grid Services on Air Quality Research
V. Sliva

The Community Multiscale Air Quality (CMAQ) modeling system is a set of software components developed by a research partnership between the National Oceanic & Atmospheric Administration (NOAA) and the US Environmental Protection Agency (EPA). The goals of the CMAQ model are: 1) To improve the ability to evaluate the impact of air quality management practices for multiple pollutants at multiple scales, and 2) To provide scientists the ability to better probe, understand, and simulate chemical and physical interactions in the atmosphere. In this paper we present the ideas, and the scientific impact of taking this research to the next level: Maximize productivity and provide seamless collaboration among scientists of different organizations involved in CMAQ by the use of secure Grid Services provided by the Globus Toolkit.

The Case for Risk Based Information Assurance
A. Shelmire, J. Rome, J. Marsteller

Risk Assessment is a necessary component of any operation. Adoption of useful risk assessment methods is particularly useful in the Information Security realm. Many methods of "risk assessment" have been developed by various agencies, including NIST's SP800 method, CERT's OCTAVE and SSA methods, and methods applied by private organizations. During 2006 the TeraGrid performed a Risk Assessment based upon the NIST standard. This paper takes lessons learned from that process and other risk assessments, and the industry as a whole to propose a general methodology for a risk-assessment driven Information Security Management LifeCycle.

Computational Literacy: Challenges in K-12 Education
C. Thompson, D. Ching, P. Gray, B. Helland, T. Meade

The Computational Literacy project is a research study that aims to measure how the use of computational modelling can enhance learning in diverse student populations. This paper introduces the motivations behind teaching science through computational science, the implementations of highly scaffolded computational models and lessons learned from the experience of implementing these materials in a large-scale educational research study.

Interweaving Data and Computation for End-to-End Environmental Exploration on the TeraGrid
L. Zhao, C. Song, V. Merwade, Y. Kim, R. Kalyanam

This paper presents the design and implementation of a cyberinfrastructure for End-to-End Environmental Exploration (C4E4). The C4E4 framework addresses the need for an integrated data/computation platform for studying broad environmental impacts by combining heterogeneous data resources with state-of-the-art modeling and visualization tools. With Purdue being a TeraGrid Resource Provider, C4E4 builds on top of the Purdue TeraGrid data management system and Grid resources, and integrates them through a service-oriented workflow system. It allows researchers to construct environmental workflows for data discovery, access, transformation, modeling, and visualization. Using the C4E4 framework, we have implemented an end-to-end SWAT (Soil and Water Assessment Tool) simulation and analysis workflow that connects our TeraGrid data and computation resources. It enables researchers to conduct comprehensive studies on the impact of land management practices in the St. Joseph watershed using data from various sources in hydrologic, water quality, atmospheric, and other related disciplines.

GT4 GRAM: A Functionality and Performance Study
M. Feller, I. Foster, S. Martin

The Globus Toolkit's pre-Web Services GRAM service ("GRAM2") has been widely deployed on grids around the world for many years. Recent work has produced a new, Web Services-based GRAM service ("GRAM4"). We describe and compare the functionality and performance of the GRAM2 and GRAM4 job execution services included in Globus Toolkit version 4 (GT4). GRAM4 provides significant improvements in functionality and scalability over GRAM2 in many areas. GRAM4 is faster in the case of many concurrent submissions, but slower for sequential submissions and when file staging is involved. (Optimizations to address the latter cases are in progress.) This information should be useful when considering an upgrade from GRAM2 to GRAM4, and when comparing GRAM against other job execution services.

Advances in the MSI Community and the MSI CyberInfrastructure Empowerment Coalition (MSI-CIEC)
R. Alo, K. Barnes, D. Baxter, G. Fox, A. Kusliks, A. Ramirez

Building on a very successful NSF CI-TEAM Demonstration project, MSI-CI2, [1, 2, and 11] and the earlier Advanced Networking for Minority Serving Institutions [3], The Minority-Serving Institutions CyberInfrastructure Empowerment Coalition (MSI-CIEC) is laying the foundation for a sustainable and scalable initiative to meaningfully engage MSIs into CyberInfrastructure and the TeraGrid. We discuss some of the challenges and opportunities for MSIs with the emergence of CI for science and engineering. We will briefly illustrate the potential of MSI-CIEC by presenting two prominent success stories within the context of the overall MSI community. We then discuss the approach and current activities of MSI-CIEC including those with the TeraGrid.

A Grid-Computing Framework for Quadratic Programming under Uncertainty
A. Kulkarni, A. Rossi, J. Alameda, U. Shanbhag

Mathematical programming under uncertainty concerns the optimal management of resources when there is stochasticity in the data. Specifically, this requires making a (first-period) decision before the realization of uncertainty. In addition, recourse-based formulations react to the randomness by reacting through (second-period) recourse decisions. Such formulations allow a decomposition into a master-worker framework. We consider the solution of such problems by developing grid-computing extensions of two algorithms for a class of problems with quadratic objective functions and linear constraints. A description of the framework for implementing these algorithms on the TeraGrid is provided. Some preliminary computational experience on a serial platform is reported.

Accessing TeraGrid Services with Campuses Local Credentials
K. Peacock, B. Johnson, S. Goasguen

Accessing TeraGrid (TG) Resource Providers (RP) requires users to have a X.509 certificate issued by a Certificate Authority (CA) trusted by the TG RPs. This requirement often puts a burden on the users as some CA who vouch for the identity of the users often require face to face meetings. In this paper we present the different steps needed to put in place an authentication system that will let users access TG with their home institution credentials. Our work while still under progress shows that using shibboleth and GridshibCA in connection with the institution identity management, a campus can issue grid proxies for its members. This system integration would allow users to easily access TG services and with the potential use of additional attributes it could allow TG RPs to do more advanced authorization.

SDSC TeacherTECH Evolution
A. Mason, D. Baxter

SDSC TeacherTECH program in its entirety, ranging from technology training to the use of computer based scientific and mathematics tools, provides a comprehensive educator resource that lends technology support to teachers at varying levels of technical skill. TeacherTECH program of continued learning opportunities and follow up through a variety of workshops may prove to be more scalable than conventional approaches. The focus is on expressed teacher needs. SDSC and TeraGrid have launched a program to expand TeacherTECH resources and programs at Resource Provider sites, with the ultimate goal of expanding beyond the boundaries of these sites to include more EOT partners nationally. By launching the national expansion through established TeraGrid partners, all of whom have experience creating and implementing EOT outreach activities, the chance for success is high.

A Community Climate System Modeling Portal for the TeraGrid
A. Basumallik, L. Zhao, C. Song, R. Sriver, M. Huber

The Community Climate System Model (CCSM) is a coupled climate modelling framework for simulating the earth's climate system, that allows researchers to conduct fundamental research into the earth's past, present and future climate states. While the model is well documented, the learning curve for new users is steep and porting the model to a new platform can be difficult and time- consuming. In this paper, we describe an effort to make the CCSM framework more easily accessible and usable to a large class of users by means of a portal. We present a TeraGrid based web portal that allows TeraGrid users to run CCSM simulations on TeraGrid resources without having to individually port and install CCSM and without encountering lower level details such as specifics of batch queues and library locations. We describe the back-end CCSM installation, the front-end user interface for the CCSM portal and present an overview of a post-processing framework for the CCSM simulation results.

Implementing a Reliable File Transfer Service Cluster
J. Basney, P. Duda

As grids move from prototypes to testbeds to production infrastructure, grid resource providers are faced with the challenge of delivering reliable services to enable productive use of available resources. On high performance, distributed grids such as the TeraGrid, moving large data sets to, from, and between supercomputing resources requires reliable data management services. The Reliable File Transfer (RFT) Service in the Globus Toolkit Version 4 (GT4) provides this capability on the TeraGrid and other grids. We present modifications to RFT to support clustering to achieve high availability in the presence of server failures, based on a standard Web service tiered architecture, leveraging the capabilities of modern database systems.

Cyberinfrastructure for Remote Sensing of Ice Sheets
L. Hayden, G. Fox, P. Gogineni

Science and engineering research and education are foundational drivers of cyberinfrastructure. Understanding the relationship between sea level rise and melting ice sheets is the application domain of this project. It is an issue of global importance, especially for the populations living in coastal regions. Scientists are in need of computationally intensive tools and models that will help them measure and predict the response of ice sheets to climate change. To address the Cyberinfrastructure challenges presented immediately by CReSIS and the Polar science community in general, The Cyberinfrastructure Center for Polar Science (CICPS) with experts in Polar Science, Remote Sensing and Cyberinfrastructure has been established. This center includes The University of Kansas the lead CReSIS institution, Indiana University, which is internationally known for its broad expertise in research and infrastructure for eScience; and ECSU, a founding member of CReSIS with a center of excellence in remote sensing. CICPS includes CReSIS institutions as collaborators and will drive PolarGrid to meet their goals while using the best-known technologies. CICPS, founded with the vision that Cyberinfrastructure will have a profound impact on polar science, is committed to the effort needed to build the portal, workflow and Grid (Web) services that are required to make PolarGrid real. This paper de-scribes the set of CICPS projects that are being implemented and proposed. The first of these projects is an NSF CI-TEAM project (PI: Hayden, Co-PIs: Fox and Gogineni), "Cyberinfrastructure for Remote Sensing of Ice Sheets," which establishs a virtual classroom environment and aCReSIS Science Gateway for TeraGrid working with IU, MSI-CIEC (Minority-Serving Institution Cyberinfrastructure Empowerment Coalition) and TeraGrid. The second project, called PolarGrid, as proposed will deploy the Cyberinfrastructure (abbreviated CI), will provide the polar community with a state-of-the-art computing facility to process the large volumes of data to be collected by CReSIS field operations and to sup-port large-scale ice-sheet models. PolarGrid will follow modern open data access standards so that raw, processed and simulated data can be archived outside PolarGrid by and for the full science community.

Performance Analysis and Optimization of the Regional Ocean Model System on TeraGrid
Y. Zuo, X. Wu, V. Taylor

The Regional Ocean Modeling System (ROMS) is a large-scale application code that can be configured for any region of the global ocean, ranging from local to basin scales, and is widely used in oceanography and atmospheric sciences. In this paper, we optimize the ROMS application for efficient execution on a grid environment, the TeraGrid. The strategy used to optimize the ROMS entails combining multiple communications and overlapping the communication with the computation as much as possible. In particular, the communication function, MP_Exchange is rewritten, and several new communication modules are added. To demonstrate the advantages of making this change to the communication function, we focus on optimizing the function step2d of the ROMS code, which dominates execution time of the code. Experiments are conducted on the TeraGrid resources at NCSA, UC, and SDSC. The experimental results show that the overall application performance is improved up to 36%.

GridFTP Pipelining
J. Bresnahan, M. Link, R. Kettimuthu, D. Fraser, I. Foster

GridFTP is an exceptionally fast transfer protocol for large volumes of data. Implementations of it are widely deployed and used on well-connected Grid environments such as those of the TeraGrid because of its ability to scale to network speeds. However, when the data is partitioned into many small files instead of few large files, it suffers from lower transfer rates. The latency between the serialized transfer requests of each file directly detracts from the amount of time data pathways are active, thus lowering achieved throughput. Further, when a data pathway is inactive, the TCP window closes, and TCP must go through the slow-start algorithm. The performance penalty can be severe. This situation is known as the “lots of small files” problem. In this paper we introduce a solution to this problem. This solution, called pipelining, allows many transfer requests to be sent to the server before any one completes. Thus, pipelining hides the latency of each transfer request by sending the requests while a data transfer is in progress. We present an implementation and performance study of the pipelining solution.

Thursday June 7

A Partnership to Foster Understanding of Bioinformatics in North Carolina's High Schools
A. Vogel, S. McLean

In recent years the National Science Foundation has been asking principal investigators to make additional efforts to share their research with broad audiences, including those at the precollege level. This new expectation can be both exciting and daunting to research scientists. Though passionate about their projects, expert in the scientific content, and eager to share their knowledge, these men and women are typically not trained or experienced as educators at the precollege level. To address successfully the diverse needs of today's elementary, middle, and high school students and their teachers requires an understanding of precollege education ecosystems that can be significantly different to the learning environments provided in university and research settings. In addition, appropriate and sufficient resources, including personnel to develop and carry out these projects, are also needed. These potential barriers can be overcome when research laboratories partner with existing precollege education programs, bringing together complementary expertise and resources. In this paper, we describe such a partnership focused on fostering high school students' understanding of bioinformatics and, more generally, their appreciation of the computer as a tool for scientific research. We also note an innovative extension to such a partnership "the participation of a third collaborating organization, a bioportal project, to facilitate delivery of relevant precollege teacher training and student learning experiences that will further understanding of" and potentially broaden participation in" the TeraGrid. The project described here involves three partners: DESTINY, UNC-Chapel Hill's Traveling Science Learning Program; Dr. Todd Vision's laboratory in UNC-Chapel Hill's Biology Department; and the Renaissance Computing Institute (RENCI). A goal of this partnership is to extend the workforce pipeline into the precollege arena, which is so often overlooked.

Three-Dimensional Geofluid Simulation using Parallel geofe on the Teragrid Resources
P. Wang, D. Cohen, M. Person, A. Gopu, C. Gable

Parallel geofe, a parallel, three-dimensional paleohydrologic modeling program, was developed based on the serial code, with the aim to simulate ground water flow on the Atlantic continental shelf in New England within the past two million years. Its implementation and performance on the Teragrid systems are discussed.

TACC's Graduate Level Scientific Computing Curriculum
B. Armosky, J. Boisseau, W. Barth, K. Gaither, K. Milfeld

The Texas Advanced Computing Center at The University of Texas at Austin has developed a graduate-level scientific computing curriculum to enable scientists and engineers to use advanced computing technologies with maximum effectiveness. Four graduate-level courses comprise the curriculum: a comprehensive and detailed overview of scientific/technical computing, and three more specialized classes covering parallel computing, distributed/grid computing, and visualization and data analysis. Course materials are focused on developing practical skills while providing a solid theoretical foundation, and include presentations, tests, homework assignments, and programming projects. All content is developed using the deep expertise and extensive experience in working with current researchers of TACC staff, who will continuously improve the curriculum with new and revised materials as technologies evolve. The content is available for download and use by other institutions, and the classes will be taught remotely in 2008.

Enabling Knowledge Discovery in a Virtual Universe
J. Gardner, A. Connolly, C. McBride

Virtual observatories will give astronomers easy access to an unprecedented amount of data. Extracting scientific knowledge from these data will increasingly demand both efficient algorithms as well as the power of parallel computers. Such machines will range in size from small Beowulf clusters to large massively parallel platforms (MPPs) to collections of MPPs distributed across a Grid, such as the NSF TeraGrid facility. Nearly all efficient analyses of large astronomical datasets use trees as their fundamental data structure. Writing efficient tree-based techniques, a task that is time-consuming even on single-processor computers, is exceedingly cumbersome on parallel or grid-distributed resources. We have developed a library, Ntropy, that provides a flexible, extensible, and easy-to-use way of developing tree-based data analysis algorithms for both serial and parallel platforms. Our experience has shown that not only does our library save development time, it also delivers an increase in serial performance. Furthermore, Ntropy makes it easy for an astronomer with little or no parallel programming experience to quickly scale their application to a distributed multiprocessor environment. By minimizing development time for efficient and scalable data analysis, we enable wide-scale knowledge discovery on massive datasets.

A Fault Diagnosis and Prognosis Service for TeraGrid Clusters
Z. Lan, P. Gujrati, Y. Li, Z. Zheng, R. Thakur

In this paper, we present an ongoing research effort on developing an automatic fault diagnosis and prognosis service for large-scale computing systems, such as TeraGrid clusters. By leveraging the research on system health monitoring, the proposed service aims at automatically revealing fault patterns from historical data by applying data mining and machine learning techniques. To address key challenges posted by fault diagnosis and prognosis, two integrated techniques are developed: a knowledge base to accumulate empirical and inferred fault patterns from historical data and a meta-learning mechanism to optimally combine separately learned classifiers for improved detection and prediction accuracy. We also present preliminary studies by using failure logs from the BlueGene/L systems at SDSC and ANL.

The TeraGrid Knowledge Base: What's in it for you?
D. Hart, J. Bolte, M. Hrovat, D. Diehl

In simple terms, a primary goal of support is to deliver the information that people need at the point they need it. This involves identifying the information needed, collecting that information, making the information accessible, and maintaining it to keep it relevant and accurate. Properly done, a knowledge base has the potential to be a key component in meeting this need for the TeraGrid. To realize this potential re-quires the right toolset, the right processes, and the right partnership with the TeraGrid community. The short answer to the question posed is that the TeraGrid Knowledge Base is the right set of tools and processes to provide the information you need to accomplish your task as a customer of the TeraGrid or a support provider. This will be achieved through the efforts of a committed knowledge management team in collaboration with the TeraGrid community. The longer answer follows.

How to Run a Million Jobs in Six Months on the NSF TeraGrid
E. Walker, D. Earl, M. Deem

In June 2006, a team of researchers began submitting workflows across the distributed multi-user clusters on the NSF TeraGrid. The goal of their scientific study was to discover as many new hypothetical zeolite crystalline structures as possible. In just over six months, the researchers succeeded in completing over a quarter of a million workflows, comprising over a million jobs. In all, the team consumed a million computational hours, harnessing over 200 TFlops of distributed resources, to help populate a database of over three million new crystalline structures. This paper describes how this feat was achieved in such a short period of time.

Production Level Scientific Simulation Management on International Federated Grids
S. Zasada, B. Cheney, R. Saksena, J. Suter, P. Coveney

Key to broadening participation in the grid is the provision of easy to use access mechanisms and user interfaces, to allow a wide range of users with different skill sets to access the computational and data resources on offer. The Application Hosting Environment is one such middleware tool, hiding much of the complexity of dealing with grid resources from the user, and allowing them to interact with applications rather than machines. The nature of the AHE means that it can be used as a single interface to a wide variety of resources, ranging from those provided at a departmental or institutional level to international federated grids of super computers. The number, range and size of resources made available by federating grids makes possible scientific investigations that would previously not have been feasible. In this paper we describe how we have deployed the AHE to offer access to federated resources provided by the TeraGrid, UK National Grid Service and EU DEISA grid. We also present three case studies where the AHE has been used to facilitate production level scientific simulation across these federated resources.

Software Provider Forum

Brief Introduction to GridSphere
E. Roberts

An open-source portlet based Web portal.

Portal-based User Registration Service (PURSe), current and future work
M. Christie, R. Ananthakrishnan

The talk provides an overview of PURSe, which provides a customizable solution for automating user registration and credential management, especially for portal-based system.

Grid Portlets and Services from the Open Grid Computing Environments (OGCE) Collaboration
M. Pierce

We present an overview of the OGCE Grid portlets, Grid client programming libraries, and services that can be used to build science gateways to the TeraGrid.

TeraGrid's GRAM Auditing and Accounting
S. Martin

The GRAM service provides secure scalable efficient access to remote computing resources, but some auditing enhancements are needed to meet the security and usability requirements to allow gateway users to be multiplexed over a common/service credential. I'll describe how auditing information is produced by TG's Pre-WS and WS GRAM services and integrated with TeraGrid's central accounting system. Also, I'll present how a gateway (or any gram client) can use the OGSA-DAI client to retrieve usage for an individual job.

MyCluster Building personal clusters on demand
E. Walker

This talk will describe MyCluster; a system for building personal clusters on demand. The system is current a production service on the NSF TeraGrid. The current production version of MyCluster builds personal Condor clusters for users, via the deployment of semi- autonomous agents at a host cluster site. These semi-autonomous agents are responsible for submitting and managing job proxies through the local scheduler at a site. Job proxies contribute back to the personal cluster when they eventually run, allowing users to submit jobs into an expanding, and shrinking, personal cluster over time. Running jobs in the job proxies also allow us to provide a system call virtualized environment for executing jobs, where additional cluster-wide virtual services are provided. These additional cluster-wide services include a virtual private network, and a wide-area network distributed filesystem. This talk will focus on the long term vision of the MyCluster system, and discuss our efforts in iterating towards this vision.

Batch Queue Predictor (BGP)
D. Nurmi

Most space-sharing parallel computers presently operated by high- performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishing to use these batch- queued resources have accounts at multiple sites and have the option of choosing at which site or sites to submit a parallel job. In such a situation, the amount of time a users job will wait in a batch queue can significantly affect the overall time a user waits from job submission to job completion. The batch queue prediction (BQP) service offers queuing delay predictions for individual jobs.

Pegasus: Managing Large-Scale Computational Workflows on the TeraGrid
E. Deelman

Pegasus bridges the scientific domain and the execution environment by automatically mapping the high-level workflow descriptions onto= distributed infrastructures such as the TeraGrid, the Open Science Grid, and others. Pegasus automatically manages data generated during workflow execution and captures their provenance information. Pegasus provides robustness, scalability, and reliability through dynamic workflow remapping and through the use of the Condor DAGMan workflow execution engine. Pegasus is used in a variety of scientific applications ranging from astronomy, biology, earthquake science, gravitational-wave physics, and others. In particular the Southern California Earthquake Center (SCEC) project has used Pegasus for the past three years to manage over half a million workflow tasks and that used over 2.5 CPU Years.

Building Shared Collections with the Storage Resource Broker
R. Moore

Large scientific data collections are being assembled by cyberinfrastructure projects to organize simulation output, observational data, real-time sensor data, and experimental data. The size of the collections may be hundreds of terabytes and tens of millions of files. The files are typically distributed across disk file systems and tape archives located at multiple sites. The files may be replicated for disaster recovery or to improve access performance. Descriptive and provenance metadata are typically associated with each file to support browsing and discovery. Access controls and audit trails may be associated with each file. Examples of how to build such shared collections will be demonstrated through use of the Storage Resource Broker data grid, software middleware that supports data and trust virtualization. Management virtualization will be demonstrated through the next generation of data grid technology, the integrated Rule-Oriented Data System (iRODS).

GridFTP/RFT/tgcp
D. Frasier, R. Madduri

We will describe the tools provided by Globus Toolkit to perform high- performance data transfer. GridFTP is the default protocol used in Grid environments to provide this capability. RFT adds a Reliable and fault-tolerance layer on top of GridFTP. Both these services are wrapped into a easy-to-use tool for Teragrid in the form of tgcp that can automatically set network parameters that would provide better bandwidth and resource utilization.

Resource Discovery on TeraGrid with MDS4
L. Pearlman

We will describe the Globus Monitoring and Discovery System (MDS4), the MDS4 services deployed on TeraGrid, the information they provide, and the interfaces available to application developers and end-users to provide and query data.

Monitoring TeraGrid Functionality and Performance with Inca
K. Ericson

Inca automates user-level testing and benchmarking of a Grids services, software and environment and can be customized to show Grid usability from the perspective of a particular user or user community. Well present Incas current use cases on TeraGrid and discuss how it might be customized to support science gateway and TeraGrid community applications.

Demo Abstracts

Neutron Science TeraGrid Gateway (NSTG) Demonstration
J. Cobb, S. Miller, M. Reuter, V. Lynch, G. Pike

This demonstration will show the production operation of the Neutron Science portal including its use for instrument scientists and user at the Spallation Neutron Source (SNS). Features include:

  • Portal interface for user ease of use;
  • Account management authentication, authorization, and access control linked to SNS facility access roles;
  • Data archive, search, and exploration;
  • Meta-data capture; experiment workflow and data stewardship from DAS creation through various phases including conversion to standard format (NeXus), data reduction, analysis, and visualization;
  • TeraGrid simulation capability directly from TeraGrid via community user account notion
  • Support and deployment of community contributed codes for data reduction and analysis
  • Presentation of live early SNS data; and
  • Use by other experimental user facilities.

TeraDRE: A distributed rendering resource on the TeraGrid
D. Braun, J. Moreland, J. Woo, C. Song, L. Arns, G. Bertoline

The Teragrid Distributed Render Environment (TeraDRE) is a render farm implementation that has been created as a service on the TeraGrid. The system uses open source Renderman-compliant software, Pixie, as well as Alias Maya for rendering scientific research and complex animations. Integrated with Madison Wisconsins Condor batch scheduling software, users submit a job through custom software that we have written. We will demonstrate different ways of submitting rendering jobs using the TeraDRE service. We will also showcase a new drag-and-drop user interface and job/user management middleware being developed at Purdue.

Purdue TeraGrid Environmental Science Portals
L. Zhao, A. Basumallik, R. Kalyanam, P. Taezoon, C. Song

The Purdue Environmental Science Portals are TeraGrid based web portals that enable easy access to Purdue environmental data resources as well as the CCSM (Community Climate System Model) framework for simulating the earth's climate system. The data portal provides near-real-time access to NEXRAD Level II radar data, PTO satellite data (GOES-12, MODIS, AVHRR, Feng Yung), LARS remote sensing data, and CCSM modeling data. The CCSM portal provides an intuitive front-end interface to a back-end CCSM framework that allows TeraGrid users to easily run CCSM simulations on TeraGrid resources, shielding them from model setup and configuration steps which would be complex and time-consuming.

Using the TeraGrid Condor Resource
B. Whitson, P. Smith, T. Kesler, P. Cheeseman

Condor is a high-throughput computing environment utilizing the power of large collections of distributively owned computing resources. Condor manages workstations and resources automatically. Purdue provides a large Condor pool, consisting of more than 6000 processors, to the TeraGrid science communities. Many of Purdue's TeraGrid services, including earth observation and weather data processing, distributed rendering and visualization applications, are all supported by the Condor pools. This demonstration will describe the concept of using Condor for high-throughput computing as well as several projects on the TeraGrid that use Condor. We will compare various methods, their benefits and drawbacks; demonstrate the methods of discovering available architectures and resource properties, submitting jobs to the TeraGrid Condor resources, monitoring status, interpreting logs and history, enhancing productivity, etc. We will also demonstrate how to use tools such as DAGMan and Master-Worker to manage workflows, and how to build your personal Condor pool to manage jobs.

GridWay: A metascheduler for Globus-based Grids
I. Llorente, R. Montero, E. Huedo, T. Vaquez, J. Vazquez-Poletti

GridWay is a widely-used metascheduling technology that performs job execution management and resource brokering, allowing unattended, reliable, and efficient execution of jobs, job arrays, and workflows on heterogeneous and dynamic Globus grids. GridWay performs all the job scheduling and submission steps transparently to the end user and adapts job execution to changing Grid conditions by providing dynamic scheduling, fault recovery mechanisms, migration on-request and opportunistic migration. The GridWay metascheduler is a Globus product, released under Apache license v2.0, welcoming code and support contributions from individuals and corporations around the world.

GridWay provides the following benefits to the different stakeholders involved in a Grid environment: (i) for project and infrastructure directors, GridWay is an open-source community project, adhering to Globus philosophy and guidelines for collaborative development; (ii) for system integrators, GridWay is highly modular, allowing adaptation to different grid infrastructures, and supports several OGF standards; (iii) for system managers, GridWay gives a scheduling framework similar to that found on local DRM systems, supporting resource accounting and the definition of scheduling policies; (iv) for application developers, GridWay implements the DRMAA API (C and JAVA bindings) OGF standard, assuring compatibility of applications with LRM systems that implement the standard, such as SGE, Condor or Torque; and (v) for end users, GridWay provides a LRM-like CLI for submitting, monitoring, synchronizing and controlling jobs that could be described using the JSDL OGF standard.

There exist a number of commercial and open source workload management and scheduling systems available today, each one suitable for different underlying computer infrastructures and execution profiles. GridWay stands out from other metascheduling systems because it has been specifically designed to work on top of Globus, offering the highest functionality, quality of service and reliability on this kind of infrastructures.

The demonstration consists of three parts. The first part is a description of the state of the technology: main benefits and major features, alternatives for scheduling infrastructures, relevant use cases, and project status and roadmap. The presentation will focus on its state-of-the-art functionality, such as the new scheduling policies, which comprise job prioritization policies (fixed priority, urgency, share, deadline and waiting-time) and resource prioritization policies (fixed priority, usage, failure and rank). The second part of the presentation demonstrates its main functionality on the TeraGrid and the EGEE infrastructures, showing how GridWay is able to simultaneously access to distinct middlewares (GT pre-WS, GT WS and EGEE services), additionally allowing Grid interoperability and providing support to the transition to new Globus versions. Finally, the two main ongoing research lines of the GridWay project will be briefly described, namely: the management of virtual machines in the Grid, and the deployment of federated infrastructures using grid gateways.

Pegasus DAGMan Demonstration
E. Deelman, K. Wenger, K. Vahi, G. Mehta

We propose to hold a Pegasus-DAGMan demonstration that will cover the basics of workflow creation, planning and execution. The presenters will run workflows of increasing complexities across the Teragrid with a focus on ease of use. The demonstration will cover various techniques existing in Pegasus and DAGMan, that allow for reliable and efficient workflow execution on the grid with minimum user intervention.

Questions? Please contact us!
TeraGrid 2007 • Wisconsin Union Conference Services
800 Langdon Street • Madison, WI 53706 • Phone: (608) 265-8012 • Fax: (608) 265-8299
Email: teragrid@union.wisc.edu

If you have trouble accessing content within this site, please contact the Webmaster.
Copyright © 2007

JamesRiver sgIBM GridTodayHPC