XSEDE Science Successes
Analyzing data for transportation systems using TACC's Rustler, XSEDE ECSS support
Published on February 13, 2017 by Faith Singer-Villalobos
In the next 10 years you are going to see some form of autonomous or connected vehicles on the streets. Natalia Ruiz-Juri, a research associate with The University of Texas at Austin's Center for Transportation Research (CTR) is fairly certain of this. She is one of many researchers at CTR and The University of Texas at Austin (UT Austin) who are studying the wide range of technical, social and policy aspects of connected and autonomous vehicle (CAV) technologies.
Fully autonomous vehicles or driverless cars are capable of sensing their environment and navigating without human input. They can detect surroundings using a variety of techniques such as radar, lidar, GPS, odometry, and computer vision. Similarly, connected vehicles (CVs) are vehicles that can exchange messages containing location and other safety-related information with other vehicles, and with devices affixed to roadside infrastructure.
CVs share information in the form of Basic Safety Messages (BSMs) with other vehicles and the infrastructure; these include vehicle position, speed and breaking status. Such real-time feedback and information exchange between vehicles is expected to greatly enhance safety, and it opens the door to several possibilities in traffic management.
For example, vehicles could talk to other vehicles that are much further ahead and get warned about congestion or dangerous conditions, thereby allowing a driver to make strategic decisions and take a different path.
Additionally, vehicles could also talk to infrastructure, such as an intersection light, which might be capable of tracking the number of vehicles passing through and potentially adjusting the signal timing plan accordingly. The advent of CVs would therefore have huge promise in improving traffic management and the overall utilization of transportation infrastructure, particularly if vehicle connectivity is considered along with automation.
While the basic goal of CVs, in particular, is safety — experts hypothesize up to 80 percent less accidents in the future — the data generated by CVs has an enormous potential to support transportation planning and operations.
THE BIG DATA PROBLEM
At this point researchers are still exploring diverse datasets. A number of connected vehicle test beds and autonomous vehicles test sites have been planned, or are already in place. Texas is part of one of the 10 US-Department of Transportation-designated autonomous vehicle proving grounds, and research sponsored by other agencies, such as TxDOT and the North Central Texas Council of Governments is also happening at UT Austin.
"The volume and complexity of CV data are tremendous and present a big data challenge for the transportation research community," Ruiz-Juri said. While there is uncertainty in the characteristics of the data that will eventually be available, the ability to efficiently explore existing datasets is paramount.
Ruiz-Juri and her colleagues, including Chandra Bhat, James Kuhr and Jackson Archer, were interested in exploring the most comprehensive data set released to date — the Safety Pilot Model Deployment (SPMD) data, produced by a study conducted by The University of Michigan Transportation Research Institute and the National Highway Traffic Safety Administration.
However, to get started they needed help using computational resources. They turned to the Texas Advanced Computing Center (TACC), also at UT Austin, and a key partner in the Extreme Science and Engineering Discovery Environment (XSEDE), the most advanced, powerful, and robust collection of integrated advanced digital resources and services in the world. Through XSEDE, Ruiz-Juri and team took advantage of the Extended Collaborative Support Services (ECSS) program, and the TACC experts within the program, to make these resources easier to use and to help more people use them.
TACC ECSS experts Weijia Xu and Amit Gupta were able to help Ruiz- Juri and her colleagues figure out how to use very large datasets on supercomputers like Rustler, TACC's experimental system for exploring new storage and data compute techniques and technologies.
Ruiz-Juri and her colleagues compared efforts to build scalable solutions for CV data analysis using Hive, an open-source data warehouse application that supports distributed queries. The data included approximately 2,700 cars, trucks and transit buses whose activities were logged through on-board sensors over a two month period.
"Hive is an ideal choice in this particular use case since it not only offers scalability and performance but also has a SQL-like interface," Xu said, referring to Structured Query Language used to manage data. "It is similar to PostgreSQL which the research team is already familiar with."
According to Ruiz-Juri, using Rustler is a huge time-saver because it lets them play with the data and see what it looks like without spending hours waiting for a query to complete.
As a researcher, Ruiz-Juri said one of the challenges she faces is not knowing which system to use on a particular model for a particular dataset. This is one of the many ways that Xu and Gupta were able to help. They developed an automated methodology to understand how each system is expected to perform based on the characteristics of the network. For this work they used Rustler, but soon they plan to move the data to Wrangler, an XSEDE-allocated resource.
"Natalia and her colleagues were trying to make sense of the data," Gupta said. "It's unfiltered data from real people capturing their movement patterns across the city. All of this data was sampled at 10 times per second — speed data, when a person used their brakes, when they used their windshield wipers etc — so it's a lot of information and nobody has completely figured out what to do with it. Natalia and her team are trying to validate, and in some cases possibly break through, some of the assumptions that they traditionally made in their field with respect to traffic patterns."
In addition to determining which system to use for which model, Xu and Gupta also helped Ruiz-Juri and colleagues create a friendly user interface to remove some of the hurdles of using a command line. If you don't have an interface, the researcher has to come up with something manually and they may not have time or funding to do that, especially when in exploration mode. "The interface gave us the opportunity to look at this data now instead of, say, two years down the line in the project," Ruiz-Juri said.
"The XSEDE ECSS program has been great for us," Ruiz-Juri said. "We get together and we talk about projects and research in general. Amit and Weijia have started to understand more about what we are doing, so for me the best part is not when we know what we want and they help us, but when they understand enough of what we're doing and can come up with new ideas on their own. We've been working together for over three years now on different projects."
The goal is to enable their research exploration by leveraging HPC tools and infrastructure, according to Gupta. Due to the scale of such resources available at TACC, they are able to iterate through their analysis cycle much quicker and converge towards conclusions faster. It also enables them to attempt new simulation experiments that would overload their computational resources or take prohibitively long to run.
"I enjoy working on this project very much," Gupta said. "It's one of my favorite projects. It's a very challenging and interesting application of computer science to a real world problem."
THE CHALLENGE OF TRANSFORMATIVE TECHNOLOGIES
One of the challenges with research in this field is that connected and autonomous vehicles can be disruptive, according to Ruiz-Juri. How do we anticipate what's going to happen in the future when this type of technology can change not only transportation system performance, but also travel choices and behavior?
How are people going to react to this technology? Are they going to purchase more cars, fewer cars? Are they going to travel further? Are they not going to care about travel time any longer so they move further away from downtown?
Researchers want to understand how they need to modify existing models so that they can consider all these complex, interrelated impacts, when assessing the effects of CAV technologies into the future. Advanced models require significant computational resources, and TACC experts have already supported CTR in the use of HPC for their simulations.
"It would have been very hard without the help of Amit or Weijia to be able to have visibility and access to HPC from the interface of a preexisting code that we use for modeling, and which may be central to future research in CVs. They helped us a lot in terms of how to access the systems, how to set up log-in, writing scripts, authentication, creating accounts, and much more," Ruiz-Juri said.
Using models to test all of the hypotheses and questions can transform the way we think about living and travelling.
"I think that vehicle connectivity is going to happen relatively soon and it's going to make travel safer — it's something to look forward to," Ruiz-Juri said. "It also gives us the opportunity to collect a lot of data so we can look at operating transportation systems differently. It has huge potential for safety and traffic operations. Automated vehicles are an exciting possibility that can truly transform how we travel, and lead to major changes in lifestyle choices and decisions."
Natalia Ruiz Juri of the Center for Transportation Research, The University of Texas at Austin
Preliminary visualization of trip-level data after processing on Rustler.
A web-based interface to run large-scale advanced transportation models in TACC's HPC resources.
- XSEDE Resources, Trinity Enable Non-Human Primate Reference Transcriptome Resource to Support Study of Genes in Our Closest Relatives
- Turtle Tree of Life
- Region 1 Champions meet at Idaho National Laboratory
- Crash test simulations expose real risks
- NSF supports development of new arctic maps
- How was the planet Earth formed?
- Exploring Large Data for Scientific Discovery
- XSEDE Value Added
- Scholars program helps realize dream
- Making sense of cyberinfrastructure
- XSEDE15 Wrap Up
- Bioinformatics Scripts Solutions
- XSEDE15 Plenary Panel
- Polymer Potential
- The Future of NSF Advanced Computing Infrastructure
- 2015 International Summer School on HPC Challenges
- A Catalyst for Complexity
- As Austin Grows So Does Its Traffic Woes
- The University of Tennessee, Knoxville, Wins Second Place in an International Student Supercomputing Competition
- PSC Receives NSF Award for Bridges Supercomputer
- Innovative New Supercomputers Increase Nation's Computational Capacity and Capability
- Exploring Competitive Balance
- A Direct Bridge
- The Dopamine Transporter
- XSEDE Supercomputers Laid the Foundation for an Unprecedented Simulation of Cosmological Evolution
- Big Data Needs Big Funding
- XSEDE helps create a more effective way to assemble genomic information
- Of Micelles and Machines
- XSEDE Allocation System to Receive Makeover
- Internet2: Advancing Science in the Age of Big Data
- XSEDE User Portal At Your Fingertips: Mobile App
- Researchers Study Air Pollution
- Dan Stanzione: New Executive Director at TACC
- People of XSEDE: Campus Champions - Preaching the HPC Gospel
- XSEDE and Blue Waters Go Supernova
- Two at a Time
- Show Him the Money
- Cosmic Slurp
- Turning Salt into the Unknown
- Looking Inside Images
- Farming the Wind
- Breaking out of the Digital Graveyard
- The Mechanism of Short-term Memory
- Open Science and Industry Collaboration
- XSEDE, Prace Call for Requests of Joint Support
- XSEDE Wins HPCWire Award
- Shields to Maximum, Mr. Scott
- The Ultimate Timekeeper
- Blue Waters, XSEDE sign collaborative agreement
- People of XSEDE - Outreach programs set XSEDE apart
- Wrangler Reels in Award
- The Great Comet: NSF awards $12 Million Grant to SDSC to deploy Comet
- Meet the Gribbles
- 2013 Nobel Prize in Chemistry winners bring HPC to the lab
- XSEDE helps create a more effective way to assemble genomic information
- XSEDE facilitates large-scale image analysis to understand diseases
- XSEDE announces new campus briding services and tools
- XSEDE, NSF Release Cloud Survey Report
- XSEDE13: Programming Competition Allows Students to "Geek Out" and Gain Crucial Skillsets
- Katlin Thaney gave XSEDE13 Keynote: Gateways for Open Science
- XSEDE13 conference selects best papers, posters visualizations and more
- XSEDE13 speaker tells how turbulence simulations help make movie magic
- XSEDE13 Plenary Talk: Accelerating Brain Research with Supercomputers
- Invited speakers announced for Extreme Scaling Workshop - Heterogenous Computing
- XSEDE13 speaker LeManuel "Lee" Bitsóí: Democratizing Scientific Research
Read more about Bitsóí's talk at this year's conference
- More than 70 students from 4 continents gain HPC skills at fourth annual Summer School
- Registration opens for Extreme Scaling Workshop 2013
- Campus Champions Fellows Named
- Campus Champions program reaches 200 members
- Rock Snot Genomics: University of Texas researchers use advanced sequencing and TACC's Ranger supercomputer to uncover origin of common algae
- Experiencing some turbulence: Researchers Take on One of Physics' Most Important and Enduring Problems
- Register now for Virtual School summer courses on data-intensive and many-core computing
- XSEDE seeks a Scientific Workflow Specialist for Extended Collaborative Support Service
Applications are due May 31, 2013
- XSEDE13 schedule now available online
- Students from high school to grad school levels invited to participate in programming contest at XSEDE13 high performance computing conference
- SDSC's Gordon enables discoveries in the study of genetics Read about Gordon's role in pinpointing the genetic patterns underlying autism-spectrum disorders, schizophrenia and similar brain conditions.
- XSEDE, National Computational Science Institute offer summer workshops for educators
- XSEDE13 Student Day applications due May 15 High school and undergraduate students get hands-on experience in computational science and interact with expert researchers
- XSEDE upgrades to Internet2's 100G Network
- XSEDE13 Registration now open!
- Get to know XSEDE Staff XSEDE Allocations Manager Ken Hackworth: The Man, The Myth, The Legend
- Two sponsors commit to XSEDE13 conference: Cray and Intel .
- Texas Unleashes Stampede
- Swirling Secrets-Understanding the turbulence of gases
- Blacklight helps researchers develop better materials for carbon capture
- Journey to the limits of spacetime
- Students invited to participate in XSEDE13 Multiple ways for high school, undergraduate, and graduate students to get involved; funding support available.
- XSEDE Call for Humanities, Arts and Social Science ProjectsIf you and your collaborators need to access to large collections of digital data, need more computer power, or require substantial storage capacity and computing power – please share it with XSEDE.
- XSEDE needs your feedback! If you received an invitation to complete the 2013 User Satisfaction Survey, please take 10 minutes today to share your comments about the XSEDE user experience.
- XSEDE deploys Globus Online for data transfer The first official software service on XSEDE has been accepted for production deployment
- The Stampede Era Begins XSEDE supercomputer now operational and available to the national open science community
- Call for ParticipationInternational Summer School on HPC Challenges in Computational Sciences
- XSEDE, European Grid Infrastructure seek collaborative use cases
Deadline extended to March 8!
- XSEDE offers free online parallel computing course Learn to use parallel computers more efficiently and productively
- NICS makes the top of Green500 list XSEDE partner recognized for energy-conscious high-performance computer, Beacon
- XSEDE's John Towns appointed to Compute Canada board of directors Board includes leaders in industry, academia, and computational research
- STILL ACCEPTING RESPONSES to Cloud Use Survey from XSEDE, NSF All researchers encouraged to respond and help shape future of cloud computing in XSEDE
- Make room for Stampede: TACC expands data center for new supercomputer
Read more about the new data center at TACC
See TACC Deputy Director, Dan Stanzione describe the new center
- SDSC welcomes Gordon supercomputer as a research powerhouse
Read more about SDSC's Gordon
- Campus Bridging Early Adopter Program issues Call For Proposals to be submitted Dec. 1-9
Read more about the program
- XSEDE12 announced -- first conference of Extreme Science and Engineering Discovery Environment
Read more about XSEDE12
- PSC, SGI Team Up on Shared-Memory Supercomputer
Read more about PSC's shared-memory supercomputer
- Pittsburgh Supercomputing Center Wins High-Performance Computing Award
Read more about PSC
- Blacklight Goes to Work at the Pittsburgh Supercomputing Center
Read more about Blacklight
- Ranger supercomputer's lifespan extended one year as part of NSF XD initiative.
Read more about Ranger
- Kraken set to deliver 2 billionth CPU hour, sustains 96 percent utilization
Read more about Kraken
- TACC Offers New, Broader Computational Biology Software Stack to Open Science Community.
Read more about biology software stack
- ACM launches new Special Interest Group on High Performance Computing. Join by Nov. 18 for special rate.
Read more about the new SIGHPC
- 'What Are You Working on Today,' Ranger, Jaguar and iForge?
Read more about TACC's Ranger supercomputer
Read more about ORNL's Jaguar supercomputer
Read more about NCSA's iForge supercomputer
- Adventures with HPC Accelerators, GPUs and Intel MIC Coprocessors
Read more about experiences with new hardware
- Developing Scientific Computing Communities
Read more about development efforts
- Indiana University to create the National Center for Genome Analysis Support, which will be integrated with XSEDE resources
Read more about the NCGAS at IU
- Scientists use XSEDE/TeraGrid resources to determine how shock waves move through solids
Read more about 'super-elastic shock waves'
- XSEDE upgrades network
Read more about the XSEDE upgrade
- Richard Tapia, Rice University mathematician and professor and member of XSEDE outreach team, receives National Medal of Science
Watch the Oct. 21 webcast
Read more about Tapia's award
Learn more about Richard Tapia
- Stampede's comprehensive capabilities to bolster U.S. open science computational resources
Read more about Stampede
Watch a video of Jay Boisseau, director of TACC, discussing Stampede
- SDSC announces scalable, high-performance data storage cloud
Read more about SDSC cloud
- Appro and SDSC Gordon supercomputer to provide up to 35M IOPS
Read more about SDSC's Gordon
- Dr. Barry Schneider from the National Science Foundation to describe XSEDE in the Oklahoma Supercomputing Symposium keynote, Oct. 11-12
Read more about Dr. Schneider's keynote
Go to symposium site
- Students research solar cells with HPC
Read more about HPC and solar research
- Seeing Is Believing: Extreme Digital visualization and data analysis resources help researchers derive insights from massive data sets
Read more about Extreme Digital
- New "Memory Advantage Program" on Blacklight at the Pittsburgh Supercomputing Center
Read more about PSC's MAP
- XSEDE project brings advanced cyberinfrastructure, digital services, and expertise to nation's scientists and engineers
Read more about XSEDE
- Watch the John Towns video
- How XSEDE will facilitate collaborative science
Read more about XSEDE and collaboration