Growing in volume by the hour, big data is everywhere. Just reflect for a moment on the many forms of data we use regularly as a modern society:
- Satellite aerial imagery
- Global positioning
- Atmospheric weather data
- Surveillance security video
- Silly cat videos on YouTube
as well as content from social media and websites, to name a few. The possibilities and formats are many. Some predict that the quantity of data in the world will increase by 40% every year. Consider the reference from IDC Research that the information published within the last 10 years is equal to all of the information published since the dawn of the printing press and you start to understand the enormous growth in data.
It will come as no surprise that a primary characteristic of big data is the enormous volume of information involved. An ever-changing value, what is considered “big” today at 1000 Exabytes of data may be considered an ordinary volume come tomorrow. Indiscriminately tracking and recording, the data generated is of a scale and scope beyond the ability of today’s software tools to manage, process or interpret.
The Unique Challenges of Unstructured Data
There are two main players in this “beast” that is big data. What is known as “structured” data has been collected by agencies and corporations for years and is typically stored in databases on site as part of a tightly managed and centralized system. Think of the large mass of information routinely gathered by corporations in the form of client profiles or customer purchasing patterns as a good example. In contrast, unstructured data is widely distributed (note the downloadable data on the websites of USGS, USACE, NASA and others), constantly changing in real time, and managed by a radically different (often Hadoop-based) infrastructure. The greatest challenge for the field of engineering, and any industry for that matter, is to develop the capacity to use big data, particularly unstructured data, in a way that advances the engineering process.
Big Data and New Frontiers in Engineering
In a recent article published in ASFPM’s newsletter (Oct 2015), Emeritus Director Larry Larson points out the need to collect, store and provide engineering practitioners with standardized, usable data. Because of the public safety and health considerations that are involved in floodplain planning, he argues that we have an ethical duty to incorporate, to the greatest extent possible, big data in project planning. Decision-making that draws on all possible sources of information would ultimately result in better mitigation plans, more accurate analyses of structural damage, and better cost benefit forecasts.
We couldn’t agree more with Larson’s views. A project’s utility and benefit to the community will be enhanced when additional data can be integrated into the analysis. But the truth is that while there are a few global initiatives underway to capture this data in a usable format, most practicing engineers are without a means of integrating big data into a standard engineering design process. Agencies such as USGS and NASA host web portals allowing engineers to manually download data files directly onto their desktops. This still leaves the engineer with the problem of figuring out how to incorporate those files into their engineering software and ultimately their project. This is the scenario most engineers encounter: data files that are very relevant and key to a project, but ultimately inaccessible for all practical purposes.
A Different Approach
At CivilGEO, we have worked to integrate big data into the engineering process while striving to keep our software methodology streamlined and intuitive. Our software is directly tied into the data centers of NASA, USGS, FEMA and others to draw from their vast storehouses of mapping information. This methodology also allows us to stream high-resolution digital elevation data from USGS for the entire North American continent directly into our software. EPA’s watershed delineation areas, FEMA’s flood data, and cloud-based GIS data are accessible to all users of GeoHECRAS. We are constantly tracking new streams of data. For example, as Canada continues to map floodplains and floodways for different provinces, we will add these data streams as they become available. This data is pulled into an engineer’s project in real-time, allowing the most up-to-date information to be utilized. This is truly a breakthrough technology for HEC-RAS modeling, ultimately making the engineer’s project better and more accurate.