A square GIS in a round world
Data analysts and geo-specialists, seeking to perform visual analytics on their geospatial and geotemporal data sets often struggle with the same problem: neither their GIS systems nor their conventional analytics tools seem to do the job. It is not uncommon to hear professionals saying:
“If I don’t know beforehand what it is I’m looking for, I can’t use my GIS system for quick and interactive analysis to find “the needle in the haystack”.
“My GIS is fine for storing and editing data. But for working interactively it’s way too static: it takes ages to load the data. And my analytics tool is really powerful… but lacks a decent geospatial interface with a map view. So what I typically do is write scripted code in my analytics tool, and it’s from there that I run the analysis. Afterwards I use my GIS just to display the picture of the end result of my analysis. So I first need to run the analysis – and only in a second phase I get the visuals”.
These statements are usual to crop up. This happens mostly because traditional systems that were not conceived for interactive analytics on geospatial/geotemporal data are still being used. So, why does this happen? Because at their core, traditional GIS systems (as well as GIS extensions to today’s analytics tools) are built up around so-called “geodatabases”. The geodatabase is the native data structure of traditional GIS, and is the primary data format or database model used for editing and data management. Geodatabases typically need a lot of time to pre-process and ingest data (ETL anyone?). And they have been designed from the outset to handle ‘static’ geospatial data that does not rapidly change overtime.
A systems architecture based on a geo-database is fine for use cases where your geo-data remains idle for a long period of time and where the data is not complexly intertwined. A cadaster for instance, or an inventory of street networks.
Geodatabases get sent in a spin though when you want to look for complex relationships between different “layers” of information (for instance background maps, tracks of moving individuals or assets and weather data). And geodatabases are typically not set up to handle the aspect of temporality – which is crucial for data analysts.
Does this mean your geodatabase-driven GIS gets sent straight to the dustbin? Not necessarily. Traditional GIS may still have its role in static environments where performance and multilayered analytics are not important. However, innovators and nimble, forward-looking organizations increasingly look elsewhere for making the most of their geospatial data.
I have some advice for those GIS users who are dissatisfied with the status quo offered by the conventional geospatial software vendors out there. For those of you who feel like you’re sitting on top of a treasure of valuable geospatial data – but you’re lacking the “analytical key” to truly open up the treasure box and to turn the data into actionable information. Sounds familiar? Then read on!
1. Don’t Lock Your Data in a Geodatabase – Open Up Your Data Through In-Memory Access
If you haven’t yet started building your infrastructure, it’s best to avoid the pitfalls of the geodatabase from the outset. Choose a geospatial technology with in-memory data model instead.
Luciad has explained the pitfalls of the geodatabase in a white paper. In summary, the choice for a technology that accesses your geo-data in-memory has the following benefits over geodatabases:
- Great support for temporality. In-memory ingestion of data allows you to handle real-time or near-real-time data. More in general you’ll have great support for time-stamped or temporal data – something that we at Luciad have been doing for over 16 years.
- No data lock-in. Your data resides in its original repositories. In contrast, once you’ve moved your data into a geodatabase it’s very hard to get the data back out again. Because the data has been pre-processed and conditioned to “fit” the geodatabase architecture.
- Adapt to your overall systems architecture. Because your data resides in its original repositories, you can adapt your geospatial system to your bigger systems architecture. Classical GIS with a built-in geodatabase often force you to think your whole systems architecture around the GIS.
- No relational wall. In-memory data models are well-fit to reveal patterns, relations or anomalies. You won’t hit the so-called “relational wall”. Geodatabases instead are relational databases (RDBMS), which require many so-called slow “join operations” on tables to answer these queries. That’s why classical GIS systems with internal relational geodatabases are bad at connecting the dots.
2. Don’t Reinvent The Wheel If You Don’t Need To: Use Components
What if you haven’t read this article in time and you have embarked on the geodatabase path? Should you start again from scratch? Well, not necessarily. Although suboptimal, you could build a web services wrapper around your classical GIS and extract the data. At Luciad, we’ve done that for a number of customers. We’ve connected their traditional GIS with our high-performance LuciadFusion server. LuciadFusion taps all the data, efficiently caches and fuses it together with other data sources, and allows you to “shield” yourself from the inherent slowness and complexity of your underlying GIS.
The reason why we at Luciad can help our customers build “on top” of their legacy infrastructure, is that our software products are componentized. All Luciad software products are plug-and-play by themselves, but you can also pick and choose exactly those components or building blocks that you’re lacking, and leave the remainder of your infrastructure as is.
3. It’s All About the Interface
It’s all about the interface. Simple, yet so true.
The geodata in your system may be very precise and very rich. But if the interface, the figurative “lens” you’re looking through, is not good, then your insights will still be blurry.
At Luciad, we’ve done a lot of groundbreaking work in making sure our users can look at their data through very different “lenses” or views. Users know us since many years from our “hybrid view” where you can visualize and edit geodata within one single application in 2D and 3D.
Another classic is our “vertical view”. Very interesting if you want to analyze a flight trajectory for instance. In the wink of an eye you can see which spaces or areas on the map affect the flight, and at which height the flight is.
More recently, various customers and partners have asked us for solutions where they want to analyze their data through different “lenses” at the same time.
For SAP, we’ve built a “human geography” interface to analyze a huge set of data around violent events on the African continent. Have a look at this movie where Luciad showcases an application that allows for powerful and intuitive analysis in complex datasets. The user can query the data not only through the map interface, but also through a time bar and through a word cloud. All these different “lenses” are linked to each other. Zoom in on a specific period in time for instance, and your results on the map view will automatically update and adapt. You can analyze for instance how Boko Haram developed not only over time and in space, but also what the means of violence were that this group has used.
Here’s another challenge we got: We were given a map of a theme park along with over 10 million visitor positions. It was up to us to spot unusual visitor behavior – finding the needle in the haystack. What did we find out? Check it out here. For this application we performed “convoy analysis” where we group visitor positions that are logically correlated. And we created an entirely new view: the “sequence view” (appearing around 3’30’’ in the movie). The sequence view allows you to drill down in every single one of the visitor positions to see which individuals went to see which attraction in the theme park at a given time.
By the time you’re reading this, you may already have come across some online video that is showcasing similar functionality. In this era of fake news, the advice I have for you to separate the original from the copy is to ask for a live demo, with live data if possible. Recording a video after hours of preprocessing is one thing – showing stuff live is another.
We hope the above tips can help you to make the right choices for your geodata analytics projects. If some thoughts get sparked up in the process, things will become all the more interesting. Maybe you have an idea for another stunning “view” that can help the analyst community gain faster and better insight in the vast datasets that are available today. We’d love to hear from you!