Saturday, November 7, 2009

13.2. What Is a Spatial Database?











 < Day Day Up > 







13.2. What Is a Spatial Database?



A database is a tool for storing and accessing tables of information. Traditional databases store information in fields and records (columns and rows or attributes and values). The types of data that fields can hold varies across different types of databases but, generally speaking, they hold numeric and text data. The main feature of a database is that of querying, where you can retrieve information that meets your specific criteria. Relational databases allow you to join information from multiple tables using a common piece of information that is in both tables.



To learn more about relational database management systems (RDBMS), see http://en.wikipedia.org/wiki/RDBMS.




A spatial database is much the same, but it can also store geographic data. Several databases and GIS products use the term spatial database to mean slightly different things. For example, ESRI's Spatial Database Engine (SDE) isn't a spatial database, but is advertised as an interface between client software and a normal database. It allows spatial data to be stored in SDE's format, in the database. To load and manipulate the spatial data, you need to have an ESRI product or access to an ESRI service.



ESRI's SDE product isn't the only option for spatially enabling a database. Despite the limited amount of marketing, there are other databases available that better fit the description of spatial database. Oracle has an extension called Oracle Spatial and IBM has a Spatial Extender for the DB2 database. MySQL database also has a spatial extension. All these are similar to PostGIS; they store and access the spatial data using database tools, without requiring specialized GIS software to access or modify the data.



PostGIS has several features most commercial spatial databases don't, some of which initially draw people to PostGIS as an enterprise spatial database management system. PostGIS is actively developed and supported. Support for PostGIS is built into an increasing number of applications including MapServer, JUMP, QGIS, FME, and more. There are extensive functions for interacting with spatial data, without needing a GIS client application. This has inspired many to use PostGIS.



PostGIS can be considered an advanced spatial database because it has the ability to both store and manipulate spatial data. PostGIS isn't simply a data storage repository, but also an environment for interacting with spatial data. The OGC has created a specification for storing and querying spatial data in an SQL database�the Simple Features Specification for SQL (SFSQL). OGC specifications aren't just nice to have; they are becoming an integral requirement for geospatial data interoperability. When sharing data or enabling open access to geospatial information is required, these open standards become critical.



PostGIS has one of the most robust implementations of the SFSQL specification, according to the OGC resources page at http://www.opengeospatial.org/resources/?page=products. Because PostGIS implements all SFSQL specifications, you can access standardized functions without proprietary twists and turns over the lifespan of your projects. Some other applications only implement subsets of the SFSQL specification.



The way geographic data is presented in PostGIS (as returned in a query) is very intuitive, as you will see from the examples in this chapter. You can get access to coordinates of spatial features through a variety of methods, depending on your need. You can query from a command-line SQL tool or view them graphically with mapping software. Either way, you aren't bound to using proprietary tools to get access to your information.





13.2.1. Server-Based GIS



PostgreSQL is a database server product. When requests are made to the database, the server processes the request, prepares the data, and returns the results to your application. All the heavy-duty work is done on the server, and only the results are sent back to you. For many applications, this is critical. Even with the horsepower of modern computers, most PCs aren't designed to handle the intense workload of database queries. If all the data had to be sent across a network to be processed by your application on the user's machine, the network and client program would be a major performance bottleneck.



The same problem exists for GIS and spatial data management. Many, if not most, GIS desktop applications have a strong reliance on the user's computer. This may be fine for normal mapping processes (though mapping complex features can be very slow), but when you consider doing more advanced spatial analysis, problems appear.



Consider this example. You have a large number of polygons to merge together based on some attribute. The GIS program loads the required polygons into your computer's memory or into a temporary file. This alone can be a major bottleneck, because it sucks down large amounts of data over the network. However, when the process of merging polygons begins, it isn't uncommon to see major memory and processor used, not to mention major hard-disk activity while churning through all the data.



Another issue is that the desktop GIS program may not have the capability to do the analysis you need. Your options are usually to purchase an add-on software component or use another application to process the data. The same process as in the previous example occurs: there's heavy data copying and processing on the PC, and you often need to convert the data to another format.



PostGIS takes advantage of the server-based database by making an extensive set of GIS functions available on the server. One way to think of this is that PostGIS includes the spatial data storage and also spatial data-manipulation capabilities usually found only in desktop GIS products. This significantly reduces the requirements of client applications by taking advantage of the server's capabilities. This is a key strength of PostGIS.



Future GIS desktop applications will be little more than products for visualization, with the GIS functionality happening in the spatial database. There is already some movement toward this model. With the openly accessible capabilities of PostGIS, application developers can build spatial capabilities into their database applications right now.



PostGIS is an extension of the PostgreSQL database, so having a standard PostgreSQL installation is the place to start. A custom compiled version of PostgreSQL isn't required to use PostGIS. PostGIS consists of three components:




PostGIS libraries



The core library is libpostgis.so or libpostgis.dll on Windows. This library is the interface between PostgreSQL capabilities and the spatial abilities of PostGIS.


PostGIS script for functions and types



There is one main script that loads in the hundreds of PostGIS specific functions and types: postgis.sql. Newer versions of PostGIS don't have a postgis.sql file; instead you use a file called lwpostgis.sql.


Optional script for project support



An optional script called spatial_ref_sys.sql is often loaded that lets you use spatial reference systems or projections with PostGIS data.



The scripts are platform-independent, so getting the libraries you need is the hardest part. You may want to compile your own PostGIS extension to include custom capabilities, though this is increasingly unnecessary.



Documentation from the PostGIS web site is a good starting point for everyone, and walks you through more detail than presented here. Look for the online PostGIS documentation at http://postgis.refractions.net/docs/.



The examples in this chapter are based on PostgreSQL Version 7.4 and PostGIS Version 0.8. At the time of writing PostgreSQL is at Version 8.0 and PostGIS 1.0 is about to be released. More recent versions are available, but there are still some problems being ironed out. Some of the examples will show slightly different results if you use the newer versions.


















     < Day Day Up > 



    No comments: