This week my development improvement focused on PostGIS which is a way to conduct spatial analysis using SQL on data stored in a PostgreSQL database. The advantage of using PostGIS over desktop GIS applications including proprietary software from ESRI Arcmap or open source Quantum GIS is the ability to work at scale and native relational database support. The normal order of operations for spatial analysis is to first retrieve the data using SQL, export as a csv, import into third party software, geocode addresses, and start spatial analysis. With PostGIS every step from start to finish, even the query itself, is built into PostGIS. Desktop GIS software stores spatial data as a flat file called a shapefile. This is great for a single user to access however it does no support multiple users or applications calling this spatial data at the same time. This is where PostgreSQL and PostGIS shine as they allow for concurrent user access using a commercial opern source solution. The key differences between this second and first generation GIS solution is explained well in this PostGIS introduction. In summary PostGIS offers tremendous automation, speed, and consistency in spatial analysis on a large data set in PostgreSQL.
For learning how to begin I went where all the forums pointed to, Boundless: Introduction to PostGIS. I completed the first few examples using the Open Geo Suite and there is a lot to learn. However if you are already familiar with SQL and spatial analysis concepts such as projections, spatial joins and buffers then the real learning curve is a change in approach. Rather than seeing spatial analysis as a single step in a chain of data preparation steps one is now integrating all those steps into a SQL based approach and can be automated using Python functions. A standard database has data types of string, numeric, and Date/Time Stamp. PostgreSQL is a community driven relational database and is also extendable so that additional datatypes such as spatial based entries are allowed. I believe that PostGIS is well suited for dealing with dynamic spatial data stored in a relational database while desktop GIS software is better suited for one off projects where the data sources are already in shapefile format and users are not concurrently accessing this spatial data.