TYPES OF DATABASE MODELS • A variety of different data models exist for the storage and
management of attribute data. The most common are: • Tabular • Hierarchial • Network • Relational • Object Oriented • The tabular model is the manner in which most early GIS software packages stored their attribute data. The next three models are those most commonly implemented in database management systems (DBMS). The object oriented is newer but rapidly gaining in popularity for some applications. Prepared by: Sr Dr Siti Aekbal Salleh
21
Tabular Model • The simple tabular model stores attribute data as
sequential data files with fixed formats (or comma delimited for ASCII data), for the location of attribute values in a predefined record structure.
• This type of data model is outdated in the GIS arena. • It lacks any method of checking data integrity, as well as
being inefficient with respect to data storage, e.g. limited indexing capability for attributes or records, etc.
Prepared by: Sr Dr Siti Aekbal Salleh
22
Hierarchical Model • The hierarchical database organizes data in a tree structure. • Data is structured downward in a hierarchy of tables. Any level in the hierarchy can have unlimited children, but any child can have only one parent. • Hierarchial DBMS have not gained any noticeable acceptance for use within GIS. They are oriented for data sets that are very stable, where primary relationships among the data change infrequently or never at all. • Also, the limitation on the number of parents that an element may have is not always conducive to actual geographic phenomenon. Prepared by: Sr Dr Siti Aekbal Salleh
23
Network Model • The network database organizes data in a network or plex structure. Any column in a plex structure can be linked to any other. Like a tree structure, a plex structure can be described in terms of parents and children.
• This model allows for children to have more than one parent.
Prepared by: Sr Dr Siti Aekbal Salleh
24
• Network DBMS have not found much more
acceptance in GIS than the hierarchical DBMS. • They have the same flexibility limitations as hierarchical databases; however, the more powerful structure for representing data relationships allows a more realistic modelling of geographic phenomenon. • However, network databases tend to become overly complex too easily. In this regard it is easy to lose control and understanding of the relationships between elements.
Prepared by: Sr Dr Siti Aekbal Salleh
25
Relational Model • The relational database organizes data in tables. Each
table, is identified by a unique table name, and is organized by rows and columns. • Each column within a table also has a unique name. Columns store the values for a specific attribute, e.g. cover group, tree height. Rows represent one record in the table. • In a GIS each row is usually linked to a separate spatial feature, e.g. a forestry stand. • Accordingly, each row would be comprised of several columns, each column containing a specific value for that geographic feature. Prepared by: Sr Dr Siti Aekbal Salleh
26
• There are many different designs of DBMSs, but in GIS the relational design
has been the most useful. In the relational design, data are stored conceptually as a collection of tables. Common fields in different tables are used to link them together. This surprisingly simple design has been so widely used primarily because of its flexibility and very wide deployment in applications both within and without GIS
Prepared by: Sr Dr Siti Aekbal Salleh
27
• The relational DBMS is attractive because of its: • Simplicity in organization and data modelling. • Flexibility - data can be manipulated in an ad hoc manner by joining tables.
• Efficiency of storage - by the proper design of data tables redundant data can be minimized; and the
• Non-procedural nature - queries on a relational database do
not need to take into account the internal organization of the data.
Prepared by: Sr Dr Siti Aekbal Salleh
28
Object-Oriented Model • The object-oriented database model manages data through objects.
• An object is a collection of data elements and
operations that together are considered a single entity.
• The object-oriented database is a relatively new model. This approach has the attraction that querying is very natural, as features can be bundled together with attributes at the database administrator's discretion. Prepared by: Sr Dr Siti Aekbal Salleh
29
• To date, only a few GIS packages are promoting the use of this attribute data model.
• However, initial impressions indicate that this
approach may hold many operational benefits with respect to geographic data processing.
• Fulfilment of this promise with a commercial GIS product remains to be seen.
Prepared by: Sr Dr Siti Aekbal Salleh
30
Georelational Data Model • Geographical referenced data: • Spatial and attribute (component)
• Georelational data model: • Stores spatial and attribute data separately but have a link in a split system
• Called RELATIONAL DATA BASE
Prepared by: Sr Dr Siti Aekbal Salleh
31
Object-based Vector Data Model • Stores spatial and attribute data together in a system • Geometry data are stored along with their attributes • Its eliminates the use of split system and the needs for data synchronization
• Model treats spatial data as an object • Object can represents spatial feature (road and lake), layer or the coordinate system on which the layer is based Prepared by: Sr Dr Siti Aekbal Salleh
32
The Evolution/Development • ESRI INC. • ArcINFO – coverages • ArcVIEW – Shapefiles • ArcGIS – Geodatabase
• Georelational data model (ArcINFO and ArcVIEW) • Object-Based data model (ArcGIS) Prepared by: Sr Dr Siti Aekbal Salleh
33
EXAMPLES
Prepared by: Sr Dr Siti Aekbal Salleh
34
Types of DBMS • Three main types of DBMS are available to GIS users today • Relational (RDBMS) • Object (ODBMS) • Object-relational (ORDBMS)
Relational (RDBMS) • Data stored as tuples (tup-el), conceptualized as tables • Table – data about a class of objects • Two-dimensional list (array) • Rows = objects • Columns = object states (properties, attributes)
Column = attribute
Table
Row = object Vector feature
Relational (RDBMS) • Most popular type of DBMS • Over 95% of data in DBMS is in RDBMS • Commercial systems • Microsoft Access • Microsoft SQL Server • Oracle • IBM DB2 • Informix • Sybase
Relational join • Fundamental query operation • Occurs because • Data created/maintained by different users, but integration needed for queries
• Table joins use common keys (column
values) • Table (attribute) join concept has been extended to geographic case
Relational databases
Object (ODBMS) • Initially designed to address weaknesses of RDBMS e.g. • inability to store complete objects directly in the database (both object state and behavior)
• difficult to handle rich data types e.g. geographic objects, sound & video
• poor performance for many types of geographic query • difficult to support geographic data types and processing functions
• ODBMS have not proven to be as commercially successful as some predicted because of the massive installed base of RDBMS
Object-relational (ORDBMS) • RDBMS vendors added ODBMS capabilities into standard RDBMS, thus ORDBMS
• Hybrid object-relational DBMS (ORDBMS) can be thought of as an RDBMS engine with an extensibility framework for handling objects.
• Geographic ORDBMS extended to support geographic object types & functions
7 key capabilities of geographic ORDBMS 1. Query parser – engines used to interpret queries by splitting them up and decoding them have been extended to deal with geographic types and functions
2. Query optimizer – able to handle geographic queries efficiently e.g. household 3km radius, income > $110, 000 attribute query – quickly or geometric query - longer
3. Query language – handle geographic types (e.g. points and areas) and functions (e.g. select area that touch each other)
4. Indexing service – unidimensional DBMS data index service have been extended to support multidimensional (i.e. x, y, z coordinates)
5. Storage management – large volumes of geographic records with different sizes (especially
geometric and topological relationship) have been accommodated through specialized storage structures
6. Transaction services - dealing with long transactions common in many geographic applications 7. Replication - dealing with geographic types and with problem of reconciling changes made by distributed users
Geographic DBMS extension • A number of the commercial DBMS vendors have
released spatial database extensions to their standard ORDBMS products • IBM – DB2 Spatial Extender and Informix Spatial Datablade • spatial capabilities in the core of Microsoft SQL Server • Oracle Spatial • Opensource DBMS PostgreSQL has also been extended with spatial types and functions (PostGIS).
• Focus is on data storage, retrieval and management
Storing data in DBMS Tables • The lowest level of geographic database = object class (also called a layer or feature class)
• Stored in a standard database table, a two-dimensional array of rows and columns
• Rows contain objects (instances of object classes) • Columns contain object properties or attributes
• The data stored at individual row, column = values • Geographic database tables are distinguished from non-
geographic tables by the presence of a geometry column (often called the shape column)
• To save space and improve performance, the actual coordinate values may be stored in a highly compressed binary form
• Tables are joined together using common row/column values or keys
• Following joins, all tables can be treated as a single table
Parts of GIS database tables for U.S states (A) STATES table; (B) POPULATION table
Parts of GIS database tables for U.S states
(C) joined table—COMBINED STATES and POPULATION
Codd’s five principles • Series of rules for the efficient and effective design of database table structure 1. One value in each cell of a row and column 2. All values in a column are the same subject 3. Each row is unique 4. No significance to the sequence of columns 5. No significance to the sequence of rows
The concept of normal forms • Improve the simplicity and stability of a database • Reduce redundancy of tables by splitting them into sub-tables that are re-joined at query time
• Large tables common in geographic applications
leads to tendency for non-normalized table designs in GIS
Tax assessment database
(A) raw data
(B) cleaned data in a GIS DBMS
Tax assessment database
(A) raw data
(B) cleaned data in a GIS DBMS
Tax assessment database (C) data partially normalized into three subtables
(D) joined table
SQL • Structured (Standard) Query Language – (pronounced SEQUEL) • Developed by IBM in 1970s • Standard for accessing relational databases • Three types of usage • Stand alone queries (command line) • High level programming (general purpose programming language) • Embedded in other applications (GUI)
• It has geographic capabilities
Types of SQL Statements • Data Definition Language (DDL) • Create, alter and delete data • CREATE TABLE, CREATE INDEX • Data Manipulation Language (DML) • Retrieve and manipulate data • SELECT, UPDATE, DELETE, INSERT • Data Control Languages (DCL) • Control security of data • GRANT, CREATE USER, DROP USER
Example of SQL Syntax SELECT Table 10_3a. ParcelNum, Table 10_3c.Address, Table 10_3a.AssessedValue FROM (Tab10_3b INNER JOIN Tab10_3a ON Tab10_3b.ZoningCode = Tab10_3a.ZoningCode) INNER JOIN Tab10_3c ON Tab10_3a.OwnersName = Tab10_3c.OwnerName WHERE (((Tab10_3a.AssessedValue) > 300000) AND ((Tab10_3b.ZoningType) = “Residential”));
Results of a SQL query against the tables in Figure 10.3C