Spatial and GIS Glossary of Terms

On this page Carat arrow pointing down
Warning:
Cockroach Labs will stop providing Assistance Support for v21.2 on May 16, 2023. Prior to that date, upgrade to a more recent version to continue receiving support. For more details, see the Release Support Policy.

This page contains a glossary of terms common to spatial databases and geographic information systems (GIS). Where possible, we provide links to further information.

Note:

This page is provided for reference purposes only. The inclusion of a term in this glossary does not imply that CockroachDB has support for any feature(s) related to that term. For more information about the specific spatial and GIS features supported by CockroachDB, see Working with Spatial Data.

Geometry Terms

  • Bounding box: Given a set of points, a bounding box is the smallest rectangle that encloses all of the points in the set. Due to edge cases in how geographic points are mapped to cartographic projections, a bounding box for a given set of points may be larger than expected.

  • Spheroid: A spheroid (also known as an ellipsoid) is essentially a "slightly squished" sphere. Spheroids are used to represent almost-but-not-quite spherical objects. For example, a spheroid is used to represent the Earth in the World Geodetic System standard.

  • Cartographic projection: A cartographic projection, or map projection, is the process used to represent 3-dimensional (or higher) data on a 2-dimensional surface. This is usually related to how we might display 3-dimensional shapes represented in a database by the GEOGRAPHY data type on the surface of a map, which is a flat plane. For more information, see the GIS Lounge article What is a Map Projection? by Caitlin Dempsey.

  • Covering: The covering of a shape A is a set of locations (in CockroachDB, S2 cell IDs) that comprise another shape B such that no points of A lie outside of B.

  • Nearest-neighbor search: Given a starting point on a map and a set of search criteria, find the specified number of points nearest the starting point that meet the criteria. For example, a nearest-neighbor search can be used to answer the question, "What are the 10 closest Waffle House restaurants to my current location?" This is also sometimes referred to as "k nearest-neighbor" search.

  • SRID: The Spatial Referencing System Identifier (a.k.a. SRID) is used to tell which spatial reference system will be used to interpret each spatial object. A commonly used SRID is 4326, which represents spatial data using longitude and latitude coordinates on the Earth's surface as defined in the WGS84 standard.

  • Spatial reference system: Used to define what a spatial object "means". For example, a spatial object could use geographic coordinates using latitude and longitude, or a geometry projection using points with X,Y coordinates in a 2-dimensional plane.

Data types

  • GEOGRAPHY: Used to represent shapes relative to locations on the Earth's spheroidal surface.

Data Formats

  • WKT: The "Well Known Text" data format is a convenient human-readable notation for representing spatial objects. For example a 2-dimensional point object with x- and y-coordinates is represented in WKT as POINT(123,456). This format is defined by the OGC. For more information, see the Well Known Text documentation.

  • EWKT: The "Extended Well Known Text" data format extends WKT by prepending an SRID to the shape's description. For more information, see the Well Known Text documentation.

  • WKB: The "Well Known Binary" data format is a convenient machine-readable binary representation for spatial objects. For efficiency, an application may choose to use this data format, but humans may prefer to read WKT. This format is defined by the OGC. For more information, see Well Known Binary.

  • EWKB: The "Extended Well Known Binary" data format extends WKB by prepending SRID information to the shape's description. For more information, see Well Known Binary.

Organizations

  • OGC: The Open Geospatial Consortium was formerly known as the "Open GIS Consortium". The organization is still referred to colloquially as "OpenGIS" in many places online. The OGC is a consortium of businesses, government agencies, universities, etc., described as "a worldwide community committed to improving access to geospatial (location) information."

  • MapBox: A company providing a location data platform for mobile and web applications. For more information, see https://www.mapbox.com/.

  • Esri: A company providing "location intelligence" services. Esri develops spatial and GIS software, including the popular ArcGIS package. For more information about Esri, see https://www.esri.com.

Industry Standards

  • DE-9IM: The Dimensionally Extended nine-Intersection Model (DE-9IM) defines a method that uses a 3x3 matrix to determine whether two shapes (1) touch along a boundary, (2), intersect (overlap), or (3) are equal to each other - that is, they are the same shape that covers the same area. This notation is used by the ST_Relate built-in function. Almost all other spatial predicate functions can be logically implemented using this model. However, in practice, most are not, and ST_Relate is reserved for advanced use cases.

File Formats

  • Shapefile: A spatial data file format developed by Esri and used by GIS software for storing geospatial data. It can be automatically converted to SQL by tools like shp2pgsql for use by a database that can run spatial queries.

  • Vector file: A file format that uses a non-pixel-based, abstract coordinate representation for geospatial data. Because it is abstract and not tied to pixels, the vector format is scalable. The motivation is similar to that behind the Scalable Vector Graphics (SVG) image format: scaling the image up or down does not reveal any "jaggedness" (due to loss of information) such as might be revealed by a pixel representation. However, vector files are usually much larger in size and more expensive (in terms of CPU, memory, and disk) to work with than Raster files.

  • Raster file: A file format that uses a non-scalable, pixel-based representation for geospatial data. Raster files are smaller and generally faster to read, write, or generate than Vector files. However, raster files have inferior image quality and/or accuracy when compared to vector files: they can appear "jagged" due to the reduced information available when compared to vector files.

  • GeoJSON: A format for encoding geometric and geographic data as JSON. For more information, see GeoJSON.

Software and Code Libraries

  • GIS: A "Geographic Information System" (or GIS) is used to store geographic information in a computer for processing and interaction by humans and/or other software. Some systems provide graphical "point and click" user interfaces, and some are embedded in programming languages or data query languages like SQL. For example, CockroachDB versions 20.2 and later provide support for executing spatial queries from SQL.

  • ArcGIS: A commercial GIS software package developed by the location intelligence company Esri. For more information, see Esri's ArcGIS overview.

  • PostGIS: An extension to the PostgreSQL database that adds support for geospatial queries. For more information, see postgis.net.

  • GEOS: An open source geometry library used by CockroachDB, PostGIS, and other projects to provide the calculations underlying various spatial predicate functions and operators. For more information, see http://trac.osgeo.org/geos/.

  • GeographicLib: A C++ library for performing various geographic and other calculations used by CockroachDB and other projects. For more information, see https://geographiclib.sourceforge.io.

  • CGAL: The computational geometry algorithms library. For more information, see https://www.cgal.org.

  • TIGER: The "Topographically Integrated Geographic Encoding and Referencing System" released by the U.S. Census Bureau.

  • S2: The S2 Geometry Library is a C++ code library for performing spherical geometry computations. It models a sphere using a quadtree "divide the space" approach, and is used by CockroachDB.

Spatial objects

This section has information about the representation of geometric and geographic "shapes" according to the SQL/MM standard.

  • Point: A point is a sizeless location identified by its X and Y coordinates. These coordinates are then translated according to the spatial reference system to determine what the point "is", or what it "means" relative to the other geometric objects (if any) in the data set. A point can be created in SQL by the ST_Point() function.

  • LineString: A linestring is a collection of points that are "strung together" into one geometric object, like a necklace. If the "necklace" were "closed", it could also represent a polygon. A linestring can also be used to represent an arbitrary curve, such as a Bézier curve.

  • Polygon: A polygon is a closed shape that can be made up of straight or curved lines. It can be thought of as a "closed" linestring. Irregular polygons can take on almost any arbitrary shape. Common regular polygons include: squares, rectangles, hexagons, and so forth. For more information about regular polygons, see the 'Regular polygon' Wikipedia article.

  • GeometryCollection: A geometry collection is a "box" or "bag" used for gathering 1 or more of the other types of objects defined above into a collection: namely, points, linestrings, or polygons. In the particular case of SQL, it provides a way of referring to a group of spatial objects as one "thing" so that you can operate on it/them more conveniently, using various SQL functions.

Spatial System tables

  • pg_extension: A table used by the PostgreSQL database to store information about extensions to that database. Provided for compatibility by CockroachDB. For more information, see the PostgreSQL documentation.

  • spatial_ref_sys: A SQL table defined by the OGC that holds the list of SRIDs supported by a database, e.g., SELECT count(*) FROM spatial_ref_sys;

  • geometry_columns: Used to list all of the columns in a database with the GEOMETRY data type, e.g., SELECT * from geometry_columns.

  • geography_columns: Used to list all of the columns in a database with the GEOGRAPHY data type, e.g., SELECT * from geography_columns.

See also


Yes No
On this page

Yes No