The gpkg_contents and gpkg_geometry_columns tables
Introduction
GeoPackage uses a system of metadata tables to describe and organize the spatial and non-spatial data stored within the database. Two of the most important metadata tables are gpkg_contents and gpkg_geometry_columns. These tables work together to provide essential information about the datasets in your GeoPackage and their geometric properties.
Understanding these tables is crucial for working effectively with GeoPackages, whether you're creating new datasets, querying existing ones, or building applications that read GeoPackage files.
The gpkg_contents Table
The gpkg_contents table serves as the central registry for all user data tables in a GeoPackage. Every feature table, tile matrix set, or attributes table must have an entry in this table.
Purpose and Role
Think of gpkg_contents as the table of contents for your GeoPackage. It provides a comprehensive inventory of all datasets, describing what they contain and where they're located spatially. This table allows applications to quickly discover what data is available without having to inspect every table in the database.
Table Structure
The gpkg_contents table contains the following columns:
table_name (TEXT, PRIMARY KEY, NOT NULL): The name of the actual data table in the GeoPackage. This must match exactly with an existing table name.
data_type (TEXT, NOT NULL): Describes the type of data stored in the table. Valid values include:
featuresfor vector feature datatilesfor raster tile pyramidsattributesfor non-spatial tabular data2d-gridded-coveragefor gridded coverage data (extension)
identifier (TEXT): A human-readable identifier for the dataset, which may be used in user interfaces.
description (TEXT): A human-readable description of the table contents.
last_change (DATETIME, NOT NULL): The timestamp of the last change to the table content, stored in ISO 8601 format (e.g., 2025-01-14T10:30:00.000Z).
min_x, min_y, max_x, max_y (DOUBLE): The bounding box coordinates that define the spatial extent of the data. For geographic coordinate systems, these would typically be longitude and latitude values. These columns may be NULL for non-spatial attribute tables.
srs_id (INTEGER, FOREIGN KEY): References the spatial reference system defined in the gpkg_spatial_ref_sys table. This identifies the coordinate system used by the data.
Example Entry
Here's what a typical entry in gpkg_contents might look like for a feature table containing city boundaries:
table_name: cities
data_type: features
identifier: World Cities
description: Major cities worldwide with population > 100,000
last_change: 2025-01-14T08:15:30.000Z
min_x: -180.0
min_y: -90.0
max_x: 180.0
max_y: 90.0
srs_id: 4326
Key Concepts
The bounding box values (min_x, min_y, max_x, max_y) define the minimum bounding rectangle (MBR) that contains all features in the table. Applications use this to determine if a dataset intersects with an area of interest without having to examine individual geometries.
The last_change timestamp is particularly important for caching and synchronization applications, allowing them to detect when data has been updated.
The gpkg_geometry_columns Table
While gpkg_contents describes all datasets, gpkg_geometry_columns provides specific metadata about the geometry columns in feature tables. This table is only used for tables where data_type is features in gpkg_contents.
Purpose and Role
The gpkg_geometry_columns table describes the geometric characteristics of spatial feature data. It specifies what type of geometries are stored (points, lines, polygons, etc.), which column contains them, and their dimensional properties.
Table Structure
The gpkg_geometry_columns table contains these columns:
table_name (TEXT, PRIMARY KEY, NOT NULL): The name of the feature table, which must exist in gpkg_contents.
column_name (TEXT, PRIMARY KEY, NOT NULL): The name of the geometry column in the feature table. A table can have multiple geometry columns, each with its own entry.
geometry_type_name (TEXT, NOT NULL): The name of the geometry type stored in this column. Valid values include:
GEOMETRY(base type, any geometry)POINTLINESTRINGPOLYGONMULTIPOINTMULTILINESTRINGMULTIPOLYGONGEOMETRYCOLLECTION
Each type can also have Z, M, or ZM variants (e.g., POINTZ, LINESTRINGM, POLYGONZM).
srs_id (INTEGER, NOT NULL, FOREIGN KEY): References the spatial reference system in gpkg_spatial_ref_sys. This must match the srs_id in gpkg_contents for the same table.
z (TINYINT, NOT NULL): Indicates whether geometries include Z (elevation) values:
0= prohibited1= mandatory2= optional
m (TINYINT, NOT NULL): Indicates whether geometries include M (measure) values:
0= prohibited1= mandatory2= optional
Example Entry
For our cities table, the geometry column metadata might look like this:
table_name: cities
column_name: geom
geometry_type_name: POINT
srs_id: 4326
z: 0
m: 0
This indicates that the cities table has a geometry column named geom containing 2D points without elevation or measure values.
Geometry Type Hierarchy
The geometry types follow a hierarchy. A column defined as GEOMETRY can contain any geometry type, while a column defined as POINT should only contain point geometries. This type constraint helps applications optimize their handling of the data and provides validation.
The Z and M indicators define dimensionality:
- 2D geometries have only X and Y coordinates
- 3D (Z) geometries add elevation values
- M geometries add measure values (often representing distance along a linear feature)
- 4D (ZM) geometries include both elevation and measure
Relationship Between the Tables
The two tables work in tandem to fully describe feature datasets:
gpkg_contentsprovides the general metadata applicable to all data types: what the dataset is, when it changed, and where it is spatially located.gpkg_geometry_columnsadds geometry-specific details: what kind of shapes are stored and how they're structured dimensionally.
Every entry in gpkg_geometry_columns must have a corresponding entry in gpkg_contents where the data_type is features. The table_name and srs_id values must be consistent between the two tables.
Practical Usage
Discovering Available Datasets
When an application opens a GeoPackage, it typically queries gpkg_contents first to discover what data is available:
SELECT table_name, identifier, description, data_type
FROM gpkg_contents
WHERE data_type = 'features';
Getting Geometry Details
For each feature table discovered, the application can then query gpkg_geometry_columns to understand the geometry structure:
SELECT column_name, geometry_type_name, z, m
FROM gpkg_geometry_columns
WHERE table_name = 'cities';
Validating Data
These tables also support validation. Before inserting features into a table, an application can check the required geometry type and ensure new features conform to the specification.
Creating New Feature Tables
When creating a new feature table, you must update both metadata tables. The typical sequence is:
- Create the feature table with its geometry column
- Insert a row into
gpkg_contentsdescribing the table - Insert a row into
gpkg_geometry_columnsdescribing the geometry column - Register the geometry column with the spatial index (if using R-tree indexes)
Common Patterns and Best Practices
Consistent Naming: Use clear, descriptive names for tables and geometry columns. The convention is to name the geometry column geom or geometry, though any valid SQL column name is allowed.
Bounding Box Maintenance: Keep the bounding box in gpkg_contents up to date. When features are added, modified, or deleted, the spatial extent may change and should be recalculated.
Timestamp Updates: Update the last_change timestamp whenever table content changes. This is critical for synchronization and caching systems.
Specific Geometry Types: When possible, use specific geometry types (POINT, LINESTRING, etc.) rather than the generic GEOMETRY type. This provides better validation and allows applications to optimize their handling.
Single Coordinate System: While the GeoPackage standard allows different srs_id values for the geometry column, it's simpler and more efficient to use the same coordinate system throughout your GeoPackage when possible.
Summary
The gpkg_contents and gpkg_geometry_columns tables form the metadata foundation of a GeoPackage. They provide essential information that allows applications to discover, understand, and properly handle the spatial data within the package. Understanding their structure and relationship is fundamental to working effectively with the GeoPackage format, whether you're building tools to create, read, or analyze GeoPackage data.