The gpkg_metadata and gpkg_metadata_reference tables
Metadata — data about data — is essential for understanding, managing, and preserving geospatial information. For Danish municipalities, proper metadata ensures compliance with INSPIRE directives, facilitates data sharing between departments and external partners, and maintains institutional knowledge about spatial datasets.
GeoPackage provides a robust, standardized mechanism for storing metadata directly within the database file through two core tables: gpkg_metadata and gpkg_metadata_reference. This approach keeps metadata and data together, preventing the common problem of "orphaned" datasets where documentation is lost or separated from the actual data.
In this section, you'll learn how these tables work, how to populate them, and why they're crucial for municipal GIS data management.
Why Metadata Matters for Danish Municipalities
Before diving into the technical details, consider these practical scenarios:
Scenario 1: Data Handover
A consultant delivers a GeoPackage containing updated utility network data. Without metadata, your staff must figure out: What coordinate system is this in? When was it collected? What accuracy can we expect? Who can we contact with questions?
Scenario 2: INSPIRE Compliance
Danish municipalities must provide metadata for spatial datasets according to INSPIRE requirements. Embedding this metadata in the GeoPackage ensures it's always available and correctly associated with the data.
Scenario 3: Long-term Archival
A planning project from 2020 needs to be reviewed in 2025. With proper metadata, future staff can understand the data's context, limitations, and original purpose without tracking down the original creators.
Scenario 4: Data Discovery
Your municipality maintains dozens of GeoPackages. Rich metadata enables staff to search and discover relevant datasets without opening each file individually.
Understanding the Two-Table Structure
GeoPackage separates metadata into two complementary tables:
The gpkg_metadata Table
This table stores the actual metadata content. Think of it as a library of metadata documents, where each row contains one complete metadata record.
Table structure:
| Column | Type | Description |
|---|---|---|
id |
INTEGER PRIMARY KEY | Unique identifier for each metadata record |
md_scope |
TEXT NOT NULL | The scope of what this metadata describes |
md_standard_uri |
TEXT NOT NULL | The metadata standard being used |
mime_type |
TEXT NOT NULL | Format of the metadata (usually 'text/xml') |
metadata |
TEXT NOT NULL | The actual metadata content |
The gpkg_metadata_reference Table
This table creates links between metadata records and the data they describe. It answers the question: "Which metadata applies to which data?"
Table structure:
| Column | Type | Description |
|---|---|---|
reference_scope |
TEXT NOT NULL | What type of thing is being described |
table_name |
TEXT | Name of the table being described (if applicable) |
column_name |
TEXT | Name of the column being described (if applicable) |
row_id_value |
INTEGER | Specific row being described (if applicable) |
timestamp |
DATETIME NOT NULL | When this reference was created |
md_file_id |
INTEGER NOT NULL | Foreign key to gpkg_metadata.id |
md_parent_id |
INTEGER | Reference to parent metadata record |
This separation allows:
- Reuse: One metadata record can describe multiple tables or datasets
- Hierarchy: Metadata can inherit from parent records
- Flexibility: Different granularity levels (whole GeoPackage, specific table, single row)
Metadata Scope Values
The md_scope column in gpkg_metadata defines what level of information the metadata describes. Understanding these values is crucial for organizing your metadata correctly.
Standard Scope Values
dataset
Describes an entire collection of related data. Use for metadata that applies to a complete GeoPackage or a major dataset within it.
Example: Metadata describing all cadastral parcels in Brøndby Kommune stored in a GeoPackage.
featureType (or "series")
Describes a category or type of features. Use for metadata about a specific layer or table.
Example: Metadata describing the "buildings" table structure and attributes.
attribute
Describes a specific attribute or column. Use for detailed documentation of individual fields.
Example: Metadata explaining what the "byg_anvend" (building use) code values mean.
attributeType
Describes a type of attribute used across multiple tables.
Example: Metadata defining the CVR number format used in various business-related tables.
feature
Describes an individual geographic feature or row.
Example: Metadata for a specific protected building with historical significance.
fieldSession
Describes data collection activities during a specific time period.
Example: Metadata documenting a week-long field survey of street furniture.
model
Describes a data model or schema.
Example: Metadata documenting your municipality's custom asset management data structure.
catalog
Describes a collection of datasets.
Example: Metadata for a GeoPackage containing multiple environmental monitoring datasets.
Metadata Standards and Formats
Common Metadata Standards
The md_standard_uri column identifies which metadata standard is being used. For Danish municipalities, the most relevant standards are:
ISO 19115:2014 (Geographic Information — Metadata)
The international standard for geographic metadata. URI: http://www.isotc211.org/2005/gmd
Use when: Creating comprehensive metadata for INSPIRE compliance or formal data exchange.
ISO 19139 (XML schema implementation of ISO 19115)
The XML encoding of ISO 19115. URI: http://www.isotc211.org/2005/gmd
Use when: You need machine-readable XML metadata that follows international standards.
Dublin Core
A simpler, more general metadata standard. URI: http://purl.org/dc/elements/1.1/
Use when: You need basic, easy-to-create metadata for internal use.
Custom/Internal
Your municipality's own metadata schema. URI: http://municipality.dk/metadata/v1
Use when: You have specific internal requirements not covered by standards.
MIME Types
The mime_type column specifies the format of the metadata content:
- text/xml: XML-formatted metadata (most common for ISO 19115/19139)
- text/plain: Plain text metadata
- application/json: JSON-formatted metadata
- text/html: HTML-formatted metadata (less common but allowed)
Practical Examples: Creating Metadata
Example 1: Basic Dataset-Level Metadata (Dublin Core)
Let's create simple metadata for a GeoPackage containing playground equipment locations in Brøndby Kommune.
-- Insert metadata record
INSERT INTO gpkg_metadata (
md_scope,
md_standard_uri,
mime_type,
metadata
) VALUES (
'dataset',
'http://purl.org/dc/elements/1.1/',
'text/xml',
'<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>Playground Equipment Inventory - Brøndby Kommune</dc:title>
<dc:creator>Park and Recreation Department</dc:creator>
<dc:subject>Playground equipment, Parks, Public facilities</dc:subject>
<dc:description>Complete inventory of playground equipment across all municipal parks in Brøndby Kommune. Includes equipment type, installation date, last inspection date, and condition assessment.</dc:description>
<dc:date>2024-01-15</dc:date>
<dc:type>Dataset</dc:type>
<dc:format>GeoPackage 1.3</dc:format>
<dc:identifier>broendby-playground-2024</dc:identifier>
<dc:language>da</dc:language>
<dc:coverage>Brøndby Kommune, Denmark</dc:coverage>
<dc:rights>Internal use only - Brøndby Kommune</dc:rights>
</metadata>'
);
-- Create reference linking metadata to the GeoPackage
INSERT INTO gpkg_metadata_reference (
reference_scope,
table_name,
timestamp,
md_file_id
) VALUES (
'geopackage',
NULL,
datetime('now'),
(SELECT MAX(id) FROM gpkg_metadata)
);
Explanation:
- The metadata record uses Dublin Core, a simple standard perfect for basic documentation
md_scopeis 'dataset' because this describes the entire collectionreference_scopeis 'geopackage' to apply metadata to the entire filetable_nameis NULL because this applies to the whole GeoPackage, not a specific table
Example 2: Table-Level Metadata
More specific metadata for just the playground equipment table:
-- Insert table-specific metadata
INSERT INTO gpkg_metadata (
md_scope,
md_standard_uri,
mime_type,
metadata
) VALUES (
'featureType',
'http://purl.org/dc/elements/1.1/',
'text/xml',
'<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>Playground Equipment Points</dc:title>
<dc:description>Point features representing individual pieces of playground equipment. Each point represents one play structure, swing set, or other equipment item. Collected via field survey with GPS accuracy better than 2 meters.</dc:description>
<dc:date>2024-01-15</dc:date>
<dc:spatial>EPSG:25832 (ETRS89 / UTM zone 32N)</dc:spatial>
<dc:source>Field survey January 2024, GPS Trimble R2</dc:source>
</metadata>'
);
-- Link metadata to specific table
INSERT INTO gpkg_metadata_reference (
reference_scope,
table_name,
timestamp,
md_file_id
) VALUES (
'table',
'playground_equipment',
datetime('now'),
(SELECT MAX(id) FROM gpkg_metadata)
);
Example 3: Column/Attribute Metadata
Documenting what a specific column means:
-- Insert column-specific metadata
INSERT INTO gpkg_metadata (
md_scope,
md_standard_uri,
mime_type,
metadata
) VALUES (
'attribute',
'http://purl.org/dc/elements/1.1/',
'text/plain',
'Equipment Type Classification
Values:
- SWING: Swing sets (single or multiple seats)
- SLIDE: Slides of any configuration
- CLIMB: Climbing structures (walls, frames, nets)
- SPRING: Spring-mounted riders
- SPIN: Merry-go-rounds and spinning equipment
- SANDBOX: Sand play areas
- OTHER: Miscellaneous equipment
This classification follows Danish playground safety standard DS/EN 1176.
Last updated: 2024-01-15'
);
-- Link to specific column
INSERT INTO gpkg_metadata_reference (
reference_scope,
table_name,
column_name,
timestamp,
md_file_id
) VALUES (
'column',
'playground_equipment',
'equipment_type',
datetime('now'),
(SELECT MAX(id) FROM gpkg_metadata)
);
Note: Plain text is used here for simplicity, but you could also use XML for structured documentation.
Example 4: ISO 19115/19139 Metadata for INSPIRE
For INSPIRE compliance, you need more comprehensive metadata following ISO 19115:
INSERT INTO gpkg_metadata (
md_scope,
md_standard_uri,
mime_type,
metadata
) VALUES (
'dataset',
'http://www.isotc211.org/2005/gmd',
'text/xml',
'<?xml version="1.0" encoding="UTF-8"?>
<gmd:MD_Metadata xmlns:gmd="http://www.isotc211.org/2005/gmd"
xmlns:gco="http://www.isotc211.org/2005/gco">
<gmd:fileIdentifier>
<gco:CharacterString>broendby-cadastral-2024-v1</gco:CharacterString>
</gmd:fileIdentifier>
<gmd:language>
<gco:CharacterString>dan</gco:CharacterString>
</gmd:language>
<gmd:characterSet>
<gmd:MD_CharacterSetCode codeListValue="utf8"
codeList="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#MD_CharacterSetCode">
utf8
</gmd:MD_CharacterSetCode>
</gmd:characterSet>
<gmd:contact>
<gmd:CI_ResponsibleParty>
<gmd:organisationName>
<gco:CharacterString>Brøndby Kommune, Teknik og Miljø</gco:CharacterString>
</gmd:organisationName>
<gmd:contactInfo>
<gmd:CI_Contact>
<gmd:address>
<gmd:CI_Address>
<gmd:electronicMailAddress>
<gco:CharacterString>[email protected]</gco:CharacterString>
</gmd:electronicMailAddress>
</gmd:CI_Address>
</gmd:address>
</gmd:CI_Contact>
</gmd:contactInfo>
<gmd:role>
<gmd:CI_RoleCode codeList="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode"
codeListValue="pointOfContact">pointOfContact</gmd:CI_RoleCode>
</gmd:role>
</gmd:CI_ResponsibleParty>
</gmd:contact>
<gmd:dateStamp>
<gco:Date>2024-01-15</gco:Date>
</gmd:dateStamp>
<gmd:identificationInfo>
<gmd:MD_DataIdentification>
<gmd:citation>
<gmd:CI_Citation>
<gmd:title>
<gco:CharacterString>Cadastral Parcels - Brøndby Kommune</gco:CharacterString>
</gmd:title>
<gmd:date>
<gmd:CI_Date>
<gmd:date>
<gco:Date>2024-01-15</gco:Date>
</gmd:date>
<gmd:dateType>
<gmd:CI_DateTypeCode codeList="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_DateTypeCode"
codeListValue="publication">publication</gmd:CI_DateTypeCode>
</gmd:dateType>
</gmd:CI_Date>
</gmd:date>
</gmd:CI_Citation>
</gmd:citation>
<gmd:abstract>
<gco:CharacterString>Complete cadastral parcel dataset for Brøndby Kommune including matrikel numbers, ownership information, and parcel boundaries. Updated quarterly from national cadastral register.</gco:CharacterString>
</gmd:abstract>
<gmd:language>
<gco:CharacterString>dan</gco:CharacterString>
</gmd:language>
<gmd:topicCategory>
<gmd:MD_TopicCategoryCode>planningCadastre</gmd:MD_TopicCategoryCode>
</gmd:topicCategory>
</gmd:MD_DataIdentification>
</gmd:identificationInfo>
</gmd:MD_Metadata>'
);
INSERT INTO gpkg_metadata_reference (
reference_scope,
table_name,
timestamp,
md_file_id
) VALUES (
'table',
'cadastral_parcels',
datetime('now'),
(SELECT MAX(id) FROM gpkg_metadata)
);
Note: ISO 19115 XML can be quite verbose. Many organizations use metadata creation tools that generate this XML automatically.
Working with Metadata in QGIS
Viewing Metadata
QGIS provides built-in tools for viewing GeoPackage metadata:
Method 1: Layer Properties
- Right-click a layer from a GeoPackage
- Select Properties
- Go to the Metadata tab
- QGIS will display any metadata associated with that layer
Method 2: DB Manager
- Open Database → DB Manager
- Expand GeoPackage → Select your GeoPackage
- Navigate to the Info tab
- Click Metadata to view all metadata tables
Method 3: Direct SQL Query
In DB Manager, execute SQL to view metadata:
-- View all metadata records
SELECT
m.id,
m.md_scope,
m.md_standard_uri,
substr(m.metadata, 1, 100) as metadata_preview
FROM gpkg_metadata m;
-- View metadata references
SELECT
mr.reference_scope,
mr.table_name,
mr.column_name,
mr.timestamp,
substr(m.metadata, 1, 100) as metadata_preview
FROM gpkg_metadata_reference mr
JOIN gpkg_metadata m ON mr.md_file_id = m.id;
Creating Metadata in QGIS
Using the Metadata Editor:
- Right-click layer → Properties → Metadata tab
- Fill in the metadata form:
- Identification: Title, abstract, keywords
- Categories: Select relevant ISO topic categories
- Contacts: Add responsible organizations/individuals
- Links: Related resources and documentation
- History: Data lineage and processing steps
- Extent: Spatial and temporal coverage
- Click OK to save
QGIS will automatically create entries in gpkg_metadata and gpkg_metadata_reference tables.
Important limitation: QGIS's metadata editor primarily works with ISO 19115/19139 format. For simpler Dublin Core metadata or custom formats, you'll need to use SQL directly.
Exporting Metadata
Export metadata to XML files for external use:
- Copy the XML content
- Save to a
.xmlfile - Use for uploading to metadata catalogs or INSPIRE geoportals
In DB Manager, query the metadata:
SELECT metadata FROM gpkg_metadata WHERE id = 1;
Hierarchical Metadata with md_parent_id
The md_parent_id column in gpkg_metadata_reference enables metadata inheritance, reducing redundancy and maintaining consistency.
Concept
Child metadata can reference a parent metadata record, inheriting its properties while adding or overriding specific details.
Use case: You have general metadata for all road data, but specific tables (primary_roads, secondary_roads) need additional details.
Example: Hierarchical Road Network Metadata
-- Parent metadata: General road network information
INSERT INTO gpkg_metadata (
md_scope,
md_standard_uri,
mime_type,
metadata
) VALUES (
'dataset',
'http://purl.org/dc/elements/1.1/',
'text/xml',
'<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>Road Network - Brøndby Kommune</dc:title>
<dc:creator>Roads and Infrastructure Department</dc:creator>
<dc:description>Complete road network data including classification, surface type, and maintenance responsibility.</dc:description>
<dc:date>2024-01-20</dc:date>
<dc:spatial>EPSG:25832</dc:spatial>
<dc:rights>Internal use - Brøndby Kommune</dc:rights>
</metadata>'
);
-- Store the parent ID for reference
-- In practice, you'd do this programmatically
-- Let's say this gets id = 10
-- Child metadata: Specific to primary roads
INSERT INTO gpkg_metadata (
md_scope,
md_standard_uri,
mime_type,
metadata
) VALUES (
'featureType',
'http://purl.org/dc/elements/1.1/',
'text/xml',
'<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>Primary Roads</dc:title>
<dc:description>Major arterial roads maintained by the municipality. Width >= 6 meters. Updated quarterly from road inspection database.</dc:description>
<dc:source>Road inspection database, quarterly export</dc:source>
</metadata>'
);
-- Link child metadata to primary_roads table with parent reference
INSERT INTO gpkg_metadata_reference (
reference_scope,
table_name,
timestamp,
md_file_id,
md_parent_id
) VALUES (
'table',
'primary_roads',
datetime('now'),
(SELECT MAX(id) FROM gpkg_metadata), -- Child metadata
10 -- Parent metadata id
);
Benefit: Applications can reconstruct complete metadata by combining parent and child records, avoiding duplication of common information.
Best Practices for Municipal GeoPackage Metadata
1. Establish Metadata Standards
Create a municipal metadata policy:
- Which metadata standard to use (recommend ISO 19115 for external sharing, Dublin Core for internal)
- Required fields for different data types
- Who is responsible for creating and updating metadata
- Review and update frequency
Template approach: Create metadata templates for common dataset types:
- Cadastral data template
- Infrastructure asset template
- Environmental monitoring template
- Planning data template
2. Metadata at Multiple Levels
Provide metadata at appropriate granularity:
GeoPackage Level
├── Dataset metadata (overall project or collection)
│
├── Table Level
│ ├── Layer 1 metadata (structure, purpose, source)
│ └── Layer 2 metadata
│
└── Column Level
├── Attribute 1 documentation (code lists, definitions)
└── Attribute 2 documentation
Don't over-document: Not every column needs metadata. Focus on:
- Non-obvious field names
- Coded values or classifications
- Fields with specific formatting requirements
- Fields with complex calculation methods
3. Minimum Required Metadata
Every GeoPackage should include at least:
Identification:
- Title and description
- Creation date
- Creator/responsible party
- Contact information
Spatial Reference:
- Coordinate system (EPSG code)
- Geographic extent
- Positional accuracy
Data Quality:
- Source of data
- Collection method
- Accuracy/precision
- Known limitations
Access and Use:
- Usage restrictions
- License or terms of use
- Update frequency
4. Use Danish Language Appropriately
For internal municipal use:
- Write metadata in Danish
- Use
<dc:language>da</dc:language>or<gmd:language>dan</gmd:language> - Include Danish keywords and terminology
For INSPIRE or international sharing:
- Consider bilingual metadata (Danish and English)
- Use standardized INSPIRE keywords in English
- Maintain Danish-specific content separately
5. Keep Metadata Current
Establish update triggers:
- Update metadata when data is updated
- Include
last_modifiedtimestamps - Document version history
- Note major changes in data structure or content
Example versioning in metadata:
<dc:description>
Version 2.1 - Updated 2024-01-20
Changes: Added equipment_manufacturer field, corrected 12 location errors from GPS re-survey.
Version 2.0 - Updated 2023-06-15
Changes: Complete re-survey of all playgrounds, new safety inspection fields added.
Version 1.0 - Initial version 2023-01-10
</dc:description>
6. Automate When Possible
Use scripts to populate standard metadata fields:
# Python example using sqlite3
import sqlite3
from datetime import datetime
def add_basic_metadata(gpkg_path, title, description, creator):
"""Add basic Dublin Core metadata to GeoPackage"""
conn = sqlite3.connect(gpkg_path)
cursor = conn.cursor()
metadata_xml = f"""<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>{title}</dc:title>
<dc:creator>{creator}</dc:creator>
<dc:description>{description}</dc:description>
<dc:date>{datetime.now().strftime('%Y-%m-%d')}</dc:date>
<dc:format>GeoPackage 1.3</dc:format>
<dc:language>da</dc:language>
</metadata>"""
# Insert metadata
cursor.execute("""
INSERT INTO gpkg_metadata (md_scope, md_standard_uri, mime_type, metadata)
VALUES (?, ?, ?, ?)
""", ('dataset', 'http://purl.org/dc/elements/1.1/', 'text/xml', metadata_xml))
md_id = cursor.lastrowid
# Create reference
cursor.execute("""
INSERT INTO gpkg_metadata_reference
(reference_scope, table_name, timestamp, md_file_id)
VALUES (?, ?, ?, ?)
""", ('geopackage', None, datetime.now().isoformat(), md_id))
conn.commit()
conn.close()
# Usage
add_basic_metadata(
'road_network.gpkg',
'Road Network - Brøndby Kommune',
'Municipal road network with classification and maintenance data',
'Brøndby Kommune, Roads Department'
)
7. Validate Metadata
Check for completeness:
-- Find tables without metadata
SELECT table_name
FROM gpkg_contents
WHERE table_name NOT IN (
SELECT DISTINCT table_name
FROM gpkg_metadata_reference
WHERE table_name IS NOT NULL
);
-- Find metadata records not referenced anywhere
SELECT m.id, m.md_scope, substr(m.metadata, 1, 50)
FROM gpkg_metadata m
WHERE m.id NOT IN (
SELECT md_file_id FROM gpkg_metadata_reference
);
Validate XML structure: Use XML validators to ensure ISO 19115/19139 metadata is well-formed and valid against the schema.
Common Pitfalls and How to Avoid Them
Pitfall 1: Orphaned Metadata
Problem: Metadata records exist but aren't linked via gpkg_metadata_reference.
Solution: Always create both the metadata record AND the reference. Use transactions:
BEGIN TRANSACTION;
INSERT INTO gpkg_metadata (...) VALUES (...);
INSERT INTO gpkg_metadata_reference (...) VALUES (..., last_insert_rowid());
COMMIT;
Pitfall 2: Incorrect reference_scope
Problem: Using wrong scope value (e.g., 'table' when it should be 'geopackage').
Valid values:
geopackage: Entire GeoPackage filetable: Specific tablecolumn: Specific columnrow: Specific rowrow/col: Specific cell
Check your references:
-- This should return no rows if all scopes are valid
SELECT * FROM gpkg_metadata_reference
WHERE reference_scope NOT IN ('geopackage', 'table', 'column', 'row', 'row/col');
Pitfall 3: Metadata Encoding Issues
Problem: Danish characters (æ, ø, å) appear corrupted.
Solution:
- Always use UTF-8 encoding
- Include proper XML declaration:
<?xml version="1.0" encoding="UTF-8"?> - Verify GeoPackage application_id is set correctly
Pitfall 4: Massive XML Blobs
Problem: Very large ISO 19115 XML makes the GeoPackage slow.
Solution:
- For detailed documentation, link to external resources instead of embedding everything
- Use
md_parent_idto avoid repetition - Consider storing extensive metadata externally and keeping only essential metadata in GeoPackage
Pitfall 5: Stale Metadata
Problem: Metadata becomes outdated as data changes.
Solution:
- Implement update triggers or workflows
- Include "last verified" dates
- Automate metadata updates where possible
- Regular metadata audits (quarterly or annually)
Advanced: Programmatic Metadata Management
For municipalities managing many GeoPackages, programmatic metadata management is essential.
Python Script: Bulk Metadata Extraction
import sqlite3
import json
from pathlib import Path
def extract_all_metadata(gpkg_path):
"""Extract all metadata from a GeoPackage as JSON"""
conn = sqlite3.connect(gpkg_path)
conn.row_factory = sqlite3.Row
cursor = conn.cursor()
# Get all metadata with references
cursor.execute("""
SELECT
m.id,
m.md_scope,
m.md_standard_uri,
m.mime_type,
m.metadata,
mr.reference_scope,
mr.table_name,
mr.column_name,
mr.row_id_value,
mr.timestamp,
mr.md_parent_id
FROM gpkg_metadata m
LEFT JOIN gpkg_metadata_reference mr ON m.id = mr.md_file_id
""")
metadata_list = []
for row in cursor.fetchall():
metadata_list.append({
'id': row['id'],
'scope': row['md_scope'],
'standard': row['md_standard_uri'],
'mime_type': row['mime_type'],
'content': row['metadata'],
'applies_to': {
'scope': row['reference_scope'],
'table': row['table_name'],
'column': row['column_name'],
'row_id': row['row_id_value'],
'parent_id': row['md_parent_id']
},
'timestamp': row['timestamp']
})
conn.close()
return metadata_list
def create_metadata_report(gpkg_directory):
"""Create a summary report of metadata across multiple GeoPackages"""
report = []
for gpkg_file in Path(gpkg_directory).glob('*.gpkg'):
try:
metadata = extract_all_metadata(gpkg_file)
report.append({
'file': gpkg_file.name,
'metadata_count': len(metadata),
'has_dataset_metadata': any(m['scope'] == 'dataset' for m in metadata),
'tables_documented': len(set(
m['applies_to']['table']
for m in metadata
if m['applies_to']['table']
))
})
except Exception as e:
report.append({
'file': gpkg_file.name,
'error': str(e)
})
return report
# Usage
report = create_metadata_report('/path/to/municipality/geopackages')
print(json.dumps(report, indent=2))
Python Script: Metadata Template Application
def apply_metadata_template(gpkg_path, template_type='infrastructure'):
"""Apply standard municipal metadata template"""
templates = {
'infrastructure': {
'md_scope': 'dataset',
'md_standard_uri': 'http://purl.org/dc/elements/1.1/',
'creator': 'Brøndby Kommune, Infrastructure Department',
'rights': 'Internal use only',
'language': 'da'
},
'planning': {
'md_scope': 'dataset',
'md_standard_uri': 'http://www.isotc211.org/2005/gmd',
'creator': 'Brøndby Kommune, Planning Department',
'rights': 'Public data - INSPIRE compliant',
'language': 'da'
}
}
# Implementation would apply the template...
pass
# Usage
apply_metadata_template('road_network.gpkg', 'infrastructure')
Knowledge Check Quiz
Question 1: Understanding the Two-Table Structure
Why does GeoPackage use two separate tables (gpkg_metadata and gpkg_metadata_reference) instead of storing metadata directly with each feature table?
A) To save disk space by avoiding duplication
B) To allow one metadata record to describe multiple tables and enable hierarchical metadata
C) Because SQLite requires separate tables for XML content
D) To comply with INSPIRE directives
<details> <summary>Click for answer</summary>
Answer: B) To allow one metadata record to describe multiple tables and enable hierarchical metadata
Explanation: The separation enables metadata reuse (one record can describe multiple tables), hierarchical relationships (parent-child metadata), and flexible granularity (describing entire GeoPackage, specific tables, columns, or even individual rows). This design is more flexible than embedding metadata directly in feature tables.
</details>
Question 2: Metadata Scope Values
You're documenting a GeoPackage that contains a table of building polygons. You want to create metadata explaining what the "byg_anvend" (building use) column values mean. Which md_scope value should you use?
A) dataset
B) featureType
C) attribute
D) column
<details> <summary>Click for answer</summary>
Answer: C) attribute
Explanation: The attribute scope is used for documenting specific attributes or columns. While you might think "column" would be the scope value, in GeoPackage metadata terminology, attribute is the correct scope for column-level documentation. The reference_scope in gpkg_metadata_reference would be set to 'column'.
</details>
Question 3: Metadata Standards
Your municipality needs to publish cadastral data that complies with INSPIRE requirements. Which metadata standard should you use?
A) Dublin Core (simple and easy)
B) ISO 19115/19139 (comprehensive geographic metadata)
C) Custom municipal schema
D) Plain text documentation
<details> <summary>Click for answer</summary>
Answer: B) ISO 19115/19139
Explanation: INSPIRE requires ISO 19115-compliant metadata. While Dublin Core is simpler, it doesn't meet INSPIRE's comprehensive metadata requirements. ISO 19139 is the XML schema implementation of ISO 19115, which is what INSPIRE geoportals expect.
</details>
Question 4: MIME Types
You're creating ISO 19115 metadata for a dataset. Which MIME type should you specify in the mime_type column?
A) application/xml
B) text/xml
C) text/html
D) application/json
<details> <summary>Click for answer</summary>
Answer: B) text/xml
Explanation: ISO 19115/19139 metadata is stored as XML, and the correct MIME type for XML in GeoPackage is text/xml. While application/xml is technically valid for XML in general, GeoPackage specification recommends text/xml for metadata content.
</details>
Question 5: Creating Metadata References
You've just inserted a metadata record into gpkg_metadata and it was assigned id = 15. You want this metadata to describe the entire GeoPackage, not just one table. How should you insert the reference?
A) INSERT INTO gpkg_metadata_reference (reference_scope, table_name, md_file_id) VALUES ('geopackage', NULL, 15)
B) INSERT INTO gpkg_metadata_reference (reference_scope, table_name, md_file_id) VALUES ('dataset', '*', 15)
C) INSERT INTO gpkg_metadata_reference (reference_scope, table_name, md_file_id) VALUES ('file', 'gpkg', 15)
D) INSERT INTO gpkg_metadata_reference (reference_scope, md_file_id) VALUES ('all', 15)
<details> <summary>Click for answer</summary>
Answer: A) reference_scope='geopackage', table_name=NULL, md_file_id=15
Explanation: For GeoPackage-level metadata, use reference_scope='geopackage' and set table_name to NULL (since it applies to the entire file, not a specific table). You must also include a timestamp value (not shown in options for simplicity). Option B incorrectly uses 'dataset' as scope and '*' for table name, which is not standard.
</details>
Question 6: Hierarchical Metadata
Your GeoPackage contains three road tables: primary_roads, secondary_roads, and local_roads. All share common metadata about coordinate system and data source, but each has specific information. What's the best approach?
A) Create separate, complete metadata for each table
B) Create one metadata record that references all three tables
C) Create parent metadata with common info, then child metadata for each table using md_parent_id
D) Store metadata only for the primary_roads table
<details> <summary>Click for answer</summary>
Answer: C) Create parent metadata with common info, then child metadata using md_parent_id
Explanation: Hierarchical metadata using md_parent_id avoids duplication while maintaining specific information for each table. The parent contains common information (CRS, source, creator), and children contain table-specific details. This makes maintenance easier—update common info once in the parent, and it applies to all children.
</details>
Question 7: Finding Undocumented Tables
You want to identify which tables in your GeoPackage lack metadata. Which SQL query correctly finds undocumented tables?
A) SELECT * FROM gpkg_contents WHERE metadata IS NULL
B) SELECT table_name FROM gpkg_contents WHERE table_name NOT IN (SELECT table_name FROM gpkg_metadata)
C) SELECT table_name FROM gpkg_contents WHERE table_name NOT IN (SELECT DISTINCT table_name FROM gpkg_metadata_reference WHERE table_name IS NOT NULL)
D) SELECT * FROM gpkg_metadata_reference WHERE table_name IS NULL
<details> <summary>Click for answer</summary>
Answer: C) Check gpkg_contents against gpkg_metadata_reference
Explanation: Metadata is linked to tables through gpkg_metadata_reference, not stored directly in gpkg_contents or gpkg_metadata. You need to find tables listed in gpkg_contents (which lists all feature and tile tables) that don't appear in gpkg_metadata_reference. The WHERE table_name IS NOT NULL is important because some metadata references apply to the entire GeoPackage (table_name=NULL).
</details>
Question 8: Metadata Best Practices
A colleague creates very detailed ISO 19115 metadata for every table, resulting in a GeoPackage that's 50% metadata by file size and loads slowly. What's the best advice?
A) This is correct—more metadata is always better
B) Switch to Dublin Core which creates smaller files
C) Keep essential metadata in GeoPackage, link to external detailed documentation
D) Remove all metadata to improve performance
<details> <summary>Click for answer</summary>
Answer: C) Keep essential metadata in GeoPackage, link to external detailed documentation
Explanation: While comprehensive metadata is valuable, excessively large metadata can impact performance. Best practice is to store essential identification, contact, and quality information in the GeoPackage, and link to external resources for extensive documentation. This balances self-documentation with performance. Option B (Dublin Core) would reduce size but might not meet compliance requirements.
</details>
Question 9: Updating Metadata
Your playground equipment data has been updated with new inspections. The data changed, but the metadata (coordinate system, collection method, contact info) remains valid. What should you do?
A) Delete and recreate all metadata
B) Update the timestamp and add a note about the data update in the description
C) Nothing—metadata doesn't need updating if structure hasn't changed
D) Create new metadata with a different md_scope value
<details> <summary>Click for answer</summary>
Answer: B) Update the timestamp and add a note about the data update
Explanation: Even when structural metadata remains valid, you should document when the data was updated. Update the metadata's date field and add version notes to the description. This maintains an audit trail and helps users understand data currency. Simply doing nothing (option C) means users won't know when the data was last updated.
</details>
Question 10: QGIS Metadata Workflow
You've added comprehensive metadata to a GeoPackage table using QGIS's metadata editor (Layer Properties → Metadata tab). Where is this metadata actually stored?
A) In a separate .xml file next to the GeoPackage
B) In the gpkg_metadata and gpkg_metadata_reference tables
C) In a hidden metadata column in the feature table
D) In the QGIS project file (.qgz), not the GeoPackage
<details> <summary>Click for answer</summary>
Answer: B) In the gpkg_metadata and gpkg_metadata_reference tables
Explanation: QGIS writes metadata directly into the GeoPackage's gpkg_metadata and gpkg_metadata_reference tables, following the GeoPackage specification. This ensures the metadata travels with the data file. The metadata is embedded in the GeoPackage itself, not in external files or QGIS project files, making the GeoPackage self-documenting.
</details>
Question 11: Reference Scope for Specific Features
You have a table of protected buildings, and one particular building (row_id=42) has extensive historical documentation that should be stored as metadata. What should you use for reference_scope?
A) feature
B) row
C) record
D) individual
<details> <summary>Click for answer</summary>
Answer: B) row
Explanation: To attach metadata to a specific feature (row) in a table, use reference_scope='row' and specify the row_id_value=42. The GeoPackage specification uses 'row' rather than 'feature' or 'record' for this purpose. You would also specify the table_name and optionally column_name if the metadata applies to a specific cell.
</details>
Question 12: Metadata for Coded Values
Your GeoPackage contains a "road_surface" column with codes: 1=Asphalt, 2=Concrete, 3=Gravel, 4=Unpaved. Where should you document what these codes mean?
A) In a separate lookup table within the GeoPackage
B) In attribute-level metadata stored in gpkg_metadata
C) In comments within the SQL table definition
D) Both A and B are recommended
<details> <summary>Click for answer</summary>
Answer: D) Both a lookup table and metadata are recommended
Explanation: Best practice is to create both: (1) a lookup/domain table within the GeoPackage that can be joined programmatically, and (2) human-readable metadata documenting the codes in gpkg_metadata with md_scope='attribute'. This supports both automated validation/queries and human understanding. The lookup table enables foreign key constraints and dropdown lists in editing tools, while metadata provides documentation.
</details>
Quiz Results Interpretation
11-12 correct: Excellent! You have a thorough understanding of GeoPackage metadata structures and best practices. You're ready to implement comprehensive metadata management in municipal workflows.
9-10 correct: Very good! You understand the core concepts. Review the sections on hierarchical metadata and best practices to strengthen your expertise.
7-8 correct: Good foundation. Revisit the examples of creating metadata references and the different scope values. Practice with the SQL examples.
5-6 correct: Fair understanding. Work through the practical examples again, paying attention to the relationship between gpkg_metadata and gpkg_metadata_reference tables.
0-4 correct: Additional study needed. Review the table structures section carefully and practice creating simple metadata records using the provided SQL examples.
Practical Exercise
To reinforce your learning, complete this hands-on exercise:
Exercise: Create Comprehensive Metadata for a Municipal Dataset
- Add three levels of metadata:
- GeoPackage-level: Overall project description
- Table-level: Describe the playground_equipment table
- Column-level: Document the equipment_type codes
- View in QGIS: Open the GeoPackage and verify the metadata appears in Layer Properties → Metadata tab.
Verify your metadata:
-- Check your work
SELECT * FROM gpkg_metadata;
SELECT * FROM gpkg_metadata_reference;
Create a test GeoPackage with a playground equipment table:
-- Create sample table
CREATE TABLE playground_equipment (
fid INTEGER PRIMARY KEY,
geom BLOB NOT NULL,
equipment_type TEXT,
install_date TEXT,
condition TEXT
);
Success criteria:
- All metadata records have proper
md_scopevalues - All references use correct
reference_scopevalues - Metadata is visible in QGIS
- No orphaned metadata (all records have references)
Summary
The gpkg_metadata and gpkg_metadata_reference tables provide a powerful, standardized mechanism for keeping data and documentation together. For Danish municipalities, proper use of these tables:
✅ Ensures INSPIRE compliance
✅ Facilitates data sharing and handover
✅ Maintains institutional knowledge
✅ Improves data discoverability
✅ Supports long-term data preservation
✅ Reduces confusion and data misuse
Remember the key principles:
- Two tables work together:
gpkg_metadatastores content,gpkg_metadata_referencecreates links - Multiple levels: Metadata can describe the GeoPackage, tables, columns, or individual features
- Standards matter: Use ISO 19115 for formal compliance, Dublin Core for simplicity
- Hierarchy reduces duplication: Use
md_parent_idfor shared metadata - Keep it current: Update metadata when data changes
By incorporating metadata into your GeoPackage workflow, you create self-documenting datasets that remain valuable and understandable for years to come.