SQLite as the foundation — databases, tables, and data integrity
What is SQLite?
SQLite is a C-language library that implements a small, fast, self-contained SQL database engine. Unlike traditional database systems like PostgreSQL or Oracle, SQLite is not a client-server system. Instead, it reads and writes directly to ordinary disk files.
Key Characteristics
Self-contained - The entire database system is contained in a single library with no external dependencies. A complete SQLite database is stored in a single cross-platform file.
Serverless - No separate database server process is required. Applications read and write directly to the database file on disk.
Zero-configuration - No installation or setup procedures are needed. SQLite databases work out of the box.
Transactional - All changes within a single transaction are completely atomic, consistent, isolated, and durable (ACID-compliant), even during system crashes or power failures.
Cross-platform - Database files can be freely copied between 32-bit and 64-bit systems or between big-endian and little-endian architectures.
Example: A GeoPackage file created on a Windows desktop can be copied to a Linux server or Mac laptop and opened without any conversion or compatibility issues.
SQLite in the Context of GeoPackage
GeoPackage leverages SQLite's features while adding geospatial capabilities:
- Standard SQL support - GeoPackage uses standard SQL for querying and manipulating data
- Extension mechanism - SQLite's extension architecture allows GeoPackage to add spatial functions
- File-based storage - Single-file databases make GeoPackages highly portable
- Broad tool support - Many GIS applications and programming libraries support SQLite/GeoPackage
SQLite Database Structure
Understanding SQLite's internal structure helps you work more effectively with GeoPackage.
Database Files
A SQLite database consists of one primary file with optional auxiliary files:
Main database file (.gpkg) - Contains all database content including tables, indexes, and data.
Write-Ahead Log (.gpkg-wal) - Used during transactions in WAL mode for improved concurrency.
Shared memory file (.gpkg-shm) - Index for the WAL file, used in WAL mode.
Example: When you create aarhus_municipality.gpkg, you might see these files during active editing:
aarhus_municipality.gpkg(main file)aarhus_municipality.gpkg-wal(temporary, during editing)aarhus_municipality.gpkg-shm(temporary, during editing)
The WAL and SHM files are automatically removed when all database connections are closed.
Pages and Storage
SQLite organizes data into fixed-size pages, typically 4096 bytes (4KB). The database file is a collection of these pages.
-- Check the page size of a GeoPackage
PRAGMA page_size;
-- Returns: 4096 (typical default)
-- View database statistics
PRAGMA page_count; -- Total number of pages
PRAGMA freelist_count; -- Number of unused pages
Understanding page size implications:
- Larger page sizes (8KB, 16KB) can improve performance for large geometries
- Smaller page sizes (1KB, 2KB) reduce overhead for small features
- Default 4KB is a good balance for most GeoPackage applications
Schema Objects
A SQLite database contains several types of schema objects:
Tables - Store data in rows and columns.
Indexes - Improve query performance by creating sorted lookup structures.
Views - Virtual tables defined by queries.
Triggers - Automated actions that execute in response to data changes.
Example GeoPackage schema objects:
-- System tables (required by GeoPackage)
gpkg_contents
gpkg_geometry_columns
gpkg_spatial_ref_sys
gpkg_extensions
-- User feature tables
buildings
roads
parcels
zoning_districts
-- Spatial indexes (R-tree)
rtree_buildings_geom
rtree_roads_geom
-- Views (optional, user-defined)
current_zoning_view
infrastructure_summary_view
Creating and Managing Tables
SQLite tables are the fundamental structures for storing data in GeoPackage.
Basic Table Creation
-- Simple feature table
CREATE TABLE street_lights (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
geom POINT NOT NULL,
light_id TEXT UNIQUE NOT NULL,
street_name TEXT,
installation_date DATE,
lamp_type TEXT,
wattage INTEGER,
operational INTEGER DEFAULT 1
);
Key elements:
CREATE TABLEdefines a new table- Column definitions specify name and data type
- Constraints (PRIMARY KEY, NOT NULL, UNIQUE, DEFAULT) enforce data rules
- AUTOINCREMENT ensures IDs are never reused
Primary Keys
Every table should have a primary key - a column (or combination of columns) that uniquely identifies each row.
-- Single-column integer primary key (recommended for GeoPackage)
CREATE TABLE buildings (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
building_id TEXT UNIQUE NOT NULL,
-- other columns...
);
-- Composite primary key (multiple columns)
CREATE TABLE inspection_records (
asset_id TEXT NOT NULL,
inspection_date DATE NOT NULL,
inspector_id TEXT,
result TEXT,
PRIMARY KEY (asset_id, inspection_date)
);
-- Named primary key constraint
CREATE TABLE parcels (
fid INTEGER CONSTRAINT pk_parcels PRIMARY KEY AUTOINCREMENT,
parcel_number TEXT
);
Best practices for GeoPackage:
- Use INTEGER PRIMARY KEY AUTOINCREMENT for feature tables
- Column name
fid(feature ID) is conventional but not required - Never manually assign values to AUTOINCREMENT columns
- Composite keys can be useful for junction tables but are less common in GeoPackage
Column Constraints
Constraints enforce data integrity rules at the database level.
NOT NULL - Requires a value:
CREATE TABLE assets (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
geom POINT NOT NULL, -- Geometry must always be present
asset_id TEXT NOT NULL, -- ID is required
asset_name TEXT, -- Name is optional (NULL allowed)
installation_date DATE NOT NULL
);
UNIQUE - Ensures values are distinct:
CREATE TABLE sensors (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
sensor_id TEXT UNIQUE NOT NULL, -- Each sensor has unique ID
serial_number TEXT UNIQUE, -- Serial numbers must be unique if provided
location_code TEXT -- Multiple sensors can share location codes
);
CHECK - Validates values against conditions:
CREATE TABLE roads (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
geom LINESTRING NOT NULL,
-- Simple value check
lane_count INTEGER CHECK(lane_count > 0),
-- Range check
speed_limit INTEGER CHECK(speed_limit BETWEEN 30 AND 130),
-- List of valid values
surface_type TEXT CHECK(surface_type IN ('Asphalt', 'Concrete', 'Gravel', 'Dirt')),
-- Complex condition
width_m REAL CHECK(width_m > 0 AND width_m < 50),
-- Named constraint
CONSTRAINT valid_priority CHECK(priority BETWEEN 1 AND 10)
);
DEFAULT - Provides automatic values:
CREATE TABLE maintenance_logs (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
asset_fid INTEGER NOT NULL,
-- Current timestamp
log_date DATETIME DEFAULT CURRENT_TIMESTAMP,
-- Current date
scheduled_date DATE DEFAULT CURRENT_DATE,
-- Literal values
status TEXT DEFAULT 'Pending',
active INTEGER DEFAULT 1,
priority INTEGER DEFAULT 5,
-- Expression
fiscal_year INTEGER DEFAULT (CAST(strftime('%Y', 'now') AS INTEGER))
);
FOREIGN KEY - Establishes relationships between tables:
-- Enable foreign key support (must be set for each connection)
PRAGMA foreign_keys = ON;
CREATE TABLE buildings (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
geom POLYGON NOT NULL,
building_id TEXT UNIQUE NOT NULL
);
CREATE TABLE building_permits (
permit_id INTEGER PRIMARY KEY AUTOINCREMENT,
building_fid INTEGER NOT NULL,
permit_number TEXT UNIQUE NOT NULL,
issue_date DATE,
expiry_date DATE,
-- Foreign key relationship
FOREIGN KEY (building_fid) REFERENCES buildings(fid)
);
-- Foreign key with actions
CREATE TABLE inspections (
inspection_id INTEGER PRIMARY KEY AUTOINCREMENT,
building_fid INTEGER NOT NULL,
inspection_date DATE,
-- ON DELETE CASCADE: delete inspections when building is deleted
FOREIGN KEY (building_fid) REFERENCES buildings(fid) ON DELETE CASCADE
);
-- Multiple foreign keys
CREATE TABLE building_ownership (
ownership_id INTEGER PRIMARY KEY AUTOINCREMENT,
building_fid INTEGER NOT NULL,
owner_fid INTEGER NOT NULL,
ownership_percentage REAL,
FOREIGN KEY (building_fid) REFERENCES buildings(fid),
FOREIGN KEY (owner_fid) REFERENCES owners(fid)
);
Important: Foreign key constraints are NOT enforced by default in SQLite. You must enable them:
-- Enable for current connection
PRAGMA foreign_keys = ON;
-- Verify it's enabled
PRAGMA foreign_keys;
-- Returns: 1 (enabled) or 0 (disabled)
Modifying Tables
SQLite has limited ALTER TABLE capabilities compared to other databases.
Supported modifications:
-- Rename table
ALTER TABLE old_name RENAME TO new_name;
-- Rename column (SQLite 3.25.0+)
ALTER TABLE buildings RENAME COLUMN old_column TO new_column;
-- Add column
ALTER TABLE buildings ADD COLUMN energy_label TEXT;
-- Add column with default
ALTER TABLE buildings ADD COLUMN created_date DATE DEFAULT CURRENT_DATE;
NOT supported directly:
- Dropping columns (workaround: create new table, copy data, drop old table)
- Modifying column types
- Adding constraints to existing columns
- Reordering columns
Workaround for unsupported changes:
-- Example: Remove a column (not directly supported)
-- 1. Create new table with desired structure
CREATE TABLE buildings_new (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
geom POLYGON NOT NULL,
building_id TEXT UNIQUE NOT NULL,
address TEXT
-- Removed: obsolete_column
);
-- 2. Copy data
INSERT INTO buildings_new (fid, geom, building_id, address)
SELECT fid, geom, building_id, address FROM buildings;
-- 3. Drop old table
DROP TABLE buildings;
-- 4. Rename new table
ALTER TABLE buildings_new RENAME TO buildings;
-- 5. Recreate indexes and triggers as needed
Indexes for Performance
Indexes dramatically improve query performance but add overhead for insertions and updates.
B-Tree Indexes
Standard SQLite indexes use B-tree structures for fast lookups.
-- Single-column index
CREATE INDEX idx_buildings_type ON buildings(building_type);
-- Query can use this index
SELECT * FROM buildings WHERE building_type = 'Residential';
-- Multi-column index
CREATE INDEX idx_buildings_type_year ON buildings(building_type, construction_year);
-- Can use index for these queries:
SELECT * FROM buildings WHERE building_type = 'Residential';
SELECT * FROM buildings WHERE building_type = 'Residential' AND construction_year > 2000;
-- Cannot fully use index (only uses building_type):
SELECT * FROM buildings WHERE construction_year > 2000;
-- Unique index (enforces uniqueness)
CREATE UNIQUE INDEX idx_buildings_id ON buildings(building_id);
-- Expression index
CREATE INDEX idx_buildings_upper_name ON buildings(UPPER(building_name));
-- Partial index (only indexes subset of rows)
CREATE INDEX idx_active_buildings ON buildings(building_type)
WHERE active = 1;
When to create indexes:
- Columns frequently used in WHERE clauses
- Columns used in JOIN conditions
- Columns used in ORDER BY clauses
- Foreign key columns
When NOT to create indexes:
- Small tables (< 1000 rows)
- Columns rarely used in queries
- Columns with very low selectivity (e.g., boolean columns with skewed distribution)
- Tables with very frequent inserts/updates
Spatial Indexes (R-tree)
GeoPackage uses SQLite's R-tree module for spatial indexing.
-- R-tree spatial index (created automatically by most GIS tools)
CREATE VIRTUAL TABLE rtree_buildings_geom USING rtree(
id, -- Feature ID
minx, maxx, -- X extent
miny, maxy -- Y extent
);
-- Populate spatial index (usually done by triggers)
INSERT INTO rtree_buildings_geom
SELECT fid,
ST_MinX(geom), ST_MaxX(geom),
ST_MinY(geom), ST_MaxY(geom)
FROM buildings;
-- Query using spatial index (GIS tools do this automatically)
SELECT b.*
FROM buildings b
WHERE b.fid IN (
SELECT id FROM rtree_buildings_geom
WHERE minx <= 571000 AND maxx >= 570000
AND miny <= 6221000 AND maxy >= 6220000
);
How spatial indexes work:
- Store minimum bounding rectangles (MBR) for each feature
- Organize MBRs in hierarchical tree structure
- Quickly eliminate features that don't intersect query area
- Return candidate features for detailed geometry testing
Performance impact:
- Spatial queries can be 100-1000x faster with indexes
- Index size typically 5-10% of geometry size
- Minimal overhead for insertions/updates
Index Maintenance
-- List all indexes
SELECT name, tbl_name, sql FROM sqlite_master WHERE type = 'index';
-- Drop an index
DROP INDEX IF EXISTS idx_buildings_type;
-- Rebuild indexes (analyze for query optimization)
ANALYZE;
-- Reindex everything
REINDEX;
-- Reindex specific table
REINDEX buildings;
Transactions and Data Integrity
Transactions ensure data consistency and enable safe concurrent access.
ACID Properties
SQLite transactions are ACID-compliant:
Atomicity - All changes in a transaction succeed or fail together.
Consistency - Transactions move the database from one valid state to another.
Isolation - Concurrent transactions don't interfere with each other.
Durability - Committed changes persist even after system crashes.
Basic Transaction Commands
-- Begin transaction
BEGIN TRANSACTION;
-- Make changes
INSERT INTO buildings (geom, building_id) VALUES (...);
UPDATE buildings SET owner = 'New Owner' WHERE fid = 123;
DELETE FROM buildings WHERE demolition_date < '2020-01-01';
-- Commit (save changes)
COMMIT;
-- Or rollback (discard changes)
ROLLBACK;
Implicit transactions:
- Each SQL statement outside an explicit transaction runs in its own transaction
- This is slower for bulk operations
Explicit transactions for performance:
-- Slow: Each insert is a separate transaction
INSERT INTO street_trees (...) VALUES (...); -- Transaction 1
INSERT INTO street_trees (...) VALUES (...); -- Transaction 2
-- ... 10,000 more inserts
-- Fast: Single transaction for all inserts
BEGIN TRANSACTION;
INSERT INTO street_trees (...) VALUES (...);
INSERT INTO street_trees (...) VALUES (...);
-- ... 10,000 more inserts
COMMIT;
-- Can be 100x faster!
Transaction Modes
SQLite supports different transaction modes:
DEFERRED (default) - Lock acquired on first read or write:
BEGIN; -- or BEGIN DEFERRED;
IMMEDIATE - Write lock acquired immediately:
BEGIN IMMEDIATE;
-- Useful when you know you'll be writing
EXCLUSIVE - Exclusive lock acquired immediately:
BEGIN EXCLUSIVE;
-- Blocks all other connections
Savepoints
Savepoints allow partial rollbacks within transactions:
BEGIN TRANSACTION;
INSERT INTO buildings (...) VALUES (...); -- Change 1
SAVEPOINT sp1;
UPDATE buildings SET owner = 'Test' WHERE fid = 100; -- Change 2
DELETE FROM buildings WHERE fid = 200; -- Change 3
-- Oops, changes 2 and 3 were wrong
ROLLBACK TO SAVEPOINT sp1;
-- Changes 2 and 3 are undone, but change 1 remains
INSERT INTO buildings (...) VALUES (...); -- Change 4
COMMIT;
-- Changes 1 and 4 are saved
Views for Data Access
Views are virtual tables defined by queries, useful for simplifying complex queries and providing controlled access to data.
Creating Views
-- Simple view: current zoning districts
CREATE VIEW current_zoning AS
SELECT
fid,
geom,
zone_code,
zone_name,
max_height_m
FROM zoning_districts
WHERE effective_date <= CURRENT_DATE
AND (expiry_date IS NULL OR expiry_date > CURRENT_DATE);
-- Use view like a table
SELECT * FROM current_zoning WHERE zone_code = 'RES_LOW';
-- Complex view: building summary with ownership
CREATE VIEW building_summary AS
SELECT
b.fid,
b.geom,
b.building_id,
b.address,
b.construction_year,
b.area_m2,
o.owner_name,
o.owner_type,
COUNT(u.unit_id) AS unit_count
FROM buildings b
LEFT JOIN building_owners o ON b.fid = o.building_fid
LEFT JOIN units u ON b.fid = u.building_fid
GROUP BY b.fid;
-- View with calculations
CREATE VIEW infrastructure_age AS
SELECT
asset_id,
asset_type,
installation_year,
CAST(strftime('%Y', 'now') AS INTEGER) - installation_year AS age_years,
CASE
WHEN CAST(strftime('%Y', 'now') AS INTEGER) - installation_year < 10 THEN 'New'
WHEN CAST(strftime('%Y', 'now') AS INTEGER) - installation_year < 30 THEN 'Moderate'
ELSE 'Old'
END AS age_category
FROM infrastructure;
View Limitations
Views are read-only in standard SQLite (cannot INSERT, UPDATE, DELETE directly):
-- This will fail
INSERT INTO current_zoning (...) VALUES (...);
-- Error: cannot modify view
Workaround: Use INSTEAD OF triggers:
CREATE VIEW editable_view AS SELECT * FROM buildings WHERE active = 1;
CREATE TRIGGER update_editable_view
INSTEAD OF UPDATE ON editable_view
BEGIN
UPDATE buildings SET
building_name = NEW.building_name,
updated_date = CURRENT_TIMESTAMP
WHERE fid = OLD.fid;
END;
Benefits of Views
Simplify complex queries:
-- Instead of writing this repeatedly:
SELECT b.*, z.zone_name, p.parcel_number
FROM buildings b
JOIN parcels p ON ST_Within(ST_Centroid(b.geom), p.geom)
JOIN zoning z ON p.zone_code = z.zone_code;
-- Create a view:
CREATE VIEW buildings_with_zoning AS
SELECT b.*, z.zone_name, p.parcel_number
FROM buildings b
JOIN parcels p ON ST_Within(ST_Centroid(b.geom), p.geom)
JOIN zoning z ON p.zone_code = z.zone_code;
-- Then simply:
SELECT * FROM buildings_with_zoning WHERE zone_name = 'Residential';
Provide controlled access:
-- View exposes only public information
CREATE VIEW public_buildings AS
SELECT
fid,
geom,
building_id,
address,
construction_year,
building_type
-- Excludes: owner_name, tax_value, etc.
FROM buildings;
Triggers for Automation
Triggers automatically execute SQL statements in response to data changes.
Trigger Types
BEFORE triggers - Execute before the data change:
-- Validate data before insert
CREATE TRIGGER validate_building_before_insert
BEFORE INSERT ON buildings
FOR EACH ROW
WHEN NEW.construction_year > CAST(strftime('%Y', 'now') AS INTEGER)
BEGIN
SELECT RAISE(ABORT, 'Construction year cannot be in the future');
END;
AFTER triggers - Execute after the data change:
-- Update modification timestamp
CREATE TRIGGER update_building_timestamp
AFTER UPDATE ON buildings
FOR EACH ROW
BEGIN
UPDATE buildings SET updated_date = CURRENT_TIMESTAMP
WHERE fid = NEW.fid;
END;
INSTEAD OF triggers - Replace the operation (for views):
CREATE TRIGGER insert_into_view
INSTEAD OF INSERT ON building_summary
BEGIN
INSERT INTO buildings (geom, building_id, address)
VALUES (NEW.geom, NEW.building_id, NEW.address);
END;
Common Trigger Uses in GeoPackage
Maintain spatial index:
-- Update spatial index on insert
CREATE TRIGGER buildings_geom_insert
AFTER INSERT ON buildings
FOR EACH ROW
BEGIN
INSERT INTO rtree_buildings_geom VALUES (
NEW.fid,
ST_MinX(NEW.geom), ST_MaxX(NEW.geom),
ST_MinY(NEW.geom), ST_MaxY(NEW.geom)
);
END;
-- Update spatial index on update
CREATE TRIGGER buildings_geom_update
AFTER UPDATE OF geom ON buildings
FOR EACH ROW
BEGIN
DELETE FROM rtree_buildings_geom WHERE id = OLD.fid;
INSERT INTO rtree_buildings_geom VALUES (
NEW.fid,
ST_MinX(NEW.geom), ST_MaxX(NEW.geom),
ST_MinY(NEW.geom), ST_MaxY(NEW.geom)
);
END;
-- Update spatial index on delete
CREATE TRIGGER buildings_geom_delete
AFTER DELETE ON buildings
FOR EACH ROW
BEGIN
DELETE FROM rtree_buildings_geom WHERE id = OLD.fid;
END;
Audit logging:
CREATE TABLE audit_log (
log_id INTEGER PRIMARY KEY AUTOINCREMENT,
table_name TEXT,
operation TEXT,
record_id INTEGER,
changed_by TEXT,
changed_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
CREATE TRIGGER audit_building_changes
AFTER UPDATE ON buildings
FOR EACH ROW
BEGIN
INSERT INTO audit_log (table_name, operation, record_id)
VALUES ('buildings', 'UPDATE', NEW.fid);
END;
Cascade updates:
-- When building is demolished, update related records
CREATE TRIGGER building_demolished
AFTER UPDATE OF demolition_date ON buildings
FOR EACH ROW
WHEN NEW.demolition_date IS NOT NULL AND OLD.demolition_date IS NULL
BEGIN
UPDATE building_permits
SET status = 'Void - Building Demolished'
WHERE building_fid = NEW.fid AND status = 'Active';
DELETE FROM building_inspections
WHERE building_fid = NEW.fid AND scheduled_date > NEW.demolition_date;
END;
Data validation:
-- Ensure expiry date is after effective date
CREATE TRIGGER validate_permit_dates
BEFORE INSERT ON permits
FOR EACH ROW
WHEN NEW.expiry_date IS NOT NULL AND NEW.expiry_date <= NEW.effective_date
BEGIN
SELECT RAISE(ABORT, 'Expiry date must be after effective date');
END;
Managing Triggers
-- List all triggers
SELECT name, tbl_name, sql FROM sqlite_master WHERE type = 'trigger';
-- Drop a trigger
DROP TRIGGER IF EXISTS update_building_timestamp;
-- Temporarily disable all triggers
PRAGMA recursive_triggers = OFF;
-- Re-enable triggers
PRAGMA recursive_triggers = ON;
Data Integrity Best Practices
1. Use Constraints Liberally
CREATE TABLE infrastructure (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
geom POINT NOT NULL, -- Geometry required
asset_id TEXT UNIQUE NOT NULL, -- Unique identifier required
asset_type TEXT NOT NULL CHECK(asset_type IN ('Water', 'Sewer', 'Electric', 'Gas')),
installation_year INTEGER CHECK(installation_year >= 1800 AND installation_year <= CAST(strftime('%Y', 'now') AS INTEGER)),
operational INTEGER DEFAULT 1 CHECK(operational IN (0, 1)),
inspection_interval_days INTEGER CHECK(inspection_interval_days > 0)
);
2. Enable Foreign Keys
-- Enable foreign key enforcement
PRAGMA foreign_keys = ON;
-- Verify enforcement
PRAGMA foreign_key_check; -- Lists any violations
3. Use Transactions for Batch Operations
-- Wrap multiple related changes
BEGIN TRANSACTION;
-- Insert building
INSERT INTO buildings (geom, building_id) VALUES (...);
SET @building_fid = last_insert_rowid();
-- Insert related units
INSERT INTO units (building_fid, unit_number) VALUES (@building_fid, '1A');
INSERT INTO units (building_fid, unit_number) VALUES (@building_fid, '1B');
-- Update summary statistics
UPDATE building_stats SET total_units = total_units + 2;
COMMIT;
4. Validate Data Regularly
-- Check for orphaned records
SELECT * FROM units
WHERE building_fid NOT IN (SELECT fid FROM buildings);
-- Check for invalid geometries
SELECT fid, building_id FROM buildings
WHERE ST_IsValid(geom) = 0;
-- Check constraint violations
PRAGMA integrity_check;
5. Implement Audit Trails
CREATE TABLE data_changes (
change_id INTEGER PRIMARY KEY AUTOINCREMENT,
table_name TEXT NOT NULL,
record_id INTEGER NOT NULL,
operation TEXT NOT NULL, -- INSERT, UPDATE, DELETE
changed_at DATETIME DEFAULT CURRENT_TIMESTAMP,
changed_by TEXT,
old_values TEXT, -- JSON representation
new_values TEXT -- JSON representation
);
-- Trigger to populate audit trail
CREATE TRIGGER audit_building_updates
AFTER UPDATE ON buildings
FOR EACH ROW
BEGIN
INSERT INTO data_changes (table_name, record_id, operation, old_values, new_values)
VALUES (
'buildings',
NEW.fid,
'UPDATE',
json_object('address', OLD.address, 'owner', OLD.owner_name),
json_object('address', NEW.address, 'owner', NEW.owner_name)
);
END;
SQLite Pragmas for GeoPackage
Pragmas are special commands that control SQLite behavior.
Important Pragmas for GeoPackage
-- View application ID (should be 0x47504B47 for GeoPackage)
PRAGMA application_id;
-- Returns: 1196444487
-- View user version (GeoPackage spec version)
PRAGMA user_version;
-- Enable foreign key constraints
PRAGMA foreign_keys = ON;
-- Set journal mode (WAL recommended for concurrent access)
PRAGMA journal_mode = WAL;
-- Set synchronous mode (NORMAL balances safety and performance)
PRAGMA synchronous = NORMAL;
-- View database encoding
PRAGMA encoding;
-- Should return: UTF-8
-- Analyze database for query optimization
PRAGMA optimize;
-- Check database integrity
PRAGMA integrity_check;
-- Returns: ok (if no problems)
-- Quick integrity check (faster)
PRAGMA quick_check;
-- View table information
PRAGMA table_info(buildings);
-- Returns: column definitions
-- View index information
PRAGMA index_list(buildings);
-- View foreign keys for a table
PRAGMA foreign_key_list(building_permits);
Performance Tuning Pragmas
-- Increase cache size (in pages, default ~2MB)
PRAGMA cache_size = 10000; -- ~40MB with 4KB pages
-- Set temp storage to memory
PRAGMA temp_store = MEMORY;
-- Set page size (must be done on empty database)
PRAGMA page_size = 4096;
-- Enable memory-mapped I/O (can improve performance)
PRAGMA mmap_size = 268435456; -- 256MB
Common SQLite/GeoPackage Patterns
Pattern 1: Lookup Tables
-- Code lookup table
CREATE TABLE road_surface_codes (
code TEXT PRIMARY KEY,
description_da TEXT NOT NULL,
description_en TEXT,
typical_lifespan_years INTEGER
);
INSERT INTO road_surface_codes VALUES
('ASP', 'Asfalt', 'Asphalt', 20),
('CON', 'Beton', 'Concrete', 30),
('GRV', 'Grus', 'Gravel', 5);
-- Feature table references codes
CREATE TABLE roads (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
geom LINESTRING NOT NULL,
surface_code TEXT,
FOREIGN KEY (surface_code) REFERENCES road_surface_codes(code)
);
Pattern 2: Many-to-Many Relationships
-- Two entity tables
CREATE TABLE buildings (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
geom POLYGON NOT NULL,
building_id TEXT UNIQUE NOT NULL
);
CREATE TABLE owners (
owner_id INTEGER PRIMARY KEY AUTOINCREMENT,
owner_name TEXT NOT NULL,
contact_email TEXT
);
-- Junction table
CREATE TABLE building_ownership (
building_fid INTEGER NOT NULL,
owner_id INTEGER NOT NULL,
ownership_percentage REAL,
start_date DATE,
end_date DATE,
PRIMARY KEY (building_fid, owner_id, start_date),
FOREIGN KEY (building_fid) REFERENCES buildings(fid),
FOREIGN KEY (owner_id) REFERENCES owners(owner_id)
);
Pattern 3: Temporal Validity
CREATE TABLE land_use_history (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
geom POLYGON NOT NULL,
parcel_id TEXT NOT NULL,
land_use_code TEXT NOT NULL,
valid_from DATE NOT NULL,
valid_to DATE, -- NULL means current
recorded_by TEXT,
UNIQUE(parcel_id, valid_from)
);
-- Query current land use
SELECT * FROM land_use_history
WHERE valid_from <= CURRENT_DATE
AND (valid_to IS NULL OR valid_to > CURRENT_DATE);
-- Query land use at specific date
SELECT * FROM land_use_history
WHERE parcel_id = 'P12345'
AND valid_from <= '2020-06-15'
AND (valid_to IS NULL OR valid_to > '2020-06-15');
Pattern 4: Computed Columns with Triggers
CREATE TABLE parcels (
fid INTEGER PRIMARY KEY AUTOINCREMENT,
geom POLYGON NOT NULL,
parcel_id TEXT UNIQUE NOT NULL,
area_m2 REAL, -- Will be auto-calculated
perimeter_m REAL -- Will be auto-calculated
);
-- Auto-calculate area and perimeter
CREATE TRIGGER calculate_parcel_metrics
AFTER INSERT ON parcels
FOR EACH ROW
BEGIN
UPDATE parcels SET
area_m2 = ST_Area(geom),
perimeter_m = ST_Perimeter(geom)
WHERE fid = NEW.fid;
END;
CREATE TRIGGER recalculate_parcel_metrics
AFTER UPDATE OF geom ON parcels
FOR EACH ROW
BEGIN
UPDATE parcels SET
area_m2 = ST_Area(NEW.geom),
perimeter_m = ST_Perimeter(NEW.geom)
WHERE fid = NEW.fid;
END;
Summary
SQLite provides a robust, efficient, and portable foundation for GeoPackage. Understanding SQLite's architecture, transaction model, and data integrity features enables you to create well-designed, performant GeoPackage databases. Key takeaways include: using appropriate constraints to enforce data quality, leveraging indexes for performance, implementing transactions for data consistency, and utilizing triggers and views for automation and simplified data access. By mastering these SQLite fundamentals, you'll be well-equipped to build and maintain high-quality geospatial databases for Danish municipal applications.
Knowledge Check Quiz
Test your understanding of SQLite as the foundation of GeoPackage:
Question 1: What is the primary difference between SQLite and traditional database systems like PostgreSQL?
a) SQLite is faster than PostgreSQL
b) SQLite is serverless and stores everything in a single file
c) SQLite supports more data types
d) SQLite requires more configuration
Question 2: What are the three file types you might see when working with a GeoPackage in WAL mode?
a) .gpkg, .gpkg-log, .gpkg-backup
b) .gpkg, .gpkg-wal, .gpkg-shm
c) .gpkg, .gpkg-journal, .gpkg-index
d) .gpkg, .gpkg-tmp, .gpkg-lock
Question 3: What does AUTOINCREMENT ensure for primary key values?
a) Values are assigned sequentially without gaps
b) Values are never reused, even after deletion
c) Values start at 1 and increment by 1
d) Values are randomly generated
Question 4: Which constraint would you use to ensure a column never contains NULL values?
a) UNIQUE
b) CHECK
c) NOT NULL
d) DEFAULT
Question 5: Foreign key constraints are enforced by default in SQLite. True or False?
a) True
b) False - you must explicitly enable them with PRAGMA foreign_keys = ON
Question 6: What is the purpose of an R-tree spatial index in GeoPackage?
a) To store geometry data more efficiently
b) To quickly eliminate features that don't intersect a query area
c) To validate geometry correctness
d) To convert between coordinate systems
Question 7: Which transaction mode acquires a write lock immediately when the transaction begins?
a) DEFERRED
b) IMMEDIATE
c) EXCLUSIVE
d) CONCURRENT
Question 8: Why is wrapping multiple INSERT statements in a single transaction faster than executing them individually?
a) SQLite processes multiple statements in parallel
b) Each individual statement runs in its own transaction with overhead
c) Transactions use compression
d) The database file is locked more efficiently
Question 9: What is a view in SQLite?
a) A copy of a table with filtered data
b) A virtual table defined by a query
c) An index on multiple columns
d) A backup of table data
Question 10: What are the three types of triggers in SQLite?
a) INSERT, UPDATE, DELETE
b) BEFORE, AFTER, INSTEAD OF
c) PRIMARY, FOREIGN, CHECK
d) IMMEDIATE, DEFERRED, EXCLUSIVE
Question 11: When should you create an index on a column?
a) Always - indexes always improve performance
b) When the column is frequently used in WHERE, JOIN, or ORDER BY clauses
c) Only for TEXT columns
d) Never - indexes slow down the database
Question 12: What does the CHECK constraint do?
a) Ensures values are unique
b) Validates values against a condition
c) Prevents NULL values
d) Creates a foreign key relationship
Question 13: Which PRAGMA command enables foreign key constraint enforcement?
a) PRAGMA foreign_keys = true;
b) PRAGMA foreign_keys = ON;
c) PRAGMA enforce_foreign_keys;
d) PRAGMA constraints = foreign_keys;
Question 14: What is the purpose of a savepoint in a transaction?
a) To create a backup of the database
b) To allow partial rollback within a transaction
c) To improve transaction performance
d) To enable concurrent transactions
Question 15: Which statement about SQLite ALTER TABLE is correct?
a) You can drop columns directly
b) You can modify column types directly
c) You can rename tables and add columns, but not drop columns directly
d) ALTER TABLE is not supported in SQLite
Answer Key
- b) SQLite is serverless and stores everything in a single file - Unlike client-server databases, SQLite reads/writes directly to a single database file with no separate server process.
- b) .gpkg, .gpkg-wal, .gpkg-shm - In WAL (Write-Ahead Logging) mode, SQLite creates a write-ahead log file (.gpkg-wal) and shared memory file (.gpkg-shm) alongside the main database file.
- b) Values are never reused, even after deletion - AUTOINCREMENT ensures that once an ID is used, it's never assigned again, even if that row is deleted.
- c) NOT NULL - The NOT NULL constraint requires that a column always contains a value and cannot be NULL.
- b) False - you must explicitly enable them with
PRAGMA foreign_keys = ON- Foreign key enforcement is disabled by default in SQLite for backwards compatibility. - b) To quickly eliminate features that don't intersect a query area - R-tree indexes use bounding boxes to rapidly filter out features that definitely don't match a spatial query.
- b) IMMEDIATE - BEGIN IMMEDIATE acquires a write lock immediately, while DEFERRED waits until the first write operation.
- b) Each individual statement runs in its own transaction with overhead - Without an explicit transaction, each INSERT creates its own transaction with significant overhead. A single transaction eliminates this repeated overhead.
- b) A virtual table defined by a query - Views are virtual tables that don't store data themselves but present data from underlying tables according to a query.
- b) BEFORE, AFTER, INSTEAD OF - These are the three trigger types: BEFORE (executes before the operation), AFTER (executes after), and INSTEAD OF (replaces the operation, used for views).
- b) When the column is frequently used in WHERE, JOIN, or ORDER BY clauses - Indexes improve query performance but add overhead for inserts/updates, so create them strategically for frequently queried columns.
- b) Validates values against a condition - CHECK constraints enforce rules like value ranges, allowed values, or complex conditions.
- b)
PRAGMA foreign_keys = ON;- This is the correct syntax to enable foreign key constraint enforcement in SQLite. - b) To allow partial rollback within a transaction - Savepoints let you roll back to a specific point within a transaction without abandoning all changes.
- c) You can rename tables and add columns, but not drop columns directly - SQLite's ALTER TABLE is limited; dropping columns requires creating a new table, copying data, and dropping the old table.