Summary
Luontovahti is a web-based service that monitors Finnish environmental open data sources and alerts subscribers when changes occur in their areas of interest. It aggregates forest cutting notices, protected area designations, endangered species observations, water body health assessments, biodiversity analyses, and cultural heritage data into a single interactive map. Users draw monitoring areas, choose which data sources to track, and receive email notifications enriched with cross-references from dozens of other environmental datasets. The map and all data layers are freely browsable without registration.
The project is a solo effort, currently through its third development milestone with 109 integrated data sources, a fully operational notification pipeline (with email delivery stubbed pending deployment), and a React frontend rendering environmental data on Finnish national coordinate system base maps.
Motivation
Forest cutting notices in Finland are public record, published twice daily by the Finnish Forest Centre (Metsäkeskus). However, no existing service proactively alerts interested parties when new notices appear in specific areas, and no tool cross-references those notices against conservation data to flag potentially sensitive operations. A new clear-cutting notice adjacent to a Natura 2000 site, overlapping a salmonid stream, or within a high-biodiversity Zonation cell would go unnoticed unless someone happened to check the right map service at the right time. Luontovahti closes that gap by continuously monitoring the data and pushing enriched notifications to subscribers.
Architecture
The system runs as three logical components on a single VPS: a FastAPI async web server, an APScheduler-based change monitor, and a PostgreSQL/PostGIS database. All three share a single Python process by default, though the scheduler can be separated for independent scaling.
The backend serves a REST API with 12 endpoints covering health checks, data proxying, spatial analysis, geocoding, raster value lookups, and email subscription management. All external data access is proxied through the backend, keeping API keys off the client and enabling caching, rate limiting, and circuit breaker protection on every external call. The frontend never talks directly to any external service.
The notification pipeline runs on a schedule aligned with the Metsäkeskus data publication times. It fetches all valid cutting notices via WFS, upserts them into a local table with SHA-256 change detection, scores each new notice against Zonation biodiversity and drainage rasters, spatially matches notices to verified subscriptions using PostGIS ST_DWithin, enriches each match with cross-references from up to 36 environmental sources concurrently, composes per-subscriber HTML emails, and logs them (real email delivery is the next milestone). Each phase is a separate database transaction, and enrichment failures are non-fatal: a notification still sends even if some cross-reference sources time out.
Data Integration
The 109 registered data sources span four distinct access patterns.
Locally stored data includes cutting notices (fetched and upserted into cached_notices with full deduplication), Zonation and drainage rasters (imported as PostGIS raster tiles from SYKE GeoTIFFs), and BirdLife FINIBA/IBA bird area shapefiles.
Mirrored overlay data covers sources where live queries are impractical, such as the Tukes mining registry and Metsahallitus ArcGIS protected area layers. A FeatureSyncer periodically upserts these into a cached_features table backed by a composite GiST index on (source_id, geometry).
Pass-through proxy sources are queried live when the user pans the map, with per-source bounded caching (500 entries per source, LRU eviction across up to 200 source caches) and per-source circuit breakers that open after three consecutive transient failures and retry after five minutes.
A fourth pattern, direct query with lightweight change tracking via feature ID hashing, is designed for milestone 5.
The data sources themselves are diverse in protocol and structure. SYKE provides dozens of WFS 2.0 endpoints across multiple GeoServer workspaces. Metsahallitus publishes protected area polygons via an ArcGIS FeatureServer. FinBIF exposes species observations through a REST API with IUCN red list status linkages across nine taxon groups. GTK and Tukes host mining registry data on ArcGIS MapServer instances. Two sources (SYKE Luonnonmuistomerkit nature monuments and GTK surface soil types) are WMS-only, requiring a dedicated GetFeatureInfo proxy. All of these are normalized behind a DataProvider protocol so that the proxy router and enrichment service operate source-agnostically.
Spatial Processing
All geometries are stored and processed in EPSG:3067 (ETRS-TM35FIN), the standard Finnish projected coordinate system. The frontend sends and receives WGS84 GeoJSON; conversion happens via PostGIS ST_Transform on the backend. The map renders MML national topographic and orthophoto tiles via proj4leaflet with a 20-level resolution matrix, falling back to the community mirror at kapsi.fi when no MML API key is configured.
The area analysis endpoint (POST /api/area/analyze) accepts a user-drawn polygon and returns cutting notice statistics broken down by purpose, protection coverage from multiple source categories, conflict indicators (notices overlapping protected areas, groundwater zones, or mining claims), Zonation zonal statistics (max, mean, and percentage of area in the national top 10%), and drainage class distribution. All of this is computed server-side through PostGIS spatial joins and raster clip operations.
Zonation scoring deserves specific mention. SYKE’s 2018 national Zonation analysis assigns a 0.0 to 1.0 biodiversity priority value to every 96-meter cell across Finland, incorporating deadwood potential, Red List forest species connectivity, Forest Act habitat sites, and protected area connectivity. Luontovahti imports this as a PostGIS raster and automatically scores every new cutting notice polygon against it, computing area-weighted statistics and classifying risk as CRITICAL, HIGH, MODERATE, or LOW. This means a subscriber can be told not just that a new cutting notice appeared near their area, but that 40% of the notice polygon falls within the top 10% of nationally important biodiversity cells and overlaps a forestry-sensitive water body with known salmonid populations.
Frontend
The frontend is a single-page React application with TypeScript, built with Vite. The map fills the full viewport with no sidebar; all interaction happens through map controls, popups, and an overlay layer panel. Clicking any feature opens a combined multi-tab popup with three views: feature metadata from all matching sources at the click point, FinBIF species observations grouped by taxon and annotated with IUCN threat status badges, and a spatial analysis tab showing endangered species summaries and nearby overlay conflicts for cutting notices. The popup tabs are implemented as CSS-only radio input selectors to avoid React hydration overhead inside Leaflet’s DOM.
State management uses zustand stores for map viewport, overlay visibility, notices data, drawn polygons, and subscription email state. URL hash synchronization persists the map center, zoom, and active base layer across page loads. Overlay data fetching is debounced at 300ms on viewport changes and skipped below a minimum zoom level.
Subscription Model
The subscription system is deliberately minimal. No user accounts, no passwords, no management dashboard. A subscriber provides an email address, draws one or more monitoring areas, and selects which data sources to track. Each subscription is an independent row (email + geometry + sources array). Verification uses a shared token so that one click confirms all areas from a single request. Each subscription gets its own permanent unsubscribe token, and unsubscribing hard-deletes the row for GDPR simplicity. Email addresses are the only personal data collected; they are normalized, validated, and logged only as SHA-256 hashes in audit trails.
Resilience
External API reliability varies significantly across Finnish government services. The system handles this through several layers. Circuit breakers track per-source failure counts, opening after three consecutive transient errors (timeouts, 5xx responses) and retrying after five minutes. The overlay proxy returns HTTP 503 with a Retry-After header when a circuit is open. The enrichment service runs all source queries concurrently with an overall timeout using asyncio.wait; when the timeout fires, completed results are preserved rather than discarded. The WFS client retries transient failures with exponential backoff and enforces a configurable overall paging timeout (default 600 seconds) to prevent indefinite fetch loops. Error classification distinguishes transient from permanent failures at the type level so that only retriable errors trip circuit breakers.
Technology Choices
- Python 3.12 + FastAPI with fully async I/O, including async SQLAlchemy 2.0 sessions and httpx for external HTTP calls. A single shared
httpx.AsyncClient with 100 max connections serves all providers, eliminating per-request connection setup. - PostgreSQL 16 + PostGIS with PostGIS raster support for Zonation and drainage analysis. Ten Alembic migrations manage schema evolution through three milestones. Raster tables are loaded separately via scripts since they come from external GeoTIFF downloads.
- React 19 + TypeScript + Leaflet with proj4leaflet for Finnish national CRS rendering. No heavyweight mapping framework; the CRS configuration, tile URL construction, and layer management are implemented directly.
- APScheduler 3.x as an in-process
AsyncIOScheduler, with a guard that prevents startup if multiple uvicorn workers are detected (which would duplicate scheduled jobs and fragment process-global caches).
Current Status and Roadmap
Three milestones are complete. M1 delivered the cutting notice map, WFS integration, and subscription infrastructure. M2 added 109 overlay sources across seven provider types, multi-tab click popups, species observation display with IUCN status, and the enrichment pipeline. M3 integrated Zonation and drainage raster analysis, automated notice scoring, the area analysis endpoint, and geocoding.
The next milestone (M4) covers production deployment on a Hetzner VPS in Helsinki, Nginx reverse proxying with TLS, and wiring up a real email provider. Subsequent milestones add general multi-source change monitoring with daily digest emails (M5), a regional statistics dashboard (M6), geotagged photo uploads with moderation (M7), and PDF report export (M8).
What I Learned
Working with Finnish government spatial data services required navigating significant heterogeneity. The Metsäkeskus WFS does not support CQL filters, requiring OGC FES 2.0 XML filter construction. SYKE GeoServer workspaces sometimes have WFS disabled while WMS remains available, forcing different access strategies for the same logical dataset. ArcGIS FeatureServer area calculations are computed in Web Mercator and arrive inflated by 4 to 8.5 times at Finnish latitudes. Some endpoints return null properties unless you explicitly list property names in the request. Three planned data sources turned out to have their endpoints disabled, unreachable, or misconfigured at the time of integration. Building a resilient multi-source system means treating every external service as potentially unreliable and designing the notification pipeline so that partial failures degrade gracefully rather than blocking delivery.
The PostGIS raster integration for Zonation scoring was particularly rewarding. Being able to clip a national-scale biodiversity raster against an arbitrary polygon and compute zonal statistics entirely within the database, then attach those results automatically to every new cutting notice, turns what would otherwise require desktop GIS expertise into an automated background process that runs twice a day.