Innovatiespotter Technical Documentation

Data Pipeline & System Architecture

Overview

Projects Overview

All system components grouped by purpose and function.

Glossary

Business Glossary

Business terminology, platform features, and website categories.

Data

Data Sources

KVK API, Mutatieservice, Handelsdata, BAG, and IBIS integrations.

Pipeline

Data Acquisition Pipeline

Mutatieservice, bedrijf_nieuw queue, and cronjob schedules.

Scraping

Datahunter

Website scraping with dh_full and dh_fast instances.

WGP

DFE / WGP

Website guessing and Web orchestration event-driven architecture.

ML

ML Pipeline

Machine learning for website classification.

Ecosystem

ref_scrapers

Innovation ecosystem netlists and company matching.

Search

Spotbot Indexer

SOLR indexing, language priority system, and field specifications.

Infrastructure

Architecture Diagram

System topology, database tables, server roles, and CRUD operations.

Achievements

October 2025

Database completeness, SBI 2025 migration, and infrastructure costs.