🔄 Data Acquisition Pipeline

Regular data processing, Mutatieservice, and cronjobs

1. KVK Mutatieservice (Signal Processing)

Signal Polling

Script: mutatieservice.php

Schedule: 4-week lookback period, polled in 1-week intervals

Signals monitored:

  • SignaalNieuweInschrijving - New company registered with KVK
  • SignaalBeeindiging_2025_01 - Company closed or bankrupt
  • SignaalVoortzettingEnOverdracht_2025_01 - Company transferred (KVK number changed)
  • SignaalRechtsvormwijziging - Legal form conversion (rechtsvorm change/omzetting)
  • SignaalGewijzigdeInschrijving - Changes to company registration data (will be used to track employee count and SBI code changes)
  • SignaalGewijzigdeVestiging - Changes to establishment data (not used for address tracking - too many updates, handled via cronjobs)

Processing:

  • All signals saved to kvk_mutatieservice table
  • Known signals added to bedrijf_nieuw queue for processing
Example Signal: SignaalVoortzettingEnOverdracht (Company Transfer)
{
  "signaal": {
    "signaalType": "SignaalVoortzettingEnOverdracht_2025_01",
    "heeftBetrekkingOp": {
      "kvkNummer": "31009215",
      "totaalWerkzamePersonen": 4,
      "wordtUitgeoefendIn": [
        {
          "vestigingsnummer": "000015317528",
          "isHoofdvestiging": true,
          "totaalWerkzamePersonen": 0
        },
        {
          "vestigingsnummer": "000031756611",
          "isHoofdvestiging": false,
          "totaalWerkzamePersonen": 4
        }
      ]
    },
    "voortzettingEnOverdracht": {
      "betrokkenVestigingen": {
        "vestigingsnummers": ["000031756611", "000015317528"]
      },
      "rolBijVoortzettingOverdracht": {
        "code": "VG",
        "omschrijving": "Voortgezet door"
      },
      "kvkNummer": "98568493",
      "persoonRechtsvorm": "Eenmanszaak"
    }
  }
}

Analysis: Company KVK 31009215 transferred to new KVK 98568493. Both vestigingen (branches) moved. Legal form changed from "Vennootschap Onder Firma" to "Eenmanszaak".

2. Update Priority Strategy

Priority 1: Innovative Companies (2-month update cycle)

Criteria:

  • Companies with public innovation labels (im_public_*)
  • Active companies only
  • Not updated in last 2 months

Sorting: Oldest update first → Newest founding → Smallest employees → KVK desc

Batch size: 1200 companies per run

Priority 2: Useful Companies (6-month update cycle)

Criteria (if Priority 1 batch < 200):

  • Active vestiging addresses
  • Excluded rechtsvormen: IDs 2794, 2774, 914, 2776, 2777, 942
  • Must meet ONE of:
    • Founded in last 2 years
    • 5-250 employees
    • economischactief = true
    • Has Industrie SBI codes (1*, 2*, 30-33*)
  • Not updated in last 6 months

Sorting: Same as Priority 1

3. bedrijf_nieuw Queue Processing

Queue Structure

Purpose: Central queue for new and updated company data

Sources:

  • Mutatieservice signals
  • KVK dump processing (one-time, see October 2025)
  • Manual additions

Key fields: source (kvk/handelsdata), vestigingsnummer (determines endpoint routing)

Processing Scripts

Three parallel processors based on source and vestigingsnummer:

  • insert_from_handelsdata.php - Process entries where source='kvk' AND vestigingsnummer IS NULL
  • insert_from_vestigingsprofiel.php - Process entries where source='kvk' AND vestigingsnummer IS NOT NULL
  • insert_new_companies.php - Process entries where source='kvk' AND vestigingsnummer IS NULL (legacy KVK API)

Smart detection: Calls basisprofiel first for brand new KVK numbers, then vestigingsprofiel for branches

4. Handelsdata API Integration

Purpose

Cheap alternative to periodically update branches (target: once per year), mostly necessary to track address changes.

October 2025 usage: Tested quality and filled gap of missing registrations (see October 2025 page)

Ongoing usage: Periodic updates, combined with Mutatieservice signals or Handelsdata exportSelectie for targeted monthly updates

Implementation

Class: handelsdataRequest.class.php

Features: Multi-token system, sticky token strategy, auto-rotation on rate limits

Scripts:

  • insert_from_handelsdata.php - Process new registrations
  • update_handelsdata.php - Periodic updates (--force to bypass bedrijf_nieuw check)

Data fields updated: werknemers, websites, oprichting, email, telefoon, nmi (non-mailing-indicator)

Setup: 3 subscriptions (€40/month each = 300k credits/month total), configured in core.ini

Fallback: kvk_requests table or changing source in bedrijf_nieuw so failures are handled by KVK endpoints

Handelsdata Update Priority

Blocks update if: Pending records exist in bedrijf_nieuw with source='handelsdata'

Reason: Give priority to inserting new companies first to optimize API credit usage

Override: Use --force flag to bypass check

5. KVK API Integration

Vestigingsprofiel Integration

Purpose: Detailed branch profiles with werknemers/sbi/address information per branch

Class: kvkRequest.class.php (extended to support vestigingsprofiel endpoint)

Script: insert_from_vestigingsprofiel.php

Features:

  • force_paid vs allow_paid parameter support for controlled paid API usage
  • TEMPORARILY_UNAVAILABLE exception handling (IPD1002 KVK error - 503 responses)
  • LARGE_COMPANIES_KVK documentation support (companies >1000 branches)
  • Enhanced error handling and retry logic with max_timeout for critical errors
  • Logic fix: Fixed missing vestigingsnummer matching when updating inactive branches

Supporting Classes

taxonomie.class.php fixes:

  • getSBIOfHoofdvestiging() - Fixed incorrect SBI code retrieval for main branch (include only proper taxonomie type_id)
  • getCodeByName() - Fixed SBI code lookup by name matching

company.class.php enhancements:

  • estimateData() - Call before saving company to ensure economischactief, missing oprichting, sbi, and rechtsvorm fields are properly populated

6. Periodic Updates

Update Scripts

Scripts:

  • update_kvk.php [--force] - Periodic updates with priority check from KVK API
  • update_handelsdata.php [--force] - Periodic updates with priority check from Handelsdata API

Cronjob Schedule

Every 20 Minutes

  • :05, :25, :45 - insert_new_companies.php force
  • :07, :22, :37 - insert_from_handelsdata.php --batch-size=1000 --debug
  • :15, :35, :55 - insert_from_vestigingsprofiel.php --limit=600 --allow-update --process-missing-kvk --debug

Hourly

  • :10 - update_kvk.php force
  • :52 - update_handelsdata.php --debug --batch-size=1000 --from-db

Weekly (Monday)

  • 06:58 - reset_kvk_instellingen.php (Reset KVK counters)
  • 07:01 - reset_handelsdata_instellingen.php (Reset Handelsdata counters)