From Event Streams to NeuroSymbolic Programming

A Research Journey Across Semantic Web, ISO 25964, and Modern NeuroSymbolic AI

Mar 12, 2026

Over the course of this research exploration, we investigated a surprisingly coherent trajectory across several areas of computer science:

Event sourcing / Kappa architectures
Semantic Web and stream reasoning
ISO 25964-2 vocabulary interoperability
NeuroSymbolic AI
Modern neurosymbolic programming frameworks like Scallop

The original goal was simple: identify the most serious academic projects that connect these domains.
The result was a much deeper picture of how several research traditions have evolved and partially converged.

This article reconstructs the entire exploration—from early Semantic Web infrastructure projects to the most alive NeuroSymbolic AI frameworks today.

1. The Original Question

The starting point of the investigation was:

What are the top-tier academic projects connecting event sourcing (event-driven/Kappa architectures) with Semantic Web technologies and ISO 25964-2 thesaurus interoperability?

The short answer was immediately surprising:

There is no single canonical project combining all three.

Instead, the research landscape splits into two major clusters:

Cluster A — Semantic Event & Stream Processing

RDF stream processing
semantic complex event processing
real-time reasoning over evolving knowledge graphs

Cluster B — Vocabulary Interoperability

SKOS and ISO 25964
thesaurus mapping
controlled vocabulary infrastructure

The overlap between these clusters remains thin, but several projects approach it from different directions.

2. The Foundational Semantic Web Projects

LarKC — The Large Knowledge Collider

One of the most ambitious projects in this space was LarKC, funded under the EU FP7 program between 2008–2011.

LarKC attempted to solve a fundamental challenge:

How do we perform reasoning over web-scale knowledge when the data is noisy, incomplete, heterogeneous, and constantly changing?

Instead of building a single massive reasoner, LarKC proposed a pipeline architecture:

retrieve → abstract → select → reason → decide

This approach introduced several ideas that later became mainstream:

incomplete reasoning
workflow-based reasoning pipelines
pluggable reasoning components
distributed semantic computation

LarKC produced a service-oriented platform allowing reasoning plugins to be composed into distributed workflows.

It also inspired research in:

stream reasoning
semantic middleware
hybrid AI architectures

However, LarKC itself did not survive as a production platform.

Its legacy instead lived on in several subfields.

3. The Rise of Semantic Stream Reasoning

The most influential descendant of LarKC was the stream reasoning research field.

This field attempts to answer a central question:

How can machines reason over continuous streams of data rather than static knowledge bases?

Important systems included:

C-SPARQL

Continuous SPARQL queries over RDF streams.

ETALIS

A semantic complex event processing engine.

CityPulse

A large EU project for real-time semantic processing of smart-city data streams.

These systems introduced key ideas:

RDF streams
continuous queries
semantic event detection
real-time reasoning over knowledge graphs

Although influential academically, many of these systems eventually slowed down or stopped active development.

4. The ISO 25964 Vocabulary Ecosystem

Parallel to the stream reasoning world, another ecosystem evolved around controlled vocabularies.

ISO 25964-2 focuses on interoperability between thesauri and other knowledge organization systems.

Several major infrastructure projects emerged.

Skosmos

A widely used vocabulary publication platform.

Features include:

browsing SKOS vocabularies
SPARQL integration
multilingual thesaurus support

Skosmos remains one of the most active repositories today.

JSKOS Server

A backend infrastructure for vocabulary storage and mapping.

Key capabilities include:

concept management
mapping between vocabularies
concordances and annotations
streaming change notifications via WebSockets

This makes JSKOS Server one of the rare systems combining vocabulary infrastructure with event-style updates.

Cocoda

A collaborative tool for creating mappings between knowledge organization systems.

It enables the practical work required by ISO 25964-2:

vocabulary alignment
mapping creation
cross-domain terminology integration

Together, Skosmos, JSKOS Server, and Cocoda form the most alive modern ecosystem for semantic vocabulary interoperability.

5. The Current Semantic Streaming Landscape

Most early stream-reasoning systems are no longer the center of innovation.

The modern ecosystem includes newer projects such as:

RDF-Connect

A framework for RDF-based streaming data pipelines.

OntopStream

A system enabling streaming virtual knowledge graphs using Apache Flink.

Kolibrie

A modern Rust-based RDF stream reasoning engine.

I-DLV-sr

A logic-based system for reasoning over streaming data using Apache Flink.

Among these, Kolibrie is particularly interesting.

6. Kolibrie — A Modern Stream Reasoning Engine

Kolibrie is developed by the Stream Intelligence Lab at KU Leuven.

It focuses on:

continuous SPARQL queries
reasoning over timestamped triples
sliding and tumbling windows
integration with machine learning

Kolibrie is currently used primarily for:

benchmark evaluation
research experiments
neurosymbolic stream reasoning research

It does not yet power large production systems but is actively used in academic research.

7. The Shift Toward NeuroSymbolic AI

While exploring these systems, the investigation naturally expanded into NeuroSymbolic AI.

NeuroSymbolic systems aim to combine:

neural learning
symbolic reasoning
probabilistic logic
structured knowledge representations

Several major frameworks dominate the academic landscape.

8. The Most Alive NeuroSymbolic AI GitHub Projects

The most active academic projects today include:

Scallop

A neurosymbolic programming language based on Datalog.

DeepProbLog

A probabilistic logic framework integrating neural networks.

PyNeuraLogic / NeuraLogic

A differentiable logic programming framework.

DeepSeaProbLog

An extension supporting richer probabilistic reasoning.

DeepStochLog

A framework combining grammars, probabilities, and neural networks.

PEIRCE

An emerging framework combining LLMs with symbolic reasoning.

These systems represent different approaches to the same core problem:

How can symbolic reasoning and neural learning be integrated into a single framework?

Among them, Scallop stands out as one of the most ambitious projects.

9. The History of Scallop

Scallop evolved through several distinct phases.

Phase 1 — Prehistory

Scallop builds on earlier systems like:

TensorLog
DeepProbLog
probabilistic deductive databases

These systems struggled with scalability due to exact probabilistic reasoning.

Phase 2 — Scallop v1 (2021)

The first Scallop system appeared in a NeurIPS 2021 paper.

Its main innovation was:

scalable differentiable reasoning using Datalog

Instead of evaluating all proofs, Scallop computes the top-k proofs for queries.

This dramatically improved scalability.

Phase 3 — Scallop as a Programming Language (2023)

The PLDI 2023 paper reframed Scallop as a full language.

Key ideas:

relations as the core data model
Datalog as the reasoning language
provenance semirings for differentiable reasoning

At this stage Scallop became a full programming ecosystem including:

compiler
interpreter
REPL
Python/PyTorch bindings

Phase 4 — Education and Community

Scallop became widely used for teaching neurosymbolic programming.

The project hosts tutorials and materials for:

LOG22
SSFT22
PLDI23
SSNP24 summer school

This transformed Scallop from a research prototype into a learning platform for the field.

Phase 5 — Integration with Foundation Models

Recent work extends Scallop to integrate with:

vision models
language models
multimodal foundation models

Plugins such as:

scallop-gpt
scallop-clip

show how symbolic reasoning can be combined with modern AI systems.

10. Where the Field Stands Today

The investigation revealed a fascinating landscape.

Three major ecosystems coexist:

Semantic Stream Reasoning

Focus on real-time reasoning over data streams.

Vocabulary Interoperability

Focus on controlled vocabularies and thesaurus mapping.

NeuroSymbolic AI

Focus on integrating neural learning with symbolic reasoning.

The biggest gap is the intersection of all three.

There is still no flagship system combining:

event sourcing / Kappa architectures
semantic stream reasoning
ISO 25964 vocabulary interoperability
neurosymbolic reasoning

But the pieces now exist.

11. A Possible Future Architecture

A realistic modern architecture might look like this:

Event Log (Kafka / Kappa architecture)
        ↓
Semantic Stream Layer
(RDF-Connect / Kolibrie)
        ↓
NeuroSymbolic Reasoning
(Scallop / DeepProbLog)
        ↓
Vocabulary & Mapping Layer
(JSKOS Server / Skosmos / Cocoda)

Such a stack could support:

real-time knowledge graph reasoning
semantic interoperability
explainable AI decisions
controlled vocabulary governance

12. Final Thoughts

This exploration began with a simple question about event-driven semantic architectures.

It ended by mapping an entire research ecosystem spanning:

Semantic Web infrastructure
stream reasoning
controlled vocabulary interoperability
neuro-symbolic programming

What emerged is a clear lesson:

The future of intelligent systems is likely hybrid.

Not purely neural.
Not purely symbolic.
Not purely streaming.

But a combination of all three.

And the pieces of that architecture are already being built today.

Wondrous Machines

Discussion about this post

Ready for more?