1 / 20

Knowledge Engineering
and the 'Shortcomings' of SQL

Where SQL fits, where it doesn't, and a practical path forward
By G. Sawatzky, embedded-commerce.com
Prefer prose? Read the full article

What is Knowledge Representation?

Knowledge Engineering Value Proposition

Six Technical Challenges with SQL for KR

The Three-Layer Problem

The Problem: Traditional SQL design forces blending these layers. Choosing VARCHAR size (physical) impacts how "name" (conceptual) is perceived. KE requires clean separation with loosely coupled mappings.

CWA vs. OWA: A Fundamental Divide

Impact: CWA simplifies querying but struggles with incomplete knowledge. OWA is better suited for dynamic, real-world knowledge. SQL can simulate OWA with LEFT JOIN or NOT EXISTS, but it's not native.

SQL ≠ The Relational Model

The formal RM provides robust knowledge representation through declarative constraints. A "relvar" is a predicate stating truth about the world.

Date & Darwen: SQL's Deviations from RM

The Composability Challenge

When SQL's declarative features fall short, developers resort to procedural code, scattering logical definitions.

Use SQL Where It Shines

Strategy: Use SQL for what it does best, not as the sole KR solution.

SQL is Evolving (Stonebraker: "What Goes Around Comes Around")

SQL absorbs good ideas from alternative data models, extending capabilities within the relational paradigm.

Declarative Alternatives for KR

These languages offer the declarative power and composability KE practitioners need.

Logica Example: Class Hierarchy & Inference

@Engine("duckdb");

# Define graduate students
GraduateStudent(person_id: 123);
GraduateStudent(person_id: 456);

# Rule: GraduateStudent implies Student
Student(person_id:) :- GraduateStudent(person_id:);

# Define undergraduates
Undergraduate(person_id: 789);
Student(person_id:) :- Undergraduate(person_id:);

# Rule: GraduateStudents have library access
HasLibraryAccess(person_id:) :- GraduateStudent(person_id:);
Result: Query Student returns 123, 456, 789. Query HasLibraryAccess returns 123, 456. Rules compose naturally.

Logica Example: Simulating Open World Assumption

@Engine("duckdb");

IsSweet("orange");
IsSweet("apple");
IsNotSweet("lemon");
IsNotSweet("lime");

IsFruit("orange");
IsFruit("kiwi");
IsFruit("lemon");
IsFruit("apple");
IsFruit("lime");

# Find fruits with unknown sweetness
UnknownSweetness(fruit:) :- 
  IsFruit(fruit), 
  ~IsSweet(fruit), 
  ~IsNotSweet(fruit);
Result: Returns "kiwi" - a fruit where sweetness is explicitly unknown, not assumed false.

Logica Example: Taxonomy with Transitive Closure

SubclassOf("citrus", "fruits");
SubclassOf("fruits", "foods");
SubclassOf("foods", "entity");

HasProperty("foods", "is_perishable");
HasProperty("fruits", "is_sweet");
HasProperty("citrus", "is_zesty");

# Direct subclass
TransitiveSubclass(x,y) :- SubclassOf(x, y);

# Indirect subclass (recursive)
TransitiveSubclass(x, y) :- 
  TransitiveSubclass(x, z), 
  TransitiveSubclass(z, y);

# Inherit properties from superclasses
HasAllProperties(class, property) :- 
  HasProperty(class, property);
HasAllProperties(class, property) :-
  SubclassOf(class, superclass),
  HasAllProperties(superclass, property);

KE Toolkit Evaluation Checklist (Part 1)

Criteria Description
Declarative Expression Express domain rules as logical statements (what is true), not procedural steps (how to compute).
Automated Inference Native support for deriving new facts from existing knowledge and rules.
Semantic Richness Capture complex semantics: hierarchies, part-whole relationships, N-ary associations.
Incomplete Knowledge (OWA) Distinguish unknown from false; handle missing information robustly.

KE Toolkit Evaluation Checklist (Part 2)

Criteria Description
Schema Flexibility Evolve conceptual model without significant overhead or disruption.
Layer Separation Clear separation of conceptual, logical, and physical concerns.
Explainability Provide transparent reasoning traces for conclusions.
Composability Combine smaller knowledge modules to build larger, complex systems.
Model Querying Query schema, concepts, and relationships independent of data instances.

Model-First, Engine-Right Architecture

ORM (conceptual authority) KR Rules Execution SQL • PGQ • Views Prolog • Datalog LNN/LTN

Orchestration Strategy for the ORM Toolkit

Key principle: Don't replace—orchestrate. Use each technology for its strengths.

Next Steps

More background in the article: Knowledge Engineering and the "Shortcomings" of SQL