What is the Prediction Engine?
The prediction engine is a GPU-accelerated system for understanding relationships between tokens (words, concepts, symbols). It uses three evaluation channels, generates semantic variants, and learns from experience via a neural policy head and VAE.
Quick Start
prediction.enable(32, 0.15) # 32-dim embeddings, 0.15 epsilon
# Record relations
prediction.record("cat", "animal", 1.0)
prediction.record("dog", "animal", 1.0)
prediction.record("cat", "pet", 0.8)
# Evaluate similarity
let result = prediction.evaluate("cat", "dog")
print(result.combined) # > 0.0
# Find most related tokens
let related = prediction.evaluate_all("cat", 5)
for each r in related
print(f"{r.token}: {r.combined}")
3-Layer Evaluation
Every prediction.evaluate(from, to) returns scores from three independent channels:
| Channel | What it measures | Weight |
|---------|-----------------|--------|
| Linear | Direct relation strength (normalized) | 0.4 |
| Echo | Presence-count resonance (log-scaled) | 0.2 |
| Relational | Cosine similarity in embedding space | 0.4 |
The combined score is the weighted blend.
Multi-Variant Generation
Generate 3-4 semantically distinct interpretations of a relation:
let variants = prediction.generate_variants("cat", "dog", "+")
# [{mode: "co-presence", score: 0.42, prob: 0.28, chosen: true},
# {mode: "aggregation", score: 0.38, prob: 0.25, chosen: false},
# {mode: "union", score: 0.35, prob: 0.23, chosen: false},
# {mode: "resonance", score: 0.31, prob: 0.22, chosen: false}]
Each operator (+, -, *, /, ^) generates different semantic modes. For example, - generates: withdrawal, differentiation, removal, contrast.
Neural Policy Head
A 2-layer MLP that learns which variant to select:
prediction.enable_neural_policy(16) # 16-dim hidden layer
# The policy observes variant scores + token embeddings
# and learns via REINFORCE gradient updates
prediction.update_policy("context", 0, 1.0) # reward variant 0
Exploration Modes
prediction.set_exploration("epsilon") # default: random with probability epsilon
prediction.set_exploration("gumbel") # Gumbel-softmax noise
prediction.set_exploration("boltzmann") # sample from softmax(scores/temp)
prediction.set_temperature(0.5) # lower = more greedy
VAE Training
Train a variational autoencoder on token embeddings:
prediction.train_vae("cat", 0.001)
# Loss = MSE reconstruction + 0.01 * KL divergence
# Uses proper reparameterization trick
Iterative training improves the latent space structure over time.
Prototype Memory
Store up to 64 cluster centroids in latent space:
prediction.store_prototype("cat")
prediction.store_prototype("dog")
print(prediction.prototype_count()) # merges if cosine > 0.8
Bayesian Confidence
let conf = prediction.confidence("weather", ["sunny", "warm", "clear"])
print(conf) # 0.0 to 1.0
Used by revise statements to blend engine posterior with user updates.
Auto-Feed from Cognitive Statements
When enabled, cognitive primitives automatically feed the engine:
belief/observe/revisestatements feed tokens- Sequences buffer and flush at size 8 with 2-token overlap
reviseblends 70% user + 30% engine Bayesian posteriorobservenudges related beliefs by combined_score * 0.05
Persistence
prediction.save("model.npme") # NPME v3 binary format
prediction.load("model.npme") # loads v2 or v3
prediction.save_ncf("model.ncf") # NCF format
prediction.load_ncf("model.ncf")
GPU Acceleration
evaluate_all automatically uses TF32 tensor core SGEMM for batch cosine similarity when vocabulary >= 256 tokens.