Why a Memory Module?
Most languages hide memory management. NSL exposes it as a first-class module with 74 functions across 11 layers -- from VirtualAlloc to GPU device memory to semantic tagging with the prediction engine.
The 11 Layers
| Layer | What | Key Functions |
|-------|------|---------------|
| 1. Virtual Memory | VirtualAlloc/Free/Protect | reserve, commit, release, protect, query |
| 2. Slab Allocator | 256MB arenas, 14 size classes | heap_create, heap_alloc, heap_free |
| 3. NUMA | Node-aware allocation | numa_info, numa_alloc, numa_free |
| 4. Advanced VM | Prefetch, offer, reclaim | prefetch, offer, discard |
| 5. File Mapping | mmap + ring buffers | map_file, ring_buffer |
| 6. GPU Bridge | CUDA device memory | gpu_alloc, gpu_upload, gpu_download |
| 7. CPU Tensor | AVX-512 aligned memory | cpu_alloc, cpu_free |
| 8. Semantic | Tags + prediction engine | tag, relate, query_related, embed |
| 9. Pool Allocator | Fixed-size O(1) pools | pool_create, pool_alloc, pool_free |
| 10. Async GPU | Non-blocking transfers | gpu_upload_async, gpu_sync |
| 11. Crypto/Mining | SHA-256, proof-of-work | sha256, mine, verify_pow |
Virtual Memory
let region = memory.reserve(1048576, "buffer") # 1MB virtual
memory.commit(region["addr"], 65536) # commit 64KB
memory.poke(region["addr"], [72, 101, 108]) # write bytes
let bytes = memory.peek(region["addr"], 3) # read bytes
memory.release(region["addr"]) # free
Slab Allocator
14 size classes (8B to 64KB). O(1) alloc and free via address arithmetic:
let arena = memory.heap_create("pool")
let addrs = memory.heap_alloc_batch(arena["arenaId"], 256, 10000)
print(f"Allocated {len(addrs)} blocks")
memory.heap_free_batch(arena["arenaId"], addrs, 256)
memory.heap_defrag(arena["arenaId"]) # decommit empty slabs
memory.heap_destroy(arena["arenaId"])
Data Integrity
memory.zero(addr, 4096) # SecureZeroMemory (NIST SP 800-88)
memory.fill(addr, 4096, 0xAA) # pattern fill
let hash = memory.checksum(addr, size) # xxHash64 (~28 GB/s)
let ok = memory.verify(addr, size, hash)
let h = memory.entropy(addr, 4096) # Shannon entropy [0.0, 8.0]
let d = memory.diff(addr1, addr2, size) # byte-by-byte comparison
Typed Access
memory.typed_poke(addr, "float32", [3.14, 2.71, 1.41])
let vals = memory.typed_peek(addr, "float32", 3)
# Supports: float32, float64, int32, int64, uint32
GPU Bridge
let gpu = memory.gpu_alloc(4096, "weights")
let cpu = memory.cpu_alloc(4096, 64) # 64-byte aligned
memory.poke(cpu["addr"], [1, 2, 3, 4])
memory.gpu_upload(cpu["addr"], gpu["addr"], 4)
memory.gpu_download(gpu["addr"], cpu["addr"], 4)
memory.gpu_free(gpu["addr"])
memory.cpu_free(cpu["addr"])
Ring Buffers
Zero-copy circular buffers via VirtualAlloc2 double-mapping:
let ring = memory.ring_buffer(65536)
# Two adjacent virtual mappings of the same physical pages
# Writes past the end appear at the start
memory.ring_buffer_free(ring["ringId"])
Semantic Memory
Tag regions and query by semantic similarity:
prediction.enable()
let a = memory.cpu_alloc(4096)
memory.tag(a["addr"], "type", "weights")
memory.tag(a["addr"], "layer", "encoder")
let b = memory.cpu_alloc(4096)
memory.tag(b["addr"], "type", "gradients")
memory.relate(a["addr"], b["addr"], 0.8, "+")
let related = memory.query_related(a["addr"]) # finds b
let emb = memory.embed(a["addr"]) # 32-float vector
let conf = memory.confidence(a["addr"], ["gradients", "loss"])
SHA-256 and Mining
let hash = memory.sha256(addr, size) # FIPS 180-4 compliant
# Proof-of-work (Bitcoin-style double SHA-256)
let result = memory.mine(addr, size, 20) # difficulty = 20 leading zero bits
if result["found"]
print(f"Nonce: {result.nonce}, Hash: {result.hash}")
let ok = memory.verify_pow(addr, size, result["nonce"], 20)
Pool Allocator
Faster than slab for uniform-size allocations:
let pool = memory.pool_create(64, 1000, "particles")
let elem = memory.pool_alloc(pool["poolId"]) # O(1)
memory.pool_free(pool["poolId"], elem["addr"]) # O(1)
memory.pool_destroy(pool["poolId"])