Architecture & Design

This page provides an in-depth look at the technical architecture and design decisions that make OctoFHIR FHIRPath a high-performance, reliable FHIRPath implementation.

System Architecture

OctoFHIR FHIRPath follows a modular architecture designed for performance, maintainability, and extensibility.

Core Components

1. Lexical Analyzer (Lexer)

Purpose: Tokenizes FHIRPath expressions into a stream of tokens
Implementation: Hand-written lexer optimized for FHIRPath syntax
Key Features:
- Zero-copy tokenization where possible
- Comprehensive error reporting with position information
- Support for all FHIRPath operators and literals
- Unicode-aware string processing

2. Parser

Purpose: Converts token stream into an Abstract Syntax Tree (AST)
Implementation: Recursive descent parser using the nom parser combinator library
Key Features:
- Operator precedence handling
- Left-associativity for operators
- Comprehensive syntax error reporting
- Memory-efficient AST representation

3. Evaluator

Purpose: Executes the parsed AST against FHIR resources
Implementation: Tree-walking interpreter with optimization passes
Key Features:
- Lazy evaluation for performance
- Type-safe value representation
- Comprehensive function library
- Context-aware evaluation

4. Value System

Purpose: Represents FHIRPath values and types
Implementation: Rust enum-based type system
Key Features:
- Zero-cost type conversions where possible
- Comprehensive type checking
- Support for all FHIRPath data types
- Memory-efficient representation

Design Principles

1. Performance First

Every design decision prioritizes performance without sacrificing correctness:

Zero-Copy Operations: Minimize memory allocations and copying
Lazy Evaluation: Only compute values when needed
Efficient Data Structures: Use appropriate data structures for each use case
Compile-Time Optimizations: Leverage Rust’s zero-cost abstractions

2. Memory Safety

Rust’s ownership system ensures memory safety throughout the codebase:

No Null Pointer Dereferences: Option types for nullable values
No Buffer Overflows: Bounds checking on all array accesses
No Use-After-Free: Ownership system prevents dangling pointers
No Memory Leaks: Automatic memory management without garbage collection

3. Error Handling

Comprehensive error handling provides clear feedback to users:

Structured Error Types: Different error types for different failure modes
Position Information: Precise location of syntax and evaluation errors
Error Recovery: Continue processing when possible after errors
User-Friendly Messages: Clear, actionable error descriptions

4. Extensibility

The architecture supports future extensions and modifications:

Modular Design: Clear separation of concerns between components
Plugin Architecture: Support for custom functions and operators
Version Compatibility: Forward and backward compatibility considerations
API Stability: Stable public APIs with semantic versioning

Data Flow

Expression Evaluation Pipeline

Input: FHIRPath expression string and FHIR resource JSON
Lexical Analysis: Convert expression to token stream
Parsing: Build AST from tokens
Optimization: Apply optimization passes to AST (future)
Evaluation: Execute AST against FHIR resource
Output: Return results as FHIRPath values

Memory Management

Stack Allocation: Most temporary values allocated on stack
Reference Counting: Shared data uses Rc<T> for reference counting
Copy-on-Write: Efficient string handling with Cow<str>
Arena Allocation: Large ASTs use arena allocation for efficiency

Performance Optimizations

1. Parsing Optimizations

Interned Strings: Common identifiers are interned to reduce memory usage
Compact AST: Minimal memory footprint for AST nodes
Fast Path Parsing: Optimized parsing for common expression patterns

2. Evaluation Optimizations

Short-Circuit Evaluation: Boolean operations short-circuit when possible
Memoization: Cache results of expensive operations
Vectorization: Process collections efficiently using iterators
Constant Folding: Evaluate constant expressions at parse time

3. Memory Optimizations

Object Pooling: Reuse objects for repeated evaluations
Lazy Loading: Load FHIR resource data on demand
Streaming: Process large resources without loading entirely into memory
Compression: Compress cached data to reduce memory usage

Concurrency Model

Thread Safety

Immutable Data: Most data structures are immutable after creation
Send + Sync: Core types implement Send and Sync for thread safety
Lock-Free Operations: Avoid locks where possible using atomic operations
Parallel Processing: Support for parallel evaluation of multiple expressions

Async Support

Future Integration: Compatible with async/await patterns
Non-Blocking I/O: Support for async resource loading
Backpressure: Handle slow consumers gracefully
Cancellation: Support for cancelling long-running operations

Language Bindings Architecture

WebAssembly (WASM)

Minimal Runtime: Lightweight WASM module for browsers
JavaScript Interop: Seamless integration with JavaScript
Memory Management: Efficient memory sharing between WASM and JS
Error Handling: Proper error propagation across language boundaries

Node.js (NAPI)

Native Performance: Direct access to Rust code from Node.js
Type Safety: TypeScript definitions for all APIs
Async Support: Non-blocking operations using Node.js event loop
Memory Efficiency: Minimal copying between Rust and JavaScript

Command Line Interface

Streaming I/O: Process large files without loading into memory
Shell Integration: Proper exit codes and signal handling
Configuration: Support for configuration files and environment variables
Logging: Structured logging with configurable levels

Testing Architecture

Unit Testing

Property-Based Testing: Use QuickCheck-style testing for edge cases
Fuzzing: Automated fuzzing to find parsing and evaluation bugs
Benchmark Testing: Performance regression testing
Memory Testing: Valgrind and AddressSanitizer integration

Integration Testing

Official Test Suite: Compliance testing against FHIRPath specification
Cross-Platform Testing: Automated testing on multiple platforms
Language Binding Testing: Test all language bindings
Performance Testing: Automated performance benchmarking

Security Considerations

Input Validation

Expression Validation: Validate FHIRPath expressions before evaluation
Resource Validation: Validate FHIR resources against schema
Size Limits: Prevent denial-of-service through large inputs
Timeout Protection: Limit evaluation time for complex expressions

Memory Safety

Buffer Overflow Protection: Rust’s memory safety prevents buffer overflows
Integer Overflow Protection: Checked arithmetic operations
Stack Overflow Protection: Recursion depth limits
Heap Exhaustion Protection: Memory usage limits and monitoring

Future Architecture Considerations

Planned Enhancements

JIT Compilation: Just-in-time compilation for frequently used expressions
Query Optimization: Advanced query optimization techniques
Distributed Evaluation: Support for distributed FHIRPath evaluation
GPU Acceleration: Leverage GPU for parallel processing of large datasets

Scalability Improvements

Horizontal Scaling: Support for distributed processing
Caching Layer: Intelligent caching of parsed expressions and results
Load Balancing: Built-in load balancing for high-throughput scenarios
Resource Pooling: Efficient resource pooling for multi-tenant scenarios

This architecture provides a solid foundation for high-performance FHIRPath evaluation while maintaining the flexibility to evolve with changing requirements and new optimization opportunities.