Architecture & Design
Architecture & Design
Section titled “Architecture & Design”This page provides an in-depth look at the technical architecture and design decisions that make OctoFHIR FHIRPath a high-performance, reliable FHIRPath implementation.
System Architecture
Section titled “System Architecture”OctoFHIR FHIRPath follows a modular architecture designed for performance, maintainability, and extensibility.
Core Components
Section titled “Core Components”1. Lexical Analyzer (Lexer)
Section titled “1. Lexical Analyzer (Lexer)”- Purpose: Tokenizes FHIRPath expressions into a stream of tokens
- Implementation: Hand-written lexer optimized for FHIRPath syntax
- Key Features:
- Zero-copy tokenization where possible
- Comprehensive error reporting with position information
- Support for all FHIRPath operators and literals
- Unicode-aware string processing
2. Parser
Section titled “2. Parser”- Purpose: Converts token stream into an Abstract Syntax Tree (AST)
- Implementation: Recursive descent parser using the
nom
parser combinator library - Key Features:
- Operator precedence handling
- Left-associativity for operators
- Comprehensive syntax error reporting
- Memory-efficient AST representation
3. Evaluator
Section titled “3. Evaluator”- Purpose: Executes the parsed AST against FHIR resources
- Implementation: Tree-walking interpreter with optimization passes
- Key Features:
- Lazy evaluation for performance
- Type-safe value representation
- Comprehensive function library
- Context-aware evaluation
4. Value System
Section titled “4. Value System”- Purpose: Represents FHIRPath values and types
- Implementation: Rust enum-based type system
- Key Features:
- Zero-cost type conversions where possible
- Comprehensive type checking
- Support for all FHIRPath data types
- Memory-efficient representation
Design Principles
Section titled “Design Principles”1. Performance First
Section titled “1. Performance First”Every design decision prioritizes performance without sacrificing correctness:
- Zero-Copy Operations: Minimize memory allocations and copying
- Lazy Evaluation: Only compute values when needed
- Efficient Data Structures: Use appropriate data structures for each use case
- Compile-Time Optimizations: Leverage Rust’s zero-cost abstractions
2. Memory Safety
Section titled “2. Memory Safety”Rust’s ownership system ensures memory safety throughout the codebase:
- No Null Pointer Dereferences: Option types for nullable values
- No Buffer Overflows: Bounds checking on all array accesses
- No Use-After-Free: Ownership system prevents dangling pointers
- No Memory Leaks: Automatic memory management without garbage collection
3. Error Handling
Section titled “3. Error Handling”Comprehensive error handling provides clear feedback to users:
- Structured Error Types: Different error types for different failure modes
- Position Information: Precise location of syntax and evaluation errors
- Error Recovery: Continue processing when possible after errors
- User-Friendly Messages: Clear, actionable error descriptions
4. Extensibility
Section titled “4. Extensibility”The architecture supports future extensions and modifications:
- Modular Design: Clear separation of concerns between components
- Plugin Architecture: Support for custom functions and operators
- Version Compatibility: Forward and backward compatibility considerations
- API Stability: Stable public APIs with semantic versioning
Data Flow
Section titled “Data Flow”Expression Evaluation Pipeline
Section titled “Expression Evaluation Pipeline”- Input: FHIRPath expression string and FHIR resource JSON
- Lexical Analysis: Convert expression to token stream
- Parsing: Build AST from tokens
- Optimization: Apply optimization passes to AST (future)
- Evaluation: Execute AST against FHIR resource
- Output: Return results as FHIRPath values
Memory Management
Section titled “Memory Management”- Stack Allocation: Most temporary values allocated on stack
- Reference Counting: Shared data uses
Rc<T>
for reference counting - Copy-on-Write: Efficient string handling with
Cow<str>
- Arena Allocation: Large ASTs use arena allocation for efficiency
Performance Optimizations
Section titled “Performance Optimizations”1. Parsing Optimizations
Section titled “1. Parsing Optimizations”- Interned Strings: Common identifiers are interned to reduce memory usage
- Compact AST: Minimal memory footprint for AST nodes
- Fast Path Parsing: Optimized parsing for common expression patterns
2. Evaluation Optimizations
Section titled “2. Evaluation Optimizations”- Short-Circuit Evaluation: Boolean operations short-circuit when possible
- Memoization: Cache results of expensive operations
- Vectorization: Process collections efficiently using iterators
- Constant Folding: Evaluate constant expressions at parse time
3. Memory Optimizations
Section titled “3. Memory Optimizations”- Object Pooling: Reuse objects for repeated evaluations
- Lazy Loading: Load FHIR resource data on demand
- Streaming: Process large resources without loading entirely into memory
- Compression: Compress cached data to reduce memory usage
Concurrency Model
Section titled “Concurrency Model”Thread Safety
Section titled “Thread Safety”- Immutable Data: Most data structures are immutable after creation
- Send + Sync: Core types implement Send and Sync for thread safety
- Lock-Free Operations: Avoid locks where possible using atomic operations
- Parallel Processing: Support for parallel evaluation of multiple expressions
Async Support
Section titled “Async Support”- Future Integration: Compatible with async/await patterns
- Non-Blocking I/O: Support for async resource loading
- Backpressure: Handle slow consumers gracefully
- Cancellation: Support for cancelling long-running operations
Language Bindings Architecture
Section titled “Language Bindings Architecture”WebAssembly (WASM)
Section titled “WebAssembly (WASM)”- Minimal Runtime: Lightweight WASM module for browsers
- JavaScript Interop: Seamless integration with JavaScript
- Memory Management: Efficient memory sharing between WASM and JS
- Error Handling: Proper error propagation across language boundaries
Node.js (NAPI)
Section titled “Node.js (NAPI)”- Native Performance: Direct access to Rust code from Node.js
- Type Safety: TypeScript definitions for all APIs
- Async Support: Non-blocking operations using Node.js event loop
- Memory Efficiency: Minimal copying between Rust and JavaScript
Command Line Interface
Section titled “Command Line Interface”- Streaming I/O: Process large files without loading into memory
- Shell Integration: Proper exit codes and signal handling
- Configuration: Support for configuration files and environment variables
- Logging: Structured logging with configurable levels
Testing Architecture
Section titled “Testing Architecture”Unit Testing
Section titled “Unit Testing”- Property-Based Testing: Use QuickCheck-style testing for edge cases
- Fuzzing: Automated fuzzing to find parsing and evaluation bugs
- Benchmark Testing: Performance regression testing
- Memory Testing: Valgrind and AddressSanitizer integration
Integration Testing
Section titled “Integration Testing”- Official Test Suite: Compliance testing against FHIRPath specification
- Cross-Platform Testing: Automated testing on multiple platforms
- Language Binding Testing: Test all language bindings
- Performance Testing: Automated performance benchmarking
Security Considerations
Section titled “Security Considerations”Input Validation
Section titled “Input Validation”- Expression Validation: Validate FHIRPath expressions before evaluation
- Resource Validation: Validate FHIR resources against schema
- Size Limits: Prevent denial-of-service through large inputs
- Timeout Protection: Limit evaluation time for complex expressions
Memory Safety
Section titled “Memory Safety”- Buffer Overflow Protection: Rust’s memory safety prevents buffer overflows
- Integer Overflow Protection: Checked arithmetic operations
- Stack Overflow Protection: Recursion depth limits
- Heap Exhaustion Protection: Memory usage limits and monitoring
Future Architecture Considerations
Section titled “Future Architecture Considerations”Planned Enhancements
Section titled “Planned Enhancements”- JIT Compilation: Just-in-time compilation for frequently used expressions
- Query Optimization: Advanced query optimization techniques
- Distributed Evaluation: Support for distributed FHIRPath evaluation
- GPU Acceleration: Leverage GPU for parallel processing of large datasets
Scalability Improvements
Section titled “Scalability Improvements”- Horizontal Scaling: Support for distributed processing
- Caching Layer: Intelligent caching of parsed expressions and results
- Load Balancing: Built-in load balancing for high-throughput scenarios
- Resource Pooling: Efficient resource pooling for multi-tenant scenarios
This architecture provides a solid foundation for high-performance FHIRPath evaluation while maintaining the flexibility to evolve with changing requirements and new optimization opportunities.