๐๏ธ Architecture Overview
Modular design with separate lexer, parser, executor, and expansion engines
๐ System Architecture
User Input
Interactive commands, scripts, or command strings
โ
Lexer
- Token recognition
- Quote handling
- Alias expansion
- Pattern detection
โ
Expansions
- Brace expansion
- Variable expansion
- Command substitution
- Arithmetic expansion
โ
Parser
- AST construction
- Control structures
- Pipeline construction
- Redirection parsing
โ
Executor
- Command execution
- Pipeline management
- Redirection handling
- Error propagation
โ
Output
Command results, exit codes, and shell state
Shell State
- Variable management
- Environment integration
- Directory stack
- Alias management
Built-ins
- 20 built-in commands
- Optimized execution
- No process spawning
- Direct state access
๐ง Core Components
Lexer (src/lexer.rs
)
Responsibilities
- Token Recognition: Identifies commands, operators, keywords, and special tokens
- Quote Handling: Processes single/double quotes and escape sequences
- Variable Detection: Identifies variable patterns for deferred expansion
- Alias Expansion: Expands command aliases before parsing
- Command Substitution: Preserves $(...) and `...` syntax for runtime expansion
Key Features
- Context-aware tokenization
- Efficient pattern matching
- Comprehensive error reporting
- Performance-optimized parsing
Parser (src/parser.rs
)
Responsibilities
- AST Construction: Builds Abstract Syntax Tree from token stream
- Control Structures: Handles if/elif/else, case, for, while, functions
- Pipeline Construction: Creates pipeline structures for command chaining
- Redirection Parsing: Processes I/O redirection operators
Key Features
- Recursive descent parsing
- Comprehensive error recovery
- Context-aware structure building
- Memory-efficient AST representation
Executor (src/executor.rs
)
Responsibilities
- Command Execution: Runs external commands and built-in functions
- Variable Expansion: Runtime expansion of variables and parameters
- Pipeline Management: Coordinates data flow between pipeline stages
- Redirection Handling: Manages file descriptors and I/O redirection
- Error Propagation: Handles exit codes and error conditions
Key Features
- Efficient process management
- Robust error handling
- Streaming I/O operations
- Cross-platform compatibility
Shell State (src/state.rs
)
Responsibilities
- Variable Scoping: Global and local variable management with proper scoping
- Environment Integration: Coordination with system environment variables
- Function Context: Function call stack and local variable scoping
- Directory Stack: pushd/popd/dirs functionality
- Alias Management: Command alias storage and expansion
Key Features
- Thread-safe state management
- Efficient variable lookups
- Proper scoping rules
- Environment synchronization
๐ Expansion Engines
Arithmetic Engine
src/arithmetic.rs
Features
- Shunting-yard Algorithm: Proper operator precedence and associativity
- Token-based Parsing: Converts infix expressions to Reverse Polish Notation
- Variable Integration: Seamlessly accesses shell variables during evaluation
- Comprehensive Operators: Arithmetic, comparison, bitwise, and logical operations
- Error Handling: Robust error reporting for syntax errors and division by zero
Supported Operations
Arithmetic
+ - * / % **
Comparison
== != < <= > >=
Bitwise
& | ^ << >> ~
Logical
&& || !
Parameter Expansion Engine
src/parameter_expansion.rs
Features
- Modifier Parsing: Sophisticated parsing of POSIX sh parameter expansion modifiers
- String Operations: Default values, substring extraction, pattern removal, and substitution
- Indirect Expansion: Dynamic variable access (${!name}) and variable name listing (${!prefix*})
- Error Handling: Robust error reporting for malformed expressions and edge cases
- Performance: Efficient string manipulation with minimal memory allocation
Supported Modifiers
Defaults
${VAR:-default} ${VAR:=default} ${VAR:+replacement} ${VAR:?error}
Substrings
${VAR:offset} ${VAR:offset:length}
Patterns
${VAR#pattern} ${VAR##pattern} ${VAR%pattern} ${VAR%%pattern}
Substitution
${VAR/pattern/replacement} ${VAR//pattern/replacement}
Indirect (bash extension)
${!name} ${!prefix*} ${!prefix@}
Brace Expansion Engine
src/brace_expansion.rs
Features
- Pattern Detection: Identifies brace patterns during lexing with nested brace support
- Comma-Separated Lists: Expands {a,b,c} into multiple alternatives
- Range Expansion: Numeric ({1..10}) and alphabetic ({a..z}) range generation
- Prefix/Suffix Handling: Combines braces with surrounding text for complex patterns
- Error Handling: Graceful handling of malformed patterns with clear error messages
Expansion Types
Lists
{a,b,c} โ a b c
Ranges
{1..5} โ 1 2 3 4 5
Nested
{{a,b},{c,d}} โ a b c d
Complex
file{1,2}.txt โ file1.txt file2.txt
๐ Module Dependencies
Primary Flow
lexer
โ
parser
โ
executor
Main command processing pipeline
State Management
state
โ
All modules
Shared state management across all components
Expansion Integration
expansion engines
โ
executor
Runtime expansion during command execution
Built-in Commands
builtins
โ
executor
Optimized command implementation
๐ฏ Design Principles
Single Responsibility
Each module has a clearly defined purpose with minimal cross-cutting concerns
- Lexer: Only handles tokenization and lexical analysis
- Parser: Only handles AST construction and parsing
- Executor: Only handles command execution and evaluation
- State: Only manages shell state and variables
Immutable by Default
Functions prefer immutable references to prevent accidental state mutation
// Functions prefer immutable references
pub fn get_var(&self, name: &str) -> Option {
// Implementation avoids mutation
}
Explicit Error Propagation
Clear Result types for error handling throughout the codebase
// Clear Result types for error handling
pub fn lex(input: &str, shell_state: &ShellState) -> Result, String>
Comprehensive Testing
323+ test cases ensuring reliability and correctness across all components
- Unit tests for individual components
- Integration tests for end-to-end workflows
- Error handling and edge case coverage
- Performance regression testing
โก Performance Considerations
Memory Management
- String Reuse: Reuse String instances where possible to reduce allocations
- Minimal Copying: Avoid unnecessary allocations in hot paths
- Efficient Data Structures: Use appropriate data structures for each use case
- Streaming Operations: Process large inputs in chunks when possible
Execution Optimization
- Built-in Optimization: Built-in commands execute without process spawning
- In-Context Execution: Command substitution uses in-process execution for built-ins
- Efficient Token Processing: Minimal copying during token processing pipeline
- Lazy Evaluation: Expansions are only executed when actually encountered
Caching Strategy
- Alias Expansion: Results are not cached (by design for freshness)
- Variable Lookups: O(1) access with HashMap implementation
- Function Definitions: Stored efficiently in HashMap for fast lookup
- Path Resolution: PATH searching optimized for external commands
๐ External Dependencies
rustyline
Interactive line editing and history with signal handling support
nix
Unix system interactions and terminal detection
glob
Pattern matching for case statements and wildcard expansion
clap
Command-line argument parsing with derive macros
signal-hook
Robust signal handling (SIGINT, SIGTERM)
libc
Low-level C library bindings and process management