๐๏ธ Architecture Overview
Modular design with separate lexer, parser, executor, and expansion engines
๐ System Architecture
User Input
Interactive commands, scripts, or command strings
โ
Lexer
- Token recognition
- Quote handling
- Alias expansion
- Pattern detection
โ
Expansions
- Brace expansion
- Variable expansion
- Command substitution
- Arithmetic expansion
โ
Parser
- AST construction
- Control structures
- Pipeline construction
- Redirection parsing
โ
Executor
- Command execution
- Pipeline management
- Redirection handling
- Error propagation
โ
Output
Command results, exit codes, and shell state
Shell State
- Variable management
- Environment integration
- Directory stack
- Alias management
Built-ins
- 32 built-in commands
- Optimized execution
- No process spawning
- Direct state access
๐ง Core Components
Lexer (src/lexer/)
Module Structure
- Main Module (
mod.rs): Token recognition, quote handling, variable detection, alias expansion, command substitution preservation, here-document tokenization, and FD redirection parsing - Token Types (
token.rs): Token enum definitions and helper methods - Test Organization (
tests/): 7 focused test modules covering basic tokenization, alias expansion, quote handling, expansion patterns, redirection operators, tilde expansion, and edge cases
Key Features
- Modular organization with clear separation of concerns
- Context-aware tokenization
- Efficient pattern matching
- Comprehensive error reporting
- Performance-optimized parsing
Parser (src/parser/)
Module Structure
- Main Module (
mod.rs): AST construction, pipeline building, redirection parsing, subshell parsing, and FD redirection AST nodes - AST Definitions (
ast.rs): Complete AST node type definitions for commands, pipelines, and control flow structures - Control Flow (
control_flow.rs): Specialized parsers for if/elif/else, case, for, while, until, and function definitions - Test Organization (
tests/): 6 focused test modules covering basic parsing, control flow, compound commands, pipelines, operators, and redirections
Key Features
- Modular organization with specialized submodules
- Recursive descent parsing
- Comprehensive error recovery
- Context-aware structure building
- Memory-efficient AST representation
Executor (src/executor/)
Module Structure
- Main Module (
mod.rs): Command execution engine and error propagation - Expansion Engine (
expansion.rs): Variable expansion, wildcard expansion, and command substitution - Redirection Handler (
redirection.rs): FD table management, redirection handling, and here-document processing - Command Executor (
command.rs): Single command execution, pipeline management, and built-in integration - Subshell Handler (
subshell.rs): Subshell execution with state isolation, compound commands, and trap inheritance - Test Organization (
tests/): 5 focused test modules covering execution, expansion, redirection, commands, and subshells
Key Features
- Highly modular architecture with focused submodules
- Efficient process management
- Robust error handling
- Streaming I/O operations
- Cross-platform compatibility
Shell State (src/state/)
Module Structure
- Main Module (
mod.rs): Variable scoping, environment integration, function context, directory stack, alias management, and loop control state - FD Table (
fd_table.rs): File descriptor table with save/restore capabilities and FD operations - Shell Options (
options.rs): Shell option flags (errexit, nounset, xtrace, etc.) and option display - Signal Handling (
signals.rs): Trap management, signal normalization, and trap display - Job Control (
jobs.rs): Job table management, background job tracking, job state transitions, and jobspec resolution - Test Organization (
tests/): 5 focused test modules covering state management, variable scoping, FD operations, shell options, and job control
Key Features
- Well-organized modular structure
- Thread-safe state management
- Efficient variable lookups
- Proper scoping rules
- Environment synchronization
- Comprehensive job control with smart jobspec matching
๐ Expansion Engines
Arithmetic Engine
src/arithmetic.rs
Features
- Shunting-yard Algorithm: Proper operator precedence and associativity
- Token-based Parsing: Converts infix expressions to Reverse Polish Notation
- Variable Integration: Seamlessly accesses shell variables during evaluation
- Comprehensive Operators: Arithmetic, comparison, bitwise, and logical operations
- Error Handling: Robust error reporting for syntax errors and division by zero
Supported Operations
Arithmetic
+ - * / % **
Comparison
== != < <= > >=
Bitwise
& | ^ << >> ~
Logical
&& || !
Parameter Expansion Engine
src/parameter_expansion.rs
Features
- Modifier Parsing: Sophisticated parsing of POSIX sh parameter expansion modifiers
- String Operations: Default values, substring extraction, pattern removal, and substitution
- Indirect Expansion: Dynamic variable access (${!name}) and variable name listing (${!prefix*})
- Error Handling: Robust error reporting for malformed expressions and edge cases
- Performance: Efficient string manipulation with minimal memory allocation
Supported Modifiers
Defaults
${VAR:-default} ${VAR:=default} ${VAR:+replacement} ${VAR:?error}
Substrings
${VAR:offset} ${VAR:offset:length}
Patterns
${VAR#pattern} ${VAR##pattern} ${VAR%pattern} ${VAR%%pattern}
Substitution
${VAR/pattern/replacement} ${VAR//pattern/replacement}
Indirect (bash extension)
${!name} ${!prefix*} ${!prefix@}
Brace Expansion Engine
src/brace_expansion.rs
Features
- Pattern Detection: Identifies brace patterns during lexing with nested brace support
- Comma-Separated Lists: Expands {a,b,c} into multiple alternatives
- Range Expansion: Numeric ({1..10}) and alphabetic ({a..z}) range generation
- Prefix/Suffix Handling: Combines braces with surrounding text for complex patterns
- Error Handling: Graceful handling of malformed patterns with clear error messages
Expansion Types
Lists
{a,b,c} โ a b c
Ranges
{1..5} โ 1 2 3 4 5
Nested
{{a,b},{c,d}} โ a b c d
Complex
file{1,2}.txt โ file1.txt file2.txt
๐ Module Dependencies
Primary Flow
lexer
โ
parser
โ
executor
Main command processing pipeline
State Management
state
โ
All modules
Shared state management across all components
Expansion Integration
expansion engines
โ
executor
Runtime expansion during command execution
Built-in Commands
builtins
โ
executor
Optimized command implementation
๐ฏ Design Principles
Single Responsibility
Each module has a clearly defined purpose with minimal cross-cutting concerns
- Lexer: Only handles tokenization and lexical analysis
- Parser: Only handles AST construction and parsing
- Executor: Only handles command execution and evaluation
- State: Only manages shell state and variables
Immutable by Default
Functions prefer immutable references to prevent accidental state mutation
// Functions prefer immutable references
pub fn get_var(&self, name: &str) -> Option {
// Implementation avoids mutation
}
Explicit Error Propagation
Clear Result types for error handling throughout the codebase
// Clear Result types for error handling
pub fn lex(input: &str, shell_state: &ShellState) -> Result, String>
Comprehensive Testing
499+ test functions ensuring reliability and correctness across all components
- Unit tests for individual components
- Integration tests for end-to-end workflows
- Error handling and edge case coverage
- Performance regression testing
โก Performance Considerations
Memory Management
- String Reuse: Reuse String instances where possible to reduce allocations
- Minimal Copying: Avoid unnecessary allocations in hot paths
- Efficient Data Structures: Use appropriate data structures for each use case
- Streaming Operations: Process large inputs in chunks when possible
Execution Optimization
- Built-in Optimization: Built-in commands execute without process spawning
- In-Context Execution: Command substitution uses in-process execution for built-ins
- Efficient Token Processing: Minimal copying during token processing pipeline
- Lazy Evaluation: Expansions are only executed when actually encountered
Caching Strategy
- Alias Expansion: Results are not cached (by design for freshness)
- Variable Lookups: O(1) access with HashMap implementation
- Function Definitions: Stored efficiently in HashMap for fast lookup
- Path Resolution: PATH searching optimized for external commands
๐ External Dependencies
rustyline
Interactive line editing and history with signal handling
support
signal-hook
Robust signal handling (SIGINT, SIGTERM)
glob
Pattern matching for case statements and wildcard expansion
clap
Command-line argument parsing with derive macros
lazy_static
Global state management in tab completion