๐Ÿ—๏ธ Architecture Overview

Modular design with separate lexer, parser, executor, and expansion engines

๐Ÿ“ System Architecture

๐Ÿ‘ค

User Input

Interactive commands, scripts, or command strings

โ†“
๐Ÿ“

Lexer

  • Token recognition
  • Quote handling
  • Alias expansion
  • Pattern detection
โ†“
๐Ÿ”„

Expansions

  • Brace expansion
  • Variable expansion
  • Command substitution
  • Arithmetic expansion
โ†“
๐Ÿ”

Parser

  • AST construction
  • Control structures
  • Pipeline construction
  • Redirection parsing
โ†“
โšก

Executor

  • Command execution
  • Pipeline management
  • Redirection handling
  • Error propagation
โ†“
๐Ÿ“ค

Output

Command results, exit codes, and shell state

๐Ÿ—‚๏ธ

Shell State

  • Variable management
  • Environment integration
  • Directory stack
  • Alias management
๐Ÿ› ๏ธ

Built-ins

  • 32 built-in commands
  • Optimized execution
  • No process spawning
  • Direct state access

๐Ÿ”ง Core Components

๐Ÿ“

Lexer (src/lexer/)

Module Structure

  • Main Module (mod.rs): Token recognition, quote handling, variable detection, alias expansion, command substitution preservation, here-document tokenization, and FD redirection parsing
  • Token Types (token.rs): Token enum definitions and helper methods
  • Test Organization (tests/): 7 focused test modules covering basic tokenization, alias expansion, quote handling, expansion patterns, redirection operators, tilde expansion, and edge cases

Key Features

  • Modular organization with clear separation of concerns
  • Context-aware tokenization
  • Efficient pattern matching
  • Comprehensive error reporting
  • Performance-optimized parsing
๐Ÿ”

Parser (src/parser/)

Module Structure

  • Main Module (mod.rs): AST construction, pipeline building, redirection parsing, subshell parsing, and FD redirection AST nodes
  • AST Definitions (ast.rs): Complete AST node type definitions for commands, pipelines, and control flow structures
  • Control Flow (control_flow.rs): Specialized parsers for if/elif/else, case, for, while, until, and function definitions
  • Test Organization (tests/): 6 focused test modules covering basic parsing, control flow, compound commands, pipelines, operators, and redirections

Key Features

  • Modular organization with specialized submodules
  • Recursive descent parsing
  • Comprehensive error recovery
  • Context-aware structure building
  • Memory-efficient AST representation
โšก

Executor (src/executor/)

Module Structure

  • Main Module (mod.rs): Command execution engine and error propagation
  • Expansion Engine (expansion.rs): Variable expansion, wildcard expansion, and command substitution
  • Redirection Handler (redirection.rs): FD table management, redirection handling, and here-document processing
  • Command Executor (command.rs): Single command execution, pipeline management, and built-in integration
  • Subshell Handler (subshell.rs): Subshell execution with state isolation, compound commands, and trap inheritance
  • Test Organization (tests/): 5 focused test modules covering execution, expansion, redirection, commands, and subshells

Key Features

  • Highly modular architecture with focused submodules
  • Efficient process management
  • Robust error handling
  • Streaming I/O operations
  • Cross-platform compatibility
๐Ÿ—‚๏ธ

Shell State (src/state/)

Module Structure

  • Main Module (mod.rs): Variable scoping, environment integration, function context, directory stack, alias management, and loop control state
  • FD Table (fd_table.rs): File descriptor table with save/restore capabilities and FD operations
  • Shell Options (options.rs): Shell option flags (errexit, nounset, xtrace, etc.) and option display
  • Signal Handling (signals.rs): Trap management, signal normalization, and trap display
  • Job Control (jobs.rs): Job table management, background job tracking, job state transitions, and jobspec resolution
  • Test Organization (tests/): 5 focused test modules covering state management, variable scoping, FD operations, shell options, and job control

Key Features

  • Well-organized modular structure
  • Thread-safe state management
  • Efficient variable lookups
  • Proper scoping rules
  • Environment synchronization
  • Comprehensive job control with smart jobspec matching

๐Ÿ”„ Expansion Engines

๐Ÿงฎ

Arithmetic Engine

src/arithmetic.rs

Features

  • Shunting-yard Algorithm: Proper operator precedence and associativity
  • Token-based Parsing: Converts infix expressions to Reverse Polish Notation
  • Variable Integration: Seamlessly accesses shell variables during evaluation
  • Comprehensive Operators: Arithmetic, comparison, bitwise, and logical operations
  • Error Handling: Robust error reporting for syntax errors and division by zero

Supported Operations

Arithmetic
+ - * / % **
Comparison
== != < <= > >=
Bitwise
& | ^ << >> ~
Logical
&& || !
๐Ÿ“Š

Parameter Expansion Engine

src/parameter_expansion.rs

Features

  • Modifier Parsing: Sophisticated parsing of POSIX sh parameter expansion modifiers
  • String Operations: Default values, substring extraction, pattern removal, and substitution
  • Indirect Expansion: Dynamic variable access (${!name}) and variable name listing (${!prefix*})
  • Error Handling: Robust error reporting for malformed expressions and edge cases
  • Performance: Efficient string manipulation with minimal memory allocation

Supported Modifiers

Defaults
${VAR:-default} ${VAR:=default} ${VAR:+replacement} ${VAR:?error}
Substrings
${VAR:offset} ${VAR:offset:length}
Patterns
${VAR#pattern} ${VAR##pattern} ${VAR%pattern} ${VAR%%pattern}
Substitution
${VAR/pattern/replacement} ${VAR//pattern/replacement}
Indirect (bash extension)
${!name} ${!prefix*} ${!prefix@}
๐Ÿ”—

Brace Expansion Engine

src/brace_expansion.rs

Features

  • Pattern Detection: Identifies brace patterns during lexing with nested brace support
  • Comma-Separated Lists: Expands {a,b,c} into multiple alternatives
  • Range Expansion: Numeric ({1..10}) and alphabetic ({a..z}) range generation
  • Prefix/Suffix Handling: Combines braces with surrounding text for complex patterns
  • Error Handling: Graceful handling of malformed patterns with clear error messages

Expansion Types

Lists
{a,b,c} โ†’ a b c
Ranges
{1..5} โ†’ 1 2 3 4 5
Nested
{{a,b},{c,d}} โ†’ a b c d
Complex
file{1,2}.txt โ†’ file1.txt file2.txt

๐Ÿ”— Module Dependencies

Primary Flow

lexer โ†’ parser โ†’ executor

Main command processing pipeline

State Management

state โ†” All modules

Shared state management across all components

Expansion Integration

expansion engines โ†’ executor

Runtime expansion during command execution

Built-in Commands

builtins โ†’ executor

Optimized command implementation

๐ŸŽฏ Design Principles

๐Ÿ”’

Single Responsibility

Each module has a clearly defined purpose with minimal cross-cutting concerns

  • Lexer: Only handles tokenization and lexical analysis
  • Parser: Only handles AST construction and parsing
  • Executor: Only handles command execution and evaluation
  • State: Only manages shell state and variables
๐Ÿ“

Immutable by Default

Functions prefer immutable references to prevent accidental state mutation

// Functions prefer immutable references
pub fn get_var(&self, name: &str) -> Option {
    // Implementation avoids mutation
}
โš ๏ธ

Explicit Error Propagation

Clear Result types for error handling throughout the codebase

// Clear Result types for error handling
pub fn lex(input: &str, shell_state: &ShellState) -> Result, String>
๐Ÿงช

Comprehensive Testing

499+ test functions ensuring reliability and correctness across all components

  • Unit tests for individual components
  • Integration tests for end-to-end workflows
  • Error handling and edge case coverage
  • Performance regression testing

โšก Performance Considerations

Memory Management

  • String Reuse: Reuse String instances where possible to reduce allocations
  • Minimal Copying: Avoid unnecessary allocations in hot paths
  • Efficient Data Structures: Use appropriate data structures for each use case
  • Streaming Operations: Process large inputs in chunks when possible

Execution Optimization

  • Built-in Optimization: Built-in commands execute without process spawning
  • In-Context Execution: Command substitution uses in-process execution for built-ins
  • Efficient Token Processing: Minimal copying during token processing pipeline
  • Lazy Evaluation: Expansions are only executed when actually encountered

Caching Strategy

  • Alias Expansion: Results are not cached (by design for freshness)
  • Variable Lookups: O(1) access with HashMap implementation
  • Function Definitions: Stored efficiently in HashMap for fast lookup
  • Path Resolution: PATH searching optimized for external commands

๐Ÿ“š External Dependencies

rustyline
Interactive line editing and history with signal handling support
signal-hook
Robust signal handling (SIGINT, SIGTERM)
glob
Pattern matching for case statements and wildcard expansion
clap
Command-line argument parsing with derive macros
lazy_static
Global state management in tab completion