Skip to main content
Glama
CODE_REVIEW.md15.1 kB
# Code Review: MCP SSH Session Server **Focus**: Enable AI agents to interact with network devices/servers over SSH in persistent sessions like humans **Review Date**: 2025-01-01 --- ## 🔴 Critical Issues ### 1. Race Condition in Command Submission (High Priority) ✅ FIXED **Status**: Addressed by extending lock scope to cover check and registration. **Location**: `command_executor.py:112-121` **Issue**: The check for existing running commands and registration of new commands is not atomic. ```python # Thread A checks - no command running ✓ with self._lock: for cmd in self._commands.values(): if cmd.session_key == session_key and cmd.status in (RUNNING, AWAITING_INPUT): raise Exception("Command already running") # Thread B checks at the same time - no command running ✓ # Thread A registers command # Thread B registers command - NOW TWO COMMANDS ON SAME SESSION! ❌ ``` **Impact**: Multiple commands could be submitted to the same session simultaneously, causing command pile-up. **Fix**: Extend the lock to cover both check AND registration: ```python with self._lock: # Check for existing commands for cmd in self._commands.values(): if cmd.session_key == session_key and ... raise Exception(...) # Create and register command (atomic) running_cmd = RunningCommand(...) self._commands[command_id] = running_cmd ``` --- ### 2. Background Monitor Accessing Deleted Resources (High Priority) ✅ FIXED **Status**: Addressed by adding `monitoring_cancelled` event and checking it in background threads. **Location**: `command_executor.py:343-441` (timeout monitor) **Issue**: When a session is closed, background monitoring tasks might still be running and trying to access deleted shells/sessions. ```python # Main thread: close_session() -> clears cmd from _commands # Background thread: still running _continue_monitoring_timeout_background() # tries to access cmd.shell -> might crash or access wrong shell ``` **Impact**: Race condition causing crashes or undefined behavior. **Fix**: Add proper shutdown coordination: ```python # In RunningCommand dataclass, add: monitoring_cancelled: threading.Event = field(default_factory=threading.Event) # In background monitor: while not cmd.monitoring_cancelled.is_set(): # ... monitoring loop # In clear_session_commands: cmd.monitoring_cancelled.set() # Signal monitors to stop ``` --- ### 3. Unbounded Output Accumulation in Timeout Monitor (Medium Priority) ✅ FIXED **Status**: Addressed by adding `OutputLimiter` to background monitors. **Location**: `command_executor.py:375` **Issue**: During background monitoring after timeout, output keeps accumulating without limits. ```python cmd.stdout += chunk # No size limit! Could grow to gigabytes ``` **Impact**: Memory exhaustion for long-running commands that produce lots of output. **Fix**: Add OutputLimiter to background monitoring. --- ### 4. Prompt Detection False Positives During Command Echo (Medium Priority) ✅ FIXED **Status**: Addressed by verifying command echo (newline) before prompt detection. **Location**: `session_manager.py:1203-1208` **Issue**: The code tries to avoid false positives by waiting 0.5s after receiving data, but this is fragile. ```python # Command: echo "user@host:~$" # Output immediately includes: user@host:~$ echo "user@host:~$" # ^^^^^^^^^ MATCHES PROMPT PATTERN! # But we haven't seen the actual command output yet! ``` **Impact**: Commands complete prematurely before output is received. **Fix**: Better approach - track whether we've seen a newline after sending the command: ```python seen_command_echo = False while ...: if '\n' in raw_output and not seen_command_echo: seen_command_echo = True continue # Don't check for prompt until after first newline ``` --- ## 🟡 Design Issues ### 5. Exit Code Lost After Timeout (Medium Priority) **Location**: `command_executor.py:196-203` **Issue**: When a command times out, we set `exit_code = None` and lose information. ```python running_cmd.exit_code = None # Clear exit code since it's still running ``` **Problem**: When the command actually completes in background monitoring, we always set `exit_code = 0` (line 400), even if the command failed! **Fix**: Extract actual exit code from the sentinel marker or from $? check. --- ### 6. No Protection Against Prompt Injection (Medium Priority) ✅ FIXED **Status**: Addressed by disabling overly broad prompt generalization pattern. **Location**: `session_manager.py:575-579` **Issue**: The captured prompt is used as a literal string with wildcards, but isn't validated. ```python # Malicious output: jon@host:~$ cd /tmp; rm -rf * # Generalized prompt: jon@host:*$ # This would match: "Deleting everything:*$" <- FALSE POSITIVE! ``` **Impact**: Commands could complete prematurely if output accidentally matches the generalized prompt pattern. **Fix**: Be more conservative with prompt generalization - require @ symbol, specific structure. --- ### 7. Idle Timeout Can Cause Data Loss (Medium Priority) ✅ FIXED **Status**: Addressed by requiring prompt detection for idle timeout completion. **Location**: `session_manager.py:1231-1262` **Issue**: The 2-second idle timeout could cause premature completion for commands with bursty output. **Example**: ```bash # A progress bar that updates every 3 seconds Processing... [=====> ] 50% # Output # 3 second pause Processing... [==========>] 100% # More output ``` The command would complete after the first line due to 2s idle timeout. **Fix**: Only use idle timeout as a completion signal if we also detect a prompt. Otherwise, keep waiting. --- ### 8. Context-Changing Command Detection Incomplete (Low Priority) **Location**: `session_manager.py:1094-1132` **Issue**: Detection only covers common cases, missing many edge cases: - `chroot` - `python`, `node`, `irb` (entering REPLs) - `mysql`, `psql` (database shells) - Network device config modes (`conf t` on Cisco) **Impact**: Prompt detection breaks when entering these contexts. **Fix**: Expand detection or use a more robust approach (try to recapture prompt on any command that changes the shell state). --- ### 9. Enable Mode State Not Validated (Medium Priority) **Location**: `session_manager.py:217-230` **Issue**: Once enable mode is entered, we assume it stays enabled forever. **Problem**: Many commands can drop you out of enable mode: - `exit` or `disable` commands - Timeout (some devices) - Configuration errors **Impact**: Subsequent commands might fail because we're no longer in enable mode. **Fix**: Validate enable mode before each command by checking if prompt ends with `#`. --- ### 10. Session Key Ambiguity (Low Priority) **Location**: `session_manager.py:85` **Issue**: Sessions are keyed by `username@resolved_host:port`, but SSH config aliases might resolve to the same underlying host. **Example**: ``` # ~/.ssh/config Host server1 HostName 192.168.1.100 Host server2 HostName 192.168.1.100 # Same IP! ``` These would create two separate sessions when they could share one. **Impact**: Resource waste, potential confusion. **Fix**: Use resolved hostname+port as key, or make it explicit that aliases create separate sessions. --- ## 🟢 Code Quality Issues ### 11. Inconsistent Error Handling **Location**: Throughout `command_executor.py` **Issue**: Some functions return tuples `(stdout, stderr, exit_code)`, others return strings, others raise exceptions. **Fix**: Standardize on one approach, preferably returning structured result objects. --- ### 12. Magic Numbers Without Constants **Location**: Multiple files **Examples**: - `4096` - buffer size (session_manager.py:418, 509, etc.) - `65535` - large buffer (command_executor.py:314, 371) - `0.3`, `0.5`, `2.0` - various timeouts - `100`, `300` - max commands/timeouts **Fix**: Extract to named constants at class level. --- ### 13. Duplicate Code in Background Monitors **Location**: `command_executor.py:343-494` **Issue**: `_continue_monitoring_timeout_background` and `_continue_monitoring_shell_background` have very similar logic. **Fix**: Extract common monitoring logic to shared method. --- ### 14. Logging Verbosity in Tight Loops **Location**: `command_executor.py:377, 467` **Issue**: Debug logs inside tight polling loops could fill up disk. ```python while time.time() - start_time < max_timeout: logger.debug(f"[TIMEOUT_MONITOR_RECV] Received {len(chunk)} bytes") # Every 0.1s! ``` **Fix**: Use rate-limited logging or only log on significant events. --- ### 15. Shell Command Escaping Vulnerabilities **Location**: `file_manager.py:237` **Issue**: Content escaping for shell commands is done manually and might miss edge cases. ```python escaped_content = content.replace('\\', '\\\\').replace('"', '\"').replace('$', r'\$').replace('`', r'\`') ``` **Problem**: Doesn't handle newlines, null bytes, or other special characters. **Fix**: Use a more robust approach - write to temp file then move, or use base64 encoding: ```python # Base64 approach encoded = base64.b64encode(content.encode()).decode('ascii') cmd = f'echo {encoded} | base64 -d | sudo tee {shlex.quote(remote_path)} > /dev/null' ``` --- ## 🔵 Enhancement Opportunities ### 16. Add Command Timeout Extension **Suggestion**: Allow extending timeout for running commands. **Use Case**: An AI agent realizes a command needs more time and extends the timeout before it expires. **Implementation**: ```python def extend_command_timeout(command_id: str, additional_seconds: int) -> bool: """Extend timeout for a running command.""" ``` --- ### 17. Add Session Health Checks **Suggestion**: Periodically verify session connectivity. **Implementation**: ```python def is_session_healthy(session_key: str) -> bool: """Check if session is still connected and responsive.""" shell = self._session_shells.get(session_key) if not shell: return False try: shell.send('\n') time.sleep(0.1) return shell.recv_ready() or shell.get_transport().is_active() except: return False ``` --- ### 18. Add Command Queuing Per Session **Suggestion**: Instead of rejecting commands when one is running, queue them. **Benefits**: - AI agents don't need to poll to know when to send next command - More natural workflow for multi-step operations **Implementation**: Add command queue per session in `RunningCommand` dataclass. --- ### 19. Add Streaming Output API **Suggestion**: Allow reading command output as it arrives, not just at completion. **Use Case**: AI agent monitors progress of long-running command. **Implementation**: ```python def get_command_output_stream(command_id: str, from_byte: int = 0) -> str: """Get command output from specified byte position.""" ``` --- ### 20. Better Network Device Mode Detection **Suggestion**: For network devices, track current mode (user exec, privileged exec, config mode). **Benefits**: - Better prompt detection - Automatic mode management (auto-enter enable before privileged commands) **Implementation**: Enhance `_session_shell_types` to include mode state. --- ## 📊 Testing Recommendations ### Test Scenarios to Add 1. **Race Condition Tests** - Submit multiple commands to same session from different threads - Close session while command is running - Close session while background monitor is active 2. **Timeout Tests** - Command that times out then completes successfully - Command that times out then fails - Command that times out and never completes 3. **Prompt Detection Tests** - Custom PS1 with ANSI codes - Prompts that change (cd to different directory) - Commands that output fake prompts in their data - Multi-line prompts 4. **Edge Case Tests** - Very large output (>10MB) - Commands with no output - Commands that only output on stderr - Binary output - UTF-8/Unicode in various locales 5. **Network Device Tests** - Enable mode entry and persistence - Config mode entry/exit - Pager handling (more, less) - Commands that change prompt format 6. **Interactive Command Tests** - Password prompts - Yes/no prompts - Pagers (less, more) - sudo with NOPASSWD vs password --- ## 📝 Documentation Improvements ### Missing Documentation 1. **Behavior Guarantees** - What happens when session is closed while command is running? - What happens to background monitors? - Command serialization guarantees 2. **Timeout Behavior** - Initial timeout vs background monitoring timeout - When does a command truly "fail" vs continue running? 3. **Prompt Detection Algorithm** - How prompts are captured and generalized - Failure modes and workarounds - When to recapture prompts 4. **Threading Model** - Which operations are thread-safe? - Locking strategy - Background task lifecycle --- ## 🎯 Priority Recommendations ### Immediate (Before Next Release) 1. Fix race condition in command submission (#1) 2. Fix background monitor resource access (#2) 3. Add output limiting to timeout monitor (#3) 4. Fix prompt detection false positives (#4) ### Short Term (Next Sprint) 5. Preserve and extract actual exit codes (#5) 6. Improve idle timeout logic (#7) 7. Validate enable mode state (#9) 8. Fix shell escaping vulnerabilities (#15) ### Medium Term (Next Quarter) 9. Expand context-changing command detection (#8) 10. Standardize error handling (#11) 11. Extract magic numbers to constants (#12) 12. Deduplicate background monitor code (#13) 13. Add comprehensive test suite ### Long Term (Future Enhancements) 14. Session health checks (#17) 15. Command queuing (#18) 16. Streaming output API (#19) 17. Advanced network device mode tracking (#20) --- ## 🏆 What's Done Well 1. **Persistent Shell Sessions**: Excellent design choice for stateful interactions 2. **Automatic Prompt Detection**: Smart approach to handle various shell types 3. **Enable Mode Automation**: Great for network device management 4. **Sudo Integration**: Well-designed automatic password handling 5. **Output Limiting**: Good protection against memory exhaustion 6. **Background Monitoring**: Innovative solution for timeout handling 7. **File Operations**: Comprehensive SFTP + sudo fallback approach 8. **Logging**: Detailed logging helps debugging --- ## Summary This is a solid foundation for SSH session management for AI agents. The main areas for improvement are: 1. **Thread Safety**: Fix race conditions and resource access issues 2. **Robustness**: Handle edge cases in prompt detection and command completion 3. **State Management**: Better tracking of session state (enable mode, context changes) 4. **Error Handling**: More consistent and predictable error behavior The architecture is sound and the design patterns are appropriate for the use case. With the fixes outlined above, this will be production-ready for AI agent interactions.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/devnullvoid/mcp-ssh-session'

If you have feedback or need assistance with the MCP directory API, please join our Discord server