RHEL Troubleshooting on AWS¶
Common issues and solutions when using Kryden STIG-aligned RHEL AMIs on AWS.
SSH Connection Issues¶
Cannot Connect via SSH¶
Symptoms: Connection timeout or refused
Possible Causes:
- Security Group - Ensure port 22 is open from your IP
- Network ACL - Check VPC network ACLs allow SSH
- Instance not running - Verify instance state in EC2 console
Solutions:
# Verify security group allows your IP
aws ec2 describe-security-groups --group-ids sg-xxxxx
# Check instance status
aws ec2 describe-instance-status --instance-ids i-xxxxx
Permission Denied (publickey)¶
Symptoms: Permission denied (publickey) error
Possible Causes:
- Wrong SSH key
- Wrong username (must use
ec2-user) - Incorrect key permissions
Solutions:
# Ensure correct permissions on your key
chmod 600 /path/to/your-key.pem
# Connect with verbose output
ssh -v -i /path/to/your-key.pem ec2-user@<ip>
Boot Issues¶
Instance Stuck Waiting for Devices (NVMe/Xen Mismatch)¶
Symptoms:
- Instance never passes status checks
- System log shows:
A start job is running for dev-nvme0n1p2.device - System log shows:
Timed out waiting for device dev-nvme0n1p2.device
Cause:
You launched the AMI on a Xen-based instance type (t2, m4, c4, r4). These AMIs require Nitro-based instances (t3, m5, c5, r5, etc.).
The AMI is built with NVMe storage drivers. Nitro instances present storage as /dev/nvme* devices, while Xen instances use /dev/xvd* devices. When launched on Xen, the expected NVMe devices don't exist, causing the boot to hang.
Solution:
- Terminate the stuck instance
- Launch a new instance using a Nitro-based instance type:
- t3.micro, t3.small, t3.medium (burstable)
- m5.large, m5.xlarge (general purpose)
- c5.large, c5.xlarge (compute optimized)
- See Supported Instance Types for the full list
How to Identify Instance Type
In the system log, look for Hypervisor detected::
Hypervisor detected: Xen HVM→ Wrong instance type (Xen-based)Hypervisor detected: KVM→ Correct instance type (Nitro-based)
Instance Fails to Start¶
Check System Logs:
- EC2 Console → Select Instance → Actions → Monitor and troubleshoot → Get system log
Common Issues:
- Wrong instance type - See "Instance Stuck Waiting for Devices" above
- Disk full - Check EBS volume size
- Kernel panic - May need to launch from snapshot
Cloud-init Failures¶
Check cloud-init logs:
Service Issues¶
Service Won't Start¶
# Check service status
sudo systemctl status <service>
# View service logs
sudo journalctl -u <service> -n 50
# Check SELinux denials
sudo ausearch -m AVC -ts recent | grep <service>
SELinux Blocking Application¶
# Find the denial
sudo ausearch -m AVC -ts recent
# Generate a policy module (if appropriate)
sudo ausearch -m AVC -ts recent | audit2allow -M myapp
sudo semodule -i myapp.pp
Warning
Only create custom SELinux policies if you understand the security implications.
AWS-Specific Issues¶
Instance Metadata Service¶
If applications cannot access instance metadata:
# Check IMDSv2 token
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
curl -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/
EBS Volume Issues¶
# Check disk space
df -h
# List block devices
lsblk
# Check for filesystem errors (requires unmount for non-root)
# On Nitro instances, devices are /dev/nvme*
sudo xfs_repair -n /dev/nvme1n1p1
Device Names on Nitro
On Nitro-based instances, EBS volumes appear as NVMe devices:
- Root volume:
/dev/nvme0n1 - Additional volumes:
/dev/nvme1n1,/dev/nvme2n1, etc.