RHEL Troubleshooting on AWS¶

Common issues and solutions when using Kryden STIG-aligned RHEL AMIs on AWS.

SSH Connection Issues¶

Cannot Connect via SSH¶

Symptoms: Connection timeout or refused

Possible Causes:

Security Group - Ensure port 22 is open from your IP
Network ACL - Check VPC network ACLs allow SSH
Instance not running - Verify instance state in EC2 console

Solutions:

# Verify security group allows your IP
aws ec2 describe-security-groups --group-ids sg-xxxxx

# Check instance status
aws ec2 describe-instance-status --instance-ids i-xxxxx

Permission Denied (publickey)¶

Symptoms: Permission denied (publickey) error

Possible Causes:

Wrong SSH key
Wrong username (must use ec2-user)
Incorrect key permissions

Solutions:

# Ensure correct permissions on your key
chmod 600 /path/to/your-key.pem

# Connect with verbose output
ssh -v -i /path/to/your-key.pem ec2-user@<ip>

Boot Issues¶

Instance Stuck Waiting for Devices (NVMe/Xen Mismatch)¶

Symptoms:

Instance never passes status checks
System log shows: A start job is running for dev-nvme0n1p2.device
System log shows: Timed out waiting for device dev-nvme0n1p2.device

Cause:

You launched the AMI on a Xen-based instance type (t2, m4, c4, r4). These AMIs require Nitro-based instances (t3, m5, c5, r5, etc.).

The AMI is built with NVMe storage drivers. Nitro instances present storage as /dev/nvme* devices, while Xen instances use /dev/xvd* devices. When launched on Xen, the expected NVMe devices don't exist, causing the boot to hang.

Solution:

Terminate the stuck instance
Launch a new instance using a Nitro-based instance type:
- t3.micro, t3.small, t3.medium (burstable)
- m5.large, m5.xlarge (general purpose)
- c5.large, c5.xlarge (compute optimized)
See Supported Instance Types for the full list

How to Identify Instance Type

In the system log, look for Hypervisor detected::

Hypervisor detected: Xen HVM → Wrong instance type (Xen-based)
Hypervisor detected: KVM → Correct instance type (Nitro-based)

Instance Fails to Start¶

Check System Logs:

EC2 Console → Select Instance → Actions → Monitor and troubleshoot → Get system log

Common Issues:

Wrong instance type - See "Instance Stuck Waiting for Devices" above
Disk full - Check EBS volume size
Kernel panic - May need to launch from snapshot

Cloud-init Failures¶

Check cloud-init logs:

sudo cat /var/log/cloud-init-output.log
sudo cat /var/log/cloud-init.log

Service Issues¶

Service Won't Start¶

# Check service status
sudo systemctl status <service>

# View service logs
sudo journalctl -u <service> -n 50

# Check SELinux denials
sudo ausearch -m AVC -ts recent | grep <service>

SELinux Blocking Application¶

# Find the denial
sudo ausearch -m AVC -ts recent

# Generate a policy module (if appropriate)
sudo ausearch -m AVC -ts recent | audit2allow -M myapp
sudo semodule -i myapp.pp

Warning

Only create custom SELinux policies if you understand the security implications.

AWS-Specific Issues¶

Instance Metadata Service¶

If applications cannot access instance metadata:

# Check IMDSv2 token
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
curl -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/

EBS Volume Issues¶

# Check disk space
df -h

# List block devices
lsblk

# Check for filesystem errors (requires unmount for non-root)
# On Nitro instances, devices are /dev/nvme*
sudo xfs_repair -n /dev/nvme1n1p1

Device Names on Nitro

On Nitro-based instances, EBS volumes appear as NVMe devices:

Root volume: /dev/nvme0n1
Additional volumes: /dev/nvme1n1, /dev/nvme2n1, etc.