Skip to content

RHEL Troubleshooting on AWS

Common issues and solutions when using Kryden STIG-aligned RHEL AMIs on AWS.

SSH Connection Issues

Cannot Connect via SSH

Symptoms: Connection timeout or refused

Possible Causes:

  1. Security Group - Ensure port 22 is open from your IP
  2. Network ACL - Check VPC network ACLs allow SSH
  3. Instance not running - Verify instance state in EC2 console

Solutions:

# Verify security group allows your IP
aws ec2 describe-security-groups --group-ids sg-xxxxx

# Check instance status
aws ec2 describe-instance-status --instance-ids i-xxxxx

Permission Denied (publickey)

Symptoms: Permission denied (publickey) error

Possible Causes:

  1. Wrong SSH key
  2. Wrong username (must use ec2-user)
  3. Incorrect key permissions

Solutions:

# Ensure correct permissions on your key
chmod 600 /path/to/your-key.pem

# Connect with verbose output
ssh -v -i /path/to/your-key.pem ec2-user@<ip>

Boot Issues

Instance Stuck Waiting for Devices (NVMe/Xen Mismatch)

Symptoms:

  • Instance never passes status checks
  • System log shows: A start job is running for dev-nvme0n1p2.device
  • System log shows: Timed out waiting for device dev-nvme0n1p2.device

Cause:

You launched the AMI on a Xen-based instance type (t2, m4, c4, r4). These AMIs require Nitro-based instances (t3, m5, c5, r5, etc.).

The AMI is built with NVMe storage drivers. Nitro instances present storage as /dev/nvme* devices, while Xen instances use /dev/xvd* devices. When launched on Xen, the expected NVMe devices don't exist, causing the boot to hang.

Solution:

  1. Terminate the stuck instance
  2. Launch a new instance using a Nitro-based instance type:
    • t3.micro, t3.small, t3.medium (burstable)
    • m5.large, m5.xlarge (general purpose)
    • c5.large, c5.xlarge (compute optimized)
  3. See Supported Instance Types for the full list

How to Identify Instance Type

In the system log, look for Hypervisor detected::

  • Hypervisor detected: Xen HVM → Wrong instance type (Xen-based)
  • Hypervisor detected: KVM → Correct instance type (Nitro-based)

Instance Fails to Start

Check System Logs:

  1. EC2 Console → Select Instance → Actions → Monitor and troubleshoot → Get system log

Common Issues:

  • Wrong instance type - See "Instance Stuck Waiting for Devices" above
  • Disk full - Check EBS volume size
  • Kernel panic - May need to launch from snapshot

Cloud-init Failures

Check cloud-init logs:

sudo cat /var/log/cloud-init-output.log
sudo cat /var/log/cloud-init.log

Service Issues

Service Won't Start

# Check service status
sudo systemctl status <service>

# View service logs
sudo journalctl -u <service> -n 50

# Check SELinux denials
sudo ausearch -m AVC -ts recent | grep <service>

SELinux Blocking Application

# Find the denial
sudo ausearch -m AVC -ts recent

# Generate a policy module (if appropriate)
sudo ausearch -m AVC -ts recent | audit2allow -M myapp
sudo semodule -i myapp.pp

Warning

Only create custom SELinux policies if you understand the security implications.

AWS-Specific Issues

Instance Metadata Service

If applications cannot access instance metadata:

# Check IMDSv2 token
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
curl -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/

EBS Volume Issues

# Check disk space
df -h

# List block devices
lsblk

# Check for filesystem errors (requires unmount for non-root)
# On Nitro instances, devices are /dev/nvme*
sudo xfs_repair -n /dev/nvme1n1p1

Device Names on Nitro

On Nitro-based instances, EBS volumes appear as NVMe devices:

  • Root volume: /dev/nvme0n1
  • Additional volumes: /dev/nvme1n1, /dev/nvme2n1, etc.