Cloud Server Resource Monitoring and Optimization Guide

Cloud Server Resource Monitoring and Optimization Guide

Monitor CPU, RAM, disk, and network usage on your cloud server with monitoring tools and optimization techniques. Detect performance issues with htop, vmstat, and iostat.

E

Elif Demir

Cloud Solutions Architect

March 20, 202614 min read0

Is your cloud server's CPU stuck at 95% but you don't know which process is causing it? Or is RAM running out and OOM Killer randomly killing services? Cloud server resource monitoring is the foundation of solving performance issues proactively rather than reactively. In this guide, we cover how to monitor CPU, RAM, disk I/O, and network traffic using Linux command-line tools, how to detect bottlenecks, and concrete optimization steps.

Why Is Resource Monitoring Critical?

Managing a server without resource monitoring is like driving a car without a dashboard. You only notice problems when it crashes. With regular monitoring:

  • You can plan capacity By tracking resource consumption trends, you can determine in advance when you need to scale.
  • You can quickly identify performance bottlenecks You can determine within minutes whether the slowdown is caused by CPU, RAM, disk I/O, or network.
  • You can optimize costs Allocating more resources than needed is a waste of money. Based on actual usage data, you can choose the right-sized cloud server.

Monitoring and Optimizing CPU Usage

Real-Time CPU Monitoring with htop

htop is an enhanced, colorful alternative to the top command. It shows each CPU core's usage separately, allows sorting processes by CPU or RAM, and lets you terminate processes directly.

terminal - htop installation and usage
# Install htop (Ubuntu/Debian)
sudo apt install htop -y

# Run
htop

# Filter processes by a specific user
htop -u www-data

# Sort by CPU usage: inside htop press F6 → PERCENT_CPU

Per-Core Analysis with mpstat

mpstat (part of the sysstat package) shows the usage distribution of each CPU core. It's a critical tool for workloads concentrated on a single core (single-threaded applications).

terminal - mpstat
# Install sysstat
sudo apt install sysstat -y

# Show all cores' usage at 2-second intervals
mpstat -P ALL 2

# Key metrics to watch:
# %usr  → User-space CPU (your applications)
# %sys  → Kernel-space CPU (system calls)
# %iowait → CPU waiting for disk I/O (high = disk bottleneck)
# %idle → Idle CPU (low = at capacity limit)

💡 Tip: If the %iowait value is consistently above 20%, you're experiencing a disk I/O bottleneck, not a CPU issue. In this case, switching to NVMe SSD or disk I/O optimization should be your priority.

Monitoring and Optimizing RAM Usage

Memory Analysis with free and vmstat

The free -h command shows the current memory status. However, the "used" value in Linux can be misleading because the kernel uses free RAM as disk cache (buffer/cache). For actual available memory, look at the "available" column.

terminal - memory monitoring
# Current memory status
free -h

# Example output:
#               total   used   free   shared  buff/cache  available
# Mem:          4.0Gi   2.1Gi  0.3Gi  128Mi   1.6Gi       1.5Gi
# Swap:         2.0Gi   0.5Gi  1.5Gi

# Monitor memory and swap at 5-second intervals with vmstat
vmstat 5

# Note: if si (swap in) and so (swap out) values
# are consistently greater than zero, RAM is insufficient

RAM Optimization Steps

Follow these steps to reduce RAM consumption:

1. Adjust MySQL/MariaDB buffer pool size: Set the innodb_buffer_pool_size value to 50-70% of total RAM. On a server with 4 GB RAM, 2-2.5 GB is an ideal starting point.

2. Limit PHP-FPM worker count: Each PHP-FPM worker consumes an average of 30-50 MB RAM. On 4 GB RAM, keep the pm.max_children value between 20-30.

3. Disable unused services: List running services with systemctl list-units --type=service --state=running and disable unnecessary ones.

SWAP space configuration is also a critical step when RAM is insufficient. Check out our VPS SWAP Space Configuration guide for more details.

Disk I/O Monitoring and Bottleneck Detection

Disk I/O bottleneck is the most common performance issue, especially in database-intensive applications. You can monitor disk performance with iostat and iotop tools.

terminal - disk I/O monitoring
# Disk performance with iostat (2-second intervals)
iostat -xz 2

# Critical metrics:
# await  → Average I/O wait time (ms). Should be <1ms for NVMe, <5ms for SSD
# %util  → Disk utilization rate. If 80%+, there's a bottleneck
# r/s, w/s → Read/write operations per second

# To see which process is doing the most disk I/O
sudo iotop -oP

⚠️ Important Warning: If the await value in iostat output is consistently above 10 ms and you're using HDD, switching to NVMe SSD will be the most effective optimization. Software optimizations cannot exceed the physical limits of disk hardware.

Monitoring Network Traffic

Network performance is critical, especially for API servers and media services. You can monitor real-time bandwidth usage and connection states with nload and ss commands.

terminal - network monitoring
# Real-time bandwidth usage
sudo apt install nload -y
nload eth0

# Active connection count and states
ss -s

# Find IPs with the most connections
ss -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -rn | head -20

# Connection state distribution (TIME_WAIT, ESTABLISHED, etc.)
ss -ant | awk '{print $1}' | sort | uniq -c | sort -rn

For more detailed information on network optimization, check out our VPS Network Performance Optimization guide.

Monitoring Tools Comparison

Tool Monitored Resource Use Case Installation
htop CPU, RAM, processes Real-time process monitoring and management apt install htop
vmstat RAM, swap, CPU, I/O Memory and swap activity monitoring Pre-installed
iostat Disk I/O Disk performance and bottleneck detection apt install sysstat
iotop Disk I/O (per process) Finding which process is using disk apt install iotop
nload Network bandwidth Real-time incoming/outgoing traffic monitoring apt install nload
mpstat CPU (per core) Per-core CPU distribution apt install sysstat
Prometheus + Grafana All (time series) Long-term monitoring, dashboards, alerts Separate installation required

Automated Alerting and Monitoring Setup

Command-line tools are excellent for real-time monitoring, but for 24/7 monitoring you need to set up an automated alerting system. You can monitor critical thresholds with a simple bash script:

monitor-alert.sh
#!/bin/bash
# Simple resource monitoring and alerting script

CPU_THRESHOLD=85
RAM_THRESHOLD=90
DISK_THRESHOLD=85

# CPU check
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print 100 - $8}' | cut -d. -f1)
if [ $CPU_USAGE -gt $CPU_THRESHOLD ]; then
  echo "WARNING: CPU usage at ${CPU_USAGE}%" | mail -s "Server Alert" [email protected]
fi

# RAM check
RAM_USAGE=$(free | awk '/Mem/{printf("%.0f"), $3/$2*100}')
if [ $RAM_USAGE -gt $RAM_THRESHOLD ]; then
  echo "WARNING: RAM usage at ${RAM_USAGE}%" | mail -s "Server Alert" [email protected]
fi

# Disk check
DISK_USAGE=$(df / | tail -1 | awk '{print $5}' | tr -d '%')
if [ $DISK_USAGE -gt $DISK_THRESHOLD ]; then
  echo "WARNING: Disk usage at ${DISK_USAGE}%" | mail -s "Server Alert" [email protected]
fi

You can run this script every 5 minutes with cron: */5 * * * * /opt/scripts/monitor-alert.sh. For more comprehensive monitoring, consider the Prometheus and Grafana combination.

Frequently Asked Questions

My server is slow but CPU and RAM look normal, what could be the issue?

You're most likely experiencing a disk I/O bottleneck. Check the await and %util values with iostat -xz 2. Also look at the wa (I/O wait) column in vmstat 2 output.

The free command shows all RAM is used, should I panic?

No. The Linux kernel uses free RAM as disk cache and this appears as "used." For actual available memory, look at the available column. If this value is low (below 10% of total RAM), then you should consider increasing RAM.

Which monitoring tool should I install?

If you're managing a single server, htop + sysstat (iostat, mpstat, sar) is sufficient. For multiple servers or long-term trend analysis, set up the Prometheus + Grafana stack.

What is OOM Killer and how can it be prevented?

OOM (Out of Memory) Killer is the Linux kernel's mechanism for forcefully terminating the most memory-consuming process when RAM is completely exhausted. To prevent it: allocate sufficient SWAP space, configure application memory limits, and regularly monitor RAM usage.

Conclusion

Resource monitoring on your cloud server is critical not only for solving performance issues but also for preventing them. Regularly monitor processes with htop, disk performance with iostat, and memory status with vmstat. Set up automated alerts for metrics exceeding threshold values and adopt a proactive rather than reactive management approach.

Looking for a High-Performance Cloud Server?

Get low latency and high I/O performance with Hosted Cloud's NVMe SSD-powered cloud servers. Keep your monitoring tools in the green.

Explore Cloud Server Plans →
E

Elif Demir

Cloud Solutions Architect

Specializing in enterprise cloud migration projects and hybrid infrastructure design with 8 years of experience in AWS, Azure, and private cloud environments.

Comments coming soon