Lesson 28 of 60 intermediate

Linux Processes, Services & Logs

Where Linux incidents become visible

Open interactive version (quiz + challenge)

Real-world analogy

Processes are the workers, services are their managers, and logs are the diary every worker keeps. When something goes wrong, you don’t interrogate every worker — you read the diaries, check the manager, and decide who to restart.

What is it?

Linux service administration is watching the right logs, issuing the right systemctl commands, and resisting the temptation to reboot everything. It’s quiet work that prevents 80% of Linux outages.

Real-world relevance

A production API server runs fine all day, then fails every night at 2 AM. journalctl shows an OOM-kill of the app every night. You correlate with a nightly backup that bloats RAM. You either tune the backup, add memory, or move backup to a quieter window — not ‘restart the service.’

Key points

Code example

// Linux triage cheat-script

# Is the service alive?
systemctl status myapp
journalctl -u myapp -e --since "30 min ago"

# Is the system healthy?
uptime        # load average
free -h       # memory
df -h         # disk
top -b -n 1   # snapshot of CPU/memory

# Is a specific process misbehaving?
pgrep -a myapp
ps -o pid,pcpu,pmem,etime,cmd -p $(pgrep myapp)
kill -15 <pid>       # graceful
kill -9  <pid>       # last resort

# Did the kernel kill a process?
dmesg | grep -i "killed process"
journalctl -k --since today | grep -i oom

Line-by-line walkthrough

  1. 1. Linux triage cheat-script
  2. 2. Is service alive block
  3. 3. systemctl status
  4. 4. journalctl recent window
  5. 5. Blank separator
  6. 6. System health block
  7. 7. uptime
  8. 8. free -h
  9. 9. df -h
  10. 10. top snapshot
  11. 11. Blank separator
  12. 12. Process-specific block
  13. 13. pgrep
  14. 14. ps per-pid stats
  15. 15. Graceful kill
  16. 16. Last-resort kill
  17. 17. Blank separator
  18. 18. Kernel-kill check
  19. 19. dmesg grep
  20. 20. journalctl OOM grep

Spot the bug

Ticket: 'API slow at 2 AM.' Junior reboots the VM every morning at 8 AM and closes the ticket.
Need a hint?
Which logs would tell you what actually breaks at 2 AM?
Show answer
journalctl since the previous day, dmesg for OOM-kill, systemctl status for crashes, and system metrics for CPU/memory/disk peaks. Often it’s a nightly job colliding with the app. Reboots mask the cause; real fix is finding the 2 AM trigger and tuning capacity or scheduling.

Explain like I'm 5

Every program is a worker. Services are trustworthy workers with a manager. Logs are their diary. When something breaks, you read the diary, not shout at the worker.

Fun fact

systemd is so central to modern Linux that ‘systemctl status ’ has become the first command most engineers type when something feels wrong — even before they check CPU or disk.

Hands-on challenge

On an Ubuntu VM: install nginx, verify via systemctl status, break it by editing nginx.conf to invalid syntax, try to restart, read the error via journalctl, fix, restart successfully. Save the command history.

More resources

Open interactive version (quiz + challenge) ← Back to course: IT Jobs Bootcamp