In this blog entry part 3 (VMware Photon OS on Azure #Part 1 - introduction , VMware Photon OS on Azure #Part 2 – create an Azure Photon OS virtual machine ) I will forcibly neglect for a while that we look into a Photon OS virtual machine on Azure. Try to look to it as a time ride back of thirty years, no external connectivity, no external help. Let's assume of not knowing anymore about Linux commands. Without going into all basic improvements, how would Today the troubleshooting experience evolve? this is the mindset for this blog part.
I remember when my parents bought me a Commodore Amiga computer with floppy disk 3.5", no hard disk, 512 MB RAM, 5.25" floppy with expansion card which lets the Amiga emulate a 286 PC at full speed. I started with no English language know-how, did a lot of try and error of my own and - it was horrible. After a day or two I deleted a file called startup-sequence. I didn't knew it was a system file and I needed to reinstall the whole computer. The first weeks past on-the-fly, and the best second chance was using the help command. So better apply this first.
With the login we directly start into gnu bash console.
localadminuser [ ~ ]$ help
GNU bash, version 4.4.18(1)-release (x86_64-unknown-linux-gnu)
These shell commands are defined internally. Type `help' to see this list.
Type `help name' to find out more about the function `name'.
Use `info bash' to find out more about the shell in general.
Use `man -k' or `info' to find out more about commands not in this list.
The text contains a release information, how to use help and further information. The release information isn't the same as in the login prompt, so there is no obvious correlation between the system and help release.
Help gives a hook to start with. Let's discover some commands. As it is a computer let's see if we can find some mathematical functions! Indeed there are some, let's try expr or let.
Found out that there is the possibility of defining variables as well. Nice! Let's go back to other commands.
I discovered that the command "cd" means change directory, and there is a difference between the commands dirs and dir. That said, discovering the filesystem content suddenly I landed in the /bin (and /sbin) directory and discovered another and even much more bigger list of new commands. Boom! This wasn't easy at all. With the alphabetical beginning at awk I've noticed that the apps there usually use --help for their syntax declaration.
I can't remember when exactly but I've started using ctrl-c as it helps when try exiting the command dc or bc. Btw, bc is yet another math app!
A next commanddf shows us the free filesystem capacity state.
In /dev directory there are same names. These are all (virtual)devices. The time has past and I've realized that /dev/sda is a 16GB disk and /dev/sdb is a 64GB disk, both with one or more partitions.
From a discovery perspective there are interesting apps beginning with ls or ending with ctl. ls as dir command substitute uses colors displaying different type of content. These apps ls* and *ctl seem to be quite useful when discovering the hardware:
Hostnamectl exposes some system identifiers.
lscpu shows the cpu information, exposed as a 4x Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz system.
lsmem shows the RAM in online or offline state.
lspci shows expansion information of pci (peripheral component interconnect). As the hardware is accessible from remote only, the system I trust shows as example seven components. Two ethernet controllers are from Mellanox Technologies, all other adapters are Microsoft devices.
Networkctl shows the status of the network devices. Some ethernet adapters seems already to be configured.
There are one loopback adapter, one docker bridge, two configured network adapters and two adapters in state configuring. The two configured network adapters make use of a driver called hv_netsvc. The adapters in state configuring are the ethernet controllers exposed in lspci. The command systemctl displays the status of device drivers. The result has similarities to a list consisting of conventions for path naming, state, etc.
Timedatectl shows up the system time.
hwclock as alternative shows the time in realtime computer format. As it is known that the realtime clocks in 2020 usually reset to 1.1.1970, in this case it counts 50 years and 150 days and knowing there were 12 leap years. It is much easier to use Timedatectl , but it is good to know that most commands have a service alternative.
Let's have another view of the system. busctl exposes users and (background) services units. So far, the system shows up five users context
- me (localadminuser) with one active session firing up busctl
- root with services units init.scope, system-machined.service and systemd-logind.service
- systemd-network
- systemd-resolve
- systemd-timesync
The protocol of those services can be viewed by using the command journalctl.
journalctl -u systemd-networkd
journalctl -u systemd-machined
journalctl -u systemd-login
journalctl -u systemd-timesync
To get an idea which users' context is processed, the command who shows up users who are currently logged in. When you log out and login again with the same user, you might want to lookup commands from the last session.Quite useful is the command history.
The directory /var/log is helpful, too, as it contains some further log files.
In the past all commands so far were used interactively. To process commands in background you can add a & at the end. A process id is shown and the command jobs displays the state of registered processes.
In help this behaviour isn't explicitly mentioned as possibility for a command but rather for a job specification. A job could be a single command but also a script.
The system has a line editor called vi. It holds more than 100 features in it. Entering version exposes all features by name. And the tool shows off the date of its last compilation. However, further help does not work intuitively entering F1 as proposed.
The more investigating, the more I thought that something on the OS might be broken. How can we check that? How can I check if tools are not okay?
It took a while, but I found the tool rpmverify which can verify the system.
The tool throws out some weired output, but, it seems to be right with the missing ones! /usr/bin/wget is missing, all other six files are missing, too. How to fix the issues? This remains unclear.
From the VMware tools perspective the system is not running as a VMware virtual machine.
Let's leave the system.
Back to the present - findings
With the internal commands help only you can learn quite a bunch of commands. Step by step you find out that most are from the free software foundation (https://www.gnu.org). Most help works usually using --help and most of the commands supports --version, as well. After a while you learn more about the underlying hardware and the filesystem structure, and on top of the underlying hardware runs VMware Photon OS.
Photon OS is a security hardened system. One finding from a learning perspective is the demand of some sort of a system check recipe. For an administrator it is not easy to come along with the internal system help. Luckily it is quite useful that most logs are retrievable and there are tools like rpmverify. There is a similarity to the short Microsoft DOS epoch in 1980ies++ where you had to learn hundreds of commands one by one without centralized get-help.
Without any help from outside, it would have been impossible to repair the detected issues even after learning more and more commands. The missing wget and top can easily be reinstalled by using the package manager tdnf (tiny dandified yum). But it isn't always obvious which program belongs to which package.
If you're new to Photon OS, it helps to gain some experience in basic troubleshooting techniques. A good starting point is the Photon OS Troubleshooting Guide https://github.com/vmware/photon/tree/master/docs/photon_troubleshoot .