Sunday, April 16, 2023

A Step-by-Step Guide to Troubleshooting OCI Compute Instances Using Serial Console

 The Serial Console allows users to access the system console of a compute instance. Instance console connections should only be used for troubleshooting purposes only. For example, a successful boot is not happening, instance has malfunctioned or someone inadvertently played with the OPC keys. More details can be found under https://docs.oracle.com/en-us/iaas/Content/Compute/References/serialconsole.htm. In my case, on one fine day, i was not able to login to the instance using the OPC user. The SSH was not happening and i did not have any other choice rather than creating a serial console connection for Linux. I verified all the security lists rules and all seems to be fine at moment. I was getting the network error messages.


Thus, i went ahead and created the serial console connection for the malfunctioned instance.


 




Next, i logged in to an another working instance which was the same subnet as the malfunctioned instance. And under ~ directory, i pasted the content of the key used earlier during console connection to the id_rsa key file(this was converted via puttygen>conversion>export open SSH key(force new file format))




Now, keeping the shell open, reboot the malfunctioned instance. We will see in the shell screen that reboot process are going on. Immediately press the ESC or F5 button until a menu appears.



press ENTER and we will get the below screen


Press Enter and immediately press ESC repeatedly until the next screen comes



Press e and in the section at the bottom add the content init=/bin/bash



Press ctrl-x


At this moment, we are logged into root user and have supersonic powers.:)

[root@localhost .ssh]# /usr/sbin/load_policy -i

[root@localhost .ssh]# /bin/mount -o remount,rw /

[root@localhost .ssh]# cd /home/opc/.ssh

[root@localhost .ssh]# ls -lrt

total 4

-rw-------. 1 opc opc 402 Jan  7 15:25 authorized_keys

[root@localhost .ssh]# cp authorized_keys authorized_keys_bkup

[root@localhost .ssh]#


We could see that someone modified the OPC authorized_keys file and that is the reason, why we were not able to login. Luckily, we have a backup else we had to regenerate the keys content again...

Reverted to the old key file


bash-4.4# cp authorized_keys_bkup authorized_keys

bash-4.4# ls -lrt

total 8

-rw-------. 1 root root 402 Apr 15 11:58 authorized_keys_bkup

-rw-------. 1 opc  opc  402 Apr 15 12:32 authorized_keys

Now, let us reboot the server and try to login to the server again via opc user

Here we are,


We can now delete the console connection from OCI console.

From this activity, we learnt that, we should always have the backup of opc keys. Also for best practice, always have the backup of the boot volume.

I hope this post helps someone. Till then, keep learning cloud.


References:-https://docs.oracle.com/en-us/iaas/Content/Compute/References/serialconsole.htm#four__maintenancemode

No comments:

Post a Comment