QuickFix: ESXi 7 and broken vmsyslog

I encountered a situation where ESXi 7 Update 3g (build 20328353) stopped sending logs to a remote Syslog server. Upon further investigation, it turned out that it also stopped writing logs locally, and the logs in /scratch/log are not updated. Free disk space is not a problem.

During diagnostics, errors were detected in the /var/log/.vmsyslogd.err log:

vmsyslog.main            : CRITICAL] Dropping messages due to log stress (qsize = 25000)

I did not find adequate KB for version 7 on this topic, there was only KB for version 6.5/6.7 with a mention of this error, where it was written that “the problem has been fixed”.

The esxcli system syslog config get command correctly displays the status of the syslog settings, but esxcli system syslog reload does not lead to positive results, logs do not start to be written locally, and are not sent to the remote server.

Restarting the service from the host management interface with the Restart button also does not lead to any results. In the log, you can only see:

vmsyslog.main            : ERROR   ] reloading (3200395)

Which is similar to the result of esxcli system syslog reload.

Stopping and restarting the service from the ESXi interface fails because:

This service with 'vmsyslogd' is marked as 'required' and cannot be stopped.

All that remains is to stop it forcibly directly from the host:

ps -cC | grep vmsyslog
3418096  3418096  vmsyslogd             /bin/python /usr/lib/vmware/vmsyslog/bin/vmsyslogd.pyc 1

We determine the PID of vmsyslog, in this case 3418096, and kill it:

kill -9 3418096

The vmsyslog will show that the process was killed and then automatically restarted:

vmsyslog.main            : ERROR   ] Watchdog 3418095 fired (child 3418096 died with status 9)!
vmsyslog.main            : ERROR   ] Watchdog 3418095 exiting
vmsyslog                 : CRITICAL] vmsyslogd daemon starting (3418940)

After restarting, logs start to be written locally and sent to the remote server.

Loading

VMware vSphere 8.0 Update 3 is out

Today, a new version of VMware vSphere 8.0 has been released. It is a major update that contains tons of new features in different areas, including live patch management, partial maintenance mode, embedded vCLS, and more and more.

I do not want to copy all well-written info here but to share a few links.

What’s New in vSphere 8 Update 3?

What’s New with vSphere 8 Core Storage

What’s New in vSphere Update 3 for vSphere IaaS control plane?

VMware ESXi 8.0 Update 3 Release Notes

VMware vCenter Server 8.0 Update 3 Release Notes

Loading

Quick fix: VMware. Some of the disks of the virtual machine failed to load.

I have faced an issue with one of the VMs running on VMware ESXi, 7.0.3, 20328353.

Symptoms:

1. VM is running. There are no reports from users;

2. vMotion fails with an error:

The object or item referred to could not be found.

3. After vMotion in hostd.log we can find the following:

Failed to find file size for /vmfs/volumes/.../VM_NAME.nvram: No such file or directory

4. In the vCenter UI under VM a message is displaying:

Some of the disks of the virtual machine VM_NAME failed to load. The information present for them in the virtual machine configuration may be incomple

5. No issues with the storage layer. All VM’s files are located on the datastore;

6. Other VMs on the host and datastore works fine;

7. Recommendations like “Rescan Datastore” don’t work.

Solution.

Before you begin, make sure that you have a backup.

The solution for me was simple, but it required downtime:

  1. Power off the VM;
  2. After that, the VM will be in an inaccessible state;
  3. Remove VM from the vCenter inventory;
  4. Locate VM’s files on the datastore and find vmx file;
  5. Register VM;
  6. Power on the VM.

After that VM should be up and running without issues.

Loading

VMware ESXi 8.0 Update 2b is out

VMware ESXi 8.0 Update 2b is out and contains a lot of bug fixes. One of the fixes I want to mention is a bug in CBT:

Changed Block Tracking (CBT) might not work as expected on a hot extended virtual disk:

In vSphere 8.0 Update 2, to optimize the open and close process of virtual disks during hot extension, the disk remains open during hot extend operations. Due to this change, incremental backup of virtual disks with CBT enabled might be incomplete, because the CBT in-memory bitmap does not resize, and CBT cannot record the changes to the extended disk block. As a result, when you try to restore a VM from an incremental backup of virtual disks with CBT, the VM might fail to start.

As a workaround, there were two options: not to use hot extend and perform disk extend operations when the VM is powered off, or create periodically full backups to reset the CBT.

So, if you’re running ESXi version 8.0 Update 2, you should consider updating to the 8.0 Update 2b as soon as possible.

You can read about other release notes here.

Loading

Protecting vCenter Server using vCenter HA functionality

In some cases, when we need a highly available vCenter Server, we can use vCenter HA functionality. In short – it’s a second copy of your vCenter VM (and a witness node), with configured replication between active vCenter node and passive vCenter node.

If something happens to the active node, the standby node will take over the entire process and it will reduce the overall downtime of the vCenter Server.

Let’s look at how to enable vCenter HA, and what we need to do.

Continue reading “Protecting vCenter Server using vCenter HA functionality”

Loading

Backing up and restoring VMware vCenter Server. Part 2 – Veeam Backup and Replication

In the previous article, we talked about how to restore vCenter using native backup. In this part, we will talk about how to restore VMware vCenter Server using Veeam Backup and Replication.

Although restoring a VM using Veeam is a simple task, but when we are talking about vCenter Server a few moments should be considered.

Let’s get started.

Continue reading “Backing up and restoring VMware vCenter Server. Part 2 – Veeam Backup and Replication”

Loading

Quick Tip: How to change the MAC address on a vSphere VM by editing the VMX file

Someone may know that a vSphere 8.0 Update 2 bug prevents you from setting a static MAC address for a VM (KB 95189).

The symptom is simple – you change the MAC address in the VM’s network interface settings, but after you click OK, nothing changes.

As a workaround, there is a solution – do the same using vSphere Host Client (ESXi Web interface). But in my case, this workaround doesn’t help, I’ve received an error:

Failed to reconfigure virtual machine pleasechangemymac. Invalid configuration for device '4'.

If you are in this situation and you need to change a VM MAC address, one good old hack still works – edit the VM’s VMX file.

Next – how to change the MAC address.

Continue reading “Quick Tip: How to change the MAC address on a vSphere VM by editing the VMX file”

Loading

Backing up and restoring VMware vCenter Server. Part 1 – Native backup

vCenter server is a critical part of the VMware infrastructure stack, and most components and 3rd-party solutions depend on it. Although downtime of vCenter may not cause a problem with overall infrastructure and will not cause a VMs downtime, it will affect the provision of new resources, management, backups, and so on. So, keeping your vCenter up and running is a priority task in most cases.

In the few articles, we will look at how to backup and restore the vCenter server, if something goes wrong. There are a few strategies for protecting the vCenter server, but all of them depend on the required availability of the service. It can be backup, replication, vCenter HA functionality, or even deploying a new vCenter and connecting hosts manually.

We will look at two options – backup and restore vCenter using the native backup function and backup and restore vCenter using 3rd party backup software.

In this article, we will take a closer look at how to backup vCSA using native backup, available in VAMI.

Continue reading “Backing up and restoring VMware vCenter Server. Part 1 – Native backup”

Loading

Updating the VMware vCenter server from version 8.0 U1 to 8.0 U2 using the Reduced Downtime Upgrade procedure

With the announcement of vSphere 8.0 Update 2 a new interesting feature called vCenter Reduced Downtime Update (RDU) was introduced.

This feature can reduce overall vCenter Server downtime during updates and upgrades.

In a nutshell, RDU is a vLCM feature that creates a new already updated vCenter Server virtual machine and copies all data from running vCenter to the new copy. After data is copied, all we need to do is to switch over to the updated copy of vCenter. Switchover takes less time than the full upgrade procedure and minimizes the downtime of vCenter.

In this article, we will look at how to update the vCenter server from version 8.0 Update 1 to version 8.0 Update 2 using RDU.

Continue reading “Updating the VMware vCenter server from version 8.0 U1 to 8.0 U2 using the Reduced Downtime Upgrade procedure”

Loading

Deploying Nutanix Community Edition 2.0 on VMware vSphere 8

This year Nutanix Community Edition 2.0 (CE 2.0) was released, based on Nutanix AOS 6.5 – the most actual LTS version at the time of writing.

In this article, we’ll take a closer look at how to deploy a three-node Nutanix CE 2.0 cluster on a VMware vSphere 8 infrastructure.

Continue reading “Deploying Nutanix Community Edition 2.0 on VMware vSphere 8”

Loading