Post date: Oct 31, 2014 10:1:46 AM
Symptoms
You cannot observe the progress when you delete a snapshot.
It is difficult to determine the status of snapshot deletion.
The snapshot removal task stops at 95% or 99% and does not appear to proceed.
The hostd process does not respond.
Storage vMotion migration stays at 18% for a long time.
You need commands to monitor snapshot deletion in ESXi/ESX
Purpose
This article provides information on monitoring directories using the watch command and waiting for the snapshot deletion operations to complete in ESX and ESXi.
For more information on how snapshots work, see Understanding virtual machine snapshots in VMware ESXi and ESX (1015180).
Resolution
Note: Time involved to commit snapshots is environmental and subjective.
To monitor directories during snapshot deletion in ESX 3.5/4.x and ESXi 4.1/5.x:
Note: This method does not work if the base disks are virtual-mode RDM. You see the read and the delta files "touch" time, but the time stamps of the RDM pointer file are not updated. If this method does not work, complete the Alternative Workaround described below.
Log in as root to the ESX host using SSH. For more information, see Connecting to an ESX host using a SSH client (1019852) or Using Tech Support Mode in ESXi 4.1 and ESXi 5.x (1017910).
Navigate to the virtual machine directory containing vmdk virtual disk files.
List files in the directory by executing:
# ls -al
Determine any VM_NAME-00000#.vmdk or VM_NAME-00000#-delta.vmdk snapshot files. Look for numbered files following the hyphen(-) in the name. In ESXi 5.5, if the vmdk is larger than 2TB, the snapshot file created is of VM_NAME-00000#-sesparse.vmdk format.
To monitor the VMDK snapshot and base disks which are currently being updated use the following watch command:
# watch -d 'ls -luth | grep -E "delta|flat|sesparse"'
where:
-d highlights the differences between successive updates
t sorts by modification time
l shows a long listing which displays additional file information
u sorts by and shows access time
h prints sizes in a readable format like 1K 234M 2G
You can also run the following command to monitor the time stamp update of the base disks to confirm if the process is working or not.
ls -lrt |grep -E "flat|delta|sesparse"
This command monitors the contents of a directory and displays files by their modification date.
Note: In ESX 3.5 and 4.0 (pre Update 2) the snapshot delta files will be written to the previous snapshot delta file and so on and are finally written to the base disk (flat). In ESX/ESXi 4.0 Update 2 and later the process works differently in that the data in snapshots (deltas) are written directly to the base disk (flat). For more information on the snapshot process, see Understanding virtual machine snapshots in VMware ESXi and ESX (1015180) and Consolidating snapshots in ESX/ESXi 3.x and 4.x (1007849).
If there are more than 10 snapshots, use this command to monitor the snapshot commit process and to prevent the screen from filling with too many files:
# while true;do date;ls -lht *vmdk|head -10;echo ________;sleep 3;done
Note: You can quit the consolidation process monitoring by pressing Ctrl + C.
Alternate Workaround
If the timestamps are not updating, complete this alternative workaround to monitor whether the timestamps are updating (except if vmdks are on NFS datastore):
Log in as root to the ESX host using SSH. For more information, see Connecting to an ESX host using a SSH client (1019852), or Using Tech Support Mode in ESXi 4.1 and ESXi 5.x (1017910).
Run esxtop.
Press V to see only running virtual machines.
Note: This is not the same as using the "v" option.
Find the virtual machine running the consolidation.
Type e to expand.
Enter the Group World ID (value from GID column).
Press Enter.
Make a note of the World ID (ID column) of the snapshot consolidation process:
In ESX/ESXi 3.x and 4.x, the process is called SnapshotVMXCombiner
In ESXi 5.x: the process is called vmx-SnapshotVMX
Type u to display the disk device statistics.
Type e to expand and enter the device where the snapshot consolidation process is writing to.
For example: naa.xxx value
Note: For a regular vmdk file, the device is the datastore that the flat file is located. For a RDM, device is the RDM device itself. For a flat vmdk, identifying the datastore device ID can be done by running esxcfg-scsidevs -m. For RDM, the vmkfstools -q against the pointer file reveals the vml ID, which needs to be correlated with the output of ls -l /vmfs/devices/disks/ to get the device ID. For more information, see Identifying disks when working with VMware ESX/ESXi (1014953).
Identify the Group World ID from step 6 under.
Look at the number IOPS and throughput for the Consolidation process (WRITES/s and MBWRTN/s columns) to ensure there is activity and the process is actually doing work.
To monitor directories during snapshot deletion in ESXi 3.5/4.0:
Log in as root to the ESXi host using the Tech Support mode. For more information, see Tech Support Mode for Emergency Support (1003677).
Change to the virtual machine directory /vmfs/volumes/datastore/VM.
Open a text editor such as vi or nano and create the file snapmon. For more/related information, see Editing files on an ESX host using vi or nano (1020302).
Add this script the the file:
snapmon
clear
while [ 1 ]
do
date;ls -luth *.vmdk
sleep 2
clear
done
Click Save and close the file.
Run this command to make the file executable:
# chmod ug+x snapmon
Commit the snapshot from the Snapshot Manager.
Execute the snapmon script to monitor the snapshot:
# ./snapmon
Note: This process can be sent to background and brought to foreground using the fg command.
For ESXi 4.x and ESXi 5.x, you have the option of utilizing vim-cmd.
For more detailed monitoring, type these commands to confirm that the consolidation / deletion are active.
Type in a SSH shell session to the host that is performing the consolidation, snapshot deletion task.
vim-cmd vimsvc/task_list
You see a task similar to :
(ManagedObjectReference) [
'vim.Task:haTask-9-vim.VirtualMachine.removeAllSnapshots-304060994'
]
Type from within the SSH Shell :
vim-cmd vimsvc/task_info <followed by the task listed from the first command>
# vim-cmd vimsvc/task_info haTask-9-vim.VirtualMachine.removeAllSnapshots-304060994
This is the output of the above command :
(vim.TaskInfo) {
dynamicType = <unset>,
key = "haTask-9-vim.VirtualMachine.removeAllSnapshots-304060994",
task = 'vim.Task:haTask-9-vim.VirtualMachine.removeAllSnapshots-304060994',
description = (vmodl.LocalizableMessage) null,
name = "vim.VirtualMachine.removeAllSnapshots",
descriptionId = "VirtualMachine.removeAllSnapshots", <--running process>
entity = 'vim.VirtualMachine:9', <---VIM ID>
entityName = "SvC5sql01", <---Virtual Machine name>
state = "running", <---make sure the status is running & is not in a error state>
cancelled = false,
cancelable = false,
error = (vmodl.MethodFault) null,
result = <unset>,
progress = 33, <---progress of task>
reason = (vim.TaskReasonUser) {
dynamicType = <unset>,
userName = "root",
},
queueTime = "2013-10-02T07:22:02.224526Z",
startTime = "2013-10-02T07:22:02.225526Z",
completeTime = <unset>,
eventChainId = 304060994,
changeTag = <unset>,
parentTaskKey = <unset>,
rootTaskKey = <unset>,
}