home       vmx       vmdk        about this site        forum        downloads  


   




VM-sickbay



Welcome - looks like something is wrong with your virtual machine.

Before we go into details ...
relax ...
drink a coffee and follow the VM-sickbay-rules

 

1. DON'T PANIC
2. do not try to start the VM again
3. do not mount disks from other VMs
4. do not use vmware-mount or vdiskmanager
5. do not use vdk in read/write mode
6. make a copy of all vmware-log files
7. relax


Are you right now running a VM that seems to be in an older state than expected ?
Shut it down now - do not run checkdisk and do not defragment.

 

In case you need to ask for help at VMTN or VMware-forum it is important that you provide sufficient information.
Without exact details the person trying to help you may miss-judge the cause of the desease
and so prescribe in-adequate treatment.
Anamnesis describe symptoms and conditions - post at VMTN or vmware-forum.de
Therapy fix or repair
Typical Symptoms
Post mortem recover data
Plan B for the desperate



DISCLAIMER:

The procedures I suggest for some typical problem often require advanced skills.
I can't take any responsibilty when you mess up your data following this tips.
This just lists up what I would do in such a case.
Please also note that official VMware support will very likely claim that those cases listed as "restore" are lost.
Don't give up too early - you may still be able to rescue some data.

For in depth discussion of the listed procedure use the search function at VMTN , german forum and sanbarrow forum.
Search for username "continuum" and terms from desaster description table.

Ulli Hankeln



 
encypted





Anamnesis

 

 


In most cases this data is sufficient for a first analysis:

- a detailed file-listing of all files in the VM-directory - this list should mention date, size and permissions
- all files smaller than 100 kb in the VM-directory
- screenshot of error message - if possible
- screenshot of datastorebrowser - if ESX
- screenshot of snapshotmanager

Please download the VM-sickbay-help-request.txt
and answer all further questions listed there.
Please simply edit the text-file.

Then pack all the files mentioned above along with your edited help-request
into a single zip-archive. Give that zip-file a reasonnable name -
I already have more than enough vmware.zips and log.zips ...

IMPORTANT:
if you decide to ask someone for help first power down the VM.
Then create the report-zip and send it.

Do NOT power on the VM before you have a good theory of what went wrong and a fix

 


Therapy

 



often used procedures ...
disktypes-table ...
the CID-chain ...

In case you want to try to fix the problem on your own - the next table lists some common-deseases and the therapy I would prescribe.


desease Platform chances suggested procedure
VMware complains - can not resume
(Display-messages)
hosted unknown

read vmware.log - find out expected screen resolution - change host resolution and try again
if that does not work remove *.vmss to force a fresh start

VMware complains - can not resume
(CPU-messages)
hosted like
hard reboot
disable cpu-compat-test in the vmx-file
if that does not work remove *.vms to force a fresh start

VMware complains - a file is in use
cannot start (in-use or locked messages)
  good remove stale lockdirs/lockfiles
VMware complains - file too large
"msg.disklib.tooBigForFS" in vmware.log
hosted unknown in case you tried to copy a vmdk to FAT32
try to add this line to the vmx-file:
diskLib.sparseMaxFileSizeCheck = "FALSE"
only *-flat.vmdk exists all good use this windows-tool and read
       
descriptor vmdk is lost all good use sample from list
restore entries from last log
starting a Linked Clone messed up original VM all bad restore still usable branches
       
vmx is lost or blank all good restore from last log
VMware complains - disk needs repair
monolithicSparse
hosted ??? try -R parameter of recent vmware-vdiskmanager
vmware-vdiskmanager.exe -R corrupt.vmdk

if that does not work try to mount the vmdk with vdk
if that does not work either use Plan B
       
VMware complains - file is not a virtual disk
monolithicSparse
hosted ??? try -R parameter of recent vmware-vdiskmanager
vmware-vdiskmanager.exe -R corrupt.vmdk
VMware complains - file is not a virtual disk
descriptor vmdk is blank
all good use sample from list
restore entries from last log
VMware complains - file is not a virtual disk
monolithicSparse
embedded descriptor looks ok
hosted ??? try -R parameter of recent vmware-vdiskmanager
vmware-vdiskmanager.exe -R corrupt.vmdk
VMware complains - file is not a virtual disk
monolithicSparse
embedded descriptor is blank or corrupt
hosted unknown use sample from list
restore entries from log
see howto
inject with dsfi.exe
       
VMware complains -parent has been changed
all depends fix CID-chain
VMware complains -parent has been changed
- embedded descriptor
hosted depends extract with dsfo.exe
fix CID-chain-embedded descriptor
inject with dsfi.exe
       
parent has been expanded - monolithicFlat
all depends
on skills
of admin
cut vmdk
fix geometry
fix CID-chain
parent has been expanded - split vmdk
all   delete chunks added during expand
fix geometry
fix CID-chain
parent has been expanded - monolithicSparse

complex snapshot-tree
hosted   extract descriptor of basedisk with dsfo.exe
fix geometry to the size before expand
fix CID-chain
inject with dsfi.exe
parent has been expanded - monolithicSparse
single snapshot
hosted   extract descriptor of snapshot with dsfo.exe
fix Extent description
to the size after expand
fix CID-chain
inject with dsfi.exe
       
       



Typical symptoms


In the following I list some typical error-messages along with the suggested fix ...


Could not open virtual machine: N:\test\test.vmx.
"N:\test\test.vmx" is not a valid virtual machine configuration file.

Check for missing files failed.
Snapshots are not allowed on this virtual machine.


Sometimes a crash of the host - no matter if Windows, Linux or ESX - results in a blank vmx.
Not always the error message is useful.
Anyway - if possible restore the vmx-file from the last vmware.log if that is available.
If you restore the vmx from scratch you may easily assign the wrong snapshot and so do more harm than good.

vmx is lost or blank all good restore from last log

Cannot open the disk N:\test\test-000001.vmdk or one of the snapshot disks it depends on.
Reason: The specified virtual disk needs repair.


This may happen with sparse disks on hosted platforms.
In the vmware.log this may appear as "Grain #* @* is orphaned. "

VMware complains - disk needs repair
monolithicSparse
hosted ??? try -R parameter of recent vmware-vdiskmanager
vmware-vdiskmanager.exe -R corrupt.vmdk

if that does not work try to mount the vmdk with vdk
if that does not work either use Plan B

Check for missing files failed.
The file specified is not a virtual disk.

Sometimes a crash of the host - no matter if Windows, Linux or ESX - results in a blank vmdk descriptor.

VMware complains - file is not a virtual disk
monolithicSparse
hosted ??? try -R parameter of recent vmware-vdiskmanager
vmware-vdiskmanager.exe -R corrupt.vmdk
VMware complains - file is not a virtual disk
descriptor vmdk is blank
all good use sample from list
restore entries from last log
VMware complains - file is not a virtual disk
monolithicSparse
embedded descriptor looks ok
hosted ??? try -R parameter of recent vmware-vdiskmanager
vmware-vdiskmanager.exe -R corrupt.vmdk
VMware complains - file is not a virtual disk
monolithicSparse
embedded descriptor is blank or corrupt
hosted unknown use sample from list
restore entries from log
see howto
inject with dsfi.exe

Cannot open the disk N:\test\test-000001.vmdk or one of the snapshot disks it depends on.
Reason: The parent virtual disk has been modified since the child was created.

There are several possible reasons for this message:
- a vmdk was used with a second VM
- unwise manual edit of the vmx-file
- unwise operations with vmware-vdiskmanager or vmkfstools
- failed operations with snapshotmanager
- bugs, host crashes ....

VMware complains -parent has been changed
all depends fix CID-chain
VMware complains -parent has been changed
- embedded descriptor
hosted depends extract with dsfo.exe
fix CID-chain-embedded descriptor
inject with dsfi.exe

In case a VM with snapshots does not start anymore because a basedisk was expanded you have two options:
- fix the snapshots
- fix the basedisk
The first option only makes sense with VMs that only have one snapshot.
In all other cases it is very probably easier to undo the expand of the basedisk.

parent has been expanded - monolithicFlat
all depends
on skills
of admin
cut vmdk
fix geometry
fix CID-chain
parent has been expanded - split vmdk
all   delete chunks added during expand
fix geometry
fix CID-chain
parent has been expanded - monolithicSparse
hosted  

extract descriptor of basedisk with dsfo.exe
fix geometry to the size before expand
fix CID-chain
inject with dsfi.exe

 

or

 

extract descriptor of snapshot with dsfo.exe
fix Extent description
to the size after expand
fix CID-chain
inject with dsfi.exe

 



The destination file system does not support large files

Did you move your VM to Fat32 ?
Are you using NTFS with Linux or EXT2 with Windows ?
Obviously it is a bad idea to copyvmdk- files larger than 2 GB to Fat32.
If you think your filesystem should support the files you use
you may try this entry in the vmx-file.
diskLib.sparseMaxFileSizeCheck= "false"



The version of the virtual disk is newer than the version supported by this program

Open the descriptor of the vmdk and comment offending lines.



The SVGA mode stored in the snapshot cannot be restored on this display ...

Read vmware.log - find out expected screen resolution - change host resolution and try again.
If that does not work remove *.vmss to force a fresh start.


The suspended image contains a virtual machine that uses floating point features that do not match the supported features on the real machine ...

Disable cpu-compatibilty-test in the vmx-file by adding this lines for the next start.
checkpoint.overrideVersionCheck = "true"

checkpoint.disableCpuCheck = "true"

If that does not work remove *.vmss to force a fresh start.






top

 


Post mortem





In the following cases there is little hope to recover the vmdk as is.
So first thing to try is to repair the vmdk so that it can be mounted with vdk or a helper VM.
If that fails - and the data is important - try commercial tools - I have good results with UFS-explorer.
If that does not work there is always Plan B as the ultima ratio.


post mortem Platform chances trickiness suggested procedure
basedisk is lost - only snapshots are left
ESX recover
single files
advanced fake basedisk
fix CID-chain
restore data with LiveCD
Windows recover
single files
advanced fake basedisk
fix CID-chain
mount with vdk
restore data with LiveCD
Linux recover
single files
advanced fake basedisk
fix CID-chain
restore data with Helix LiveCD
         
one or more flat chunks of a
"twoGbMaxExtentFlat" -disk is missing
hosted restore average fake chunks
mount with vdk
one or more sparse chunks of a
"twoGbMaxExtentSparse" - disk is missing
hosted restore advanced fake chunks
mount with vdk
first chunk of a
"twoGbMaxExtentSparse" - disk is missing
hosted very bad advanced Plan B
"monolithicSparse" is too small
after running out of disk-space
hosted restore advanced copy to location with disk-space
Plan B
restore data
one or more chunks of a
"twoGbMaxExtentFlat" are too small
after running out of disk-space
hosted restore advanced copy to location with disk-space
expand
mount with vdk
restore data
"monolithicFlat" is too small
after running out of disk-space
all restore advanced copy to location with disk-space
expand
mount with vdk
restore data
         
one or more chunks of a
"twoGbMaxExtentSparse" are too small
after running out of disk-space
hosted restore advanced copy to location with disk-space
Plan B
restore data
disk has holes (needs repair) hosted restore advanced mount with vdk
restore data
starting a Linked Clone messed up original VM all bad very advanced restore still usable branches


top

 


Plan B


Cases:
Rescue files from a virtual disk that is corrupted beyond repair - no matter what platform.
Rescue files from a standalone snapshot.
Rescue files from single chunks of split vmdks

Overview of the procedure:
create large new monolithicFlat vmdk
wipe disk with zeros using a helper VM or a LiveCD like Knoppix, UBCD4Win or MOA
format with filesystem used by the corrupted chunks
copy corrupted chunks into the disk
analyse with helper VM or LiveCD - using tools like UFS-explorer, GetDataBack ....
search for lost, deleted files

Problem:
Folks may tell you this can't work.
Ignore them. You are desperate - aren't you ?
I had good results with rescueing files from NTFS formatted corrupt vmdks.
Procedure is worth a try no matter what the original platform was.
I have used it with broken vmdks from ESX as well as from hosted platforms

Limitations:
This technic can only recover files that were newly created and saved during the timeframe the given chunk was active.
You can not restore files that were created in snapshot1 and last saved in snapshot2.

 

top



the CID-chain

VMware uses CID-values to verify if the snapshot chain is clean before starting a VM.
If a snapshot chain is corrupt the CID-chain is broken.

Snapshot chains may get corrupted when
- snapshots were deleted manually
- the host crashed
- the system run out of diskspace
- unwise manual edits of vmdk or vmx-files
- a virtual disk was attached to a different VM
- the basedisk was expanded
- ....
All those mentioned cases will be noticed by VMware at startup because of a break in the CID-chain..
As starting the VM in this conditions would further corrupt the virtual disks the VM will not be started.


In this simplified listing of one basedisk and its two snapshots in all
cases the child references the CID-value of its parent correctly.

###################### Windows Vista.txt ########################
CID=9a1f1a1f
parentCID=ffffffff
RW 104857600 SPARSE "Windows Vista.vmdk"
ddb.geometry.cylinders = "6527"

###################### Windows Vista-000004.txt ########################
CID=5cdd6af0
parentCID=9a1f1a1f
parentFileNameHint="Windows Vista.vmdk"
RW 104857600 SPARSE "Windows Vista-000004.vmdk"

###################### Windows Vista-000002.txt ########################
CID=c750afeb
parentCID=5cdd6af0
parentFileNameHint="Windows Vista-000004.vmdk"
RW 104857600 SPARSE "Windows Vista-000002.vmdk"

 


In this simplified listing of one basedisk and its two snapshots
the parentCID in the first snapshot does NOT reference the correct CID-value of its parent.
This means that during the last use of the "windows Vista.vmdk" it probably was used
by another VM, or it was expanded or ...


###################### Windows Vista.txt ########################
CID=a123b123
parentCID=ffffffff
RW 104857600 SPARSE "Windows Vista.vmdk"

###################### Windows Vista-000004.txt ########################
CID=5cdd6af0
parentCID=9a1f1a1f
parentFileNameHint="Windows Vista.vmdk"
RW 104857600 SPARSE "Windows Vista-000004.vmdk"

###################### Windows Vista-000002.txt ########################
CID=c750afeb
parentCID=5cdd6af0
parentFileNameHint="Windows Vista-000004.vmdk"
RW 104857600 SPARSE "Windows Vista-000002.vmdk"



CID=fffffffe

parentCID=ffffffff

This vmdk is a newly created basedisk


CID=********

parentCID=ffffffff

This vmdk is a basedisk


CID=12345678

parentCID=12345678

When the parentCID-value matches the CID-value this snapshot may be an orphan. There is something very wrong with your snapshot chain.




Therapy : fake basedisk

 


Cases:
Rescue files from a standalone snapshot.
Rescue files from a snapshot chain

find out what type of vmdk you need
find out what nominal size you need
find out which name the fake basedisk must use
create new disk with vmware-vdiskmanager.
find out which filesystem you need
format vmdk using a helper VM or a LiveCD
copy the vmdk into right path
find out which CID value is needed
attach snapshot
mount snapshot with helper VM or LiveCD
use recovery tools like UFS-explorer, GetDataBack ...


Chances:
There is a good chance to recover files which were unfragmented inside the snapshot



 


Disktypes with external descriptor


vmfs
vmfsSparse
monolithicFlat
twoGbMaxExtentSparse
twoGbMaxExtentFlat

fullDevice
partitionedDevice
custom

 

 


Disktypes with embedded descriptor



monolithicSparse

streamOptimized

 

to extract the embedded descriptor run

dsfo.exe monolithicSparse.vmdk 512 800 descriptor.bin



to inject the descriptor again run

dsfi.exe monolithicSparse.vmdk 512 800 descriptor.bin


When working with embedded descriptors make sure you never inject more than a full sector = 512 bytes.
The range 512 800 should be safe.

 


 top 

 


   home       vmx       vmdk        about this site        forum        downloads