Unknown Error

Jul 1, 2009 at 3:36 PM
Edited Jul 1, 2009 at 3:38 PM

Hi dblock,

Can you tell me in what circumstances I might see the following error?

7/1/2009 3:24:02 PM: Vestris.VMWareLib.VMWareException: Unknown error
   at Vestris.VMWareLib.VMWareInterop.Check(UInt64 errCode)
   at Vestris.VMWareLib.VMWareJob.Wait[T](Object[] properties)
   at Vestris.VMWareLib.VMWareJob.Wait[T](Object[] properties, Int32 timeoutInSeconds)
   at Vestris.VMWareLib.VMWareVirtualMachine.RunProgramInGuest(String guestProgramName, String commandLineArgs, Int32 options, Int32 timeoutInSeconds)

I have built an automation system which runs sanity checks on builds coming into QA, but I have been fighting with the bug that effects multi-threading in VIX for a few weeks now :(

http://communities.vmware.com/thread/196429

http://communities.vmware.com/message/1252959

 I've tried two things to get around this:

 - In each automation request, I keep all references to VIX until that request has completed.

 - When an automation request completes, it will wait until all other requests are complete before exiting, thus keeping references to VIX

But I still see the above error on occasion, can you tell me under what circumstances this is output?

Thanks,

Jimmy

 

Coordinator
Jul 1, 2009 at 8:07 PM

I won't be 100% sure that your problem is the VIX bug you're describing. Maybe the machine just didn't have time to power up properly. We found that delaying execution of anything after WaitForToolsInGuest by about 10 seconds improves reliability 10-fold.

On the crashes. I think you'll continue fighting this bug until a) it's fixed, which is definitely happening in VixCOM 1.6.3 (nobody will tell you when that ships, but it's definitely this year because they have to support VSphere 4.0 ASAP :)) b) you find a way to properly count references and GC in the right time. The way we've worked around this in a similar scenario is by making sure that all objects are IDisposable and using the using (...) paradigm for those IDisposable objects. In addition, we never make more than one connection to a remote host at a time and we always GC.Collect(), GC.WaitForPendingFinalizers() after an explicit Disconnect. That seems to work.

Jul 2, 2009 at 10:24 AM

Do I need to call WaitForToolsInGuest after I revert to a snapshot? I am using a pre-configured snapshot on all my images. The first thing I do is revert to this. I then copy in a number fo script files which perform the actual testing of my builds.

These scripts are .exe's, so I'm just using runProgramInGuest to call them. These run for around 30-40 minutes.

The 'Unknown Error' I describe above seems to happen intermitently. I just lose the handle to the virtual machine completely. The scripts continue to run in the guest OS, but I never know if they have completed as I don't get any return value. It's not a timeout issue as I have seen the specific 'the operation has timed out' issue on occasion.

Coordinator
Jul 2, 2009 at 11:32 AM

If the snapshot is powered down, then you need to PowerOn and WaitForToolsInGuest. If it's live I think you don't have to do it, but maybe you do. I personally never use live snapshots since they take way too much space.

I think you lose the handle because the virtual machine internal structures are disposed. This is a bug and happens in VixCOM when you have 2 connections to the same virtual machine and one is released. VixCOM does some (not so) clever handle sharing.

How many VirtualMachine objects do you create?

  1. If it's more than one simultaneously, get rid of the second one and just share an object.
  2. If it's more than one in a sequence, GC.Collect() and GC.WaitForPendingFinalizers after you release it.

 

Jul 2, 2009 at 2:24 PM

To clarify, I'm using a number of different connections (maximum of six) to a number fo different virtual machines on the same ESX host.

Each connection is managing an automation job on a particular distinct virtual machine. I never call disconnect() because I was seeing that VixCOM bug if I did, (all handles to other VM's would become invalid if I closed one). When an 'automation request' completes, I ensure that it does not finish execution until all other requests are done.

So I may have six different VirtualMachine Obects connected at the same time. I havn't seen any problems with this approach except the intermittent 'Unknown Error' that I described above.

Jul 7, 2009 at 5:14 PM
Edited Jul 7, 2009 at 5:52 PM

Hi dblock,

I still see this 'Unknown Error'. It happens in this scenario, (for example):

I have multiple connections to different Virtual Machines's on the same ESX server. While there are a few VM's running, the tasks on one VM complete, so it can be closed. I don't explictly disconnect, just logout and power down the VM.

The tasks that are running take around 40 minutes. It seems the next time the API checks that the task hasn't timed out, I get this error:

7/7/2009 1:26:56 PM: Vestris.VMWareLib.VMWareException: Unknown error
   at Vestris.VMWareLib.VMWareInterop.Check(UInt64 errCode)
   at Vestris.VMWareLib.VMWareJob.Wait[T](Object[] properties)
   at Vestris.VMWareLib.VMWareJob.Wait[T](Object[] properties, Int32 timeoutInSeconds)
   at Vestris.VMWareLib.VMWareVirtualMachine.RunProgramInGuest(String guestProgramName, String commandLineArgs, Int32 options, Int32 timeoutInSeconds)

Is there any more information I can give you to give you a clearer picture of what's causing this error?

Update: If I call GC.SupressFinalize(myObject), this doesn't seem to happen (as often?) (havn't seen it yet, 4 jobs completed)

Thanks,

jbc1

Coordinator
Jul 8, 2009 at 1:40 PM

I still think it's a VixCOM bug. Maybe you should describe this on the VixCOM API forum?

Jul 9, 2009 at 2:34 PM

Will do - Could you give me some tips on the best logs to post?

Does your DLL log to somewhere?

Thanks again.

Coordinator
Jul 9, 2009 at 3:57 PM

On the client side, the vix logs will land in C:\Documents and Settings\<username>\Local Settings\Temp\vmware-<username>\vix-<pid>.log.

In C:\Documents and Settings\All Users\Application Data\VMware\VMware Server\config.ini, add the line

vix.debugLevel = "9"

The DLL is just a wrapper, so there's no real need for additional logging.