0

Error experienced, causing VM instability, on Warewolf 2.8.6.16

Wynand Vermaak 6 months ago in Server 0

Hello Team.

I'd like to bring an error to your attention and for your consideration. We are currently observing this error on Warewolf 2.8.6.16 and it is causing the VM that is hosting the microservice to become unstable.

Context:

On Monday 2024-07-01 we deployed code to our LeadsServices UAT environment. Because changes were made to the secure.config (workflow was set to public), we had to restart the service in order for the changes in the secure.config to take effect. The VM restart was performed at around 16:15. After the restart, Sai from DevOps observed, using PRTG monitoring, a couple of red flags. One of them was the number of concurrent Warewolf connections. 

Image 1143

Now because this is a relatively new MicroService, and UAT environment, there wasn't any processes running in the background. I'm attaching the warewolf log to support this and for your review and consideration. The red flags only disappeared after 15mins - even tho there were no transactions / workflows processing on the microservice. wareWolf-Server-Leads-UAT.zip

However, what we have found is that when we restart a Production VM, one that is busy with a lot of workflows executing at the same time, these red flags take much longer to disappear. We also observed that workflows or APIs would take much longer to execute during this time. For example a Post Tool that calls an API that normally takes around 3 seconds would take much much longer - often to the point of timing out. And when this happens, Architecture and DevOps decides to switch off the VM as a precautionary measure - which means we are sitting without a service in Production.


If we consider the time we observed the red flags in PRTG, and we look at the corresponding time in the Warewolf Server logs - it appears that the following error may be the culprit:


[Header]
2024-07-01 16:15:02,101 ERROR - [Warewolf Error] - Dev2.ServerLifecycleManager
System.Runtime.InteropServices.COMException (0x8007045B): A system shutdown is in progress. (Exception from HRESULT: 0x8007045B)
at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
at System.Management.ManagementScope.InitializeGuts(Object o)
at System.Management.ManagementScope.Initialize()
at System.Management.ManagementObjectSearcher.Initialize()
at System.Management.ManagementObjectSearcher.Get()
at Dev2.ServerLifecycleManager.GetNumberOfCores()
at Dev2.ServerLifecycleManager.TrackUsage(UsageType usageType, IExecutionLogPublisher logger)
at Dev2.ServerLifecycleManager.CleanupServer()
[Footer]


Please feel free to reach out should you need more information. Let's understand and solve together.


Regards,


Wynand

execution server