Problem with checking if another process is still running.

I have had this long standing issue with finding out if another process is still running. It seems simple, but there are special conditions that cause things to work improperly. I have set up a tester to help me deal with this and something interesting happened.

Let me first explain the setup I am using to test.

Client.exe - This is the program I want to watch. There may be multiple instances of this at once, so I watch is PID. I need to know if a particular instance goes away.

Server.exe - This gets a hello message from Client.exe which contains its PID. From this point, every one second it validates the clients by seeing if a process with the PIDs it knows about are missing. If they are missing, it remove the client from the list.

Tester.exe - Tester launches and kills the Client.exe at set intervals, like every 10 seconds.

TaskManager - Running to watch the process table.

Scenario 1:

Server.exe is using CreateToolSnapshot to look up the PID.

A. When Tester.exe kills Client.exe, then the CreateToolSnapshot() suddenly starts to fail for Server.exe with error 8, that is "Out of Memory". Client.exe disappears from Task Manager.

B. If after A happens, I then exit Tester.exe, suddenly things clear up for Service.exe and it can now see that Client.exe with PID XXXX is gone.

C. When TaskManager kills Client.exe, Server.exe can detect the process disappearing without issue.

Scenario 2:

Service.exe is using OpenProcess(SYNCHRONIZE , false, PID) to see if PID is running. I am using SYNCHRONIZE because it is the least privileged access AFAIK.

A. When Tester.exe kills Client.exe, the OpenProcess SUCCEEDS long after Client.exe stops running. Again task manager accurately shows that Client.exe is no longer running.

B. After A happens, once again quitting Tester.exe will suddenly allow Server.exe to resolve all the PIDs it had correctly and will now report all the PIDs it had from the various toggles Tester.exe is no longer running.

C. If task manager kills Client.exe, Service.exe correctly can tell that the PID is no longer there.

I know that Tester.exe is probably bugged in some way (maybe it isn't closing a handle). In a way this is a good thing because it is giving me an unusual 'bug' that I would otherwise not be able to see.

The real problem is, how can service.exe correctly find out of PID is running? Both the toolsnapshot and OpenProcess() do not work right under these circumstances.
[2616 byte] By [DeepT] at [2007-11-20 11:40:05]
# 1 Re: Problem with checking if another process is still running.
Not sure what is going on. Here is a third technique to try. Using a named kernel object ( say mutex ).

client.exe

create a named mutex say "Clientexe1234" , where 1234 is the PID of the exe.
Send hello message to server
Close handle on the mutex when exiting

server.exe

Receive hello from client. Cache PID.
While checking for existence, create a mutex again with the same name "Clientexe1234" , where 1234 is the PID of the exe you are checking.
If the CreateMutex returns ERROR_ALREADY_EXISTS in GetLastError, conclude process is still running, else dead.
In either case, dead or alive, CloseHandle on the mutex.
kirants at 2007-11-10 22:24:34 >
# 2 Re: Problem with checking if another process is still running.
The real problem is, how can service.exe correctly find out of PID is running? Both the toolsnapshot and OpenProcess() do not work right under these circumstances.As a test pull your process monitoring code out of Service.exe (actually disable it rather than pulling it out).

Temporarily create a small project and put the process monitoring code in there.

I suspect that something is going wrong in the service.exe app that makes it appear that the toolhelp api's are failing w/o enough memory. By moving the monitoring code out of the main app, you'll be able to find out if the monitoring code still functions. If it does, you'll know that something else is causing the issue.

Btw, I've used the toolhelp api's quite a bit before in test automation harnesses where I snap the system process list every two seconds. Not once did I ever have a failure where the api ran out of memory.
Arjay at 2007-11-10 22:25:39 >