Silent Client Errors -- lost over 100 credits -- Now what?

Anonymous
Topic 13817

Throtteling enabled? Please try that (even if no throtteling). We've been hunting this "Can't acquire lockfile" problem for quite a while now.

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

Silent Client Errors -- lost over 100 credits -- Now what?

Hi Mike!

Quote:
If it would be of help I would be willing to go back to my previous preferences to see if the original problems re-occur. If you or Nils would like me to do this, what settings should I use and what version should I install that would give us the maximum amount of useful information.

It would help us to track down the problem (and potentially other participants, possibly of all BOINC projects) if you could
- install a recent development BOINC Core Client (any client from 6.10.29 on should have the necessary features),
- add the cc_config.xml file described in the other thread (I think you need to restart the client to recognize it) and
- restore your original computing settings (70% CPU usage)

I'm monitoring both threads, so feel free to post in either one.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

Thanks a lot! I need to

Thanks a lot!

I need to analyze this a little bit more to find the actual problem, but it's definitely helpful!

(it looks like the process id that I thought should be in the logs is missing, I need to find out why)

If you want you can restore your previous compute settings (100% CPU overnight) to avoid the errors. The Client version and config file shouldn't harm. Please leave them in place, we may ask you for one or two more experiments like this again.

Thanks a lot for your help!

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

Thanks again! For the techs:

Thanks again!

For the techs: It looks like that at some point there are multiple instances of the same application running in the same slot (i.e. writing to the same stderr file). One possible cause is that quitting an application (suspending a task by BOINC) may not work when it has threads suspended (for throtteling / CPU usage).

A fix at least for this possible issue has been checked into BOINC last night. I need to build new Apps with this and the we'll see if that actually fixes the problem. I'll do this ASAP, but there are a couple of higher priority things on my table that will probably occupy the the rest of this week.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.