A new Linux App is available from our Beta Test page.
This App was built with newer version of the BOINC library that I hope to fix some of the segfault client errors (exit status 11).
In addition it will stop trying to immediately sync the checkpoint file after five successive failures (which should help e.g. on XFS). And in contrast to the 4.14 will still keep the checkpoint if syncing failed.
I intend to make this one the "official" Linux App soon, we should at least get some more information about the computing errors we still get of the 4.02 App.
BM
BM

GNU/Linux S5R3 App 4.16 available for Beta test
)
Thank you for the report!
I can see that this is bad for you, but maybe it will help us anyway.
Do you have ddd debugger installed on the machine? If so, you could create a file "EAH_DEBUG_DDD" in the BOINC directory, and the next time a tash is started should fire up ddd attached to it. Hitting the "Cont" button will let the App run under the debugger. It should catch the signal and list where it occurred.
Thanks a lot!
I'll try to reproduce this on the same system.
BM
BM
RE: I'll try to reproduce
)
Yep, seen it:
update_app_progress (cpu_t=1.008062, cp_cpu_t=0) at boinc_api.C:265
Apparently this BOINC library version makes things rather worse than better.
Thanks a lot again for the report!
All others: be careful when using this Beta App. Would be nice to get some reports of systems where it works. I hope to issue a new, fixed version soon.
BM
BM
Interesting. I'm pretty sure
)
Interesting. I'm pretty sure it hasn't anything to do with the BOINC Core client version. Is this App actually running on any system at all?
BM
BM
We got a fix for BOINC from
)
We got a fix for BOINC from David Anderson. I'm currently rebuilding and updting the App. The 4.16 Beta Test package has been removed until it got updated.
BM
BM
Ok, I updated the 4.16 App
)
Ok, I updated the 4.16 App with new BOINC library (as of today).
Please download the new package with the old name and replace the files
(new md5 is 4a13337ab423e80cabacc1e14fdf1866, old was dc0867738e712a71ca1ec458c0eec185).
BM
BM
RE: The other issue I've
)
What system (glibc, Kernel) is this?
I don't want to go into too much detail here, but recent changes to BOINC affected the handling of CPU time. The old way was violating standards and caused some trouble (e.g. the "hang" problem on MacOS), while the new may not work correctly on ancient systems (that have a non-standard behavior of the pthread library).
I think that with the next generation of Apps & Clients (major version number 6) there is a way around this, but for now we had to make a decision between inconvenience on old and showstoppers on new systems.
BM
BM
RE: So, if it doesn't
)
What's your setting of "Leave applications in memory while suspended" (general preferences, possibly own venue)? If the App is to be left in memory, the client will still load the App and suspend it shortly after. Maybe that's what's causing the problem.
BM
BM
RE: Yes, that feature is
)
No, that's not what I meant. I don't think that this setting is causing the problem. It might be that the short time between starting and suspending the App causes trouble on either the App or the OS. For now it's just good to know and may help in further tracking down the problem.
BM
BM
RE: RE: In addition it
)
Are you sure you don't have a EAH_NO_SYNC file left in the BOINC directory?
BM
BM
Yes, just minutes ago. The
)
Yes, just minutes ago.
The message "process got signal 11" is actually not from the App, but from the BOINC Core Client. Either the reason for the segfault is in the Core Client itself, or it is catching the signal meant for the App, in which case it at least prevents any further diagnosis output that might be helpful.
Please try an old Core Client. On my test machines, a 5.4.11 seems to work reliably. At least we should get a better idea of where the segfault comes from.
BM
BM