GNU/Linux S5R3 App 4.20 available for Beta test

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820
Topic 13628

A new Linux App is available from our Beta Test page.

This App should fix the bug that caused a SEGV (signal 11) when the App couldn't wirte a checkpoint. Speed should be comparable to that of the 4.16.

Happy testing!

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

GNU/Linux S5R3 App 4.20 available for Beta test

This App fixes at least one reason of the "signal 11", it's faster than the 4.02 and hasn't any really obvious bugs, so I made it official. If it should really make things worse than they are with 4.02, I'll deprecate it tomorrow.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

1) could it be that all the

1) could it be that all the people observing "signal 11" are running boinc (and the Apps) as root? In other words is anyone seeing this that is running it under a different account?

2) If you're not using the "official" BOINC Client, the availability and location of tools, init scripts, BOINC data, project directories etc. depends on your Linux distribution. You can download the BOINC Core client package from http://boinc.berkeley.edu/download.php , install it in your HOME directory and run BOINC under your account. If you have a system-wide BOINC installation, you should stop / switch off this installation first (usually something like "/etc/init.d/boinc stop" as root should do this).

In general, running BOINC as root is not a good idea, though even some Linux distributions do this (by default).

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: Interesting point,

Quote:
Interesting point, Bernd. To be honest, I kinda liked the idea of running BOINC as a daemon- simply because it's so convenient and you never forget to start it.


It's a good and surely more convenient way to run BOINC as a daemon, but it still don't need to run under the root account. You (or the install script of the distribution) could create an own user and run BOINC under that account automatically, controlled by a runlevel script (that usually is run as root).

There are privileges to root that BOINC don't need and that could damage your system more than an ordinary user process can. E.g. there is usually some amount of spaece on the system disk that is reserved for root, so that even when the disk looks full to normal user's processes, the system remains operational. A BOINC Client or App running as root can literally fill up your filesystem to the last bit, and in bad cases you won't even be able to boot it anymore.

In the one cause of the 'signal 11' I found and fixed in this App running it as root was at least what it prevented from reporting a useful stack trace, and I haven't seen this failure at all when running under a normal user account.

Quote:
Of course, under those circumstances, I'm more than willing to do an "old style" installation. Just one question before I proceed: How can I be sure to keep my WUs and metadata? Will just copying my "projects" folder and chowning it do the trick? Thanks in advance.
Annika

I would be glad if you could at least try the old-fashioned approach until we found and fixed the current 'signal 11' problem.

If you want to keep the work in progress, I'd suggest to copy the whole BOINC directory from wherever on your machine into your home directory (depending on how you do this 'chown -R BOINC' afterwards so that all files are writeable to you). Temporary files and checkpoints are kept in the slots/ directories, references in the client_state.xml.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: it just seems to me

Quote:
it just seems to me that with the current 4.20 application, no one seems too concerned about the signal 11 issue at the moment.


This is not true. Actually it's the problem that causes the highest failure rate of all, and consequently it's at the very top of my list of things to fix.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

It looks like the "signal 11"

It looks like the "signal 11" problem has finally been found and fixed in the new Beta Test App. Many many thanks to Bikeman and Kathryn, and everyone who helped with reports!

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: Does that mean the 4.24

Quote:
Does that mean the 4.24 application ought not to exhibit that same destructive behaviour?


Due to a bug in BOINC on Linux the "no heartbeat from core client" led to a segfault ("signal 11") and a Client Error of the task. Last week we foud and fixed the bug in BOINC, and the fix went into the 4.24. Instead of giving a client error the app should now just be restarted by the Core Client, issuing just a "no finished file" message.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: I don't think date or

Quote:
I don't think date or time changes have any influence to boinc or the science apps.


They do. The time when the next "heartbeat" from the Core Client is expected by the App is determined by the system time. This e.g. leads to more frequent App restarts due to "no heartbeat" on systems that "stretch" local time for synchronizing to an external time server. With Linux Apps prior to 4.24 this actually led to a segfault (signal 11).

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.