GNU/Linux S5R3 App 4.49 available for Beta test

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820
Topic 13670

A new Linux App is available from our Beta Test page.

This is a "switching" App as you already know from the 4.38 App.

If something goes wrong with the automatic switching (i.e. you get "signal 4" "illegal instruction" errors), placing a file named "CPU_TYPE_0" in the BOINC directory should run the generic (non-SSE) App even if the wrapper would detect SSE.

The SSE App was built with compiler settings that should use the SSE unit for most arithmetics (-mfpmath=sse). There might be a difference in speed overall in one or the other direction, but mainly it should serve two purposes: first we try to avoid the FPU exceptions by avoid using the FPU, and second it should improve the prefetching of the Hough code (actually enable parts of it). This means that the App should run faster and further reduce the variance in run-times between workunits compared to previous versions.

In addition the App was built with BOINC API as of May 7, which means that it should work properly with latest development clients.

The app_info.xml has entries for

420
421
424
427
431
435
438
449

If your current App version is not listed here, you'll have to add it manually.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

GNU/Linux S5R3 App 4.49 available for Beta test

Quote:

My AMD X2 5000 swiched fine, but my AMD XP 3000 wrecked all WUs.

From stderr.txt:

393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, Detected CPU type 1
SIGILL: illegal instruction
Stack trace (5 frames):
../../projects/einstein.phys.uwm.edu/einstein_S5R3_4.49_i686-pc-linux-gnu_1[0x81946a1]
[0xffffe420]
../../projects/einstein.phys.uwm.edu/einstein_S5R3_4.49_i686-pc-linux-gnu_1[0x806b368]
/lib/libc.so.6(__libc_start_main+0xe0)[0xb7d8dfe0]
../../projects/einstein.phys.uwm.edu/einstein_S5R3_4.49_i686-pc-linux-gnu_1(shmat+0x55)[0x804b5c1]

Exiting...

cu,
Michael

Too bad. Are there SSE2 instructions in the executable?

Does the CPU_TYPE_0 work for the moment?

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

It indeed did contain SSE2

It indeed did contain SSE2 instructions. I built a new SSE App and updated the archive (and the md5sum on the page).

Thanks for the report.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: 1. Is it possible, that

Quote:
1. Is it possible, that the first version(with some SSE2 instructions) might be faster than the new one?


Yes, it is possible, but I simply don't know.

That reminds me that I wanted to post some "reference workunit" for direct speed comparison. Will pick this up soon.

Quote:
2. Does the new beta app have any advantages for non SSE CPUs?


Not in speed. I still want to test the new BOINC API version, though, that's in both Apps, in particular with the latest Core Clients.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: RE: Switched from

Quote:
Quote:
Switched from 4.35 to 4.49 on my AMD Opteron 1210 cpu running SuSE Linux 10.3 and BOINC 5.10.45. Looks definitely faster but graphics is not working. It used to work in 4.35.
Tullio

Graphics did not work when I switched from 4,35 to 4.49 during a WU run. Now that I started one with 4.49 it works,


Yep. Switching App versions in the middle of a Task is not supported in BOINC. In case of the "separate graphics" Apps this means that the "graphics_app" link in the slot directory is not updated and points to a file that doesn't exist anymore after installing a new App version. It is only set up new when a new Task is started.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: RE: RE: 1. Is it

Quote:
Quote:
Quote:
1. Is it possible, that the first version(with some SSE2 instructions) might be faster than the new one?

Yes, it is possible, but I simply don't know.

Hey Bernd,

How about you make that SSE2 version available as a "power user" app so that those of us who weren't quick enough of the mark can at least test it a little?

That'll save us having to bribe Michael or th3 who are the two who have so far admitted to having it :-).


For now I put a copy of the SSE2 App executable in http://einstein.phys.uwm.edu/power_apps/einstein_S5R3_4.49_1_i686-pc-linux-gnu.gz. This isn't a full-featured App package, you'll have to replace the file "einstein_S5R3_4.49_i686-pc-linux-gnu_1" in the 4.49 Beta Test App package with that (expanded) file.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: RE: Hi all! I would

Quote:
Quote:

Hi all!

I would be surprised to see a significant (if at all measurable) speedup for the initial app version that contains SSE2 instructions.

The app code consists of parts are really important to performance, and those have now been converted to hand-optimized assembly code (SSE).

The rest of the code is in C but not that crucial for performance. Only in those parts of the code there will be a difference in the two app versions, mostly by scalar double precision code being compiled to x87 or SSE2 instructions, respectively. To make optimal use of SSE2, one would have to generate SSE2 versions of the handcoded sections.

So, Iwould not hold my breath wrt. the SSE2 app variant.

CU
Bikeman


In Bernd's initial posting I do not read anything about hand-coded SSE instructions, but about using a compiler switch.

I do not doubt what you are writing, but this app clearly is at least 10% faster, so there might be a chance that the SSE2 version is even a little faster.

Anyway, it will be fun to prove you are right. ;-)
Or in other words: Let's see if practice can prove theory. :-)

cu,
Michael


The overall speedup in the App compared to 4.38 mainly comes from prefetch compiler intrinsics placed in the Hough code that require these switches. Another bit of speedup arises from changes in the Assembler-coded "Kernel loop" (the "interleaving" of SSE and FPU commands we have in the 4.42 MacOS Intel App), but I think that this effect will be larger on modern Intel CPUs (Core2) than on AMDs.

My guess is that the SSE2 App will be slightly faster if you would measure it against the SSE one, but you'll only notice the difference if you'd run the same workunit side by side e.g. on a dual core machine. It shouldn't be worth another case distinction in a "switching App".

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: RE: For now I put a

Quote:
Quote:
For now I put a copy of the SSE2 App executable in http://einstein.phys.uwm.edu/power_apps/einstein_S5R3_4.49_1_i686-pc-linux-gnu.gz.

Thanks for doing that.

One other point I'd like to mention. The instructions on the beta test page talk about adding the "five files" from the package ... Now I imagine the first version of the archive probably did contain five files but the current "fixed" version now contains seven files. Are these seven files all necessary?


Nope. I fixed the archive.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: For now I put a copy of

Message 3535 in response to message 3532

Quote:
For now I put a copy of the SSE2 App executable in http://einstein.phys.uwm.edu/power_apps/einstein_S5R3_4.49_1_i686-pc-linux-gnu.gz. This isn't a full-featured App package, you'll have to replace the file "einstein_S5R3_4.49_i686-pc-linux-gnu_1" in the 4.49 Beta Test App package with that (expanded) file.


Yes, depending on the unpacking procedure this usually requires renaming the file.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: Great stuff,

Quote:
Great stuff, Bernd.


Thanks, but most kudos should go to Bikeman and Akos for their great work!

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

RE: I just installed the

Quote:

I just installed the beta app (not the SSE2 version yet) on my Fedora 7 -32 bit host.

The result that was in progress picked up fine and is crunching away.

The one that just downloaded and that is branded with 4.49 has an estimated time to completion of 40:26:20. It should take about 9.

DCF (as grabbed from client_state.xml) is 0.922172

Oh wise people, what did I do wrong?


Switching Apps in the middle of a task usually confuses the run-time estimation, but I don't precisely know why.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.