Work Unit not finishing

Anonymous
Topic 12320

Sorry, a bug in our code is triggered by data in this Workunit. It ends up in an endlees loop at the final calculation, so the counter will stay at 100% until the result reaches the "maximum number of floating point operations" that is defined in the workunit and then the client will continue with the next one. So you don't have to do anything special, just wait...

We found and fixed tha bug, but the new app we built is still passing some internal tests. Sorry for the inconveniance, this is alpha test.

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

Work Unit not finishing

The maximum FLOPS are, I think, set quite high, so you'll have to wait quite a while and still won't get any credit as the result file isn't valid. So if you get bored, "update" the project to upload the Results that you have finished so far, then "reset" the project (and hope you get a different WU...).

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

Well, I was thinking about

Well, I was thinking about the poor users who use the stock 4.13 client. If, however, you got a newer one, the possibilities are better there.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

The new set of apps we just

The new set of apps we just put on the server (4.69-4.71) should have this bug fixed. Please report if not.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

Ouch! Thanks for the reports,

Ouch! Thanks for the reports, keep up, we are working on it...

There's always one bug left...

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

Sorry folks, I know this is

Sorry folks, I know this is annoying. But we always stated that this is alpha test, no guarantees for nothing - and we are working on it.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

Honza: > Same here for

Honza:

> Same here for result H1_0073.4__0073.8_0.1_T02_Test02_4
> using einstein version 4.72, Boinc 4.16.

Hm, seems that this result has been finished correctly... did you terminate it or was it just taking a long time in 100%?

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

Thanks. Actually the current

Thanks.

Actually the current WUs crunch in three steps: Two doing the analysis of the detector data (shown as 0-50% and 50.01-100%) and then we do a step that does a comparison of some sort. The last step we thought should not take enough time to give it a full % of the progress, but apparently there are cases in which it takes quite longer than we expected.

In addition to that there has been a bug in that comparison code that really lead to not properly finishing the Result, which, however, should be fixed in the apps >= 4.72.

I am currently taking a look at the parameter sets involved and will try to further track this.

For now be informed that in rare cases it may happen that the process counter may stay at 100% for quite a while. And please continue to inform us of WUs causing that behavior.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

If you see such a result and

If you see such a result and data transfer sizes are not much of an issue to you it would help us if you could zip (or tar/gzip) your BOINC directory (or the projects/einstein and the appropriate slots directory) and make it available to us (before aborting the result or resetting the project).

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

THANKS TOBY! A big HUG from

THANKS TOBY!

A big HUG from all people in the e@h projects team!!

Edit: The Clicky2 gives a 404 - not found...
Edit#2: Not anymore - was a bit too fast?

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

In case this helps

In case this helps anyone:

The problem appears to be in the data, not in the code. The WU will eventually finish, but it may take quite some time (even more than we expect in the max CPU time value and exceeding the deadline), and maybe also more memory than we expected (possibly causing more problems). We'll ty to avoid such WUs in the future.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.