"Stuck" unit

Anonymous
Topic 10655

Thanks for the report.

Actually the new pt workunits crunch in three steps: Two runs similar to the previous ft runs (0-50%, 50-100%), then a "comparison" run that we thought to be short enough for not needing chekcpointing or a single percent of the done counter. Apparently this last step takes somewhat longer on your machine, I'll take a look at the WU and see if the reason is in the data.

BM

Bruce Allen
Bruce Allen
Joined: 15 Oct 04
Posts: 958
Credit: 170,849,008
RAC: 0

"Stuck" unit

> Under version 4.65, Unit pt11_I12_f59.998_b0.104_0 has been sitting on 100%
> completed for several hours now and not rolling over (still counting CPU time,
> nothing under "total time") This is on machine 1525. When I restart BOINC
> it starts at 50%, quickly jumps to 100%, then repeats the process.
> Might want to see if this is a reproduceable error. I'll keep a copy of the
> current file state, but I'm going to let it crunch on some other units to see
> if it's a corruption in my software.

I think we have now fixed this bug. The new science code was acting badly if it didn't find any source candidates in one of the data sets.

[Added Jan 8th] We're now testing a revised app version that hopefully will fix this problem. If our testing goes well it should be available in a day or so and will revive this stuck WU.

Bruce

Bernd Machenschalk
Bernd Machenschalk
Joined: 15 Oct 04
Posts: 2,684
Credit: 25,950,161
RAC: 34,820

>> Same problem here with

>> Same problem here with unit pt12_I12_f59.998_b0.104_9, crunching with einstein
> 4.71 on machine 1538, starts at just under 50%, gets to 50%, then upto 100%,
> and sits there.

Ooops - I thought we had this fixed... thanks for the report.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.