// DBOINCP-300: added node comment count condition in order to get Preview working ?>
Anonymous
26 Feb 2005 17:29:18 UTC
Topic 12736
(moderation:
)
When you click on the Result that was granted 0.00 credit you'll see that the "Validate state" is "invalid". Something's wron with the output file that was poduced.
> but why Outcome=Success, i thought outcome would not be success in such situations
Outcome=success means that the program (Application) ran without crashing. It doesn't tell anything about the result file itself, that's the part of the validator on the server side.
We get a lot of Results back that are more or less numerical garbage, possibly due to overclocking affecting the FPU, though the program didn't really crash.
I currently can't say what's wrong with your particular Result, if it was related to the problem in the validator it should not happen again.
> Isn't it strange that the einstein@home WUs return bad results from machines where seti@home WUs doesn't ...
I don't know the SETI code neither of the App nor the validator, but I can imagine at least two reasons for this:
1. SETIs validator isn't as picky as Einsteins. Thus they get back bad results, but don't recognize it at least on the level of analysis the users come to see.
2. SETIs App might rely more on memory and integer operations, while Einstein is definetly FPU bound. The CPU chip gets hot at the spot where the most energy is needed. When it gets too hot, it first breaks the results of the unit that is located there. If an integer unit gives false results, this will soon end in a crash of the program or the OS, e.g. because of wrong memory address calculations. If it's the FPU that gets too hot, you will notice nothing of it while the program runs until you take a close look at the results.
Carl Cristensen told me that CPDN gets all kinds of weird and obviously wrong results from overclocked machines. However I don't know how the CPDN validator handles them.
> > I guess it's time for developers to say something here. Our work (and
> CPU
> > power and energy power) is wasted this way.
> >
> developers please do something. this is an obvious bug.
We are looking into this: it appears that our validator may be setting the agreement threshold slightly too tight in some cases. This is hard to 'tune in advance' without having access to the actual results. So please be patient: one of our developers is now working on this.
> On some other thread i read, that core has different version on linux than
> windows (4.80 vs 4.79).
> Maybe they just do different computation? Maybe difference is too big?
For database reasons we are currently keeping a separate minor version number for each architecture. However, the Apps 4.78/79/80 are built from exactly the same (science) code.
A stunningly simple example for architecture and rounding issues is (int)(10*0.3) (described here).
> Actually, it seems they are aware of the problem and looking at it.
We are. It turned out that it's not done by just adjusting some parameters, but we need to change the validator as a whole, which means basically to rewrite it. We are working on this.
There are people already working on it. I'd expect this to be ready about next week, but usually all that can go wrong will go wrong - so don't bet on it. Let's hope that it goes wrong before we put it on the public server.
granted credit zero
)
> but why Outcome=Success, i thought outcome would not be success in such situations
Outcome=success means that the program (Application) ran without crashing. It doesn't tell anything about the result file itself, that's the part of the validator on the server side.
We get a lot of Results back that are more or less numerical garbage, possibly due to overclocking affecting the FPU, though the program didn't really crash.
I currently can't say what's wrong with your particular Result, if it was related to the problem in the validator it should not happen again.
BM
BM
> Isn't it strange that the
)
> Isn't it strange that the einstein@home WUs return bad results from machines where seti@home WUs doesn't ...
I don't know the SETI code neither of the App nor the validator, but I can imagine at least two reasons for this:
1. SETIs validator isn't as picky as Einsteins. Thus they get back bad results, but don't recognize it at least on the level of analysis the users come to see.
2. SETIs App might rely more on memory and integer operations, while Einstein is definetly FPU bound. The CPU chip gets hot at the spot where the most energy is needed. When it gets too hot, it first breaks the results of the unit that is located there. If an integer unit gives false results, this will soon end in a crash of the program or the OS, e.g. because of wrong memory address calculations. If it's the FPU that gets too hot, you will notice nothing of it while the program runs until you take a close look at the results.
Carl Cristensen told me that CPDN gets all kinds of weird and obviously wrong results from overclocked machines. However I don't know how the CPDN validator handles them.
BM
BM
@wijata.com: Please post the
)
@wijata.com: Please post the Result ids or names, or at least the id of the machine you did this on. I'll take a look at this.
BM
BM
> > I guess it's time for
)
> > I guess it's time for developers to say something here. Our work (and
> CPU
> > power and energy power) is wasted this way.
> >
> developers please do something. this is an obvious bug.
We are looking into this: it appears that our validator may be setting the agreement threshold slightly too tight in some cases. This is hard to 'tune in advance' without having access to the actual results. So please be patient: one of our developers is now working on this.
Bruce
> On some other thread i
)
> On some other thread i read, that core has different version on linux than
> windows (4.80 vs 4.79).
> Maybe they just do different computation? Maybe difference is too big?
For database reasons we are currently keeping a separate minor version number for each architecture. However, the Apps 4.78/79/80 are built from exactly the same (science) code.
A stunningly simple example for architecture and rounding issues is (int)(10*0.3) (described here).
BM
BM
> Actually, it seems they are
)
> Actually, it seems they are aware of the problem and looking at it.
We are. It turned out that it's not done by just adjusting some parameters, but we need to change the validator as a whole, which means basically to rewrite it. We are working on this.
BM
BM
There are people already
)
There are people already working on it. I'd expect this to be ready about next week, but usually all that can go wrong will go wrong - so don't bet on it. Let's hope that it goes wrong before we put it on the public server.
BM
BM