Too fast

Anonymous
Topic 13245

Quote:

I have a dual-core athlon-64 computer that has been stopped by E@H for a 6-hour enforced delay because it completed its daily alotment of 32 units before 24 hours. It has been running some "albert" units in about 4000 sec or 1.1 hours for each unit for each core. That means in 24 hours it should complete 2*(24/1.1)= 43.6 units with both cores running.

I think there might be some problem with the alotment calculation at E@H that assigned 32 unts max per day.

I can delete the installation & re-install if that is the only fix.

[This computer is named XBOX...]

ADDMP

Looking at the scheduler logs (available by following on-line links) I see that your computer has:

2006-01-21 23:40:03.0533 [PID=19068] [debug   ] CONTENT_LENGTH=4514 
2006-01-21 23:40:03.1788 [PID=19068] [normal  ] Handling request:   host 522041, platform i686-pc-linux-gnu, version 5.2.13, RSF 1.000000
2006-01-21 23:40:03.1788 [PID=19068] [normal  ] OS version Linux 2.6.13-15-smp
2006-01-21 23:40:03.1876 [PID=19068] [debug   ] Request [HOST#522041] Database [HOST#522041] Request [RPC#0] Database [RPC#0]
2006-01-21 23:40:03.1884 [PID=19068] [normal  ] Processing request  [HOST#522041]  [RPC#0] core client version 5.2.13
2006-01-21 23:40:03.5179 [PID=19068] [debug   ]   Result is on [HOST#522041]: r1_0148.5__190_S4R2a_2
2006-01-21 23:40:03.5180 [PID=19068] [debug   ]   Result is on [HOST#522041]: r1_0148.5__189_S4R2a_1
2006-01-21 23:40:03.5180 [PID=19068] [debug   ]   Result is on [HOST#522041]: r1_0148.5__188_S4R2a_1
2006-01-21 23:40:03.5190 [PID=19068] [normal  ]   [HOST#522041] got request for 1831.035555 seconds of work; available disk 16.098672 GB


So really the question in my mind is, why doesn't this machine have a LOT more results on it? And why isn't it requesting more than 1800 seconds of work?

PS: one of your intel boxes is reporting results with zero CPU time. Consider updating BOINC to fix this problem.

Bruce Allen
Bruce Allen
Joined: 15 Oct 04
Posts: 958
Credit: 170,849,008
RAC: 0

Too fast

Quote:
But nevertheless, I think if you check its completed results, they were running in about 4000 seconds, & that is about 43 units a day, but it was restricted to receiving only 32 units a day.


I've bumped up the per cpu quotas by another factor of two. Let's see if that fixes this problem.

Bruce Allen
Bruce Allen
Joined: 15 Oct 04
Posts: 958
Credit: 170,849,008
RAC: 0

Hi Michael, I've been

Hi Michael,

I've been watching the database rather closely (eg, every four to six hours) for the past month, since I started distributing Albert jobs on December 24th. I have watched this adjust to a number of changes: the decreased length of typical jobs, the fact that we had a target of 3 rather than 4 results, changes in internal scheduler parameters that govern how old an unsent result can be before it gets sent off to the first possible host, and a number of other small changes.

The major effect of changing the maximum daily quota per CPU is that (1) machines that are misconfigured and error out more work will wipe out a few more results before being shut down by the punishment algorithm and (2) very fast machines won't be starved for work. But currently only 4% of Albert results fail on the host machines. So the effect of (1) will be small.

I am not overly worried about the 'time to credit' issue. The average age in the E@H database of workunits WITHOUT a canonical result is 404302 seconds (4.7 days). The standard deviation is 4.3 days. This means that the average workunit is FINISHED (credit granted) in 4.7 days. All but about 25% are finished in 9 days. I think this is a reasonable length of time. If I see this change significantly due to the modification in the maximum daily quotas, then I'll revert it back.

Note: the BOINCSTATS project graph showing credit per day for E@H during the past 60 days indicates that we are starting to do more work than we have historically accomplished. So from the scientific point of view the project is working better than it has in the past, not worse.

Cheers,
Bruce

Bruce Allen
Bruce Allen
Joined: 15 Oct 04
Posts: 958
Credit: 170,849,008
RAC: 0

RE: As a charter member of

Quote:

As a charter member of the "relative handfull", (since December 29th), I want to thank the Einstein staff for their work with the MDQ. The practice I got in the running dry, detaching and reattaching to avoid the dead box club will not be missed. Yes, tweakster is warm and happy. My house is warm and happy. Our mild winter (so far) here in the midwest has meant that my air conditioning has come on briefly at times to maintain my 78 degree environment. My Einstein farm is literally heating my home. For those of you who want to tinker with the screen saver to deliver messages of doom, just remember, for the pro's, it's the first thing we disable.

Regards-tweakster

Happy to see your RAC numbers climbing.
Also happy not to be getting your electric bill.

Bruce

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.