Jan, Walt, here's the view from the scheduler side today...
2005-08-01 09:49:00 [PID=28131] [normal ] OS version Microsoft Windows 2000 Professional Edition, Service Pack 4, (05.00.2195.00)
2005-08-01 09:49:01 [PID=28131] [debug ] Request [HOST#103595] Database [HOST#103595] Request [RPC#1412] Database [RPC#1411]
2005-08-01 09:49:01 [PID=28131] [normal ] Processing request from [USER#70690] [HOST#103595] [IP 193.3.142.123] [RPC#1412] core client version 4.45
2005-08-01 09:49:01 [PID=28131] [debug ] [HOST#103595] Resetting nresults_today
2005-08-01 09:49:01 [PID=28131] [normal ] [HOST#103595] [RESULT#6941996 w1_0662.0__0662.4_0.1_T11_S4hA_3] got result
2005-08-01 09:49:01 [PID=28131] [debug ] cpu 26717.562500 cpcs 0.002347, cc 62.712644
2005-08-01 09:49:01 [PID=28131] [debug ] [RESULT#6941996 w1_0662.0__0662.4_0.1_T11_S4hA_3]: client_state 3 exit_status -1073741819; setting outcome ERROR
2005-08-01 09:49:01 [PID=28131] [normal ] [HOST#103595] got request for 8640.000000 seconds of work; available disk 20.007354 GB
2005-08-01 09:49:01 [PID=28131] [debug ] [HOST#103595]: has file w1_0662.0
2005-08-01 09:49:01 [PID=28131] [debug ] send_old_work() no feasible result older than 168.0 hours
2005-08-01 09:49:01 [PID=28131] [debug ] in_send_results_for_file(w1_0662.0, 0) prev_result.id=6941996
2005-08-01 09:49:01 [PID=28131] [debug ] Sorted list of URLs follows [host timezone: UTC+3600]
2005-08-01 09:49:01 [PID=28131] [debug ] zone=+03600 url=http://einstein.aei.mpg.de
2005-08-01 09:49:01 [PID=28131] [debug ] zone=+00000 url=http://einstein.astro.gla.ac.uk
2005-08-01 09:49:01 [PID=28131] [debug ] zone=-18000 url=http://einstein.aset.psu.edu
2005-08-01 09:49:01 [PID=28131] [debug ] zone=-18000 url=http://morel.mit.edu
2005-08-01 09:49:01 [PID=28131] [debug ] zone=-21600 url=http://einstein.phys.uwm.edu
2005-08-01 09:49:01 [PID=28131] [debug ] [HOST#103595] Sending app_version einstein windows_intelx86 479
2005-08-01 09:49:01 [PID=28131] [debug ] [HOST#103595] Already has file w1_0662.0
2005-08-01 09:49:01 [PID=28131] [debug ] [HOST#103595] reducing disk needed for WU by 8350400 bytes (length of w1_0662.0)
2005-08-01 09:49:01 [PID=28131] [debug ] est cpu dur 29038.461538; running_frac 0.251879; rsf 1.000000; est 115287.346458
2005-08-01 09:49:01 [PID=28131] [normal ] [HOST#103595] Sending [RESULT#7065325 w1_0662.0__0662.3_0.1_T13_S4hA_2] (fills 115287.35 seconds)
2005-08-01 09:49:01 [PID=28131] [normal ] [HOST#103595] Sent 1 results [scheduler ran 1 seconds]
2005-08-01 11:06:29 [PID=11044] [debug ] REQUEST_METHOD=POST CONTENT_TYPE=application/octet-stream HTTP_ACCEPT= HTTP_USER_AGENT=BOINC client (windows_intelx86 4.45)
2005-08-01 11:06:29 [PID=11044] [debug ] CONTENT_LENGTH=3802 from 193.3.142.123
2005-08-01 11:06:29 [PID=11044] [normal ] Handling request: IP 193.3.142.123, auth host 103595, platform windows_intelx86, version 4.45
2005-08-01 11:06:29 [PID=11044] [normal ] OS version Microsoft Windows 2000 Professional Edition, Service Pack 4, (05.00.2195.00)
2005-08-01 11:06:29 [PID=11044] [debug ] Request [HOST#103595] Database [HOST#103595] Request [RPC#1413] Database [RPC#1412]
2005-08-01 11:06:29 [PID=11044] [normal ] Processing request from [USER#70690] [HOST#103595] [IP 193.3.142.123] [RPC#1413] core client version 4.45
2005-08-01 11:06:29 [PID=11044] [normal ] [HOST#103595] [RESULT#7065325 w1_0662.0__0662.3_0.1_T13_S4hA_2] got result
2005-08-01 11:06:29 [PID=11044] [debug ] cpu 3917.656300 cpcs 0.002347, cc 9.195696
2005-08-01 11:06:29 [PID=11044] [debug ] [RESULT#7065325 w1_0662.0__0662.3_0.1_T13_S4hA_2]: client_state 3 exit_status -1073741819; setting outcome ERROR
2005-08-01 11:06:29 [PID=11044] [normal ] [HOST#103595] got request for 8640.000000 seconds of work; available disk 20.007354 GB
2005-08-01 11:06:29 [PID=11044] [debug ] [HOST#103595]: has file w1_0662.0
2005-08-01 11:06:29 [PID=11044] [debug ] send_old_work() no feasible result older than 168.0 hours
2005-08-01 11:06:29 [PID=11044] [debug ] in_send_results_for_file(w1_0662.0, 0) prev_result.id=7065325
2005-08-01 11:06:29 [PID=11044] [debug ] Sorted list of URLs follows [host timezone: UTC+3600]
2005-08-01 11:06:29 [PID=11044] [debug ] zone=+03600 url=http://einstein.aei.mpg.de
2005-08-01 11:06:29 [PID=11044] [debug ] zone=+00000 url=http://einstein.astro.gla.ac.uk
2005-08-01 11:06:29 [PID=11044] [debug ] zone=-18000 url=http://einstein.aset.psu.edu
2005-08-01 11:06:29 [PID=11044] [debug ] zone=-18000 url=http://morel.mit.edu
2005-08-01 11:06:29 [PID=11044] [debug ] zone=-21600 url=http://einstein.phys.uwm.edu
2005-08-01 11:06:29 [PID=11044] [debug ] [HOST#103595] Sending app_version einstein windows_intelx86 479
2005-08-01 11:06:29 [PID=11044] [debug ] [HOST#103595] Already has file w1_0662.0
2005-08-01 11:06:29 [PID=11044] [debug ] [HOST#103595] reducing disk needed for WU by 8350400 bytes (length of w1_0662.0)
2005-08-01 11:06:29 [PID=11044] [debug ] est cpu dur 29038.461538; running_frac 0.253218; rsf 1.000000; est 114677.714611
2005-08-01 11:06:29 [PID=11044] [normal ] [HOST#103595] Sending [RESULT#7096890 w1_0662.0__0662.4_0.1_T13_S4hA_1] (fills 114677.71 seconds)
2005-08-01 11:06:29 [PID=11044] [normal ] [HOST#103595] Sent 1 results [scheduler ran 0 seconds]

Whats up with thi`s
)
No Walt, but I have stuff from an hour later. I think that the host may be ahead of UTC by one hour.
2005-07-27 11:18:48 [PID=27892] [normal ] OS version Microsoft Windows 2000 Professional Edition, Service Pack 4, (05.00.2195.00)
2005-07-27 11:18:48 [PID=27892] [debug ] Request [HOST#103595] Database [HOST#103595] Request [RPC#1408] Database [RPC#48]
2005-07-27 11:18:48 [PID=27892] [normal ] Processing request from [USER#70690] [HOST#103595] [IP 193.3.142.123] [RPC#1408] core client version 4.45
2005-07-27 11:18:48 [PID=27892] [debug ] [HOST#103595] Resetting nresults_today
2005-07-27 11:18:48 [PID=27892] [normal ] [HOST#103595] [RESULT#6673270 w1_0662.0__0662.1_0.1_T08_S4hA_2] got result
2005-07-27 11:18:48 [PID=27892] [debug ] cpu 19641.531300 cpcs 0.002449, cc 48.106584
2005-07-27 11:18:48 [PID=27892] [debug ] [RESULT#6673270 w1_0662.0__0662.1_0.1_T08_S4hA_2]: client_state 5 exit_status -164; setting outcome ERROR
2005-07-27 11:18:53 [PID=27911] [normal ] OS version Microsoft Windows 2000 Professional Edition, Service Pack 4, (05.00.2195.00)
2005-07-27 11:18:53 [PID=27911] [debug ] Request [HOST#103595] Database [HOST#103595] Request [RPC#1409] Database [RPC#1408]
2005-07-27 11:18:53 [PID=27911] [normal ] Processing request from [USER#70690] [HOST#103595] [IP 193.3.142.123] [RPC#1409] core client version 4.45
2005-07-27 11:18:53 [PID=27911] [normal ] Not sending work - last RPC too recent: 5.076002
2005-07-27 11:18:53 [PID=27911] [debug ] [HOST#103595] MSG( low) Not sending work - last RPC too recent: 5 sec
2005-07-27 11:18:53 [PID=27911] [normal ] sending delay request 60.600000
2005-07-27 11:19:55 [PID=28170] [normal ] OS version Microsoft Windows 2000 Professional Edition, Service Pack 4, (05.00.2195.00)
2005-07-27 11:19:55 [PID=28170] [debug ] Request [HOST#103595] Database [HOST#103595] Request [RPC#1410] Database [RPC#1409]
2005-07-27 11:19:55 [PID=28170] [normal ] Processing request from [USER#70690] [HOST#103595] [IP 193.3.142.123] [RPC#1410] core client version 4.45
2005-07-27 11:19:55 [PID=28170] [normal ] [HOST#103595] got request for 8640.000000 seconds of work; available disk 20.007354 GB
2005-07-27 11:19:55 [PID=28170] [debug ] [HOST#103595]: has file w1_0662.0
2005-07-27 11:19:55 [PID=28170] [debug ] send_old_work() no feasible result older than 168.0 hours
2005-07-27 11:19:55 [PID=28170] [debug ] in_send_results_for_file(w1_0662.0, 0) prev_result.id=6673270
2005-07-27 11:19:55 [PID=28170] [debug ] Sorted list of URLs follows [host timezone: UTC+3600]
2005-07-27 11:19:55 [PID=28170] [debug ] zone=+03600 url=http://einstein.aei.mpg.de
2005-07-27 11:19:55 [PID=28170] [debug ] zone=+00000 url=http://einstein.astro.gla.ac.uk
2005-07-27 11:19:55 [PID=28170] [debug ] zone=-18000 url=http://einstein.aset.psu.edu
2005-07-27 11:19:55 [PID=28170] [debug ] zone=-18000 url=http://morel.mit.edu
2005-07-27 11:19:55 [PID=28170] [debug ] zone=-21600 url=http://einstein.phys.uwm.edu
2005-07-27 11:19:55 [PID=28170] [debug ] [HOST#103595] Sending app_version einstein windows_intelx86 479
2005-07-27 11:19:55 [PID=28170] [debug ] [HOST#103595] Already has file w1_0662.0
2005-07-27 11:19:55 [PID=28170] [debug ] [HOST#103595] reducing disk needed for WU by 8350400 bytes (length of w1_0662.0)
2005-07-27 11:19:55 [PID=28170] [debug ] est cpu dur 28169.642857; running_frac 0.270034; rsf 1.000000; est 104318.874131
2005-07-27 11:19:55 [PID=28170] [normal ] [HOST#103595] Sending [RESULT#6839349 w1_0662.0__0662.2_0.1_T11_S4hA_2] (fills 104318.87 seconds)
2005-07-27 11:19:55 [PID=28170] [normal ] [HOST#103595] Sent 1 results [scheduler ran 0 seconds]
RE: The computer summary
)
Indeed this seems to be a very common problem which I simply don't understand. It's why I added the 'retransmission of lost work' functionality to the scheduler. Something that to me seems very odd is that many of these 'no scheduler responded' messages occur when the server is up and functioning. As far as I can tell from the apache logs and cgi logs, the scheduler gets the connection but then the host at the other end 'goes silent'. Finally after some time (up to a couple of minutes) the connection is dropped.
I suspect that ISPs may be doing very odd things with the http transactions. But I don't know enough to figure this out. The 'retransmission of lost work' is meant to work around (at least some of) these problems.
Bruce
RE: I wonder how many of
)
So if for example the ISP delivered only the first 4kB of an 8kB scheduler reply, the client would simply say 'no scheduler responded' rather than some kind of 'incomplete response' message? This is something that I could ask Rom or David to add and/or fix. With no feedback to users suggesting that something is wrong, there's not much chance of identifying the cause.
Bruce