I rest my case

Anonymous
Topic 12667

The problem with a project like SETI@home is that many hardware is partial or full donated and not really fully sold to them. And it seems to be much more easy to get a new server as donation then a UPS.

The second problem is, that there are UPS in the server room, but because of the migration they have to manage two projects at the moment and not all hardware can be placed in the server room. It am sure it not possible with this known small budgets, to buy more UPS power only for the current time until it is all back to one project and one server room.

At all this is typical for such a migration scenario. I am working in IT business and did a lot of this projects in the past. You are always living with the risk that you can not fully protect all systems all time until all hardware is at its final place.

One more problem - and the report sounds like this problem - is that it is not enough to use a UPS, you must configure a safe automatic shutdown of all applications and operating system before the UPS is at its end, but not for every small outage. With the different systems, some of them new and strange and having issues, it is difficty to get it working and then test it carefully.

I am sure the system administrators of the SETI@home project didn't have a nice time at the moment. As posted by other in this thread, it is a very special situation to migrate a 7/24 project with 500000 active users to new hard- and software with extremly limited budget. I believe I am really good in my job, but don't dare me to say I could do it better.

Hopefully they have a little bit more luck in the future. :)

Doris and Jens
Doris and Jens
Joined: 30 Oct 04
Posts: 34
Credit: 366,238
RAC: 597

I rest my case


> The way I read this was that they had the UPSes, had the shutdown software,
> but the batteries just didn't last long enough for things to come down
> gracefully.

Yes, thats what I was talking about. You have to configure a save shutdown (possible synchronized between servers) without a user action, then to test how long it needs to shutdown all operations and then to test how long the UPS holds the power, how long the security buffer should be and then to decide how long to wait until you start with the shutdown. Starts you shutdown to early, every small outage of some minutes may stop the project for a longer time. It needs time to shutdown and may be difficulty to startup this combined network of integrated services automaticly. And that means possible to wait for the operaters beginning the work in the morning.

And when this is all well done, then you have to check and test it again on every change. The batteries become older and didn't stay as long as before, one server get a new harddisk, the other more RAM and a new Fan. The possiblity that your plan failed when it meet the reallity is not small. ;)

Many words, short meaning. ;) It is not a problem of 100 or 1000 Dollar to protect a server system like that currently running for SETI@home. You have to be lucky too. ;)

Greetings from Bremen/Germany

Jens Seidler (TheBigJens)

[url=http://www.boinc.de/][/url]

Doris and Jens
Doris and Jens
Joined: 30 Oct 04
Posts: 34
Credit: 366,238
RAC: 597

> I'm thinking that they

> I'm thinking that they could have easily done everything right, tested
> everything, and after everything was all set and everyone was happy, but that
> was a few months ago and the batteries simply quit holding a charge.....

It doesn't really matter if they have done all as we may think it is right. It is not possible to do everything right in such a migration. They are working hard and under difficulty conditions. I personal am sure they did all possible. We all can only hope that the situation become more stable very quickly when the migration is finished. This should reduce the risk in the future.

Greetings from Bremen/Germany

Jens Seidler (TheBigJens)

[url=http://www.boinc.de/][/url]

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.