Vu sur le forum de BURP
iciAn unwritten rule in computer systems administration seems to be the
following: "When you are sitting right next to the system it will work
for years without an issue but the moment you are away from it it is
bound to crash - and there's nothing you can do about it".
What happened during week 7 (or more precisely from monday the 11th of
Feb at 6.30 UTC to this saturday morning) was that the server
experienced 4 different issues:
1) Monday morning (one day after my departure on a week-long trip to
Finland) the network DNS server was down for a few minutes, causing the
script which regularly checks for network connectivity to fail. The
script detected this failure and attempted to re-establish the network
connection automatically.
2) The script, however, failed to correctly set up the routing information for outbound packages in the case where
onlythe DNS server has been down and not the physical network link. Any
attempts to contact the BURP server from the internet resulted in a
timeout because all outgoing packages were being dropped.
While I was in Finland there was simply not much I could do about the problem.
3) The fact that the network interfaces did not have the right setup
caused the system to not have internet connectivity - which then again
triggered the script; practically running it in an infinite loop.
4) At some point during the week the BOINC feeder lost connection to
the database (because no DNS information was available). As a
consequence, in the timeframe between saturday morning and now, the
project returned "No work" when clients connected to it.
As of the timestamp of this post all of these issues should now have been fixed and work should once again be flowing steadily.
I'm very sorry about this kind of downtime, especially because the
timing was very unfortunate - allowing a relatively simple problem like
this to cause the entire server to be unavailable for a whole week.
The script has been corrected in order to hopefully prevent a similar thing from happening in the future.
If you experience any further issues in relation to this downtime please post about it in this thread.
Je vous fait une traduction rapide dans la foulée ... Merci de patienter 5 minutes ...