[Nevis-linux] Linux cluster problems

William Seligman seligman at nevis.columbia.edu
Mon Jan 14 13:09:45 EST 2008


William Seligman wrote:

> - riverside, the Neutrino group server, is down with a severe hardware 
> problem; nothing appears on the screen when I turn it on.  This issue 
> has the highest priority.  If I have no insight in the next 15 minutes, 
> I'm going to restore the riverside:/home directory on some other box so 
> the Neutrino group can get their e-mail and do other work.  This box is 
> also the condor master server, so condor is down too.

I moved riverside's disks from that box to hermes.  After much pain, I managed 
to get hermes to boot properly and to "think" that it's riverside.  At this 
point, Neutrino users should be able to get their mail, login, etc.

This is a quick fix, not a stable situation; hermes has the nickname of "the 
disk-eater."  However, unless someone disagrees, I'm inclined to leave things in 
a functional state until after the NSF review.

-- 
Bill Seligman             | Phone: (914) 591-2823
Nevis Labs, Columbia Univ | mailto://seligman@nevis.columbia.edu
PO Box 137                | http://www.nevis.columbia.edu/~seligman/
Irvington NY 10533 USA    | XDI: http://public.xdi.org/=william.seligman
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3277 bytes
Desc: S/MIME Cryptographic Signature
Url : http://listserv.nevis.columbia.edu/pipermail/nevis-linux/attachments/20080114/c2d010bc/attachment.bin 


More information about the Nevis-linux mailing list