Strange behovior with the Netlab server

  • Curtis Sharon
  • Curtis Sharon's Avatar Topic Author
  • Offline
  • New Member
  • New Member
More

I was wondering if anybody else had experienced some 'strange' behavior with the Netlab web server. On occasion, ours goes down. The only pattern emerging so far is that it seems to occur on the last Monday of each month. That is until this last weekend. Our campus had a scheduled power outage on Friday (lasting about 8 hours). We gracefully shut down everything on Thursday evening. We brought everything back up on Saturday - apparently successfully. By Sunday afternoon, the web server was down. Normally we could recover by simply consoling in and restarting. This time we couldn't get past the Linux initial boot screen. After contacting NDG, we were instructed to perform a little surgery (remove the glue securing the connections to the HD, etc and re-seat the connections. this proved to be successful.

My question is why? The NDG server is not subject to vibration and movement. Why should re-seating connections that were secured with glue fix this problem?

#9

Please Log in or Create an account to join the conversation.

More

Unfortunately as I have not had this issue with any of my systems any attempts at answering your question is speculation.

That being said, it might have been a factory issue i.e. maybe when first assembled the connection wasn't 100% and after a certain amount of time with the internal disk movements it may have wiggled a bit further from its connection. This could have resulted in some I/O errors which brought down the web sever.

As this problem has been fixed there is no need to try troubleshooting, however I am curious what messages or errors show up in your log files when the web interface went down, this knowledge may assist those who have a similar issue in the future.

#11

Please Log in or Create an account to join the conversation.

  • Curtis Sharon
  • Curtis Sharon's Avatar Topic Author
  • Offline
  • New Member
  • New Member
More

Dave,

Here are the log entries:

Here is a section of the log file that shows up when Netlab is down. It seems to repeat over and over until the system is restarted.

[2016-10-31 22:51:00 UTC] watchweb: webserver starting

[2016-10-31 22:51:00 UTC] mbusd: already running under pid 506
[2016-10-31 22:52:00 UTC] watchdog: mbusd process has failed
[2016-10-31 22:52:00 UTC] Unable to estabish NETLAB+ session at this time, Connection to netlabd API failed, Bad file descriptor
[2016-10-31 22:52:00 UTC] watchweb: could not get http://127.0.0.1 [0] (502 Bad Gateway)
[2016-10-31 22:52:00 UTC] Unable to estabish NETLAB+ session at this time, Connection to netlabd API failed, Bad file descriptor
[2016-10-31 22:52:00 UTC] watchweb: could not get http://127.0.0.1 [1] (502 Bad Gateway)
[2016-10-31 22:52:00 UTC] Unable to estabish NETLAB+ session at this time, Connection to netlabd API failed, Bad file descriptor
[2016-10-31 22:52:00 UTC] watchweb: could not get http://127.0.0.1 [2] (502 Bad Gateway)
[2016-10-31 22:52:00 UTC] watchweb: webserver not responding (URL 'http://127.0.0.1')

[2016-10-31 22:52:00 UTC] watchweb: webserver restarting

[2016-10-31 22:53:01 UTC] Unable to estabish NETLAB+ session at this time, Connection to netlabd API failed, Bad file descriptor
[2016-10-31 22:53:01 UTC] watchweb: could not get http://127.0.0.1 [0] (502 Bad Gateway)
[2016-10-31 22:53:01 UTC] Unable to estabish NETLAB+ session at this time, Connection to netlabd API failed, Bad file descriptor
[2016-10-31 22:53:01 UTC] watchweb: could not get http://127.0.0.1 [1] (502 Bad Gateway)
[2016-10-31 22:53:01 UTC] Unable to estabish NETLAB+ session at this time, Connection to netlabd API failed, Bad file descriptor
[2016-10-31 22:53:01 UTC] watchweb: could not get http://127.0.0.1 [2] (502 Bad Gateway)
[2016-10-31 22:53:01 UTC] watchweb: webserver not responding (URL 'http://127.0.0.1')




Curt

#12

Please Log in or Create an account to join the conversation.

Moderators: David HoveyShawn MonsenSuper User