CodeVerge.Net Beta


   Item Entry   Register  Login  
Microsoft News
Asp.Net Forums
IBM Software
Borland Forums
Adobe Forums
Novell Forums




Can Reply:  Yes Members Can Edit: No Online: Yes
Zone: > NEWSGROUP > Novell Forums > novell.support.cluster-services Tags:
Item Type: Date Entered: 10/20/2008 7:46:02 PM Date Modified: Subscribers: 0 Subscribe Alert
Rate It:
NR
XPoints: N/A Replies: 6 Views: 13 Favorited: 0 Favorite
7 Items, 1 Pages 1 |< << Go >> >|
jessbryant <jes
NewsGroup User
Strange Node Joining Problem10/20/2008 7:46:02 PM
Reply

0


I've got a strange problem, and have seem similar posts here in the
forum, but not quite the same...so I thought I'd post to see what y'all
think.

Last Friday, I had to down one of the nodes in the cluster (2 node
cluster). The 2nd node (I'll call server2) picked up the master ip, as
was expected. When I brought the 1st node (server1) back up, it was no
longer able to join the cluster. It simply hangs on "Joining....".
When I look at the log file, it says "Join retry, some other node
acquired the cluster lock".

In trying to troubleshoot this, I brought down server2, and rebooted
both servers. I let the server1 acquire the Master, and server2 joined
without any issues. I then had server1 leave the cluster, and server2
then became the master. When trying to re-join the cluster with
server1, it would again hang. So basically, if server1 is the Master,
both servers can join. But if server2 is the Master, server1 is unable
to join.

I have verified that there are no communication problems between the
two servers. And that they can both see the sbd partition. My
heartbeat settings are on the installation defaults. I have downed both
servers completely, and it always comes back to the same thing (above
paragraph).
I have also tried reinstalling NCS on both servers (joining an existing
node), and it always fails on the "Join Node" section.

I've seen references to some TID's that are supposed to have some
solutions, but am unable to find the TID's. I've also seen reference to
re-creating the SBD partition, but can't find any good instruction on
how to do that without hosing everything. I've read references on
changing the panning ID, but also can't find instructions on how to do
that.

Can someone please give me some advice. If re-creating the sbd
partition is what I should try next, can you explain how to do it?
Thank you!


--
jessbryant
------------------------------------------------------------------------
jessbryant's Profile: http://forums.novell.com/member.php?userid=13263
View this thread: http://forums.novell.com/showthread.php?t=347994

ataubman <ataub
NewsGroup User
Re: Strange Node Joining Problem10/20/2008 10:16:02 PM
Reply

0


You delete the SBD partition as you would any other partition (use
NSSMU) and then issue the command SBD INSTALL. TID 10082213 has the
steps, just ignore the Mirrorred bit assuming your SBD is not
mirrorred.

If that doesn't pan out (boomtish*) TID 3075104 tells you about the
panning ID.


--
Andrew C Taubman
Novell Support Forums Volunteer SysOp
http://forums.novell.com/
(Sorry, support is not provided via e-mail)

Opinions expressed above are not
necessarily those of Novell Inc.
------------------------------------------------------------------------
ataubman's Profile: http://forums.novell.com/member.php?userid=34
View this thread: http://forums.novell.com/showthread.php?t=347994

"Klaus Arpe" <a
NewsGroup User
Re: Strange Node Joining Problem10/21/2008 2:32:38 PM
Reply

0

Hello,

maybe that the "panning clusterid" got out of sync. I had that Problem too
and I know tehre was a TID about it, but I cant find it anymore :-(

Klaus



"jessbryant" <jessbryant@no-mx.forums.novell.com> wrote in message
news:jessbryant.3hlhao@no-mx.forums.novell.com...
>
> I've got a strange problem, and have seem similar posts here in the
> forum, but not quite the same...so I thought I'd post to see what y'all
> think.
>
> Last Friday, I had to down one of the nodes in the cluster (2 node
> cluster). The 2nd node (I'll call server2) picked up the master ip, as
> was expected. When I brought the 1st node (server1) back up, it was no
> longer able to join the cluster. It simply hangs on "Joining....".
> When I look at the log file, it says "Join retry, some other node
> acquired the cluster lock".
>
> In trying to troubleshoot this, I brought down server2, and rebooted
> both servers. I let the server1 acquire the Master, and server2 joined
> without any issues. I then had server1 leave the cluster, and server2
> then became the master. When trying to re-join the cluster with
> server1, it would again hang. So basically, if server1 is the Master,
> both servers can join. But if server2 is the Master, server1 is unable
> to join.
>
> I have verified that there are no communication problems between the
> two servers. And that they can both see the sbd partition. My
> heartbeat settings are on the installation defaults. I have downed both
> servers completely, and it always comes back to the same thing (above
> paragraph).
> I have also tried reinstalling NCS on both servers (joining an existing
> node), and it always fails on the "Join Node" section.
>
> I've seen references to some TID's that are supposed to have some
> solutions, but am unable to find the TID's. I've also seen reference to
> re-creating the SBD partition, but can't find any good instruction on
> how to do that without hosing everything. I've read references on
> changing the panning ID, but also can't find instructions on how to do
> that.
>
> Can someone please give me some advice. If re-creating the sbd
> partition is what I should try next, can you explain how to do it?
> Thank you!
>
>
> --
> jessbryant
> ------------------------------------------------------------------------
> jessbryant's Profile: http://forums.novell.com/member.php?userid=13263
> View this thread: http://forums.novell.com/showthread.php?t=347994
>


jessbryant <jes
NewsGroup User
Re: Strange Node Joining Problem10/21/2008 6:06:07 PM
Reply

0


Well, we've deleted and recreated the sbd partition. That didn't fix
the problem. We have narrowed it down to this:

If we start up the "broken" server, and then turn the firewall off from
a command line, the node immediately joins the cluster.

We've compared the firewall settings with the "good" server, and they
are identical. So I'm not sure what else to check. It doesn't make
sense that the firewall would prevent it from joining the cluster as a
slave. The firewall doesn't prevent it from joining the cluster as long
as it thinks it's the "master". Baffling.

Any ideas to add to this, given my latest findings?

Thanks!!!


--
jessbryant
------------------------------------------------------------------------
jessbryant's Profile: http://forums.novell.com/member.php?userid=13263
View this thread: http://forums.novell.com/showthread.php?t=347994

jessbryant <jes
NewsGroup User
Re: Strange Node Joining Problem10/21/2008 8:16:02 PM
Reply

0


Just want to add a little bit more information about this strange
behavior:

If the firewall is disabled completely, and then the server is started
up, the node will still hang upon "joining". The firewall has to be
turned off manually from a command line after the server is up and
running. Once this has happened, the node joins the cluster.

I then tried to start the firewall, and as soon as I did that, the
server rebooted itself.

What's up with that?


--
jessbryant
------------------------------------------------------------------------
jessbryant's Profile: http://forums.novell.com/member.php?userid=13263
View this thread: http://forums.novell.com/showthread.php?t=347994

"Klaus Arpe" <a
NewsGroup User
Re: Strange Node Joining Problem10/22/2008 8:03:56 AM
Reply

0

If it seems to deal with the Firewall I would try a packet scan on both
Machines. You can use PKTSCAN.NLM to make a wireshark compatible Trace and
view it with Wireshark. Comare the situations.

Klaus

"jessbryant" <jessbryant@no-mx.forums.novell.com> wrote in message
news:jessbryant.3hn7c0@no-mx.forums.novell.com...
>
> Well, we've deleted and recreated the sbd partition. That didn't fix
> the problem. We have narrowed it down to this:
>
> If we start up the "broken" server, and then turn the firewall off from
> a command line, the node immediately joins the cluster.
>
> We've compared the firewall settings with the "good" server, and they
> are identical. So I'm not sure what else to check. It doesn't make
> sense that the firewall would prevent it from joining the cluster as a
> slave. The firewall doesn't prevent it from joining the cluster as long
> as it thinks it's the "master". Baffling.
>
> Any ideas to add to this, given my latest findings?
>
> Thanks!!!
>
>
> --
> jessbryant
> ------------------------------------------------------------------------
> jessbryant's Profile: http://forums.novell.com/member.php?userid=13263
> View this thread: http://forums.novell.com/showthread.php?t=347994
>


ashwin pankaj <
NewsGroup User
Re: Strange Node Joining Problem10/31/2008 5:06:01 AM
Reply

0


you could use tcpdump ..
On master node :
tcpdump -v v proto 224 and src <master ip>

on other nodes:
tcpdump -v v proto 224 and dst <master ip>


Klaus Arpe;1665635 Wrote:
> If it seems to deal with the Firewall I would try a packet scan on both
> Machines. You can use PKTSCAN.NLM to make a wireshark compatible Trace
> and
> view it with Wireshark. Comare the situations.
>
> Klaus
>
> "jessbryant" <jessbryant@no-mx.forums.novell.com> wrote in message
> news:jessbryant.3hn7c0@no-mx.forums.novell.com...
> >
> > Well, we've deleted and recreated the sbd partition. That didn't
> fix
> > the problem. We have narrowed it down to this:
> >
> > If we start up the "broken" server, and then turn the firewall off
> from
> > a command line, the node immediately joins the cluster.
> >
> > We've compared the firewall settings with the "good" server, and
> they
> > are identical. So I'm not sure what else to check. It doesn't make
> > sense that the firewall would prevent it from joining the cluster as
> a
> > slave. The firewall doesn't prevent it from joining the cluster as
> long
> > as it thinks it's the "master". Baffling.
> >
> > Any ideas to add to this, given my latest findings?
> >
> > Thanks!!!
> >
> >
> > --
> > jessbryant
> >
> ------------------------------------------------------------------------
> > jessbryant's Profile: 'NOVELL FORUMS - View Profile: jessbryant'
> (http://forums.novell.com/member.php?userid=13263)
> > View this thread: 'Strange Node Joining Problem - NOVELL FORUMS'
> (http://forums.novell.com/showthread.php?t=347994)
> >


--
ashwin_pankaj
------------------------------------------------------------------------
ashwin_pankaj's Profile: http://forums.novell.com/member.php?userid=10708
View this thread: http://forums.novell.com/showthread.php?t=347994

7 Items, 1 Pages 1 |< << Go >> >|


Free Download:







reinstall crashed cluster node

clustering scenario

expanding netware 6.5sp8 cluster with sles10

cluster resource comatose with cifs

nfs exports on clustered volumes oes2 sp1 linux

nw65sp7 - abend during backup

migrate or offline cluster resource kills ndsd

vmware esx reboot loop

resource hangs when unloading

error trying to add afp or cifs to resource

oes2 nss volumes have different mount points on the nodes

cluster pools not visible on some cluster nodes

moving cluster to new san

ncs resource does not migrrate

oes2sp1 ncs

schema extensions not available

clustering dns, clusterpath not found

can't add afp protocol to volume in cluster

upgrading cluster servers that hold iprint

cluster 2 sites / bcc

migrating a netware cluster services to a new oes2 cluster

moving resources from one cluster to another

sbd.nlm fails to load

6.5.3 2 node cluster upgrade to 6.5.8

unmounting volumes

slow cluster performance when adding 3rd node

savin copier no access cluster volume

can't add new servers to existing cluster

migrating existing nss volumes into a cluster

remove clustering from oes2 linux

zenworks 7 sp1 install on cluster: lacking docs

cluster ncp volume object in e-directory

can't create sbd

slow cluster resource load on oes/linuxbookmark

oes1 sp2 rug patches - cluster services not loading

ncs installation on oes2 linux requires secure ldap

pool and ressource load/unload/migrate

virtual ncp server displaying all ncp shares

dmz - acl open for which servers?

expstatsn on iscsi target server

cluster resource screen

advice - avoid mixed oes 1 and oes 2 clusters !!!

cifs question

clustering iprint, urgent

one cluster volume, 2 clusters?

cluster ip optimization

netware to oes2 cluster migration

iscsi initiator error

randomly problem in 12 nodes cluster.

dns failure on ncs

   
  Privacy | Contact Us
All Times Are GMT