Archive | December, 2006

Tags:

Cluster Primer

Posted on 31 December 2006 by Shawn

Hi all,

This is a post meant to give everyone an idea about how the cluster works how to access nodes if necissary and what kind of software makes the cluster a cluster. (as soon as we get it working)

The nodes are arranged in there own Class C network space the hosts file on each machine has the IP addresses of the other machines aliased so that users can access each of the individual nodes simply by typing its name. The names are:

godzilla #for the head node
godzilla1 #node 1
godzilla2 #node 2
godzilla2 #node 3

all of the cluster nodes have the same root account and user’s home directories will be mounted via nfs.

By aws4y

Comments (0)

Tags:

Clusterf*@k

Posted on 19 December 2006 by Shawn

So the nodes of the cluster are all running, thats the good news, they all have the serv_p4 mpi servers running, that two would be good news, one problem…

I CAN’T GET ANY OF THE BLOODY EXAMPLE PROGRAMS TO RUN!!!!!!!!!111

they all give the same error:

rm_5138: p4_error: rm_start: net_conn_to_listener failed: 51539
p0_24034: p4_error: Child process exited while making connection to remote process on godzilla1: 0
p0_24034: (18.819009) net_send: could not write to fd=4, errno = 32

So as it stands right now I have to have the root directories mounted by nfs and the portage tree since were going to be using nfs trees. In the mean time, I am going to track down these errors unless someone has some experience with mpi and can diagnose my problem automatically, right now I am using password authentication for SSH since I am root.

Comments (0)