On grid environments, clusters which connects many PCs and workstations in a network is a typical computation resources. In OmniRPC, we can treat a cluster as a remote host. We can run a remote executable module on each node in the cluster and execute in parallel so that we can achieve good performance.
When using clusters, we can access at least one computer in a cluster form the client host. We label this computer as the cluster server host and label the computers in the cluster without a cluster server host as the cluster node host.
We assume the environment below.The last item in the list assumes that the client host and cluster are in same network.
In OmniRPC, the omrpc-agent is invoked first on the cluster server host, and this agent activates remote executable module on each cluster node host with the appropriate scheduler. We can use one of the scheduler described below.
OmniRPC's built-in round-robin scheduler is a simple scheduler which is implemented in the agent. This scheduler activates remote executable modules on cluster node hosts.
To use this scheduler, we create a nodes file which specifies cluster node hosts on the registry( "$HOME/.omrpc-register" ) of the cluster server host. Below is the setting for this example.
hpc1 hpc2 hpc3
On the client host side, we create this hostfile.
<?xml version="1.0" ?> <OmniRpcConfig> <Host name="hpc-serv.hpcc.jp" arch="i386" os="linux"> <JobScheduler type="rr" maxjob="4" /> </Host> </OmniRpcConfig>Set the type attribute in the job scheduler element to the round-robin scheduler "rr." The default value for this attribute is "fork," which just creates the process on the same host. Our example applies to an SMP system. The number of cluster node hosts is 4, so you should set maxjob equal to 4.
The relationship between the agent and rex with this option is as follows.
In the above example, the client host and cluster hosts are in same network. Also, the remote executable programs which execute on the cluster node hosts are activated directly for the client host. In the case in which the cluster and client host are in different networks, programs can communicate to the client host from a cluster node host.
But, as the number of node hosts increases, so do the clusters connected to the local-address network. In this situation, only the server host has a global IP address; the node hosts have local IP addresses. For OmniRPC, the cluster node host must communicate with the client host, but in this situation the cluster node host cannot communicate directly with the client host outside the cluster's network.
In this situation, there are 2 ways to use the cluster.
We show an example hostile.xml which is based on the second way to use the cluster.
<?xml version="1.0" ?> <OmniRpcConfig> <Host name="hpc-serv.hpcc.jp" arch="i386" os="linux"> <Agent invoker="rsh" mxio="on" /> <JobScheduler type="rr" maxjob="4" /> </Host> </OmniRpcConfig>You should set the mxio attribute on the agent element with "on." In this case, because we assume that the cluster server host and client host are in same network, we use "rsh." If you want to invoke the agent with SSH, set "ssh". If you want to do this with the Globus gate keeper, use "globus."
Using this option, the relationship is shown in the figure below.
The agent relays communications between every rex which is executed on the remote node host and the client.
You don't have to prepare for this situation.
We now explain the case of using clusters from outside of firewalls. When there are firewall(s), it is necessary at least to access with ssh to the cluster server host. If you can not use anonymous ports without a port (#22) of ssh, you can use the function of multiplex communication with the agent. We show the hostfile for this example.
<?xml version="1.0" ?> <OmniRpcConfig> <Host name="hpc-serv.hpcc.jp" arch="i386" os="linux"> <Agent invoker="ssh" mxio="on" /> <JobScheduler type="rr" maxjob="4" /> </Host> </OmniRpcConfig>
Set the mxio attribute "on" in the agent element and use the function of multiplex communication.
In environments which use Globus, usually there are no firewalls, so you don't have to prepare. But, you have to set the mxio attribute in the same manner if the clusters consist of private IP addresses.
The relationship between the agent and rex is shown by the figure below.
Communications between the client and rex, which are executed on remote node hosts are relayed by the agent. And communications between the agent and clients are relayed by SSH's port forwarding through the firewall.