Let's rokk! [Tudor Cret's blog]

October 6, 2008

CCS 2003 & MS-MPI (part2)

Filed under: Parallel programming — Tudor Cret @ 11:27 pm

Now that we saw how to install and configure a Windows cluster (see part1 for instaling and configuring CCS) let’s find out more about MS-MPI and see how to run a parallel application.

Installing MS-MPI

MS-MPI is the implementation of the MPI standard, helping the users to create high performance parallel applications for Microsoft Windows Compute Cluster Server 2003. Microsoft provides a SDK, available here, that is used to create parallel applications. The Microsoft Compute Cluster Pack SDK contains executable binaries for Microsoft MPI and also headers and libraries for developing parallel applications. Also it includes API for integration with Microsoft Job Scheduler.

Download the SDK from the link provided above, 32-bit or 64-bit version and install it using the installation wizard either on a cluster or a simple machine on which a cluster behavior is simulated.

More about using CCP SDK can be found on MSDN – Using CCP.

More about MS-MPI can be found here. Also there are some MPI implementations for Microsoft .NET environment: MPI.NET and Pure MPI.NET.

Installing and enabling MPI Cluster Debugger

Visual Studio 2008 Professional Edition and Visual Studio 2008 Team System support remote debugging of applications, including parallel applications. The Visual Studio remote debugging process for a Message Passing Interface (MPI) application uses the following:

  • Msvsmon -the remote debugging monitor application of Visual Studio.
  • Smpd -the MPI daemon process. Starts mpishim.exe.
  • Mpishim -the application that connects to msvsmon.exe and that starts mpiexec.
  • Mpiexec -the MPI job launcher that starts the user’s application.

To use remote MPI debugging on a CCS cluster, you will need to perform the following tasks: 

  • MPI must be installed and configured on each node of the cluster.
  • The MPIShim.exe file must be installed on each node in the cluster, and in the same location on each node.
  • The Visual Studio Remote Debugging Monitor (msvsmon.exe) must be installed on each node in the cluster.
  • The Visual Studio host computer (the one from which you are debugging) must be set up with an account that has sufficient privileges to execute jobs on the cluster, and must be on a network segment and subnet that gives it access to the compute nodes of the cluster.

The Remote Debugging Monitor is specific to each processor architecture. It’s important that you install the x64 version. To install all the required remote debugging components, do the following at each compute node:

  • Insert the last disk of the Visual Studio 2005 installation set.
  • Navigate to the Remote Debugger\x64 folder using Windows Explorer.
  • Double-click rdbgsetup.exe to install the remote debugging components.

Debugging cluster applications

  • Install and configure Windows Compute Cluster 2003 as described in previous section. All services will be configured on public network, because each machine has only one network card. Let the cluster to have the following configuration:
    • Machine 1: HEADCLUSTER210 – the head node of the cluster, it is not a computational node.
    • Machine 2: CLUSTERNODE2101 – first computational node in the cluster.
    • Machine 3: CLUSTERNODE2102 – the second computational node in the cluster.
  • Check that the cluster is configured and running correctly, using Compute Cluster Administrator interface.
  • Install Microsoft Compute Cluster Pack SDK x64 version on each node, in the same location. Because CCS 2003 runs on a x64 architecture all application and processes that runs on the cluster will be x64 platform based.
  • Check that MPI services are up and running on all nodes.
  • Install Visual Studio 2008 Professional Edition or Visual Studio 2008 Team Edition on the computational nodes – CLUSTERNODE2101 and CLUSTERNODE2102 in the same location on EACH  node. Be sure that you install the extensions necessary for building x64 platform based applications. 

                           Check x64 extenstions 

  • Check that Visual Studio Remote Debugger (mpishim.exe) is installed on all nodes in the same location.
  • Modify the registry:

Cmd.exe has an issue with UNC paths.  MPI Debugging relies on these paths so just to be safe and make sure nothing breaks, carry out the following modification on each of the clusters.  Access the following registry key:

HKEY_CURRENT_USER\Software\Microsoft\Command Processor

Add a DWORD entry entitled “DisableUNCCheck” and set the value to 1:

Modify the registry

2.Running the application
  2.1.Configure a job with the Job Scheduler

If you want to have something done at the cluster for you, then you need to use the job scheduler. Be sure that you are logged in on the workstations using a domain account with administrative rights. All jobs and will have to be submitted using this account.  Debugging is no exception, as you need to create an empty job that will host the debugging application.To get started, open the Job Scheduler->File menu->Submit Job:                                                                     

Create a job on the cluster

Name the job “Debug Job” and move over to the Processors tab.  Select the number of processors you would like to use for this job and then, check the box that says “Run Job until end of run time or until cancelled”.   Failure to check this box will cause the empty job to run and finish. The job must to continually run, so that Visual Studio will then attach the running processes to this specific job.

  Select job’s running time

Move to the Tasks and add to the tasks list msvsmon.exe in order for the Visual Studio to communicate with the Visual Studio Debugger when running the parallel application.

 Add msvsmon.exe to the tasks list

Move to the Advanced tab and select which nodes will be part of your debugging scheme. In our case we will use only the twos nodes we have, but other computational nodes may be added.

 Allocate nodes for the job

Click on submit job, and the  job has to run. Write down the ID of the job (in this case, it is 16) as it will be used further.

 Running created job

   2.2.Configure Visual Studio
  • Open Visual Studio 2008 and create new application. See previous section “Debugging parallel applications with Visual Studio 2008″ for more detail about creating a new application.
  • Let for example the parallel value calculation of PI:

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include “mpi.h”
int main(int argc, char *argv[])
      int         NumIntervals      = 0;  //num intervals in the domain [0,1] of F(x)= 4 / (1 + x*x)
      double      IntervalWidth     = 0.0;      //width of intervals
      double  IntervalLength  = 0.0;      //length of intervals
      double      IntrvlMidPoint    = 0.0;      //x mid point of interval
      int         Interval          = 0;  //loop counter
      int         done              = 0;  //flag
      double      MyPI              = 0.0;      //storage for PI approximation results
      double      ReferencePI       = 3.141592653589793238462643; //ref value of PI for comparison
      double  PI;
      char  processor_name[MPI_MAX_PROCESSOR_NAME];
      char  (*all_proc_names)[MPI_MAX_PROCESSOR_NAME];
      int         numprocs;
      int         MyID;
      int         namelen;
      int         proc = 0;

      all_proc_names = malloc(numprocs * MPI_MAX_PROCESSOR_NAME);

      MPI_Gather(processor_name, MPI_MAX_PROCESSOR_NAME, MPI_CHAR, all_proc_names, MPI_MAX_PROCESSOR_NAME, MPI_CHAR, 0, MPI_COMM_WORLD);

      for (proc=0; proc < numprocs; ++proc)
            printf(“Process %d on %s\n”, proc, all_proc_names[proc]);

      while (!done) //loops until done == 0
            IntervalLength = 0.0;
            if (MyID == 0){
                  printf(“\nEnter the number of intervals: (0 quits) “);
                  //NumIntervals = 10;
            MPI_Bcast(&NumIntervals, 1, MPI_INT, 0, MPI_COMM_WORLD);   /* send number of intervals to all procs */
            if (NumIntervals == 0)
                  done = 1;   //exit if number of intervals = 0 
                 //approximate the value of PI
                  IntervalWidth   = 1.0 / (double) NumIntervals;          
                  for (Interval = MyID + 1; Interval <= NumIntervals; Interval += numprocs){
                        IntrvlMidPoint = IntervalWidth * ((double)Interval – 0.5);
                        IntervalLength += (4.0 / (1.0 + IntrvlMidPoint*IntrvlMidPoint));
                  MyPI = IntervalWidth * IntervalLength;
                  MPI_Reduce(&MyPI, &PI, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);

                  //report approximation
                  if (MyID == 0){
                  printf(“PI is approximately %.16f, Error is %.16f\n”,
                              PI, fabs(PI – ReferencePI));
      //printf(“Hello world”);

  • Go to Project->Properties (or Alt + F7).
  • In Configuration Properties tab, select Debugging under the General tab.
  • Set Debugger to Launch to property value to MPI Cluster Debugger. The next screen shot show the values of the debugger properties:

Project settings 

MPI Run Command:  mpiexec. This is required for MPI application.
MPIRun Arguments:  The first argument “-job 3.0″ is to specify which is the job in the scheduler to use.  In my case, it was 3 when I created the job, and the 0 is to specify the task, which every job has by default.   We then have “-np 2″ which is used to specify that we will be using 2 nodes for this job.  Finally you see there is “-machinefile \\kim03a\bin\machines.txt“.  The “-machinefile” is used to specfify the UNV location of a text file that contains the names of the machines that will be part of this job.  The text file should have the names of the nodes on each line. The first line should contain the machine on which is made the application debugging in order not to redirect standard console input/output. So on CLUSTERNODE2101 machines.txt will look like this:

Machine file 

The name of the machines were duplicated because the physical machines on which the cluster is deployed are dual core, and the job that runs on the cluster needs to know a name for each core.

Application Command: This is the UNV path to the MPI application that you would like to debug.  This application HAS to be compiled to 64-bit and debugging symbols should be in that same directory as well.
MPIShim Location: In this location, it is specified the path to the x64 mpishim.exe binary. Mpishim should exist on each and every one of the machines at the specified local path.
MPI network security mode: Accept connections from any address to avoid problems.

  • In order to run C code and not C++ code you have to set up the compiler to compile the code as C code and not as C++ code. For this go to C/C++ -> Advanced tab and set the value of Compile as property to Compile as C++ or Compile as C depending on the written code. For the examples presented on the MPI’s API site you need to compile the project using Compile as C.

 Set compilator for C code

  • Select Linker->General tab and set the value for Additional Library Directories to “C:\Program Files\Microsoft Compute Cluster Pack\Lib\amd64” .The path depends on the place where CCP SDK was installed on the local machine.

Add additional library directories

  • Go to Linker->Input tab and add msmpi.lib to Additional Dependencies.

Add additional msmpi.lib

  • Go to C/C++->General tab and set the value for Additional Include Directories to “C:\Program Files\Microsoft Compute Cluster Pack\Include”.

 Include additional directories

  • Select Configuration Manager in order to selectx64 platform. Choose New and select x64 platform.

 Modify application platform

  • Set the Post Build Action. The executable should be copied in the shared location on the cluster in order to be accessible to all nodes in the cluster. In this case the shared location is headcluster210\PDC

Set post build action

  • Go to Tools->Options->Debugging tab and uncheck Break all processes when one process break.
  • Build the application and resolve any kind of errors.
  • Start debugging the application. From Debug->Windows->Processes you can see the processes that currently are running.
  • The result looks like this:

 Running parallel application 

The MPI application is running parallel on the cluster. In this case there are 4 processes that are running parallel, two of them on a compute node (node 1) and the other two on the other compute node (node 2).

  • Copy the application created recently on the second node and the settings too
  • Modify machines.txt (or create another file). Now it looks like this:

 Machine file 

  • Build the application and resolve any kind of errors.
  • Start debugging the application. From Debug->Windows->Processes you can see the processes that currently are running.
  • The result looks like this:

 Running parallel application 

The MPI application is running parallel on the cluster. In this case there are only 3 processes that are running parallel (-np 4 was modified to -np 3), two of them on a compute node (node 2) and the other one on the other compute node (node 1).

Some common problems when debugging MPI application on a cluster and their solutions can be found here.


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: