Environment: (@VM)
host : CentOS-6.0
kernel: 2.6.32-71.el6.x86_64 #1 SMP
MemTotal: 7891424 kB(/proc/meminfo)
CPU(Quad core) : Intel(R) Core(TM) i3-2130 CPU @ 3.40GHz
HDD: 1TB*2 (LVM ==> 2T)
guest: CentOS-6.0
kernel: 2.6.32-71.el6.x86_64 #1 SMP
MemTotal: 1G
CPU *1
HDD: 200GB
172.16.173.143 172.16.173.144
+------------+ +------------+
|mpich2 | |mpich2 |
| | | |
|------------| |------------|
|cent143 | |cent144 |
+-----+------+ +------+-----+ +----NAT(@VM)
| | |
------+------------------+------------------+
|
+-----+------+
|mpich2 |
|openmp |
|------------|
|cent142 |
+------------+
172.16.173.142
Purpose:
1. install a openmp environment in a server and test it2. install a mpi(with mpich2) environment in all servers and test it
OPENMP:
[Setup]
[root@cent142 ~]# yum install libgomp.x86_64 gcc-gfortran-4.4.6-4.el6.x86_64 -y
[C/C++ Code Example]
[root@cent142 ~]# mkdir -p ~/src/openmp; cd ~/src/openmp
[root@cent142 openmp]# vi ttt.c
#include
#include
int main()
{
int x;
x=2;
#pragma omp parallel num_threads(2) shared(x)
{
if(omp_get_thread_num() == 1)
x = 5;
else
{
printf("1: Thread %d: x = %d\n", omp_get_thread_num(),x);
}
#pragma omp barrier
if( omp_get_thread_num() == 0)
{
printf("2: Thread# %d, x = %d\n", omp_get_thread_num(),x);
}
else
{
printf("3: Thread# %d, x = %d\n", omp_get_thread_num(),x);
}
}
return 0;
}
[root@cent142 openmp]# gcc -fopenmp -o ttt ttt.c
[root@cent142 openmp]# ./ttt
1: Thread 0: x = 2
2: Thread# 0, x = 5
3: Thread# 1, x = 5
[Fortran Code Example]
[root@cent142 openmp]# vi ttt.f
PROGRAM A2
INCLUDE "omp_lib.h"
INTEGER X
X = 2
!$OMP PARALLEL NUM_THREADS(2) SHARED(X)
IF( OMP_GET_THREAD_NUM() .EQ. 0) THEN
X=5
ELSE
! PRINT 1: The following read of x has a race
PRINT *,"1: THREAD# ", OMP_GET_THREAD_NUM(), "X = ", X
ENDIF
!$OMP BARRIER
IF (OMP_GET_THREAD_NUM() .EQ. 0) THEN
! PRINT 2
PRINT *,"2: THREAD# ", OMP_GET_THREAD_NUM(), "X =", X
ELSE
! PRINT 3
PRINT *,"3: THREAD# ", OMP_GET_THREAD_NUM(), "X =", X
ENDIF
!$OMP END PARALLEL
END
[root@cent142 openmp]# gfortran -fopenmp -o ttt.for ttt.f
[root@cent142 openmp]# ./ttt.for
1: THREAD# 1 X = 5
2: THREAD# 0 X = 5
3: THREAD# 1 X = 5
MPICH2:
[Setup]
(1) Install Packages
[root@cent142 ~]# yum install mpich2.x86_64 mpich2-devel.x86_64
-y
[root@cent143 ~]# yum install mpich2.x86_64 mpich2-devel.x86_64
-y
[root@cent144 ~]# yum install mpich2.x86_64 mpich2-devel.x86_64
-y
MPICH2 Version: 1.2.1
MPICH2 Release date: Unknown, built on Fri Nov 12 05:07:30 GMT 2010
MPICH2 Device: ch3:nemesis
MPICH2 configure: --build=x86_64-unknown-linux-gnu....
MPICH2 CC: gcc ....
MPICH2 CXX: c++ ....
MPICH2 F77: gfortran ....
MPICH2 F90: gfortran ....
(2)/etc/hosts for all servers(scp /etc/hosts cent14X:/etc/.)
172.16.173.142 cent142
172.16.173.143 cent143
172.16.173.144 cent144
(3)a hosts file for mpi(~/src/mpich2/mpd.hosts)
cent142
cent143
cent144
(4)mpd.conf for all servers( password when inserting a task)
## all servers must have the same password
cat > /etc/mpd.conf << "EOF"
secretword=********
EOF
chmod 600 /etc/mpd.conf
scp /etc/mpd.conf cent142:/etc/.
scp /etc/mpd.conf cent143:/etc/.
scp /etc/mpd.conf cent144:/etc/.
(5)password when client try to insert a task to server
##it must the same with server
cat > ~/.mpd.conf << "EOF"
secretword=********
EOF
(6)password-less ssh
(7)start up mpich2
cd ~/src/mpich2
mpdboot -n 3 -f ./mpd.hosts
mpdboot_cent142 (handle_mpd_output 420): from mpd on cent142, invalid port info:
no_port
(1) check the ~/.mpd.conf exist , and only named as ~/.mpd.conf
(2) permission 600 for ~/.mpd.conf and /etc/mpd.conf
(3) x86 - x64 hybrid architechure is not allow for mpich2
openmpi or mpich2 instead
mpdtrace
cent142
cent144
cent143
[root@cent142 mpich2]# mpiexec -n 6 /bin/hostname
cent143
cent142
cent142
cent143
cent144
cent144
[root@cent142 mpich2]# mpiexec -n 3 date : -n 4 hostname
cent144
銝?10?? 3 18:53:19 CST 2012
cent143
銝?10?? 3 18:53:19 CST 2012
cent142
cent142
銝?10?? 3 18:53:19 CST 2012
[root@cent142 mpich2]# mpiexec -n 3 /bin/date : -n 2 -host cent143 /bin/hostname
瞻T 10瞻禱 3 15:02:00 CST 2012
cent143
cent143
瞻T 10瞻禱 3 15:02:00 CST 2012
瞻T 10瞻禱 3 15:02:00 CST 2012
[root@cent142 mpich2]# mpiexec -machinefile mpd2.hosts -n 3 /bin/date
銝?10?? 3 18:54:47 CST 2012
銝?10?? 3 18:54:47 CST 2012
銝?10?? 3 18:54:47 CST 2012
(8)close mpd service for all servers
mpdallexit
[C/C++ Code Example]
[root@cent142 ~]# mkdir -p ~/src/mpich2; cd ~/src/mpich2
[root@cent142 mpich2]# cat pi_cc.cc
#include
#include "mpi.h"
#include
int main(int argc, char *argv[])
{
int n, rank, size, i;
double PI25DT = 3.141592653589793238462643;
double mypi, pi, h, sum, x;
MPI::Init(argc, argv);
size = MPI::COMM_WORLD.Get_size();
rank = MPI::COMM_WORLD.Get_rank();
while (1) {
if (rank == 0) {
std::cout << "Enter the number of intervals: (0 quits)"
<< std::endl;
std::cin >> n;
}
MPI::COMM_WORLD.Bcast(&n, 1, MPI::INT, 0);
if (n==0)
break;
else {
h = 1.0 / (double) n;
sum = 0.0;
for (i = rank + 1; i <= n; i += size) {
x = h * ((double)i - 0.5);
sum += (4.0 / (1.0 + x*x));
}
mypi = h * sum;
MPI::COMM_WORLD.Reduce(&mypi, &pi, 1, MPI::DOUBLE,
MPI::SUM, 0);
if (rank == 0)
std::cout << "pi is approximately " << pi
<< ", Error is " << fabs(pi - PI25DT)
<< std::endl;
}
}
MPI::Finalize();
return 0;
}
[root@cent142 mpich2]# mpic++ -o pi_cc pi_cc.cc
[root@cent142 mpich2]# mpdboot -n 3 -f ./mpd.hosts
[root@cent142 mpich2]# mpiexec -n 12 ./pi_cc
[root@cent142 mpich2]# ps -ef | grep pi_cc
root 4039 2350 0 16:27 pts/0 00:00:00 python2.6 /usr/bin/mpiexec -n 12 ./pi_cc
root 4045 4043 49 16:27 ? 00:01:01 ./pi_cc
root 4046 4044 49 16:27 ? 00:01:00 ./pi_cc
root 4047 4041 0 16:27 ? 00:00:00 ./pi_cc
root 4052 1357 0 16:29 pts/1 00:00:00 grep pi_cc
[root@cent143 ~]# ps -ef | grep pi_cc
root 4171 4166 20 16:27 ? 00:00:29 ./pi_cc
root 4172 4168 19 16:27 ? 00:00:27 ./pi_cc
root 4173 4169 18 16:27 ? 00:00:26 ./pi_cc
root 4174 4165 20 16:27 ? 00:00:29 ./pi_cc
root 4175 4170 20 16:27 ? 00:00:28 ./pi_cc
root 4180 2277 0 16:30 pts/0 00:00:00 grep pi_cc
[root@cent144 ~]# ps -ef | grep pi_cc
root 3696 3692 25 16:27 ? 00:00:16 ./pi_cc
root 3697 3691 22 16:27 ? 00:00:14 ./pi_cc
root 3698 3694 25 16:27 ? 00:00:16 ./pi_cc
root 3699 3695 23 16:27 ? 00:00:15 ./pi_cc
root 3703 2428 0 16:28 pts/0 00:00:00 grep pi_cc
Enter the number of intervals: (0 quits)
50
pi is approximately 3.14163, Error is 3.33333e-05
Enter the number of intervals: (0 quits)
0
[root@cent142 mpich2]# mpdallexit
[Fortran Code Example]
[root@cent142 mpich2]# cat pi_ff.f
program main
include "mpif.h"
double precision PI25DT
parameter (PI25DT = 3.141592653589793238462643d0)
double precision mypi, pi, h, sum, x, f, a
integer n, myid, numprocs, i, ierr
c function to integrate
f(a) = 4.d0 / (1.d0 + a*a)
call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
10 if ( myid .eq. 0 ) then
print *, 'Enter the number of intervals: (0 quits) '
read(*,*) n
endif
c broadcast n
call MPI_BCAST(n,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)
c check for quit signal
if ( n .le. 0 ) goto 30
c calculate the interval size
h = 1.0d0/n
sum = 0.0d0
do 20 i = myid+1, n, numprocs
x = h * (dble(i) - 0.5d0)
sum = sum + f(x)
20 continue
mypi = h * sum
c collect all the partial sums
call MPI_REDUCE(mypi,pi,1,MPI_DOUBLE_PRECISION,MPI_SUM,0,
& MPI_COMM_WORLD,ierr)
c node 0 prints the answer.
if (myid .eq. 0) then
print *, 'pi is ', pi, ' Error is', abs(pi - PI25DT)
endif
goto 10
30 call MPI_FINALIZE(ierr)
stop
end
[root@cent142 mpich2]# cat make_pi_ff.sh
#!/bin/bash
## mpif77 -f77=gfortran -o pi_ff pi_ff.f
## mpif77 -f77=f77 -o pi_ff pi_ff.f <-- so different with gfortran>
mpif77 -o pi_ff pi_ff.f
[root@cent142 mpich2]# ./make_pi_ff.sh
[root@cent142 mpich2]# mpdboot -n 3 -f ./mpd.hosts
[root@cent142 mpich2]# mpiexec -n 12 ./pi_ff
10
0
Enter the number of intervals: (0 quits)
pi is 3.1424259850010983 Error is 8.33331411305149317E-004
Enter the number of intervals: (0 quits)
[A]comparasion between -n when computating of pi with Monte Carlo
[root@cent142 mpich2]# time mpiexec -n 3 ./monte_cc 0.00001
pi = 3.14159455782312946326
points: 735000
in: 577268, out: 157732, to exit
real 0m8.805s
user 0m0.052s
sys 0m0.015s
[root@cent142 mpich2]# time mpiexec -n 6 ./monte_cc 0.00001
pi = 3.14159455782312946326
points: 735000
in: 577268, out: 157732, to exit
real 0m5.406s
user 0m0.047s
sys 0m0.015s
[root@cent142 mpich2]# time mpiexec -n 42 ./monte_cc 0.00001
pi = 3.14158287705326033645
points: 1004500
in: 788930, out: 215570, to exit
real 0m6.829s
user 0m0.053s
sys 0m0.012s
[source code]
[root@cent142 mpich2]# cat monte_cc.c
#include
#include
#include "mpi.h"
#include "mpe.h"
#define CHUNKSIZE 1000
/* We'd like a value that gives the maximum value returned by the function
random, but no such value is *portable*. RAND_MAX is available on many
systems but is not always the correct value for random (it isn't for
Solaris). The value ((unsigned(1)<<31 but="but" common="common" guaranteed="guaranteed" is="is" not="not" p="p">#define INT_MAX 1000000000
/* message tags */
#define REQUEST 1
#define REPLY 2
int main( int argc, char *argv[] )
{
int iter;
int in, out, i, iters, max, ix, iy, ranks[1], done, temp;
double x, y, Pi, error, epsilon;
int numprocs, myid, server, totalin, totalout, workerid;
int rands[CHUNKSIZE], request;
MPI_Comm world, workers;
MPI_Group world_group, worker_group;
MPI_Status status;
if(argc < 2)
{
printf("you must give a number of tolerance of pi when approaching it!!, ex:\n mpiexec -n
return -1;
}
MPI_Init(&argc,&argv);
world = MPI_COMM_WORLD;
MPI_Comm_size(world,&numprocs);
MPI_Comm_rank(world,&myid);
server = numprocs-1; /* last proc is server */
if (myid == 0)
sscanf( argv[1], "%lf", &epsilon );
MPI_Bcast( &epsilon, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD );
MPI_Comm_group( world, &world_group );
ranks[0] = server;
MPI_Group_excl( world_group, 1, ranks, &worker_group );
MPI_Comm_create( world, worker_group, &workers );
MPI_Group_free(&worker_group);
if (myid == server) { /* I am the rand server */
do {
MPI_Recv(&request, 1, MPI_INT, MPI_ANY_SOURCE, REQUEST,
world, &status);
if (request) {
for (i = 0; i < CHUNKSIZE; ) {
rands[i] = random();
if (rands[i] <= INT_MAX) i++;
}
MPI_Send(rands, CHUNKSIZE, MPI_INT,
status.MPI_SOURCE, REPLY, world);
}
}
while( request>0 );
}
else { /* I am a worker process */
request = 1;
done = in = out = 0;
max = INT_MAX; /* max int, for normalization */
MPI_Send( &request, 1, MPI_INT, server, REQUEST, world );
MPI_Comm_rank( workers, &workerid );
iter = 0;
while (!done) {
iter++;
request = 1;
MPI_Recv( rands, CHUNKSIZE, MPI_INT, server, REPLY,
world, &status );
for (i=0; i
y = (((double) rands[i++])/max) * 2 - 1;
if (x*x + y*y < 1.0)
in++;
else
out++;
}
MPI_Allreduce(&in, &totalin, 1, MPI_INT, MPI_SUM,
workers);
MPI_Allreduce(&out, &totalout, 1, MPI_INT, MPI_SUM,
workers);
Pi = (4.0*totalin)/(totalin + totalout);
error = fabs( Pi-3.141592653589793238462643);
done = (error < epsilon || (totalin+totalout) > 1000000);
request = (done) ? 0 : 1;
if (myid == 0) {
printf( "\rpi = %23.20f", Pi );
MPI_Send( &request, 1, MPI_INT, server, REQUEST,
world );
}
else {
if (request)
MPI_Send(&request, 1, MPI_INT, server, REQUEST,
world);
}
}
MPI_Comm_free(&workers);
}
if (myid == 0) {
printf( "\npoints: %d\nin: %d, out: %d,
totalin+totalout, totalin, totalout );
getchar();
}
MPI_Finalize();
}
[B]mpiexec for fortran
[root@cent142 mpich2]# mpiexec -n 3 ./pitm_ff
20
0
Enter the number of intervals: (0 quits)
pi is 3.1418009868930934 Error is 2.08333303300278772E-004
time is 7.91406631469726563E-003 seconds
Enter the number of intervals: (0 quits)
[source code]
[root@cent142 mpich2]# cat pitm_ff.f
program main
include "mpif.h"
double precision PI25DT
parameter (PI25DT = 3.141592653589793238462643d0)
double precision mypi, pi, h, sum, x, f, a
double precision starttime, endtime
integer n, myid, numprocs, i, ierr
c function to integrate
f(a) = 4.d0 / (1.d0 + a*a)
call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
10 if ( myid .eq. 0 ) then
print *, 'Enter the number of intervals: (0 quits) '
read(*,*) n
endif
c broadcast n
starttime = MPI_WTIME()
call MPI_BCAST(n,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)
c check for quit signal
if ( n .le. 0 ) goto 30
c calculate the interval size
h = 1.0d0/n
sum = 0.0d0
do 20 i = myid+1, n, numprocs
x = h * (dble(i) - 0.5d0)
sum = sum + f(x)
20 continue
mypi = h * sum
c collect all the partial sums
call MPI_REDUCE(mypi,pi,1,MPI_DOUBLE_PRECISION,MPI_SUM,0,
& MPI_COMM_WORLD,ierr)
c node 0 prints the answer.
endtime = MPI_WTIME()
if (myid .eq. 0) then
print *, 'pi is ', pi, ' Error is', abs(pi - PI25DT)
print *, 'time is ', endtime-starttime, ' seconds'
endif
goto 10
30 call MPI_FINALIZE(ierr)
stop
end
[Reference]
http://trac.nchc.org.tw/grid/wiki/mpich
http://www.mcs.anl.gov/research/projects/mpich2/
http://openmp.org/wp/
http://www.mcs.anl.gov/research/projects/mpi/usingmpi/examples/simplempi/main.htm
http://www.mcs.anl.gov/research/projects/mpich2/
http://openmp.org/wp/
http://www.mcs.anl.gov/research/projects/mpi/usingmpi/examples/simplempi/main.htm
hydra is know well replace of mpdboot in many public discussion. you could use mpiexec.hydra(without mpdboot) for your MPI cluster.
回覆刪除[root@cent142 mpich2]# cat mpd_hydra.hosts
cent142
cent143:4
cent144:2
[root@cent142 mpich2]# time mpiexec.hydra -f mpd_hydra.hosts -n 3 ./monte_cc 0.00001
pi = 3.14159455782312946326
points: 735000
in: 577268, out: 157732, to exit
real 0m2.184s
user 0m0.001s
sys 0m0.005s
[root@cent142 mpich2]# time mpiexec.hydra -f mpd_hydra.hosts -n 6 ./monte_cc 0.00001
pi = 3.14159455782312946326
points: 735000
in: 577268, out: 157732, to exit
real 0m1.447s
user 0m0.001s
sys 0m0.019s
[root@cent142 mpich2]# mpiexec.hydra --help
After mpich2-1.3.X , hydra become the default process management with MPICH2.
作者已經移除這則留言。
回覆刪除作者已經移除這則留言。
回覆刪除MPICH2 v1.2.X is not so well since I had tested it with 'PI' or 'Monte Carlo'. I installed a new one (1.4.1p1) and It's all fixed.
回覆刪除[root@cent145 ~]# wget http://kojipkgs.fedoraproject.org/packages/mpich2/1.4.1p1/8.fc18/src/mpich2-1.4.1p1-8.fc18.src.rpm
[root@cent145 ~]# yum install rpm-build-4.8.0-27.el6.x86_64
[root@cent145 ~]# rpm -i mpich2-1.4.1p1-8.fc18.src.rpm
警告:使用者 mockbuild 不存在 - 現使用 root 代替
警告:使用者 mockbuild 不存在 - 現使用 root 代替
:
:
[root@cent145 ~]# cd rpmbuild/
[root@cent145 rpmbuild]# cd SPECS
[root@cent145 SPECS]# yum install libXt-devel bison flex libuuid-devel java-devel-openjdk gcc-gfortran hwloc-devel automake autoconf libtool gettext valgrind-devel gcc-c++-4.4.6-4.el6.x86_64 make
[root@cent145 SPECS]# rpmbuild -bb mpich2.spec
:
:
正在檢查未被打包的檔案:/usr/lib/rpm/check-files /root/rpmbuild/BUILDROOT/mpich2-1.4.1p1-8.el6.x86_64
已寫入:/root/rpmbuild/RPMS/x86_64/mpich2-1.4.1p1-8.el6.x86_64.rpm
已寫入:/root/rpmbuild/RPMS/x86_64/mpich2-autoload-1.4.1p1-8.el6.x86_64.rpm
已寫入:/root/rpmbuild/RPMS/x86_64/mpich2-devel-1.4.1p1-8.el6.x86_64.rpm
已寫入:/root/rpmbuild/RPMS/noarch/mpich2-doc-1.4.1p1-8.el6.noarch.rpm
正在執行 (%clean):/bin/sh -e /var/tmp/rpm-tmp.L5CRnN
+ umask 022
+ cd /root/rpmbuild/BUILD
+ cd mpich2-1.4.1p1
+ /bin/rm -rf /root/rpmbuild/BUILDROOT/mpich2-1.4.1p1-8.el6.x86_64
+ exit 0
[root@cent145 SPECS]# cd ../RPMS
[root@cent145 RPMS]# ls -l noarch x86_64
noarch:
總計 4240
-rw-r--r-- 1 root root 4341227 2012-10-04 20:23 mpich2-doc-1.4.1p1-8.el6.noarch.rpm
x86_64:
總計 7748
-rw-r--r-- 1 root root 7385889 2012-10-04 20:23 mpich2-1.4.1p1-8.el6.x86_64.rpm
-rw-r--r-- 1 root root 9670 2012-10-04 20:23 mpich2-autoload-1.4.1p1-8.el6.x86_64.rpm
-rw-r--r-- 1 root root 531632 2012-10-04 20:23 mpich2-devel-1.4.1p1-8.el6.x86_64.rpm
#Add a new repository(createrepo) for mpich2,
#and then install mpich2 and all dependencies with yum for all servers.
[root@cent145 SPECS]# yum install mpich2-devel
[root@cent146 SPECS]# yum install mpich2-devel
[root@cent146 mpich2]# find / -name mpiexec
/usr/lib64/mpich2/bin/mpiexec
[root@cent146 mpich2]# vi ~/.bash_profile
:
:
#PATH=$PATH:$HOME/bin
PATH=$PATH:$HOME/bin:/usr/lib64/mpich2/bin
LD_LIBRARY_PATH=/usr/lib64/mpich2/lib:$LD_LIBRARY_PATH
export PATH
export LD_LIBRARY_PATH
~
[root@cent146 mpich2]# scp ~/.bash_profile cent145:~/.
[root@cent146 mpich2]# source ~/.bash_profile
[root@cent146 mpich2]# which mpicc
/usr/lib64/mpich2/bin/mpicc
[root@cent146 mpich2]# yum install gcc-c++
[root@cent146 mpich2]# mpic++ -o pi_cc pi_cc.cc -L/usr/lib64/mpich2/lib/ -lmpl -lopa
[root@cent146 mpich2]# cat hydra.hosts
cent145
cent146
[root@cent146 mpich2]# mpiexec -f hydra.hosts -n 2 ./pi_cc
Enter the number of intervals: (0 quits)
1
pi is approximately 3.2, Error is 0.0584073
Enter the number of intervals: (0 quits)
2
pi is approximately 3.16235, Error is 0.0207603
Enter the number of intervals: (0 quits)
10
pi is approximately 3.14243, Error is 0.000833331
Enter the number of intervals: (0 quits)
0
## the 'mpiexec' is equal to 'mpiexec.hydra'