Experimental evaluation of DiVinE Tools
The overall goal of DiVinE is to efficiently utilize available
computational resoururces to complete its verification task or complete it
sooner. The efficiency of selected DiVinE Tools is reported in the following
several case studies.
DiVinE Cluster on DAS3
In this experiment we evaluated performance of algorithms MAP and OWCTY
as implemented in DiVinE Cluster. Our primary aim was to show that the
tool can scale up to 256 CPU cores. Our performance evaluation was done on
the Distributed ASCI Supercomputer (DAS3), a widearea distributed system
for Computer Science research in the Netherlands. The casestudy was done
by Kees Verstoep from VU University, Amsterdam.
Distributed ASCI Supercomputer (DAS3)
DAS3 consists of five clusters distributed over four sites. It uses
Myri10G networking technology from Myricom both as an internal highspeed
interconnect as well as an interface to remote DAS3 clusters. DAS3 is
largely homogeneous: every cluster uses dualCPU AMD Opteron nodes, but
with different clock speeds and/or number of CPU cores. For the purpose
of this experiment we employ DAS3 cluster at VU University, which has the
largest number of compute nodes and cores (85 nodes with a dualcpu,
dualcore 2.4 GHz AMD Opterons).
Intitial experiments on DAS3 were performed with DiVinE Cluster
version 0.7.2 that contained several performance flaws due to which an
efficient usage of the tool was limited to clusters with small number of
nodes (less than 20). However, we succeed to optimize the tool to
efficently employ much larger number of cores that were available on DAS3
system. The optimized implementation reffered below corresponds to the
DiVinE Cluster version 0.8 (will be released shortly).
Experiments on Scalability and Efficiency
First, we show graphs depicting the scalability of unoptimized and
optimized implementations. For the purpose of this experiment, we run
DiVinE Cluster on anderson.8.prop.dve model from BEEM database. Each line in the
graphs below shows scalability for the given number of nodes when using 1
core per node, 2 cores per node, and 4 cores per node.
MAP Scalability 

OWCTY Scalability 



To get an estimate on the efficiency of MAP and OWCTY, we ran
singlecore versions on a special DAS3/VU node equipped with 16GB memory
(the regular compute nodes have 4GB, which is insufficient to store the
state space there). The efficiencies of optimized implementations are
quite good even for 256 cores. In particular, MAP on a single core took
1052.5 sec; the runtimes on 64, 128, and 256 cores are 23.0, 13.1, and 8.9
sec, giving efficiencies of 72%, 63% and 46%, respectively. OWCTY on a
single core took 631.7 sec; the runtime on 64, 128, and 256 cores are
13.3, 7.4 and 5.0 sec, giving efficiencies of 74%, 67% and 49%,
respectively. The results are shown in the two following tables.
MAP Efficiency 

OWCTY Efficiency 
Cores  Runtime (sec)  Efficiency 
1  1052.5  100% 
64  23.0  72% 
128  13.1  63% 
256  8.9  46% 


Cores  Runtime (sec)  Efficiency 
1  631.7  100% 
64  13.3  74% 
128  7.4  67% 
256  5.0  49% 

Measuring Network Throughput
The following graphs illustrate the effect of the improvements on the
applications' throughput as a function of their runtime. In this case a
larger instance of the Anderson problem was used to obtain a higher
resolution graph.
MAP Throughput 

OWCTY Throughput 



DiVinE MultiCore Performance Evaluation
The following tables reports on selected runtimes for Reachability and
LTL model checking verification tasks performed using DiVinE MultiCore
version 1.1. You may also be interested in the table with all
experiments.
Reported runtimes are in seconds (rounded). All models come from the
BEEM database. The
meassurements were performed for two different architectures:
manwe:  8x Dual Core AMD Opteron(tm) Processor 885

pheme:  2x Intel Xeon CPU 5130 @ 2.00GHz

The value X denotes that the
corresponding experiment didn't complete (memory overflow).
The value
 denotes that the corresponding
experiment was not performed.
The !
next to time means that a counterexample has been found before exploring
full state space of the model.
Reachability (generation of all reachable states)
Model   divine @
manwe_64bit
1 core  2 cores  4 cores  8 cores  16
cores 
  divine @
pheme_64bit
1 core  2 cores  3 cores  4 cores 

    
    
  2:49:29 X  25:28 X  16:28 X  11:44 X  13:18 X 
  34:52 X  5:14 X  3:17 X  2:48 X 

    
    
    
  30:39  24:29  18:22  19:17  18:46 
  24:17 X  18:40 X  14:20 X  14:08 X 

    
  2:15:04 X  1:52:44 X  30:47 X  12:53 X  5:19 X 
  34:43 X  20:05 X  7:21 X  6:51 X 

    
    
    
    
  2:12:03 X  32:43 X  15:50 X  0 X  5:11 X 
  24:23 X  8:05 X  4:59 X  4:06 X 

    
  1:04:03  30:03  13:49  7:20  3:33 
  
  15:06:33 X  3:30:19 X  1:12:51 X  31:56 X  13:04 X 
  2:22:41 X  34:44 X  19:48 X  14:44 X 

    
    
  13:55  12:22  12:24  12:24  13:42 
  10:09 X  10:16 X  10:09 X  10:15 X 

    
  1:53:48  43:39 X  32:26  17:33  9:20 
  40:07 X  6:18 X  3:41 X  3:34 X 

    
LTL Model checking
Model   divine @ manwe_64bit
1 core  2 cores  4
cores  8 cores  16 cores 
  divine @ pheme_64bit
1 core  2 cores  3
cores  4 cores 

    
    
  49:21  38:16  29:56  30:45  32:04 
  16:35 X  10:44 X  7:05 X  7:15 X 

elevator2.3.prop4  (valid) 
    
elevator2.3a.prop4  (valid) 
    
    
  1:26:43 X  20:32 X  10:03 X  6:04 X  0 X 
  21:17 X  5:13 X  3:16 X  2:40 X 

  1:32:48 X  20:01 X  9:46 X  6:00 X  3:09 X 
  22:49 X  5:10 X  3:07 X  2:36 X 

leader_election.5.prop2  (valid) 
    
leader_election.6.prop2  (valid) 
    
leader_filters.6.prop2  (valid) 
  1:06:16  37:05  21:42  15:54  11:06 
  12:35 X  5:03 X  3:16 X  2:33 X 

leader_filters.7.prop2  (valid) 
    
    
  1:38:18  52:58  32:12  20:20  13:35 
  26:06 X  4:27 X  3:09 X  2:45 X 

  1:51:46  58:48  36:18  22:40  15:41 
  26:22 X  7:49 X  5:55 X  4:19 X 

public_subscribe.5.prop1  (n/a) 
  1:20:41 X  22:36 X  10:17 X  5:13 X  2:41 X 
  20:30 X  5:41 X  3:29 X  2:37 X 

    
    
szymanski.4.prop4  (valid) 
    
anderson.3.prop2  (invalid) 
    
    
    
    
elevator.5.prop2  (invalid) 
  1:28:30  1:12:42  2:48
!  1:08
!  4
! 
  17:07 X  7:29 X  6:22 X  2:23 ! 

iprotocol.4.prop4  (invalid) 
    
lamport.6.prop2 
(invalid) 
    
  1:57:49  24:56 !  39:44  26:21  19:18 
  25:22 X  7:05 X  5:33 X  4:20 X 

peterson.5.prop2  (invalid) 
  1:43:02  14:20 !  36:16  23:49  17:14 
  27:27 X  4:23 X  2:21 !  1:40 ! 

    
szymanski.5.prop2  (invalid) 
  1:00:52  7:34 !  1:20 !  8 !  3 ! 
  
    