Scotch Brand 5.1.10 user manual

Abstracts of contents

Summary of the content on the page No. 1

Scotch and libScotch 5.1 User’s Guide
(version 5.1.10)
Franc¸ois Pellegrini
Bacchus team, INRIA Bordeaux Sud-Ouest
ENSEIRB & LaBRI, UMR CNRS 5800
Universit´e Bordeaux I
351 cours de la Lib´eration, 33405 TALENCE, FRANCE
pelegrin@labri.fr
August 29, 2010
Abstract
This document describes the capabilities and operations of Scotch and
libScotch, a software package and a software library devoted to static
mapping, partitioning, and sparse matrix block ordering of graphs and
meshes/hypergraphs. It giv

Summary of the content on the page No. 2

Contents 1 Introduction 6 1.1 Static mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2 Sparse matrix ordering . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 Contents of this document . . . . . . . . . . . . . . . . . . . . . . . . 7 2 The Scotch project 7 2.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3 Algorithms 8 3.1 Static mapping by Dual Re

Summary of the content on the page No. 3

6.3.5 gcv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.3.6 gmap / gpart . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.3.7 gmk * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.3.8 gmk msh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 6.3.9 gmtst . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 6.3.10 gord . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 6.3.11 gotst . . . . . . . . . . . .

Summary of the content on the page No. 4

7.5.3 SCOTCH graphFree . . . . . . . . . . . . . . . . . . . . . . . . 77 7.5.4 SCOTCH graphLoad . . . . . . . . . . . . . . . . . . . . . . . . 78 7.5.5 SCOTCH graphSave . . . . . . . . . . . . . . . . . . . . . . . . 79 7.5.6 SCOTCH graphBuild . . . . . . . . . . . . . . . . . . . . . . . 79 7.5.7 SCOTCH graphBase . . . . . . . . . . . . . . . . . . . . . . . . 80 7.5.8 SCOTCH graphCheck . . . . . . . . . . . . . . . . . . . . . . . 81 7.5.9 SCOTCH graphSize . . . . . . . . . . . . . . . . . .

Summary of the content on the page No. 5

7.10.2 SCOTCH stratExit . . . . . . . . . . . . . . . . . . . . . . . . 108 7.10.3 SCOTCH stratSave . . . . . . . . . . . . . . . . . . . . . . . . 109 7.10.4 SCOTCH stratGraphBipart . . . . . . . . . . . . . . . . . . . 109 7.10.5 SCOTCH stratGraphMap . . . . . . . . . . . . . . . . . . . . . 110 7.10.6 SCOTCH stratGraphMapBuild . . . . . . . . . . . . . . . . . . 110 7.10.7 SCOTCH stratGraphOrder . . . . . . . . . . . . . . . . . . . . 111 7.10.8 SCOTCH stratGraphOrderBuild . . . . . . . . . .

Summary of the content on the page No. 6

1 Introduction 1.1 Static mapping The eﬃcient execution of a parallel program on a parallel machine requires that the communicating processes of the program be assigned to the processors of the machine so as to minimize its overall running time. When processes have a lim- ited duration and their logical dependencies are accounted for, this optimization problem is referred to as scheduling. When processes are assumed to coexist simul- taneously for the entire duration of the program, it is referr

Summary of the content on the page No. 7

1.3 Contents of this document This document describes the capabilities and operations of Scotch, a software packagedevotedtostatic mapping, graphandmeshpartitioning, andsparsematrix block ordering. Scotch allows the user to map eﬃciently any kind of weighted process graph onto any kind of weighted architecture graph, and provides high- quality block orderings of sparse matrices. The rest of this manual is organized as follows. Section 2 presents the goals of the Scotch project, and section 3 out

Summary of the content on the page No. 8

The free/libre software license under which Scotch 5.1 is distributed is the CeCILL-C license [6], which has basically the same features as the GNU LGPL (“Lesser General Public License”): ability to link the code as a library to any free/libre or even proprietary software, ability to modify the code and to redistribute these modiﬁcations. Version 4.0 of Scotch was distributed under the LGPL itself. Please refer to section 8 to see how to obtain the free/libre distribution of Scotch. 3 Algorithms

Summary of the content on the page No. 9

This function, which has already been considered by several authors for hyper- cube target topologies [11, 21, 25], has several interesting properties: it is easy to compute, allows incremental updates performed by iterative algorithms, and its minimization favors the mapping of intensively intercommunicating processes onto nearby processors; regardless of the type of routage implemented on the target machine (store-and-forward or cut-through), it models the traﬃc on the intercon- nection networ

Summary of the content on the page No. 10

callsagraph bipartitioning algorithm tosplitthesubsetofprocessesassociatedwith the domain across the two subdomains, as sketched in the following. mapping (D, P) Set_Of_Processors D; Set_Of_Processes P; { Set_Of_Processors D0, D1; Set_Of_Processes P0, P1; if (|P| == 0) return; /* If nothing to do. */ if (|D| == 1) { /* If one processor in D */ result (D, P); /* P is mapped onto it. */ return; } (D0, D1) = processor_bipartition (D); (P0, P1) = process_bipartition (P, D0, D1); mapping (D0, P0); /*

Summary of the content on the page No. 11

3.1.4 Partial cost function The production of eﬃcient complete mappings requires that all graph bipartition- ings favor the criteria that we have chosen. Therefore, the bipartitioning of a ′ subgraph S of S should maintain load balance within the user-speciﬁed tolerance, ′ and minimize the partial communication cost function f , deﬁned as C X def ′ ′ ′ f (τ ,ρ ) = w ({v,v})|ρ ({v,v})| , S,T S,T S S,T C ′ v ∈ V(S ) ′ {v,v } ∈E(S) ′ which accounts for the dilation of edges internal to subgraph S a

Summary of the content on the page No. 12

neighbor processes, whether they are handled by the job itself or not, since it can ′ estimateinf thedilationofthecorrespondingedges. Thisresultsinaninteresting C feedback eﬀect: once an edge has been kept in a cut between two subdomains, the distancebetweenits endverticeswill beaccountedforinthe partialcommunication cost function to be minimized, and following jobs will be more likely to keep these vertices close to each other, as illustrated in Figure 2. The relative eﬃciency of CL0 CL0 CL2 CL

Summary of the content on the page No. 13

space that derives from the one which was globally deﬁned at the coarsest level, thus preventing local optimization reﬁnement algorithms to be trapped in local optima of the ﬁner graphs [8]. Diﬀusion This global optimization method, presented in [42], ﬂows two kinds of antag- onistic liquids, scotch and anti-scotch, from two source vertices, and sets the new frontier as the limit between vertices which contain scotch and the ones which contain anti-scotch. In order to add load-balancing constrai

Summary of the content on the page No. 14

tree rooted at a randomly chosen vertex, and this process is iterated by se- lecting a new root vertex within the last layer as long as the number of layers increases. Then, starting from the current root vertex, vertices are assigned layerafter layerto the ﬁrst subdomain, until half of the total weight has been processed. Remaining vertices are then allocated to the second subdomain. AsfortheoriginalGibbs, Poole,andStockmeyeralgorithm,itisassumedthat the maximization of the number of layers res

Summary of the content on the page No. 15

for it to account for topological structures of the graph that would else be of a too high level for it to encompass in its local optimization process. 3.1.7 Mapping onto variable-sized architectures Several constrained graph partitioning problems can be modeled as mapping the problem graph onto a target architecture, the number of vertices and topology of which depend dynamically on the structure of the subgraphs to bipartition at each step. Variable-sized architectures are supported by the DRB

Summary of the content on the page No. 16

vertex separators are computed by using direct heuristics [28, 37], or from edge separators [48, and included references] by minimum cover techniques [9, 30], but other techniques such as spectral vertex partitioning have also been used [49]. Providedthat good vertex separatorsare found, the nested dissectionalgorithm produces orderings which, both in terms of ﬁll-in and operation count, compare favorably[19, 31, 46] to the ones obtained with the minimum degreealgorithm[39]. Moreover, the elimin

Summary of the content on the page No. 17

measure its quality, severalparameters canbe deﬁned: h , h , and h denote min max avg 1 the minimum, maximum, and average heights of the tree , respectively, and h dlt is the variance, expressed as a percentage of h . Since small separators result in avg small chains in the elimination tree, h should also indirectly reﬂect the quality avg of separators. 3.2.5 Ordering methods The core of our ordering algorithm uses graph ordering methods as black boxes, whichallowstheorderertorunanytypeoforderin

Summary of the content on the page No. 18

3.2.6 Graph separation methods The core of our incomplete nested dissection algorithm uses graph separation methods as black boxes. It allows the orderer to run any type of graph separation method compatible with our criteria for quality, that is, reducing the size of the vertex separator while maintaining the loads of the separated parts within some user-speciﬁed tolerance. Separation jobs maintain an internal image of the current vertex separator, indicating for every vertex of the job whether

Summary of the content on the page No. 19

Scotch can now handle compressed streams on the ﬂy, in several widely used formats such as gzip, bzip2 or lzma. Please refer to Section 6.2 for more informa- tion. 4.2 Changes from version 5.0 Anew integerindex type hasbeencreatedinthe Fortraninterface, toaddressarray indices larger than the maximum value which can be stored in a regular integer. Please refer to Section 8.3 for more information. 5 Files and data structures For the sake of portability, readability, and reduction of storage space,

Summary of the content on the page No. 20

The numeric ﬂag, similar to the one used by the Chaco graph format [24], is made of three decimal digits. A non-zero value in the units indicates that vertex weights are provided. A non-zero value in the tenths indicates that edge weights are provided. A non-zero value in the hundredths indicates that vertex labels are provided; if itis the case, verticescanbe storedinanyorderinthe ﬁle; else, natural order is assumed, starting from the graph base index. This header data is then followed by as ma

Alternative user manuals

#	User manual	Category	Download
1	Dodge ES142 User manual	Calculator	2
2	Sony CD-R/RW User manual	Calculator	24
3	Sony EL-W516B User manual	Calculator	2
4	Sony EL-W535B User manual	Calculator	2
5	Sony VGC-RBXX User manual	Calculator	1
6	Sony EL-S25 User manual	Calculator	2
7	Philips 3000 SERIES 32PFL3403D User manual	Calculator	20
8	Samsung CAMCORDER User manual	Calculator	5
9	Ativa AT-P2000 User manual	Calculator	192
10	Ativa AT-CK1 User manual	Calculator	16
11	Ativa AT-30S User manual	Calculator	37
12	Ativa PET1030 User manual	Calculator	10
13	Ativa PET720 User manual	Calculator	16
14	Casio ALGEBRA PLUS FX 2.0 User manual	Calculator	141
15	Casio 9860 User manual	Calculator	67