Parallel Colt 0.10.1

Package cern.colt.matrix.tdouble.algo

Linear Algebraic matrix computations operating on DoubleMatrix2D and DoubleMatrix1D.

See: Description

Package cern.colt.matrix.tdouble.algo Description

Linear Algebraic matrix computations operating on DoubleMatrix2D and DoubleMatrix1D.

Overview

The linalg package provides easy and performant access to compute intensive Linear Algebra. Much functionality is concentrated in class DenseDoubleAlgebra and SparseDoubleAlgebra. Five fundamental matrix decompositions, which consist of pairs or triples of matrices, permutation vectors, and the like, produce results in five decomposition classes.  These decompositions are accessed by the Algebra class to compute solutions of simultaneous linear equations, determinants, inverses and other matrix functions.  The five decompositions are

Colt and Jama

This package could only be rolled out easily because it is to a large degree adapted from interfaces and implementations of the Jama matrix package. See the Jama homepage. Due credit is given to Joe Hicklin, Cleve Moler, Peter Webb, Ronald F. Boisvert, Bruce Miller, Roldan Pozo and Karin Remington, the Jama authors from MathWorks and NIST.

Design Issues

Jama matrices are of type Jama.Matrix, Colt matrices of type cern.colt.matrix.tdouble.DoubleMatrix1D, cern.colt.matrix.tdouble.DoubleMatrix2D and cern.colt.matrix.DoubleMatrix3D.

Jama.Matrix is not a general-purpose array class. It is designed for a single special purpose: Linear algebra. Because of its limited scope, Jama can combine data structure and algorithms in a class Jama.Matrix. In contrast, Colt matrices are general-purpose array classes. Since multi-dimensional matrices (arrays) have many applications, of which only one is linear algebra, Colt matrix packages are designed to avoid fat interfaces, yet allow to form the basis on top of which a broad set of functionality and applications can be defined (a similar spirit is used in STL and IBM Array). Thus, data structure and special-purpose algorithms are separated. Class Algebra works on DoubleMatrix2D and contains the operations of Jama.Matrix, but holds no data structure. Class DoubleMatrix2D contains an efficient and flexible multi-dimensional array data structure, as well as multi-purpose operations, but (almost) no linear algebraic operations.

As a consequence a Colt user initially faces some additional complexity, but after getting used to such a design, will honour the fact that logically related functionality is logically separated. For example, if a user is not interested in Formatting, Sorting, Partitioning, Statistics, etc. he/she does not see this functionality, because it is neither defined in the linalg package nor the matrix package, but somewhere else.

Perhaps more importantly, such a design will scale over time, as more and more functionality from many scientific and engineering domains is added. Also see matrix algorithms.

Functionality

All methods of Jama.Matrix are provided in Algebra, except for some less important convenience methods. Colt matrices (similar to IBM Arrays) are powerful and flexible data structures. Subrange, slice, dice, flip, selection and sort views are available for Colt matrices, but not for Jama matrices. (They are difficult to implement efficiently with Jama matrices, because they internally use double[][] arrays).

Performance

No extensive performance studies have been carried out so far.
Jama matrices weakly encapsulate a normal double[][] array. Dense Colt matrices strongly encapsulate a double[] array and use some arithmetic to address cells in 2-d. Addressing a cell is more expensive using double[][] arrays, due to bounds-checking, null pointer checks, non-contigous memory, and problems that compilers have to optimize such code. Using double[] arrays less bounds-checking, less null pointer checks, better cache locality and better compiler optimizations can be seen, often eliminating bounds-checking and null-pointer checks, paving the way for effective pipelining. See the publications of IBM Watson's Ninja project.

To improve performance, matrix computations should use highly optimized kernels in innermost loops. These kernels are not part of class Algebra, but part of DoubleMatrix2D and DoubleMatrix1D. Otherwise they couldn't be fully optimized. For example, with some arithmetic (not exposed to a user), a loop over a 1-d or 2-d matrix can internally reduce cell adressing overhead. Some of the most critical types of (innermost) loop operations have a corresponding optimized method in DoubleMatrix2D and DoubleMatrix1D. For example, dot products, multiplications, assign(function) transforms and aggregate methods are such internally specialized kernels. Feedback may result in a few more optimized kernels. Thus, in the name of performance, in a few cases, algorithms and data structure are not completely separeted.

Some internal optimizations have been introduced, in particular for multiplications, the LU-Decomposition and the Cholesky-Decomposition. The other decomposition classes are almost identical to the corresponding Jama classes - as such they are functional but not (yet) particularly efficient.

For small matrices, you may be better off using Sun's Java 3D 1.2, see javax.vecmath - spec and javax.vecmath javadoc.

Parallel Colt 0.10.1

Jump to the Parallel Colt Homepage