libFLAME Release Notes

Source code

libFLAME is provided as free software, licensed under the GNU Lesser General Public License (LGPL) in two forms:

Previous milestone release. The most recent milestone release of libFLAME, version 2.0, may be found here. Note: This release almost certainly lacks some of our most recent bugfixes and may not be significantly less bug-prone than the nightly snapshots.
Nightly snapshots. We also provide nightly snapshots of the libFLAME source tree, identified by their subversion revision numbers. We strongly encourage interested users to download the latest nightly snapshot instead of the previous milestone release. These snapshots provide the latest set of functionality and bugfixes, but may be slightly more prone to newer, more short-lived bugs when compared to the most recent stable release. This is simply due to the fact that the snapshot may capture recently-introduced bugs or other forms of breakage before a developer can identify and correct the problem. However, we make every effort to keep interim revisions functional and working as much as possible. That said, if you think you've found a bug, please send us feedback!

FLAME is a methodology for developing dense linear algebra libraries that is radically different from the LINPACK/LAPACK approach that dates back to the 1970s. By libFLAME we denote the library that has resulted from this project. For addition information, visit the FLAME home page.

What's provided by libFLAME?

The following libFLAME features benefit both basic and advanced users, as well as library developers:

A solution based on fundamental computer science. The FLAME project advocates a new approach to developing linear algebra libraries. Algorithms are obtained systematically according to rigorous principles of formal derivation. These methods are based on fundamental theorems of computer science to guarantee that the resulting algorithm is also correct. In addition, the FLAME methodology uses a new, more stylized notation for expressing loop-based linear algebra algorithms. This notation closely resembles how algorithms are naturally illustrated with pictures. (See Figure 1 and Figure 2 (left).)

Object-based abstractions and API. The BLAS, LAPACK, and ScaLAPACK projects place backward compatibility as a high priority, which hinders progress towards adopting modern software engineering principles such as object abstraction. libFLAME is built around opaque structures that hide implementation details of matrices, such as leading dimensions, and exports object-based programming interfaces to operate upon these structures. Likewise, FLAME algorithms are expressed (and coded) in terms of smaller operations on sub-partitions of the matrix operands. This abstraction facilitates programming without array or loop indices, which allows the user to avoid painful index-related programming errors altogether. Figure 2 compares the coding styles of libFLAME and LAPACK, highlighting the inherent elegance of FLAME code and its striking resemblance to the corresponding FLAME algorithm shown in Figure 1. This similarity is quite intentional, as it preserves the clarity of the original algorithm as it would be illustrated on a white-board or in a publication.

Educational value. Aside from the potential to introduce students to formal algorithm derivation, FLAME serves as an excellent vehicle for teaching linear algebra algorithms in a classroom setting. The clean abstractions afforded by the API also make FLAME ideally suited for instruction of high-performance linear algebra courses at the undergraduate and graduate level. Robert van de Geijn routinely uses FLAME in his linear algebra and numerical analysis courses. Some colleagues of the FLAME project are even beginning to use the notation to teach classes elsewhere around the country, including Timothy Mattson of Intel Corporation. Historically, the BLAS/LAPACK style of coding has been used in these settings. However, coding in this manner tends to obscure the algorithms; students often get bogged down debugging the frustrating errors that often result from indexing directly into arrays that represent the matrices. (See Figure 2.)

A complete dense linear algebra framework. Like LAPACK, libFLAME provides ready-made implementations of common linear algebra operations. The implementations found in libFLAME mirror many of those found in the BLAS and LAPACK packages. However, unlike LAPACK, libFLAME provides a framework for building complete custom linear algebra codes. We believe such an environment is more useful as it allows the user to quickly prototype a linear algebra solution to fit the needs of his application. We are currently writing a complete user's guide for libFLAME. In the meantime, users may browse the full list of routines available in libFLAME through our online doxygen documentation.

High performance. In our publications and performance graphs, we do our best to dispel the myth that user- and programmer-friendly linear algebra codes cannot yield high performance. Our FLAME implementations of operations such as Cholesky factorization and Triangular Inversion often outperform the corresponding implementations available in the LAPACK library. Figure 3 shows an example of the performance increase possible by using libFLAME compared to LAPACK. Many instances of the libFLAME performance advantage result from the fact that LAPACK provides only one variant (algorithm) of every operation, while libFLAME provides all known variants. This allows the user and/or library developer to choose which algorithmic variant is most appropriate for a given situation. libFLAME relies only on the presence of a core set of highly optimized unblocked routines to perform the small sub-problems found in FLAME algorithm codes. Additional performance results may be found here, at our linear algebra wiki.

Dependency-aware multithreaded parallelism. Until recently, the authors of the BLAS and LAPACK advocated getting shared-memory parallelism from LAPACK routines by simply linking to multithreaded BLAS. This low-level solution requires no changes to LAPACK code but also suffers from sharp limitations in terms of efficiency and scalability for small- and medium-sized matrix problems. The fundamental bottleneck to introducing parallelism directly within many algorithms is the web of data dependencies that inevitably exists between sub-problems. The libFLAME project has developed a runtime system, SuperMatrix, to detect and analyze dependencies found within FLAME algorithms-by-blocks (algorithms whose sub-problems operate only on block operands). Once dependencies are known, the system schedules sub-operations to independent threads of execution. This system is completely abstracted from the algorithm that is being parallelized and requires virtually no change to the algorithm code, but at the same time exposes abundant high-level parallelism. We have observed that this method provides increased performance for a range of small- and medium-sized problems, as shown in Figure 4. The most recent version of LAPACK does not offer any similar mechanism.

Support for hierarchical storage-by-blocks. Storing matrices by blocks, a concept advocated years ago by Fred Gustavson of IBM, often yields performance gains through improved spatial locality. Instead of representing matrices as a single linear array of data with a prescribed leading dimension as legacy libraries require (for column- or row-major order), the storage scheme is encoded into the matrix object. Here, internal elements refer recursively to child objects that represent sub-matrices. Currently, libFLAME provides a subset of the conventional API that supports hierarchical matrices, allowing users to create and manage such matrix objects as well as convert between storage-by-blocks and conventional "flat" storage schemes.

Advanced build system. From its early revisions, libFLAME distributions have been bundled with a robust build system, featuring automatic makefile creation and a configuration script conforming to GNU standards (allowing the user to run the ./configure; make; make install sequence common to many open source software projects). Without any user input, the configure script searches for and chooses compilers based on a pre-defined preference order for each architecture. The user may request specific compilers via the configure interface, or enable other non-default features of libFLAME such as custom memory alignment, multithreading (via POSIX threads or OpenMP), compiler options (debugging symbols, warnings, optimizations), and memory leak detection. The reference BLAS and LAPACK libraries provide no configuration support and require the user to manually modify a makefile with appropriate references to compilers and compiler options depending on the host architecture.

Backwards compatibility with LAPACK. We understand that you may have already invested a lot of time in your current dense linear algebra application. That's why we provide a set of compatibility routines that map conventional LAPACK invocations to their corresponding implementations within libFLAME. By simply linking to libFLAME as explained below, you can take advantage of the performance benefits offered by FLAME with virtually no changes to your application. (Note: currently, any operation called through the liblapack2flame compatibility layer will execute sequentially. In order to invoke our parallelized implementations, you must use native FLAME interfaces.)

Figure 1: Blocked Cholesky Factorization (variant 2) expressed as a FLAME algorithm.

      SUBROUTINE DPOTRF( UPLO, N, A, LDA, INFO )

      CHARACTER          UPLO
      INTEGER            INFO, LDA, N
      DOUBLE PRECISION   A( LDA, * )

      DOUBLE PRECISION   ONE
      PARAMETER          ( ONE = 1.0D+0 )
      LOGICAL            UPPER
      INTEGER            J, JB, NB
      LOGICAL            LSAME
      INTEGER            ILAENV
      EXTERNAL           LSAME, ILAENV
      EXTERNAL           DGEMM, DPOTF2, DSYRK, DTRSM, XERBLA
      INTRINSIC          MAX, MIN

      INFO = 0
      UPPER = LSAME( UPLO, 'U' )
      IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN
         INFO = -1
      ELSE IF( N.LT.0 ) THEN
         INFO = -2
      ELSE IF( LDA.LT.MAX( 1, N ) ) THEN
         INFO = -4
      END IF
      IF( INFO.NE.0 ) THEN
         CALL XERBLA( 'DPOTRF', -INFO )
         RETURN
      END IF

      INFO = 0
      UPPER = LSAME( UPLO, 'U' )

      IF( N.EQ.0 )
     $   RETURN

      NB = ILAENV( 1, 'DPOTRF', UPLO, N, -1, -1, -1 )
      IF( NB.LE.1 .OR. NB.GE.N ) THEN
         CALL DPOTF2( UPLO, N, A, LDA, INFO )
      ELSE
         IF( UPPER ) THEN
*********** Upper triangular case omited for purposes of fair comparison.
         ELSE
            DO 20 J = 1, N, NB
               JB = MIN( NB, N-J+1 )
               CALL DSYRK( 'Lower', 'No transpose', JB, J-1, -ONE,
     $                     A( J, 1 ), LDA, ONE, A( J, J ), LDA )
               CALL DPOTF2( 'Lower', JB, A( J, J ), LDA, INFO )
               IF( INFO.NE.0 )
     $            GO TO 30
               IF( J+JB.LE.N ) THEN
                  CALL DGEMM( 'No transpose', 'Transpose', N-J-JB+1, JB,
     $                        J-1, -ONE, A( J+JB, 1 ), LDA, A( J, 1 ),
     $                        LDA, ONE, A( J+JB, J ), LDA )
                  CALL DTRSM( 'Right', 'Lower', 'Transpose', 'Non-unit',
     $                        N-J-JB+1, JB, ONE, A( J, J ), LDA,
     $                        A( J+JB, J ), LDA )
               END IF
   20       CONTINUE
         END IF
      END IF
      GO TO 40
   30 CONTINUE
      INFO = INFO + J - 1
   40 CONTINUE
      RETURN
      END

Figure 2: FLAME/C code for algorithm shown in Figure 2 (left), representing the style of coding found in libFLAME, and Fortran-77 LAPACK code (right) implementing the same algorithm.

Figure 3: Cholesky Factorization implementations compared on an 8-core Opteron system. Notes: For FLAME experiments, LAPACK was used only for the small unblocked Cholesky subproblem. GotoBLAS was configured to provide multithreaded parallelism for level-3 BLAS operations. Peak system performance is 38.4 GFLOPS.

Figure 4: Cholesky Factorization implementations compared on a 16 core Itanium2 system. Notes: libFLAME uses variant 3 while LAPACK uses variant 2. For non-SuperMatrix experiments, GotoBLAS was configured to provide multithreaded parallelism for level-3 BLAS operations. For SuperMatrix experiments, GotoBLAS parallelism was disabled. Theoretical peak system performance is 96 GFLOPS.

What's new in libFLAME 2.0?

We've added lots of functionality since libFLAME 1.0 was released on April 1, 2007. Here is a basic summary:

Library API and implementations

Integrated FLASH and SuperMatrix with FLAME/C algorithmic variant implementations via internal control trees.
Improved SuperMatrix abstractions and performance.
Added POSIX threads support in SuperMatrix.
New and expanded interfaces for FLASH along with improved, easier-to-read implementations.
Added new and/or extended support for the following operations:
- lu_nopiv: LU factorization without pivoting
- chol: Cholesky factorization
- ttmm: triangular transpose matrix multiply
- trinv: triangular matrix inversion
- spdinv: symmetric positive definite matrix inversion
- sylv: triangular Sylvester equation solver
- transpose: blocked in-place matrix transposition
Implemented remaining unblocked FLAME/C variants for level-3 BLAS operations.
Fixed a prominent 32-bit integer bug, allowing the code to malloc() regions of memory greater than 2GB. This allows the user to create, for example, double-precision matrices larger than 16384-by-16384.
Many other bugfixes and cleanups.

Build system

Added the option of disabling non-critical FLAME code.
Added the option of compiling and including into libFLAME various netlib implementations of the files that are need for external wrappers to LAPACK-level operations. This is mostly useful because many FLAME implementations invoke wrappers to unblocked codes for the smaller subproblems. This applies for most LAPACK-level operations, such as Cholesky, LU, LQ, QR, triangular inversion, and the triangular Sylvester equation solver.
Added the option of disabling control trees in level-3 BLAS front-end.
Added the option of aligning memory to arbitrary base-2 boundaries via posix_memalign().
Added the option of aligning each column of a matrix to base-2 boundaries.
Added the option of interfacing to CBLAS in the external wrappers.
Renamed "parameter checking" option to "internal error checking".
Updated configure to use gfortran over g77, if present.
Combined libflame-base.a, libflame-blas.a, and libflame-lapack.a into a single library archive, libflame.a. (liblapack2flame.a is still a separate library due to the potential for linker symbol conflicts.)
Produce build products in completely separate directories, allowing builds for multiple architectures to be maintained simultaneously with the same source tree.
Updated config.guess and config.sub scripts (previously circa 2004).
Modified and improved various utility scripts.
Removed lots of outdated and unused build system cruft.

Status of operation support

libFLAME contains implementations of many operations that are provided by the BLAS and LAPACK libraries. However, not all FLAME implemenations support every datatype. Also, in many cases, we use a different naming convention for our routine names. The following table summarizes which routines are supported within libFLAME and also provides their corresponding netlib name for reference.

Notes:

y These routines are provided by libFLAME.
? Expands to one of {sdcz}.
~ These routines are not provided by LAPACK.
+ The LAPACK routine ?potri() differs from FLA_SPDinv() and FLASH_SPDinv() in that ?potri() require the user to invoke the Cholesky factorization manually and then pass in the result as input, whereas the FLAME implementations perform the Cholesky factorization internally and automatically.
^ LAPACK provides only an unblocked implementation of the triangular Sylvester equation solver. The lapack2flame compatibility interface maps invocations of ?trsyl() to the blocked implementation in libFLAME.
* Invocations of routines with the FLASH_ prefix call SuperMatrix by default. If SuperMatrix was not enabled at configure-time, or it was disabled at runtime with FLASH_Queue_disable(), then FLASH_ routines execute sequentially, though they will still use hierarchical storage.

operation name	netlib routine name	libFLAME routine name	FLAME/C	FLASH	SuperMatrix	type support	l2f support
libFLAME routine prefix			FLA_	FLASH_*	FLASH_
Level-3 BLAS
general matrix-matrix multiply	?gemm	Gemm	y	y	y	sdcz	N/A
hermitian matrix-matrix multiply	?hemm	Hemm	y	y	y	sdcz	N/A
hermitian rank-k update	?herk	Herk	y	y	y	sdcz	N/A
hermitian rank-2k update	?her2k	Her2k	y	y	y	sdcz	N/A
symmetric matrix-matrix multiply	?symm	Symm	y	y	y	sdcz	N/A
symmetric rank-k update	?syrk	Syrk	y	y	y	sdcz	N/A
symmetrix rank-2k update	?syr2k	Syr2k	y	y	y	sdcz	N/A
triangular matrix-matrix multiply	?trmm	Trmm	y	y	y	sdcz	N/A
triangular solve with multiple right-hand sides	?trsm	Trsm	y	y	y	sdcz	N/A
LAPACK
triangular transpose matrix-matrix multiply	?laaum	Ttmm	y	y	y	sdcz	sdcz
Cholesky factorization	?potrf	Chol	y	y	y	sdcz	sdcz
LU factorization with no pivoting	~	LU_nopiv	y	y	y	sdcz	sdcz
LU factorization with partial pivoting	?getrf	LU_piv	y			sdcz	sdcz
QR factorization	?geqrf	QR	y			sd	d
QR factorization via the UT transform	~	QR_UT	y			sd	d
LQ factorization	?gelqf	LQ	y			sd	d
LQ factorization via the UT transform	~	LQ_UT	y			sd	d
Reduction to upper Hessenberg form	?gehrd	Hess	y			d	d
Trinagular matrix inversion	?trtri	Trinv	y	y	y	sdcz	sdcz
SPD matrix inversion	?dpotri +	SPDinv	y	y	y	sdcz	sdcz
Triangular Sylvester equation solve	?trsyl ^	Sylv	y	y	y	sdcz	sdcz

LAPACK compatibility support in libFLAME

We provide an interface, liblapack2flame, which allows legacy codes that link to LAPACK to utilize libFLAME without any code changes. However, liblapack2flame does not provide interfaces to all routines within LAPACK. The column labeled "l2f support" in the above table shows which datatypes are supported for each operation.

In addition, liblapack2flame provides some interfaces to some routines which are dependent upon the above operations. An incomplete list of these operations is:

dgees, dgeesx, dgeev, dgeevx, dggev, dggevx, dgelq2, dgeqp3, dgeqr2, dggqrf, dggrqf, dgesdd, dgesvd, dposv, dposvx, dsygvd, dsygv, dsygvx, dgegs, dgegv, dgges, dggesx, dggglm, dgglse, dgelsy, dgelsd, dgelss

System and software requirements

Before you attempt to build libFLAME, be sure you have the following software tools:

Linux/UNIX. At this time we only support Linux and Linux-like operating systems. (We have not been able to test our software on Windows under the cygwin environment.)

GNU tools. At this time we strongly recommend the availability of a GNU development environment. If a full GNU environment is not present, then at the very least we absolutely require that reasonably recent versions of GNU make and GNU bash (2.0 or later) are installed and specified in the user's PATH shell environment variable. (Note: On some non-Linux systems, such as AIX and Solaris, GNU make may be named gmake while the older UNIX/BSD implementation retains the name make.)

A working BLAS library. We strongly encourage the use of Kazushige Goto's GotoBLAS with the libFLAME. GotoBLAS provides very good performance on a wide variety of mainstream architectures. However, other BLAS libraries such as ESSL (IBM), MKL (Intel), ACML (AMD), and netlib's BLAS should work just fine as well. Of course, performance will vary depending on which library is used.

Over time, libFLAME has been tested on a wide swath of modern architectures, including but not limited to x86 (Pentium/Athlon family), ia64 (Itanium family), x86_64 (Opteron/EM64T), and POWER4/5. Support by an architecture is primarily determined by the presence of an appropriate compiler. The configure script will attempt to find an appropriate compiler for a given architecure according to a predetermined search order for that architecture. For example, The first C compiler searched for on an Itanium2 system is Intel's icc. If icc is not found, then the search continues for GNU gcc. If gcc is not present, then the script checks for a generic compiler named cc. It is also possible for the user to specify the compiler explicitly at configure-time. Please see ./configure --help for further information on this and other related topics.

Building and Installing libFLAME

After downloading the software, you may proceed to build and install the libraries by performing the following steps. (Note here we assume you're building from a libflame 2.0 tarball.)

tar xzf libflame-2.0.tar.gz
cd libflame-2.0
Configure the library. Please run ./configure --help for the full range configure options.
./configure --prefix=<install_prefix>

Alternatively, you may edit and run the configure wrapper in run-conf/run-configure.sh. Note that specifying the install prefix is optional. If it is omitted, the default is $HOME/flame (which we generally recommend).
Compile the source code.
make -j n

The -j option is optional. When building libFLAME on an SMP or multicore system, you may effectively parallelize the compilation process by specifying an argument n greater than 1. In this case, make spawns n processes, allowing it to compile up to n files simultaneously.
Install the library archive files to <install_prefix> ($HOME/flame by default).
make install

At this point, the libFLAME libraries have been installed into the lib subdirectory of <install_prefix>. We recommend symbolically linking the libraries to abbreviated names that do not contain the version. In addition, you might also omit the architecture from the symbolic link name if you will only be linking code for one architecture. This can be done manually, or with the help of some optional post-installation make targets. Execute

make install-symlinks

to create symbolic links that omit both version and architecture strings from the symbolic link name, or

make install-symlinks-with-arch

to create links that omit the version but contain an architecture string. This allows one to distinguish among libraries compiled for different architectures.

In your application's makefile, refer to the symbolic link. When it comes time to install an updated version of libFLAME, you need only update the symbolic links (ie: execute make install-symlinks) to the FLAME libraries rather than the makefiles of the programs that reference them.

Configure options

If you are interested in configuring libFLAME with non-default options, please see the output of configure --help. We've summarized the most commonly used configure options here:

option	description	default
--enable-optimizations	Employ traditional compiler optimizations when compiling C and Fortran source code.	Enabled
--enable-warnings	Use the appropriate flag(s) to request warnings when compiling C and Fortran source code.	Enabled
--enable-debug	Use the appropriate debug flag (usually -g) when compiling C and Fortran source code.	Disabled
--enable-builtin-lapack-routines	Build and include into libFLAME blocked and unblocked LAPACK routines for all operations supported within libFLAME. When this option is disabled, LAPACK is required at link-time. Note that FLAME implementations of LAPACK operations (such as Cholesky, LU, and QR Factorizations) only use LAPACK code for their unblocked subproblems, though libFLAME also includes wrappers to external blocked implementations for reference testing. Enabling this option is useful when a user is setting up libFLAME for the first time and does not want to build LAPACK from source and has no intention of using a third-party library, such as MKL, to provide basic LAPACK functionality.	Disabled
--enable-goto-interfaces	Enable code that interfaces with internal/low-level libgoto functionality, such as those symbols that may be queried for architecture-dependent blocksize values.	Enabled
--enable-supermatrix	Enable Ernie Chan's dependency-aware task scheduling and parallel execution system.	Disabled
--enable-multithreading=model	Enable multithreading support. Valid values for model are pthreads and openmp. Threading must be enabled to access SMP/multicore parallelized implementations.	Disabled
--enable-memory-alignment=N	Enable code that aligns dynamically allocated memory regions at N-byte boundaries. Note: N must be a power of two and multiple of sizeof(void*), which is usually 4 on 32-bit architectures and 8 on 64-bit architectures.	Disabled
--enable-internal-error-checking	Enable internal runtime consistency checks of function parameters and return values.	Enabled
--enable-memory-counter	Enable code that keeps track of the balance between calls to FLA_malloc() and FLA_free(). Upon calling FLA_Finalize(), the counter value is output to standard error.	Disabled

Building and installing GotoBLAS

The developers of libFLAME enthusiastically encourage users to use the GotoBLAS implementation of the Basic Linear Algebra Subprograms (BLAS). To obtain the source code for GotoBLAS, please visit the Texas Advanced Computing Center software site. After downloading perform the following steps:

tar xzf GotoBLAS-1.22.tar.gz
cd GotoBLAS
Please read the documentation that accompanies the GotoBLAS source.
Most users may build the GotoBLAS library by running quickbuild.32bit or quickbuild.64bit. Alternately, advanced users may instead view and edit Makefile.rule and then execute:

make lib
Copy the library archive to a more permanent directory. You should also symbolically link the libgoto library to an abbreviated name:
ln -s libgoto_ITANIUM2-r1.10.a libgoto.a
If multiple architecture builds of libgoto share the same directory, then you should include an architecture substring in the symbolic link name to differentiate the builds:
ln -s libgoto_ITANIUM2-r1.10.a libgoto_ia64.a

We highly recommend using libFLAME with GotoBLAS! However, libFLAME will work with any BLAS library. If you want to use libFLAME with a different BLAS, use the configure-time option --disable-goto-interfaces before building libFLAME. If you have further questions about interfacing libFLAME with your preferred BLAS library, contact flame@cs.utexas.edu.

Linking your LAPACK dependent application to libFLAME

Develop your algorithm with your favorite implementation of LAPACK. Let's assume that you compile and link your code via
gfortran ... -L<lapack_path> -L<blas_path> -llapack -lblas

where -llapack links the standard LAPACK library and -lblas links your favorite BLAS library, located in <lapack_path> and <blas_path>, respectively.
Once your code works correctly, link to the GotoBLAS library instead. (The GotoBLAS often provides the fastest level-3 BLAS routines available.)
gfortran ... -L<lapack_path> -L<goto_path> -llapack -lgoto

where <goto_path> is the directory in which you keep the GotoBLAS library archive and its symbolic link.
Now it is time to experiment with linking to the libFLAME libraries:
gfortran ... -L<lapack_path> -L<goto_path> -L<flame_path> -llapack2flame -lflame -llapack -lgoto

where <flame_path> is the directory <install_prefix>/lib. Recall that <install_prefix> was determined when you configured FLAME for compiling. If you did not specify <install_prefix>, then the default value is used ($HOME/flame/lib).
- The order in which the libraries are listed is important!
- Including -llapack in the link command ensures that any LAPACK functionality not supported by libFLAME is simply linked from your favorite LAPACK library.
Run your code with the new FLAME implementations of the supported LAPACK routines listed above.

Running an Example

We offer a step-by-step walkthrough for running two example programs included in the libflame source distribution: the first executes a sequential Cholesky factorization with conventional ("flat") matrix storage; the second executes a multithreaded Cholesky factorization using SuperMatrix and hierarchical storage.

We also encourage potential users to browse the code examples provided at our linear algebra wiki.

Beyond LAPACK

We have functionality beyond LAPACK. For example, we have routines for updating an LU factorization with pivoting. Adding additional operations is not our top priority at the moment. However, if you have an operation that you would like to see supported, it doesn't hurt to contact us with your request!

Thank us!

We are very insecure people. So, if you like the libraries and find them useful, send us a message! We even make it easy. In the top-level directory of the libFLAME distribution, execute:

make send-thanks

This will automatically e-mail us a message!

Questions?

Contact flame@cs.utexas.edu.

Last Updated on 6 August 2008 by Field G. Van Zee.