Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a recursive xLARFT #1080

Merged
merged 21 commits into from
Dec 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
8a338cf
a
jprhyne May 29, 2024
7fdd346
Merge branch 'master' of github.com:jprhyne/lapack
jprhyne Jun 12, 2024
d1f787c
Merge branch 'master' of github.com:jprhyne/lapack
jprhyne Oct 14, 2024
4490848
Merge branch 'master' of github.com:jprhyne/lapack
jprhyne Oct 14, 2024
2122708
DO NOT MERGE: demonstrating changes work
jprhyne Oct 14, 2024
5495628
CAN MERGE: Implemented my version of xlarft with comments added, and …
jprhyne Oct 16, 2024
13aab4a
Merge branch 'Reference-LAPACK:master' into master
jprhyne Oct 16, 2024
298804e
Merge branch 'Reference-LAPACK:master' into larft
jprhyne Oct 16, 2024
b966220
Merge pull request #1 from jprhyne/larft
jprhyne Oct 16, 2024
1ba075c
updating parameter definition in the single complex version
jprhyne Oct 16, 2024
46e8388
Merge branch 'Reference-LAPACK:master' into master
jprhyne Nov 4, 2024
828db43
Merge branch 'Reference-LAPACK:master' into master
jprhyne Nov 16, 2024
3065ee8
Merge branch 'Reference-LAPACK:master' into larft
jprhyne Nov 22, 2024
60c66af
Merge pull request #2 from jprhyne/larft
jprhyne Nov 22, 2024
2534b59
updating documentation to be more descriptive
jprhyne Nov 22, 2024
dadd80e
Merge branch 'master' of github.com:jprhyne/lapack
jprhyne Nov 22, 2024
354a16f
Removed mod files and extranous file changes (hopefully)
jprhyne Nov 30, 2024
273ab49
removed extranous changes (hopefully x2)
jprhyne Nov 30, 2024
d4741c8
removed all extranous changes
jprhyne Nov 30, 2024
db48820
lowered line length to hopefully fix build failures in the CI
jprhyne Nov 30, 2024
e9b05ef
Updated variants information as well as fixed trailing line in zlarft
jprhyne Nov 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions SRC/VARIANTS/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,11 @@ LUREC = lu/REC/cgetrf.o lu/REC/dgetrf.o lu/REC/sgetrf.o lu/REC/zgetrf.o

QRLL = qr/LL/cgeqrf.o qr/LL/dgeqrf.o qr/LL/sgeqrf.o qr/LL/zgeqrf.o

LARFTL2 = larft/LL-LVL2/clarft.o larft/LL-LVL2/dlarft.o larft/LL-LVL2/slarft.o larft/LL-LVL2/zlarft.o


.PHONY: all
all: cholrl.a choltop.a lucr.a lull.a lurec.a qrll.a
all: cholrl.a choltop.a lucr.a lull.a lurec.a qrll.a larftl2.a

cholrl.a: $(CHOLRL)
$(AR) $(ARFLAGS) $@ $^
Expand All @@ -58,9 +60,13 @@ qrll.a: $(QRLL)
$(AR) $(ARFLAGS) $@ $^
$(RANLIB) $@

larftl2.a: $(LARFTL2)
$(AR) $(ARFLAGS) $@ $^
$(RANLIB) $@

.PHONY: clean cleanobj cleanlib
clean: cleanobj cleanlib
cleanobj:
rm -f $(CHOLRL) $(CHOLTOP) $(LUCR) $(LULL) $(LUREC) $(QRLL)
rm -f $(CHOLRL) $(CHOLTOP) $(LUCR) $(LULL) $(LUREC) $(QRLL) $(LARFTL2)
cleanlib:
rm -f *.a
2 changes: 2 additions & 0 deletions SRC/VARIANTS/README
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ This directory contains several variants of LAPACK routines in single/double/com
- [sdcz]geqrf with QR Left Looking Level 3 BLAS version algorithm [2]- Directory: SRC/VARIANTS/qr/LL
- [sdcz]potrf with Cholesky Right Looking Level 3 BLAS version algorithm [2]- Directory: SRC/VARIANTS/cholesky/RL
- [sdcz]potrf with Cholesky Top Level 3 BLAS version algorithm [2]- Directory: SRC/VARIANTS/cholesky/TOP
- [sdcz]larft using a Left Looking Level 2 BLAS version algorithm - Directory: SRC/VARIANTS/larft/LL-LVL2

References:For a more detailed description please refer to
- [1] Toledo, S. 1997. Locality of Reference in LU Decomposition with Partial Pivoting. SIAM J. Matrix Anal. Appl. 18, 4 (Oct. 1997),
Expand All @@ -44,6 +45,7 @@ Corresponding libraries created in SRC/VARIANTS:
- QR Left Looking : qrll.a
- Cholesky Right Looking : cholrl.a
- Cholesky Top : choltop.a
- LARFT Level 2: larftl2.a


===========
Expand Down
328 changes: 328 additions & 0 deletions SRC/VARIANTS/larft/LL-LVL2/clarft.f
Original file line number Diff line number Diff line change
@@ -0,0 +1,328 @@
*> \brief \b CLARFT VARIANT: left-looking Level 2 BLAS version of the algorithm
*
* =========== DOCUMENTATION ===========
*
* Online html documentation available at
* http://www.netlib.org/lapack/explore-html/
*
*> \htmlonly
*> Download CLARFT + dependencies
*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/clarft.f">
*> [TGZ]</a>
*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/clarft.f">
*> [ZIP]</a>
*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/clarft.f">
*> [TXT]</a>
*> \endhtmlonly
*
* Definition:
* ===========
*
* SUBROUTINE CLARFT( DIRECT, STOREV, N, K, V, LDV, TAU, T, LDT )
*
* .. Scalar Arguments ..
* CHARACTER DIRECT, STOREV
* INTEGER K, LDT, LDV, N
* ..
* .. Array Arguments ..
* COMPLEX T( LDT, * ), TAU( * ), V( LDV, * )
* ..
*
*
*> \par Purpose:
* =============
*>
*> \verbatim
*>
*> CLARFT forms the triangular factor T of a complex block reflector H
*> of order n, which is defined as a product of k elementary reflectors.
*>
*> If DIRECT = 'F', H = H(1) H(2) . . . H(k) and T is upper triangular;
*>
*> If DIRECT = 'B', H = H(k) . . . H(2) H(1) and T is lower triangular.
*>
*> If STOREV = 'C', the vector which defines the elementary reflector
*> H(i) is stored in the i-th column of the array V, and
*>
*> H = I - V * T * V**H
*>
*> If STOREV = 'R', the vector which defines the elementary reflector
*> H(i) is stored in the i-th row of the array V, and
*>
*> H = I - V**H * T * V
*> \endverbatim
*
* Arguments:
* ==========
*
*> \param[in] DIRECT
*> \verbatim
*> DIRECT is CHARACTER*1
*> Specifies the order in which the elementary reflectors are
*> multiplied to form the block reflector:
*> = 'F': H = H(1) H(2) . . . H(k) (Forward)
*> = 'B': H = H(k) . . . H(2) H(1) (Backward)
*> \endverbatim
*>
*> \param[in] STOREV
*> \verbatim
*> STOREV is CHARACTER*1
*> Specifies how the vectors which define the elementary
*> reflectors are stored (see also Further Details):
*> = 'C': columnwise
*> = 'R': rowwise
*> \endverbatim
*>
*> \param[in] N
*> \verbatim
*> N is INTEGER
*> The order of the block reflector H. N >= 0.
*> \endverbatim
*>
*> \param[in] K
*> \verbatim
*> K is INTEGER
*> The order of the triangular factor T (= the number of
*> elementary reflectors). K >= 1.
*> \endverbatim
*>
*> \param[in] V
*> \verbatim
*> V is COMPLEX array, dimension
*> (LDV,K) if STOREV = 'C'
*> (LDV,N) if STOREV = 'R'
*> The matrix V. See further details.
*> \endverbatim
*>
*> \param[in] LDV
*> \verbatim
*> LDV is INTEGER
*> The leading dimension of the array V.
*> If STOREV = 'C', LDV >= max(1,N); if STOREV = 'R', LDV >= K.
*> \endverbatim
*>
*> \param[in] TAU
*> \verbatim
*> TAU is COMPLEX array, dimension (K)
*> TAU(i) must contain the scalar factor of the elementary
*> reflector H(i).
*> \endverbatim
*>
*> \param[out] T
*> \verbatim
*> T is COMPLEX array, dimension (LDT,K)
*> The k by k triangular factor T of the block reflector.
*> If DIRECT = 'F', T is upper triangular; if DIRECT = 'B', T is
*> lower triangular. The rest of the array is not used.
*> \endverbatim
*>
*> \param[in] LDT
*> \verbatim
*> LDT is INTEGER
*> The leading dimension of the array T. LDT >= K.
*> \endverbatim
*
* Authors:
* ========
*
*> \author Univ. of Tennessee
*> \author Univ. of California Berkeley
*> \author Univ. of Colorado Denver
*> \author NAG Ltd.
*
*> \ingroup larft
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the old version shall live in VARIANTS, should the group be updated for the documentation? The other files in this folder carry "variant" in the name. Also, could you please add the variants to the build system? SRC/VARIANTS/Makefile and SRC/VARIANTS/README.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the build system catch! It should be in there for the most recent commit I've made. I'm also hoping that I ironed out all the line length related issues (mentioned above)

*
*> \par Further Details:
* =====================
*>
*> \verbatim
*>
*> The shape of the matrix V and the storage of the vectors which define
*> the H(i) is best illustrated by the following example with n = 5 and
*> k = 3. The elements equal to 1 are not stored.
*>
*> DIRECT = 'F' and STOREV = 'C': DIRECT = 'F' and STOREV = 'R':
*>
*> V = ( 1 ) V = ( 1 v1 v1 v1 v1 )
*> ( v1 1 ) ( 1 v2 v2 v2 )
*> ( v1 v2 1 ) ( 1 v3 v3 )
*> ( v1 v2 v3 )
*> ( v1 v2 v3 )
*>
*> DIRECT = 'B' and STOREV = 'C': DIRECT = 'B' and STOREV = 'R':
*>
*> V = ( v1 v2 v3 ) V = ( v1 v1 1 )
*> ( v1 v2 v3 ) ( v2 v2 v2 1 )
*> ( 1 v2 v3 ) ( v3 v3 v3 v3 1 )
*> ( 1 v3 )
*> ( 1 )
*> \endverbatim
*>
* =====================================================================
SUBROUTINE CLARFT( DIRECT, STOREV, N, K, V, LDV, TAU, T, LDT )
*
* -- LAPACK auxiliary routine --
* -- LAPACK is a software package provided by Univ. of Tennessee, --
* -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
*
* .. Scalar Arguments ..
CHARACTER DIRECT, STOREV
INTEGER K, LDT, LDV, N
* ..
* .. Array Arguments ..
COMPLEX T( LDT, * ), TAU( * ), V( LDV, * )
* ..
*
* =====================================================================
*
* .. Parameters ..
COMPLEX ONE, ZERO
PARAMETER ( ONE = ( 1.0E+0, 0.0E+0 ),
$ ZERO = ( 0.0E+0, 0.0E+0 ) )
* ..
* .. Local Scalars ..
INTEGER I, J, PREVLASTV, LASTV
* ..
* .. External Subroutines ..
EXTERNAL CGEMM, CGEMV, CTRMV
* ..
* .. External Functions ..
LOGICAL LSAME
EXTERNAL LSAME
* ..
* .. Executable Statements ..
*
* Quick return if possible
*
IF( N.EQ.0 )
$ RETURN
*
IF( LSAME( DIRECT, 'F' ) ) THEN
PREVLASTV = N
DO I = 1, K
PREVLASTV = MAX( PREVLASTV, I )
IF( TAU( I ).EQ.ZERO ) THEN
*
* H(i) = I
*
DO J = 1, I
T( J, I ) = ZERO
END DO
ELSE
*
* general case
*
IF( LSAME( STOREV, 'C' ) ) THEN
* Skip any trailing zeros.
DO LASTV = N, I+1, -1
IF( V( LASTV, I ).NE.ZERO ) EXIT
END DO
DO J = 1, I-1
T( J, I ) = -TAU( I ) * CONJG( V( I , J ) )
END DO
J = MIN( LASTV, PREVLASTV )
*
* T(1:i-1,i) := - tau(i) * V(i:j,1:i-1)**H * V(i:j,i)
*
CALL CGEMV( 'Conjugate transpose', J-I, I-1,
$ -TAU( I ), V( I+1, 1 ), LDV,
$ V( I+1, I ), 1,
$ ONE, T( 1, I ), 1 )
ELSE
* Skip any trailing zeros.
DO LASTV = N, I+1, -1
IF( V( I, LASTV ).NE.ZERO ) EXIT
END DO
DO J = 1, I-1
T( J, I ) = -TAU( I ) * V( J , I )
END DO
J = MIN( LASTV, PREVLASTV )
*
* T(1:i-1,i) := - tau(i) * V(1:i-1,i:j) * V(i,i:j)**H
*
CALL CGEMM( 'N', 'C', I-1, 1, J-I, -TAU( I ),
$ V( 1, I+1 ), LDV, V( I, I+1 ), LDV,
$ ONE, T( 1, I ), LDT )
END IF
*
* T(1:i-1,i) := T(1:i-1,1:i-1) * T(1:i-1,i)
*
CALL CTRMV( 'Upper', 'No transpose', 'Non-unit', I-1,
$ T,
$ LDT, T( 1, I ), 1 )
T( I, I ) = TAU( I )
IF( I.GT.1 ) THEN
PREVLASTV = MAX( PREVLASTV, LASTV )
ELSE
PREVLASTV = LASTV
END IF
END IF
END DO
ELSE
PREVLASTV = 1
DO I = K, 1, -1
IF( TAU( I ).EQ.ZERO ) THEN
*
* H(i) = I
*
DO J = I, K
T( J, I ) = ZERO
END DO
ELSE
*
* general case
*
IF( I.LT.K ) THEN
IF( LSAME( STOREV, 'C' ) ) THEN
* Skip any leading zeros.
DO LASTV = 1, I-1
IF( V( LASTV, I ).NE.ZERO ) EXIT
END DO
DO J = I+1, K
T( J, I ) = -TAU( I ) * CONJG( V( N-K+I , J ) )
END DO
J = MAX( LASTV, PREVLASTV )
*
* T(i+1:k,i) = -tau(i) * V(j:n-k+i,i+1:k)**H * V(j:n-k+i,i)
*
CALL CGEMV( 'Conjugate transpose', N-K+I-J, K-I,
$ -TAU( I ), V( J, I+1 ), LDV, V( J, I ),
$ 1, ONE, T( I+1, I ), 1 )
ELSE
* Skip any leading zeros.
DO LASTV = 1, I-1
IF( V( I, LASTV ).NE.ZERO ) EXIT
END DO
DO J = I+1, K
T( J, I ) = -TAU( I ) * V( J, N-K+I )
END DO
J = MAX( LASTV, PREVLASTV )
*
* T(i+1:k,i) = -tau(i) * V(i+1:k,j:n-k+i) * V(i,j:n-k+i)**H
*
CALL CGEMM( 'N', 'C', K-I, 1, N-K+I-J,
$ -TAU( I ),
$ V( I+1, J ), LDV, V( I, J ), LDV,
$ ONE, T( I+1, I ), LDT )
END IF
*
* T(i+1:k,i) := T(i+1:k,i+1:k) * T(i+1:k,i)
*
CALL CTRMV( 'Lower', 'No transpose', 'Non-unit',
$ K-I,
$ T( I+1, I+1 ), LDT, T( I+1, I ), 1 )
IF( I.GT.1 ) THEN
PREVLASTV = MIN( PREVLASTV, LASTV )
ELSE
PREVLASTV = LASTV
END IF
END IF
T( I, I ) = TAU( I )
END IF
END DO
END IF
RETURN
*
* End of CLARFT
*
END
Loading