大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件

上傳人:陳** 文檔編號:248976459 上傳時(shí)間:2024-10-26 格式:PPT 頁數(shù):39 大?。?.95MB
收藏 版權(quán)申訴 舉報(bào) 下載
大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件_第1頁
第1頁 / 共39頁
大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件_第2頁
第2頁 / 共39頁
大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件_第3頁
第3頁 / 共39頁

下載文檔到電腦,查找使用更方便

15 積分

下載資源

還剩頁未讀,繼續(xù)閱讀

資源描述:

《大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件》由會(huì)員分享,可在線閱讀,更多相關(guān)《大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件(39頁珍藏版)》請?jiān)谘b配圖網(wǎng)上搜索。

1、單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,,,*,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,,,*,,,,本作品采用,知識共享署名,-,非商業(yè)性使用,2.5,中國大陸許可協(xié)議,進(jìn)行許可。,,專業(yè)交流,模板超市,設(shè)計(jì)服務(wù),NordriDesign,中國專業(yè),PowerPoint,媒體設(shè)計(jì)與開發(fā),本作品的提供是以適用知識共享組織的公共許可( 簡稱“,CCPL”,或 “許可”) 條款為前提的。本作品

2、受著作權(quán)法以及其他相關(guān)法律的保護(hù)。對本作品的使用不得超越本許可授權(quán)的范圍。,,如您行使本許可授予的使用本作品的權(quán)利,就表明您接受并同意遵守本許可的條款。在您接受這些條款和規(guī)定的前提下,許可人授予您本許可所包括的權(quán)利。,,,,查看全部,…,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,,,*,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,,,*,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第

3、二級,,第三級,,第四級,,第五級,,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,2013. 7. 20,大型稀疏矩陣,,的,LU,分解及特征值求解,陳英時(shí),,2016 . 1. 9,稀疏矩陣求解的廣泛應(yīng)用,矩陣求解是數(shù)值計(jì)算的核心,[1],,稀疏矩陣求解是數(shù)值計(jì)算的關(guān)鍵之一,,偏微分方程,積分方程,特征值,優(yōu)化,…,,,萬階以上,dense matrix,不可行,,稀疏矩陣求解往往是資源瓶頸,,時(shí)間瓶頸,內(nèi)存,外存等瓶頸,,,直接法基于高斯消元法,即計(jì)算,A,的,LU,分解。,A,通常要經(jīng)過一系列置換排序,可歸并為左置換矩陣,P,,右置換

4、矩陣,Q,。基本步驟如下:,,1,)符號分析,:,,得到置換排序矩陣,P Q,,,2,)數(shù)值分解:,,,,3,)前代,回代:,I.S.Duff, A.M.Erisman, and J.K.Reid. Direct Methods for Sparse Matrices. London:Oxford Univ. Press,1986.,,J.J.Dongarra,I.S.Duff, ... Numerical Linear Algebra for High-Performance Computers.,,G.H.Golob, C.F.Van loan. Matrix Computations

5、. The Johns Hopkins University Press. 1996,稀疏矩陣復(fù)雜、多變,,基本參數(shù),,,對稱性,稀疏性,非零元分布,,敏感性,病態(tài)矩陣,,,條件數(shù),,格式多變,,,Harwell-Boeing Exchange Format,。。。,,測試集,,,Harwell-Boeing Sparse Matrix Collection,,,UF sparse matrix collection,,,,求解器的飛速發(fā)展,BBMAT,,,38744,階,分解后元素超過四千萬,.,,1988,巨型機(jī),cray-2,上,>1000,秒,,2003 4G umfpac

6、k4 32.6,秒,[4],,2006 3.0G GSS1.2 15,秒,,2012 3.0G 4,核,GSS 2.3 4,秒,,2014 i7 8,核,GSS 2.4 1,秒,,硬件的發(fā)展,CPU,,內(nèi)存等,,稀疏算法逐漸成熟,,,,稀疏排序,密集子陣,multifrontal ,supernodal…,,數(shù)學(xué)庫,,,BLAS,,,LAPACK,,稀疏,LU,分解算法的關(guān)鍵,根據(jù)符號分析,數(shù)值分解算法的不同,直接法有以下幾類,[15],:,,1,),general technique(,通用方法,),:,,主要采用消去樹等結(jié)構(gòu)進(jìn)行,LU,分解。適用于非常稀疏的非結(jié)構(gòu)化矩陣。,,

7、2,),frontal scheme(,波前法,),,LU,分解過程中,將連續(xù)多行合并為一個(gè)密集子塊,(,波前,),,對這個(gè)子塊采用,BLAS,等高效數(shù)學(xué)庫進(jìn)行分解。,,3,),multifrontal method(,多波前法,),,,將符號結(jié)構(gòu)相同的多行,(,列,),合并為多個(gè)密集子塊,以這些密集子塊為單位進(jìn)行,LU,分解。這些子塊的生成,消去,裝配,釋放等都需要特定的數(shù)據(jù)結(jié)構(gòu)及算法。,,1,零元是動(dòng)態(tài)的概念,需要,稀疏排序,,,減少注入元,(fill-in),2,密集子陣,稀疏,LU,分解 (不考慮,零元,的),LU,分解,稀疏排序,,排序算法的作用是減少矩陣,LU,分解過程中產(chǎn)生的注

8、入元,求解矩陣的最優(yōu)排序方法是個(gè),NP,完全問題(不能夠在合理的時(shí)間內(nèi)進(jìn)行求解),對具體矩陣而言,目前也沒有方法或指標(biāo)來判定哪種算法好。因此實(shí)測不同的算法,對比產(chǎn)生的注入元,是確定排序算法的可靠依據(jù)。,,,主要的排序算法有最小度排序(,MMD,,,minimum degree ordering algorithm,)和嵌套排序(,nested dissection,)兩種,。,矩陣排序方面相應(yīng)的成熟軟件庫有,AMD[12],、,COLAMD,、,METIS[13],等。,,Yannokakis M. Computing the minimum fill-in in NP-Complete. S

9、IAM J.Algebraic Discrete Methods, 1981, 2:77~79,近似最小度,排序算法,–,商圖,,,近似最小度排序(,AMD,,,Approximate Minimum Degree OrderingAlgorithm,)算法于,1996,年左右由,Patrick R. Amestoy,等學(xué)者提出,AMESTOY, P. R., DAVIS, ... 1996a. An approximate minimum degree ordering algorithm. SIAM J. Matrix Anal. Applic. 17, 4, 886,–,905.,為何需

10、要密集子塊(Dense Matrix),,,多波前法,(multifrontal),簡介,發(fā)展,,,Duff and Raid [2],,等分析,改進(jìn),[3],,開發(fā),UMFPACK [4],,基本算法,,,利用稀疏矩陣的特性,得到一系列密集子陣(波前)。將,LU,分解轉(zhuǎn)化為對這些波前的裝配,消去,更新等操作。,,多波前法的優(yōu)點(diǎn),,波前是,dense matrix ,,可直接調(diào)用高性能庫(,BLAS,等),,密集子陣可以節(jié)省下標(biāo)存儲(chǔ),,提高并行性,,目前主要的求解器,,,UMFPACK,,,,GSS,,MUMPS,PARDISO,,WSMP,,HSL MA41,等,,LU,分解形成,front

11、al,10,階矩陣。,,藍(lán)點(diǎn)代表非零元。紅點(diǎn)表示分解產(chǎn)生的注入元,(fill-in),,Frontal,劃分,{a}, {c}dvrzbdn {e} {f,g}{h,i,j},Frontal,的裝配,消去,更新過程,a,c h,,c · ·,,h · ·,,{c},{f,g},,{e},{h,i,j},{a},c,,g,h,,g · ·,,h · ·,b,e j,,e · ·,,j · ·,f,,g,h,,g,g,·,,h · ·,e,,i,j,,i · ·,,j · ·,h,,i, j,,i,i,·,,j ·,j,drpvftr,d,,i,j,,i · ·,,j · ·,消去

12、樹,消去樹,消去樹是符號分析的關(guān)鍵結(jié)構(gòu),其每個(gè)節(jié)點(diǎn)對應(yīng)于矩陣的一列(行),該節(jié)點(diǎn)只與其父節(jié)點(diǎn)相連,父節(jié)點(diǎn)定義如下:,J.W.H.Liu. The Role of Elimination Trees in Sparse Factorization. SIAM J.Matrix Anal.Applic.,11(1):134-172,1990.,,J.W.H.Liu. Elimination structures for unsymmetric sparse LU factors. SIAM J. Matrix Anal. Appl. 14, no. 2, 334--352, 1993.,,GSS,

13、簡介,標(biāo)準(zhǔn),C,開發(fā),適用于各種平臺(tái),,比,INTEL PARDISO,更快,更穩(wěn)定,,CPU/GPU,混合計(jì)算,,突破,32,位,Windows,內(nèi)存限制,,32,個(gè)用戶參數(shù),,,支持用戶定制模塊,高校,研究所,,中國電力科學(xué)研究院,,香港大學(xué) 南京大學(xué) 河海大學(xué),,中國石油大學(xué),,電子科技大學(xué),,三峽大學(xué) 山東大學(xué),,user,detail,Why they choose GSS,crosslight,Industry leader in TCAD simulation,Hybrid GPU/CPU version, more than 2 times faster than PARDIS

14、O, MUMPS and other sparse solvers.,soilvision,The most technically advanced suite of 1D/2D/3D geotechnical software,Much faster than their own sparse solver.,FEM consulting,The leading research teams in the area of the Finite Element Method since 1967,GSS is faster than PARDISO and provide many cust

15、om module.,GSCAD,Leading building software in China,GSS provide a user-specific module to deal with ill-conditioned matrix.,ICAROS,A global turnkey geospatial mapping service provider and state of the art photogrammetric technologies developer.,GSS is faster than PARDISO. Also provide some technical

16、 help.,EPRI,China Electric Power Research Institute,3-4 times faster than KLU,他們?yōu)槭裁催x擇,GSS?,GSS,,–,加權(quán),消去樹,工作量消去樹,,基于消去樹結(jié)構(gòu)來計(jì)算數(shù)值分解的工作量,將,LU,分解劃分為多個(gè)獨(dú)立的任務(wù),為高效并行計(jì)算奠定基礎(chǔ)。,GSS,,-,雙閾值列選主元算法,GSS,,- CPU/GPU,混合計(jì)算,1) After divides LU factorization into many parallel tasks, GSS will use adaptive strategy to run th

17、ese tasks in different hardware (CPU, GPU …). That is, if GPU have high computing power, then it will run more tasks automatically. If CPU is more powerful, then GSS will give it more tasks.,,,2) And furthermore, if CPU is free (have finished all tasks) and GPU still run a task, then GSS will divide

18、 this task to some small tasks then assign some child-task to CPU, then CPU do computing again. So get higher parallel efficiency.,,,3) GSS will also do some testing to get best parameters for different hardware.,GSS,,–,求解,頻域譜元方法生成,的,矩陣,矩陣較稠密,,約,40,萬階,,15,億個(gè)非零元,GSS,約需,15G,內(nèi)存,需要求解多個(gè)右端項(xiàng),,32,個(gè)右端項(xiàng) 需,80

19、,秒?,進(jìn)一步優(yōu)化,,CPU/GPU,混合計(jì)算 數(shù)值分解約,35,秒,,重復(fù)利用符號分析結(jié)果,,根據(jù)矩陣的特殊結(jié)構(gòu)來進(jìn)一步減少非零元,,估計(jì),80/4=20,秒,一次,LU,分解,,符號分解時(shí)間,50,秒,,數(shù)值分解時(shí)間,46,秒,,回代,2.5,秒,對比測試,The test matrices are all from the UF sparse matrix collection,,PARDISO is from Intel Composer XE 2013 SP1.,,GSS 2.4 use CPU-GPU hybrid computing.,,The testing CPU is I

20、NTEL Core i7-4770(3.4GHz) with 24G memory. The graphics card is ASUS GTX780 (with compute capability 3.5). NVIDIA CUDA Toolkit is 5.5. The operating system is Windows 7 64. Both solvers use default parameters.,,,For large matrices need long time computing, GSS 2.4 is Nearly 3 times faster than PARDI

21、SO. For matrices need short time computing, PARDISO is faster than GSS. One reason is that complex synchronization between CPU/GPU do need some extra time.,大型稀疏矩陣的特征值求解,重視,,A,為,全過程動(dòng)態(tài)仿真程序中大規(guī)模稀疏矩陣,為,特征值,,x,為對應(yīng)的特征向量,150 Years old and still alive: eigenproblems,,,1997 --- by Henk A. van der Vorst , Gene

22、 H. Golub,稀疏,LU,分解,-,理論上即高斯消元法,,稀疏特征值,–,趨近于純數(shù)學(xué),代數(shù)特征值問題,,,G.H.Golob, C.F.Van loan. Matrix Computations. The Johns Hopkins University Press. 1996,,Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. Zhaojun Bai , …. 2000,冪迭代,-- Power iteration,冪迭代,--,一個(gè)形象的解釋,瑞利商迭代,-- Raylei

23、gh Quotient iteration,子空間迭代,,Krylov,子空間,Arnoldi,迭代,Arnoldi,迭代的基本算法,ARPACK,,,ON RESTARTING THE ARNOLDI METHOD FOR LARGENONSYMMETRIC EIGENVALUE PROBLEMS, MORGAN,,R.B. Lehoucq. Analysis and Implementation of an Implicitly Restarted Iteration. PhD thesis,Arnoldi,迭代求解特征值,Krylov,分解,,--,基于酉相似變換,(unitary si

24、milarity ),重視,基于酉相似變換的分解具有后向穩(wěn)定性,,---Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide,,,P.173,,標(biāo)準(zhǔn),Arnoldi,分解,廣義,Krylov,分解,,其中,Q,為酉矩陣,且可以連續(xù)疊加,Matrix Algorithms || EigenSystems. G.W.Stewart P.309,實(shí)測更,效率,,實(shí)測迭代次數(shù),運(yùn)行時(shí)間都減少約,1/3,。,(與,ARPACK,對比),Krylov-schur,分解的優(yōu)點(diǎn),Deflation,操作的基

25、本思路,也類似,更加復(fù)雜,。,易于挑選,ritz,值作為,implicit shift,易于,Deflation(Lock+Purge),Schur,分解將任何一個(gè)矩陣歸約為上三角矩陣,對角線即為該矩陣的特征值;并且在這條對角線上,特征值可以通過酉相似變換來任意排列。也就是說,在生成,Rayleigh,矩陣,并計(jì)算出所有的,ritz,值之后,可以把需要的,Ritz,值排到前面,而不需要的,Ritz,值排到后面,重啟之后,只有挑出來的,Ritz,值出現(xiàn)在序列中。,G. W. Stewart, A Krylov–Schur algorithm for large eigenproblems, SI

26、AM J. Matrix Anal. Appl., 23 (2001), pp. 601–614.,Krylov-schur,及其重啟,,考慮重啟后,,B,矩陣更加復(fù)雜,如右圖所示,包含重啟的,Krylov-Schur,分解算法,Matrix Algorithms || EigenSystems. G.W.Stewart P329-330,收斂速度更快,經(jīng)多個(gè)實(shí)際算例驗(yàn)證,其速度明顯快于目前通用的,ARPACK,,一般迭代次數(shù)僅為,ARPACK,的,60%-70%,。,,,Why?,最新算法,,Subspace iteration with approximate spectral proj

27、ection,FEAST AS A SUBSPACE ITERATION EIGENSOLVER ACCELERATED BY APPROXIMATE SPECTRAL PROJECTION,---,P,.,TAK,,,P,.,TANG,,,E,.,POLIZZI,可求出復(fù)平面內(nèi)指定區(qū)域內(nèi)的所有特征值,,主要用于對稱矩陣,需推廣到非對稱矩陣,基于,cauch,積分的,spectral projection,shift-invert,變換,標(biāo)準(zhǔn)的,shift-invert,變換,Matrix transformations for computing rightmost eigenvalues

28、of large sparse non-symmetric eigenvalue problems -- K.MEERBERGEN, D.ROOSE,Cayley,變換,Matrix transformations for computing rightmost eigenvalues of large sparse non-symmetric eigenvalue problems -- K.MEERBERGEN, D.ROOSE,,Cayley,變換,Cayley,變換特性,,1,平行于虛軸的直線,,映射到單位圓,,2,該直線左側(cè)的點(diǎn)被映射到單位圓內(nèi)部,,3,該直線右側(cè)的點(diǎn)被映射到單位

29、圓外部,Filter polynomial,--- Matrix Algorithms || EigenSystems. G.W.Stewart P.317,Aleksei Nikolaevich Krylov,(1863–1945),,showed in 1931 how to use sequences of the form {b, Ab, A2b, . . .} to construct the characteristic polynomial of a matrix. Krylov was a Russian applied mathematician whose scienti

30、fic interests arose from his early training in naval science that involved the theories of buoyancy, stability, rolling and pitching, vibrations, and compass theories. Krylov served as the director of the Physics–Mathematics Institute of the Soviet Academy of Sciences from 1927 until 1932, and in 19

31、43 he was awarded a “state prize” for his work on compass theory. Krylov was made a “hero of socialist labor,” and he is one of a few athematicians to have a lunar feature named in his honor—on the moon there is the “Crater Krylov.”,Walter Edwin Arnold,i (1917–1995),,was an American engineer who pub

32、lished this technique in 1951, not far from the time that Lanczos’s algorithm emerged. Arnoldi received his undergraduate degree in mechanical engineering from Stevens Institute of Technology, Hoboken, New Jersey, in 1937 and his MS degree at Harvard University in 1939. He spent his career working a

33、s an engineer in the Hamilton Standard Division of the United Aircraft Corporation where he eventually became the division’s chief researcher. He retired in 1977. While his research concerned mechanical and aerodynamic properties of aircraft and aerospace structures, Arnoldi’s name is kept alive by

34、his orthogonalization procedure.,附一,,附二 參考文獻(xiàn),[1] Numerical Analysis. Rainer Kress. Springer-Verlag. 1991,,[2] I.S.Duff, A.M.Erisman, and J.K.Reid. Direct Methods for Sparse Matrices. London:Oxford Univ. Press,1986.,,[3] J.W.H.Liu. The Multifrontal Method for Sparse Matrix Solution: Theory and Pract

35、ice. SIAM Rev., 34 (1992), pp. 82--109.,,[4] T.A.Davis. A column pre-ordering strategy for the unsymmetric-pattern multifrontal method, ACM Trans. Math. Software, vol 30, no. 2, pp. 165-195, 2004.,,[5] N.J.Higham. Accuracy and Stability of Numerical Algorithms. SIAM,2002,,[6] G..H.Golob, C.F.Van loa

36、n. Matrix Computations. The Johns Hopkins University Press. 1996,,[7] J.W.H.Liu. The Multifrontal Method for Sparse Matrix Solution: Theory and Practice. SIAM Rev., 34 (1992), pp. 82--109.,,[8] Fast PageRank Computation via a Sparse Linear System (Extended Abstract)?Gianna M. Del Corso,Antonio Gullí

37、, Francesco Romani.,,[9] Y.Saad, Iterative Methods for Sparse Linear Systems, PWS, Boston,1996,,[10] Y.S. Chen* ,Simon Li. Application of Multifrontal Method for Doubly-Bordered Sparse Matrix in Laser Diode Simulator. NUSOD,2004,,[11],陳英時(shí) 吳文勇等,.,采用多波前法求解大型結(jié)構(gòu)方程組,.,建筑結(jié)構(gòu),,2007,年,09,期,,[12],宋新立,,,陳英時(shí)等,.,電力系統(tǒng)全過程動(dòng)態(tài)仿真中大型稀疏線性方程組的分塊求解算法,,,,非常感謝各位老師,同學(xué)!,,,

展開閱讀全文
溫馨提示:
1: 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2: 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3.本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 裝配圖網(wǎng)僅提供信息存儲(chǔ)空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

最新文檔

相關(guān)資源

更多
正為您匹配相似的精品文檔
關(guān)于我們 - 網(wǎng)站聲明 - 網(wǎng)站地圖 - 資源地圖 - 友情鏈接 - 網(wǎng)站客服 - 聯(lián)系我們

copyright@ 2023-2025  zhuangpeitu.com 裝配圖網(wǎng)版權(quán)所有   聯(lián)系電話:18123376007

備案號:ICP2024067431-1 川公網(wǎng)安備51140202000466號


本站為文檔C2C交易模式,即用戶上傳的文檔直接被用戶下載,本站只是中間服務(wù)平臺(tái),本站所有文檔下載所得的收益歸上傳人(含作者)所有。裝配圖網(wǎng)僅提供信息存儲(chǔ)空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對上載內(nèi)容本身不做任何修改或編輯。若文檔所含內(nèi)容侵犯了您的版權(quán)或隱私,請立即通知裝配圖網(wǎng),我們立即給予刪除!