大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件
《大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件》由會員分享,可在線閱讀,更多相關《大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件(39頁珍藏版)》請在裝配圖網(wǎng)上搜索。
1、單擊此處編輯母版標題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,單擊此處編輯母版標題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,,,*,單擊此處編輯母版標題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,,,*,,,,本作品采用,知識共享署名,-,非商業(yè)性使用,2.5,中國大陸許可協(xié)議,進行許可。,,專業(yè)交流,模板超市,設計服務,NordriDesign,中國專業(yè),PowerPoint,媒體設計與開發(fā),本作品的提供是以適用知識共享組織的公共許可( 簡稱“,CCPL”,或 “許可”) 條款為前提的。本作品
2、受著作權法以及其他相關法律的保護。對本作品的使用不得超越本許可授權的范圍。,,如您行使本許可授予的使用本作品的權利,就表明您接受并同意遵守本許可的條款。在您接受這些條款和規(guī)定的前提下,許可人授予您本許可所包括的權利。,,,,查看全部,…,單擊此處編輯母版標題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,單擊此處編輯母版標題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,,,*,單擊此處編輯母版標題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,,,*,單擊此處編輯母版標題樣式,,單擊此處編輯母版文本樣式,,第
3、二級,,第三級,,第四級,,第五級,,單擊此處編輯母版標題樣式,,單擊此處編輯母版文本樣式,,第二級,,第三級,,第四級,,第五級,,2013. 7. 20,大型稀疏矩陣,,的,LU,分解及特征值求解,陳英時,,2016 . 1. 9,稀疏矩陣求解的廣泛應用,矩陣求解是數(shù)值計算的核心,[1],,稀疏矩陣求解是數(shù)值計算的關鍵之一,,偏微分方程,積分方程,特征值,優(yōu)化,…,,,萬階以上,dense matrix,不可行,,稀疏矩陣求解往往是資源瓶頸,,時間瓶頸,內(nèi)存,外存等瓶頸,,,直接法基于高斯消元法,即計算,A,的,LU,分解。,A,通常要經(jīng)過一系列置換排序,可歸并為左置換矩陣,P,,右置換
4、矩陣,Q,?;静襟E如下:,,1,)符號分析,:,,得到置換排序矩陣,P Q,,,2,)數(shù)值分解:,,,,3,)前代,回代:,I.S.Duff, A.M.Erisman, and J.K.Reid. Direct Methods for Sparse Matrices. London:Oxford Univ. Press,1986.,,J.J.Dongarra,I.S.Duff, ... Numerical Linear Algebra for High-Performance Computers.,,G.H.Golob, C.F.Van loan. Matrix Computations
5、. The Johns Hopkins University Press. 1996,稀疏矩陣復雜、多變,,基本參數(shù),,,對稱性,稀疏性,非零元分布,,敏感性,病態(tài)矩陣,,,條件數(shù),,格式多變,,,Harwell-Boeing Exchange Format,。。。,,測試集,,,Harwell-Boeing Sparse Matrix Collection,,,UF sparse matrix collection,,,,求解器的飛速發(fā)展,BBMAT,,,38744,階,分解后元素超過四千萬,.,,1988,巨型機,cray-2,上,>1000,秒,,2003 4G umfpac
6、k4 32.6,秒,[4],,2006 3.0G GSS1.2 15,秒,,2012 3.0G 4,核,GSS 2.3 4,秒,,2014 i7 8,核,GSS 2.4 1,秒,,硬件的發(fā)展,CPU,,內(nèi)存等,,稀疏算法逐漸成熟,,,,稀疏排序,密集子陣,multifrontal ,supernodal…,,數(shù)學庫,,,BLAS,,,LAPACK,,稀疏,LU,分解算法的關鍵,根據(jù)符號分析,數(shù)值分解算法的不同,直接法有以下幾類,[15],:,,1,),general technique(,通用方法,),:,,主要采用消去樹等結構進行,LU,分解。適用于非常稀疏的非結構化矩陣。,,
7、2,),frontal scheme(,波前法,),,LU,分解過程中,將連續(xù)多行合并為一個密集子塊,(,波前,),,對這個子塊采用,BLAS,等高效數(shù)學庫進行分解。,,3,),multifrontal method(,多波前法,),,,將符號結構相同的多行,(,列,),合并為多個密集子塊,以這些密集子塊為單位進行,LU,分解。這些子塊的生成,消去,裝配,釋放等都需要特定的數(shù)據(jù)結構及算法。,,1,零元是動態(tài)的概念,需要,稀疏排序,,,減少注入元,(fill-in),2,密集子陣,稀疏,LU,分解 (不考慮,零元,的),LU,分解,稀疏排序,,排序算法的作用是減少矩陣,LU,分解過程中產(chǎn)生的注
8、入元,求解矩陣的最優(yōu)排序方法是個,NP,完全問題(不能夠在合理的時間內(nèi)進行求解),對具體矩陣而言,目前也沒有方法或指標來判定哪種算法好。因此實測不同的算法,對比產(chǎn)生的注入元,是確定排序算法的可靠依據(jù)。,,,主要的排序算法有最小度排序(,MMD,,,minimum degree ordering algorithm,)和嵌套排序(,nested dissection,)兩種,。,矩陣排序方面相應的成熟軟件庫有,AMD[12],、,COLAMD,、,METIS[13],等。,,Yannokakis M. Computing the minimum fill-in in NP-Complete. S
9、IAM J.Algebraic Discrete Methods, 1981, 2:77~79,近似最小度,排序算法,–,商圖,,,近似最小度排序(,AMD,,,Approximate Minimum Degree OrderingAlgorithm,)算法于,1996,年左右由,Patrick R. Amestoy,等學者提出,AMESTOY, P. R., DAVIS, ... 1996a. An approximate minimum degree ordering algorithm. SIAM J. Matrix Anal. Applic. 17, 4, 886,–,905.,為何需
10、要密集子塊(Dense Matrix),,,多波前法,(multifrontal),簡介,發(fā)展,,,Duff and Raid [2],,等分析,改進,[3],,開發(fā),UMFPACK [4],,基本算法,,,利用稀疏矩陣的特性,得到一系列密集子陣(波前)。將,LU,分解轉化為對這些波前的裝配,消去,更新等操作。,,多波前法的優(yōu)點,,波前是,dense matrix ,,可直接調(diào)用高性能庫(,BLAS,等),,密集子陣可以節(jié)省下標存儲,,提高并行性,,目前主要的求解器,,,UMFPACK,,,,GSS,,MUMPS,PARDISO,,WSMP,,HSL MA41,等,,LU,分解形成,front
11、al,10,階矩陣。,,藍點代表非零元。紅點表示分解產(chǎn)生的注入元,(fill-in),,Frontal,劃分,{a}, {c}8q8i88o {e} {f,g}{h,i,j},Frontal,的裝配,消去,更新過程,a,c h,,c · ·,,h · ·,,{c},{f,g},,{e},{h,i,j},{a},c,,g,h,,g · ·,,h · ·,b,e j,,e · ·,,j · ·,f,,g,h,,g,g,·,,h · ·,e,,i,j,,i · ·,,j · ·,h,,i, j,,i,i,·,,j ·,j,w0aa828,d,,i,j,,i · ·,,j · ·,消去
12、樹,消去樹,消去樹是符號分析的關鍵結構,其每個節(jié)點對應于矩陣的一列(行),該節(jié)點只與其父節(jié)點相連,父節(jié)點定義如下:,J.W.H.Liu. The Role of Elimination Trees in Sparse Factorization. SIAM J.Matrix Anal.Applic.,11(1):134-172,1990.,,J.W.H.Liu. Elimination structures for unsymmetric sparse LU factors. SIAM J. Matrix Anal. Appl. 14, no. 2, 334--352, 1993.,,GSS,
13、簡介,標準,C,開發(fā),適用于各種平臺,,比,INTEL PARDISO,更快,更穩(wěn)定,,CPU/GPU,混合計算,,突破,32,位,Windows,內(nèi)存限制,,32,個用戶參數(shù),,,支持用戶定制模塊,高校,研究所,,中國電力科學研究院,,香港大學 南京大學 河海大學,,中國石油大學,,電子科技大學,,三峽大學 山東大學,,user,detail,Why they choose GSS,crosslight,Industry leader in TCAD simulation,Hybrid GPU/CPU version, more than 2 times faster than PARDIS
14、O, MUMPS and other sparse solvers.,soilvision,The most technically advanced suite of 1D/2D/3D geotechnical software,Much faster than their own sparse solver.,FEM consulting,The leading research teams in the area of the Finite Element Method since 1967,GSS is faster than PARDISO and provide many cust
15、om module.,GSCAD,Leading building software in China,GSS provide a user-specific module to deal with ill-conditioned matrix.,ICAROS,A global turnkey geospatial mapping service provider and state of the art photogrammetric technologies developer.,GSS is faster than PARDISO. Also provide some technical
16、 help.,EPRI,China Electric Power Research Institute,3-4 times faster than KLU,他們?yōu)槭裁催x擇,GSS?,GSS,,–,加權,消去樹,工作量消去樹,,基于消去樹結構來計算數(shù)值分解的工作量,將,LU,分解劃分為多個獨立的任務,為高效并行計算奠定基礎。,GSS,,-,雙閾值列選主元算法,GSS,,- CPU/GPU,混合計算,1) After divides LU factorization into many parallel tasks, GSS will use adaptive strategy to run th
17、ese tasks in different hardware (CPU, GPU …). That is, if GPU have high computing power, then it will run more tasks automatically. If CPU is more powerful, then GSS will give it more tasks.,,,2) And furthermore, if CPU is free (have finished all tasks) and GPU still run a task, then GSS will divide
18、 this task to some small tasks then assign some child-task to CPU, then CPU do computing again. So get higher parallel efficiency.,,,3) GSS will also do some testing to get best parameters for different hardware.,GSS,,–,求解,頻域譜元方法生成,的,矩陣,矩陣較稠密,,約,40,萬階,,15,億個非零元,GSS,約需,15G,內(nèi)存,需要求解多個右端項,,32,個右端項 需,80
19、,秒?,進一步優(yōu)化,,CPU/GPU,混合計算 數(shù)值分解約,35,秒,,重復利用符號分析結果,,根據(jù)矩陣的特殊結構來進一步減少非零元,,估計,80/4=20,秒,一次,LU,分解,,符號分解時間,50,秒,,數(shù)值分解時間,46,秒,,回代,2.5,秒,對比測試,The test matrices are all from the UF sparse matrix collection,,PARDISO is from Intel Composer XE 2013 SP1.,,GSS 2.4 use CPU-GPU hybrid computing.,,The testing CPU is I
20、NTEL Core i7-4770(3.4GHz) with 24G memory. The graphics card is ASUS GTX780 (with compute capability 3.5). NVIDIA CUDA Toolkit is 5.5. The operating system is Windows 7 64. Both solvers use default parameters.,,,For large matrices need long time computing, GSS 2.4 is Nearly 3 times faster than PARDI
21、SO. For matrices need short time computing, PARDISO is faster than GSS. One reason is that complex synchronization between CPU/GPU do need some extra time.,大型稀疏矩陣的特征值求解,重視,,A,為,全過程動態(tài)仿真程序中大規(guī)模稀疏矩陣,為,特征值,,x,為對應的特征向量,150 Years old and still alive: eigenproblems,,,1997 --- by Henk A. van der Vorst , Gene
22、 H. Golub,稀疏,LU,分解,-,理論上即高斯消元法,,稀疏特征值,–,趨近于純數(shù)學,代數(shù)特征值問題,,,G.H.Golob, C.F.Van loan. Matrix Computations. The Johns Hopkins University Press. 1996,,Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. Zhaojun Bai , …. 2000,冪迭代,-- Power iteration,冪迭代,--,一個形象的解釋,瑞利商迭代,-- Raylei
23、gh Quotient iteration,子空間迭代,,Krylov,子空間,Arnoldi,迭代,Arnoldi,迭代的基本算法,ARPACK,,,ON RESTARTING THE ARNOLDI METHOD FOR LARGENONSYMMETRIC EIGENVALUE PROBLEMS, MORGAN,,R.B. Lehoucq. Analysis and Implementation of an Implicitly Restarted Iteration. PhD thesis,Arnoldi,迭代求解特征值,Krylov,分解,,--,基于酉相似變換,(unitary si
24、milarity ),重視,基于酉相似變換的分解具有后向穩(wěn)定性,,---Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide,,,P.173,,標準,Arnoldi,分解,廣義,Krylov,分解,,其中,Q,為酉矩陣,且可以連續(xù)疊加,Matrix Algorithms || EigenSystems. G.W.Stewart P.309,實測更,效率,,實測迭代次數(shù),運行時間都減少約,1/3,。,(與,ARPACK,對比),Krylov-schur,分解的優(yōu)點,Deflation,操作的基
25、本思路,也類似,更加復雜,。,易于挑選,ritz,值作為,implicit shift,易于,Deflation(Lock+Purge),Schur,分解將任何一個矩陣歸約為上三角矩陣,對角線即為該矩陣的特征值;并且在這條對角線上,特征值可以通過酉相似變換來任意排列。也就是說,在生成,Rayleigh,矩陣,并計算出所有的,ritz,值之后,可以把需要的,Ritz,值排到前面,而不需要的,Ritz,值排到后面,重啟之后,只有挑出來的,Ritz,值出現(xiàn)在序列中。,G. W. Stewart, A Krylov–Schur algorithm for large eigenproblems, SI
26、AM J. Matrix Anal. Appl., 23 (2001), pp. 601–614.,Krylov-schur,及其重啟,,考慮重啟后,,B,矩陣更加復雜,如右圖所示,包含重啟的,Krylov-Schur,分解算法,Matrix Algorithms || EigenSystems. G.W.Stewart P329-330,收斂速度更快,經(jīng)多個實際算例驗證,其速度明顯快于目前通用的,ARPACK,,一般迭代次數(shù)僅為,ARPACK,的,60%-70%,。,,,Why?,最新算法,,Subspace iteration with approximate spectral proj
27、ection,FEAST AS A SUBSPACE ITERATION EIGENSOLVER ACCELERATED BY APPROXIMATE SPECTRAL PROJECTION,---,P,.,TAK,,,P,.,TANG,,,E,.,POLIZZI,可求出復平面內(nèi)指定區(qū)域內(nèi)的所有特征值,,主要用于對稱矩陣,需推廣到非對稱矩陣,基于,cauch,積分的,spectral projection,shift-invert,變換,標準的,shift-invert,變換,Matrix transformations for computing rightmost eigenvalues
28、of large sparse non-symmetric eigenvalue problems -- K.MEERBERGEN, D.ROOSE,Cayley,變換,Matrix transformations for computing rightmost eigenvalues of large sparse non-symmetric eigenvalue problems -- K.MEERBERGEN, D.ROOSE,,Cayley,變換,Cayley,變換特性,,1,平行于虛軸的直線,,映射到單位圓,,2,該直線左側的點被映射到單位圓內(nèi)部,,3,該直線右側的點被映射到單位
29、圓外部,Filter polynomial,--- Matrix Algorithms || EigenSystems. G.W.Stewart P.317,Aleksei Nikolaevich Krylov,(1863–1945),,showed in 1931 how to use sequences of the form {b, Ab, A2b, . . .} to construct the characteristic polynomial of a matrix. Krylov was a Russian applied mathematician whose scienti
30、fic interests arose from his early training in naval science that involved the theories of buoyancy, stability, rolling and pitching, vibrations, and compass theories. Krylov served as the director of the Physics–Mathematics Institute of the Soviet Academy of Sciences from 1927 until 1932, and in 19
31、43 he was awarded a “state prize” for his work on compass theory. Krylov was made a “hero of socialist labor,” and he is one of a few athematicians to have a lunar feature named in his honor—on the moon there is the “Crater Krylov.”,Walter Edwin Arnold,i (1917–1995),,was an American engineer who pub
32、lished this technique in 1951, not far from the time that Lanczos’s algorithm emerged. Arnoldi received his undergraduate degree in mechanical engineering from Stevens Institute of Technology, Hoboken, New Jersey, in 1937 and his MS degree at Harvard University in 1939. He spent his career working a
33、s an engineer in the Hamilton Standard Division of the United Aircraft Corporation where he eventually became the division’s chief researcher. He retired in 1977. While his research concerned mechanical and aerodynamic properties of aircraft and aerospace structures, Arnoldi’s name is kept alive by
34、his orthogonalization procedure.,附一,,附二 參考文獻,[1] Numerical Analysis. Rainer Kress. Springer-Verlag. 1991,,[2] I.S.Duff, A.M.Erisman, and J.K.Reid. Direct Methods for Sparse Matrices. London:Oxford Univ. Press,1986.,,[3] J.W.H.Liu. The Multifrontal Method for Sparse Matrix Solution: Theory and Pract
35、ice. SIAM Rev., 34 (1992), pp. 82--109.,,[4] T.A.Davis. A column pre-ordering strategy for the unsymmetric-pattern multifrontal method, ACM Trans. Math. Software, vol 30, no. 2, pp. 165-195, 2004.,,[5] N.J.Higham. Accuracy and Stability of Numerical Algorithms. SIAM,2002,,[6] G..H.Golob, C.F.Van loa
36、n. Matrix Computations. The Johns Hopkins University Press. 1996,,[7] J.W.H.Liu. The Multifrontal Method for Sparse Matrix Solution: Theory and Practice. SIAM Rev., 34 (1992), pp. 82--109.,,[8] Fast PageRank Computation via a Sparse Linear System (Extended Abstract)?Gianna M. Del Corso,Antonio Gullí
37、, Francesco Romani.,,[9] Y.Saad, Iterative Methods for Sparse Linear Systems, PWS, Boston,1996,,[10] Y.S. Chen* ,Simon Li. Application of Multifrontal Method for Doubly-Bordered Sparse Matrix in Laser Diode Simulator. NUSOD,2004,,[11],陳英時 吳文勇等,.,采用多波前法求解大型結構方程組,.,建筑結構,,2007,年,09,期,,[12],宋新立,,,陳英時等,.,電力系統(tǒng)全過程動態(tài)仿真中大型稀疏線性方程組的分塊求解算法,,,,非常感謝各位老師,同學!,,,
- 溫馨提示:
1: 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2: 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
3.本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 裝配圖網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
6. 下載文件中如有侵權或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年防凍教育安全教育班會全文PPT
- 2025年寒假安全教育班會全文PPT
- 初中2025年冬季防溺水安全教育全文PPT
- 初中臘八節(jié)2024年專題PPT
- 主播直播培訓提升人氣的方法正確的直播方式如何留住游客
- XX地區(qū)機關工委2024年度年終黨建工作總結述職匯報
- 心肺復蘇培訓(心臟驟停的臨床表現(xiàn)與診斷)
- 我的大學生活介紹
- XX單位2024年終專題組織生活會理論學習理論學習強黨性凝心聚力建新功
- 2024年XX單位個人述職述廉報告
- 一文解讀2025中央經(jīng)濟工作會議精神(使社會信心有效提振經(jīng)濟明顯回升)
- 2025職業(yè)生涯規(guī)劃報告自我評估職業(yè)探索目標設定發(fā)展策略
- 2024年度XX縣縣委書記個人述職報告及2025年工作計劃
- 寒假計劃中學生寒假計劃安排表(規(guī)劃好寒假的每個階段)
- 中央經(jīng)濟工作會議九大看點學思想強黨性重實踐建新功