dc.date.accessioned | 2014-01-08T16:36:23Z | |
dc.date.available | 2014-01-08T16:36:23Z | |
dc.date.created | 2013-09-26T13:34:43Z | |
dc.date.issued | 2013 | |
dc.identifier.citation | Su, Huayou Wu, Nan Wen, Mei Zhang, Chunyuan Cai, Xing . Performance of sediment transport simulations on NVIDIA’s Kepler architecture. Procedia Computer Science. 2013, 18, 1275-1281 | |
dc.identifier.uri | http://hdl.handle.net/10852/37955 | |
dc.description.abstract | Aiming to understand how high-performance CUDA programming can be done for NVIDIA's new Kepler architecture, we have investigated a specific case of simulating sediment transport. The arisen stencil computations have distinct features connected to the two nonlinear partial differential equations that constitute the mathematical model. Consequently, the required CUDA programming effort differs for the two corresponding CUDA kernel functions. While Kepler's new read-only data cache brings enough benefits for one kernel function, performance of the other kernel function is further enhanceable through using the shared memory and so-called halo threads. The highest achieved performance of the stencil computation amounts to 190.45 GFLOPs on a Tesla K20 GPU. | |
dc.language | EN | |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 Unported | |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/ | |
dc.title | Performance of sediment transport simulations on NVIDIA’s Kepler architecture | |
dc.type | Journal article | |
dc.creator.author | Su, Huayou | |
dc.creator.author | Wu, Nan | |
dc.creator.author | Wen, Mei | |
dc.creator.author | Zhang, Chunyuan | |
dc.creator.author | Cai, Xing | |
cristin.unitcode | 185,15,5,52 | |
cristin.unitname | Beregningsorientert matematikk | |
cristin.ispublished | true | |
cristin.fulltext | original | |
cristin.qualitycode | 1 | |
dc.identifier.cristin | 1052642 | |
dc.identifier.bibliographiccitation | info:ofi/fmt:kev:mtx:ctx&ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Procedia Computer Science&rft.volume=18&rft.spage=1275&rft.date=2013 | |
dc.identifier.jtitle | Procedia Computer Science | |
dc.identifier.volume | 18 | |
dc.identifier.startpage | 1275 | |
dc.identifier.endpage | 1281 | |
dc.identifier.doi | http://dx.doi.org/10.1016/j.procs.2013.05.294 | |
dc.identifier.urn | URN:NBN:no-40241 | |
dc.type.document | Tidsskriftartikkel | |
dc.type.peerreviewed | Peer reviewed | |
dc.source.issn | 1877-0509 | |
dc.identifier.fulltext | Fulltext https://www.duo.uio.no/bitstream/handle/10852/37955/1/iccs2013.pdf | |
dc.type.version | PublishedVersion | |