Skip to content

Research at St Andrews

Large-scale hierarchical k-means for heterogeneous many-core supercomputers

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Author(s)

Lideng Li, Teng Yu, Wenlai Zhao, Haohuan Fu, Chenyu Wang, Li Tan, Guangwen Yang, John Thomson

School/Research organisations

Abstract

This paper presents a novel design and implementation of k-means clustering algorithm targeting the Sunway TaihuLight supercomputer. We introduce a multi-level parallel partition approach that not only partitions by dataflow and centroid, but also by dimension. Our multi-level (nkd) approach unlocks the potential of the hierarchical parallelism in the SW26010 heterogeneous many-core processor and the system architecture of the supercomputer.

Our design is able to process large-scale clustering problems with up to 196,608 dimensions and over 160,000 targeting centroids, while maintaining high performance and high scalability, significantly improving the capability of k-means over previous approaches. The evaluation shows our implementation achieves performance of less than 18 seconds per iteration for a largescale clustering case with 196,608 data dimensions and 2,000 centroids by applying 4,096 nodes (1,064,496 cores) in parallel, making k-means a more feasible solution for complex scenarios.
Close

Details

Original languageEnglish
Title of host publicationProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC '18)
Place of PublicationPiscataway
PublisherIEEE Press
Chapter13
Number of pages11
ISBN (Electronic)9781538683842
StatePublished - 11 Nov 2018
EventThe International Conference for High Performance Computing, Networking, Storage, and Analysis - Dallas, United States
Duration: 11 Nov 201816 Nov 2018
https://sc18.supercomputing.org/

Conference

ConferenceThe International Conference for High Performance Computing, Networking, Storage, and Analysis
Abbreviated titleSC18
CountryUnited States
CityDallas
Period11/11/1816/11/18
Internet address

    Research areas

  • Supercomputer, Multi/many-core Processors, Clustering, Parallel computing

Discover related content
Find related publications, people, projects and more using interactive charts.

View graph of relations

Related by author

  1. Lattice-based scheduling for multi-FPGA systems

    Yu, T., Feng, B., Stillwell, M., Guo, L., Ma, Y. & Thomson, J. D. 26 Oct 2018 (Accepted/In press) Proceedings of the International Conference on Field-Programmable Technology 2018, Naha, Okinawa, Japan. IEEE Press

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  2. Predicting and optimizing image compression

    Murashko, O., Thomson, J. D. & Leather, H. 1 Oct 2016 Proceedings of the 24th ACM International Conference on Multimedia. ACM, p. 665-669

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  3. Milepost GCC: Machine Learning Enabled Self-tuning Compiler

    Fursin, G., Kashnikov, Y., Memon, A., Chamski, Z., Temam, O., Namolaru, M., Yom-Tov, E., Mendelson, B., Zaks, A., Courtois, E., Bodin, F., Barnard, P., Ashton, E., Bonilla, E., Thomson, J. D., Williams, C. & O'Boyle, M. 2011 In : International Journal of Parallel Programming. 39, 3, p. 296-327 32 p.

    Research output: Contribution to journalArticle

  4. Automatic OpenCL device characterization: guiding optimized kernel design

    Thoman, P., Kofler, K., Studt, H., Thomson, J. D. & Fahringer, T. 2011 Euro-Par 2011 Parallel Processing: 17th International Conference, Euro-Par 2011, Bordeaux, France, August 29 - September 2, 2011, Proceedings, Part II. Berlin, Heidelberg: Springer-Verlag, p. 438-452 15 p. (Lecture Notes in Computer Science; vol. 6853/2011)

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  5. Workload characterization supporting the development of domain-specific compiler optimizations using decision trees for data mining

    Fenacci, D., Franke, B. & Thomson, J. 2010 Proceedings of the 13th International Workshop on Software 38; Compilers for Embedded Systems. New York, NY, USA: ACM, p. 5:1-5:10 (SCOPES '10)

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

ID: 255501866