R has been adopted as a popular data analysis and mining tool in many domain fields over the past decade. As Big Data overwhelms those fields, the computational needs and workload of existing R solutions increases significantly. With recent hardware and software developments, it is possible to enable massive parallelism with existing R solutions with little to no modification. In this paper, three different approaches are evaluated to speed up R computations with the utilization of the multiple cores, the Intel Xeon Phi SE10P Co-processor, and the general purpose graphic processing unit (GPGPU). Performance engineering and evaluation efforts in this study are based on a popular R benchmark script. The paper presents preliminary results on running R-benchmark with the above packages and hardware technology combinations.
| Published in | International Journal on Data Science and Technology (Volume 4, Issue 2) | 
| DOI | 10.11648/j.ijdst.20180402.11 | 
| Page(s) | 42-48 | 
| Creative Commons | 
								 
 This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.  | 
						
| Copyright | 
								 Copyright © The Author(s), 2018. Published by Science Publishing Group  | 
						
Performance Evaluation, R, Intel Xeon Phi, Multi-Core Computing, GPGPU
| [1] | Accelerating the intel math kernel library, 2007. M. Intel. Intel math kernel library, 2007. | 
| [2] | A hardware accelerator for the Intel Math Kernel. J. L. Gustafson and B. S. Greer. ClearSpeed whitepaper. | 
| [3] | Y. El-Khamra, N. Gaffney, D. Walling, E. Wernert, W. Xu, and H. Zhang. Performance evaluation of r with intel xeon phicoprocessor. In Big Data, 2013 IEEE International Conference on, pages 23–30. IEEE, 2013. | 
| [4] | Hui Zhang, Sidharth Thakur, and Andrew J. Hanson. Haptic exploration of mathematical knots. In ISVC (1), pages 745–756, 2007. | 
| [5] | Lin Jing, Xipei Huang, Yiwen Zhong, Yin Wu, and Hui Zhang. Python based 4d visualization environment. International Journal of Advancements in Computing Technology, 4 (16):460–469, September 2012. | 
| [6] | Hui Zhang, Jianguang Weng, and Andrew J. Hanson. A pseudo-haptic knot diagram interface. In Proc. SPIE, volume 7868, pages 786807–786807–14, 2011. | 
| [7] | Guangchen Ruan and Hui Zhang. Conquering Big Data with High Performance Computing, chapter Large-Scale Multimodal Data Exploration with Human in the Loop. Springer International Publishing, Springer International Publishing Switzerland, 2016. | 
| [8] | Jian Zou and Hui Zhang. Conquering Big Data with High Performance Computing, chapter High-Frequency Financial Analysis through High Performance Computing. Springer International Publishing, Springer International Publishing Switzerland, 2016. | 
| [9] | Weijia Xu, Ruizhu Huang, and Hui Zhang. Conquering Big Data with High Performance Computing, chapter Empowering R with High Performance Computing Resources for Big Data Analytics. Springer International Publishing, Springer International Publishing Switzerland, 2016. | 
| [10] | Hui Zhang, Huian Li, Michael J. Boyles, Robert Henschel, Eduardo Kazuo Kohara, and Masatoshi Ando. Exploiting hpc resources for the 3d-time series analysis of caries lesion activity. In Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the Campus and Beyond, XSEDE ’12, pages 19:1–19:8, New York, NY, USA, 2012. ACM. | 
| [11] | Hui Zhang, Michael J. Boyles, Guangchen Ruan, Huian Li, Hongwei Shen, and Masatoshi Ando. Xsede-enabled highthroughput lesion activity assessment. In Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery, XSEDE ’13, pages 10:1–10:8, New York, NY, USA, 2013. ACM. | 
| [12] | Hui Zhang, Jianguang Weng, and Guangchen Ruan. Visualizing 2-dimensional manifolds with curve handles in 4d. IEEE Transactions on Visualization and Computer Graphics, 20 (12):2575–2584, Dec 2014. | 
| [13] | Riqing Chen and Hui Zhang. Large-scale 3D Reconstruction with an R-based Analysis Workflow. In Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT '17). ACM, New York, NY, USA. | 
| [14] | Hui Zhang, Yiwen. Zhong and Juan Lin, Divide-and-conquer strategies for large-scale simulations in R, 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, 2017, pp. 3517-3523. | 
APA Style
Hui Zhang. (2018). Performance Engineering for Scientific Computing with R. International Journal on Data Science and Technology, 4(2), 42-48. https://doi.org/10.11648/j.ijdst.20180402.11
ACS Style
Hui Zhang. Performance Engineering for Scientific Computing with R. Int. J. Data Sci. Technol. 2018, 4(2), 42-48. doi: 10.11648/j.ijdst.20180402.11
AMA Style
Hui Zhang. Performance Engineering for Scientific Computing with R. Int J Data Sci Technol. 2018;4(2):42-48. doi: 10.11648/j.ijdst.20180402.11
@article{10.11648/j.ijdst.20180402.11,
  author = {Hui Zhang},
  title = {Performance Engineering for Scientific Computing with R},
  journal = {International Journal on Data Science and Technology},
  volume = {4},
  number = {2},
  pages = {42-48},
  doi = {10.11648/j.ijdst.20180402.11},
  url = {https://doi.org/10.11648/j.ijdst.20180402.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijdst.20180402.11},
  abstract = {R has been adopted as a popular data analysis and mining tool in many domain fields over the past decade. As Big Data overwhelms those fields, the computational needs and workload of existing R solutions increases significantly. With recent hardware and software developments, it is possible to enable massive parallelism with existing R solutions with little to no modification. In this paper, three different approaches are evaluated to speed up R computations with the utilization of the multiple cores, the Intel Xeon Phi SE10P Co-processor, and the general purpose graphic processing unit (GPGPU). Performance engineering and evaluation efforts in this study are based on a popular R benchmark script. The paper presents preliminary results on running R-benchmark with the above packages and hardware technology combinations.},
 year = {2018}
}
											
										TY - JOUR T1 - Performance Engineering for Scientific Computing with R AU - Hui Zhang Y1 - 2018/06/26 PY - 2018 N1 - https://doi.org/10.11648/j.ijdst.20180402.11 DO - 10.11648/j.ijdst.20180402.11 T2 - International Journal on Data Science and Technology JF - International Journal on Data Science and Technology JO - International Journal on Data Science and Technology SP - 42 EP - 48 PB - Science Publishing Group SN - 2472-2235 UR - https://doi.org/10.11648/j.ijdst.20180402.11 AB - R has been adopted as a popular data analysis and mining tool in many domain fields over the past decade. As Big Data overwhelms those fields, the computational needs and workload of existing R solutions increases significantly. With recent hardware and software developments, it is possible to enable massive parallelism with existing R solutions with little to no modification. In this paper, three different approaches are evaluated to speed up R computations with the utilization of the multiple cores, the Intel Xeon Phi SE10P Co-processor, and the general purpose graphic processing unit (GPGPU). Performance engineering and evaluation efforts in this study are based on a popular R benchmark script. The paper presents preliminary results on running R-benchmark with the above packages and hardware technology combinations. VL - 4 IS - 2 ER -