Subscribe to RSS
DOI: 10.3414/ME11-02-0048
Costs of Cloud Computing for a Biometry Department[*]
A Case StudyPublication History
received:
21 November 2011
accepted:
03 July 2012
Publication Date:
20 January 2018 (online)
Summary
Background: “Cloud” computing providers, such as the Amazon Web Services (AWS), offer stable and scalable computational resources based on hardware virtualization, with short, usually hourly, billing periods. The idea of pay-as-you-use seems appealing for biometry research units which have only limited access to university or corporate data center resources or grids.
Objectives: This case study compares the costs of an existing heterogeneous on-site hardware pool in a Medical Biometry and Statistics department to a comparable AWS offer.
Methods: The “total cost of ownership”, including all direct costs, is determined for the on-site hardware, and hourly prices are derived, based on actual system utilization during the year 2011. Indirect costs, which are difficult to quantify are not included in this comparison, but nevertheless some rough guidance from our experience is given. To indicate the scale of costs for a methodological research project, a simulation study of a permutation-based statistical approach is performed using AWS and on-site hardware.
Results: In the presented case, with a system utilization of 25 –30 percent and 3 – 5-year amortization, on-site hardware can result in smaller costs, compared to hourly rental in the cloud dependent on the instance chosen. Renting cloud instances with sufficient main memory is a deciding factor in this comparison.
Conclusions: Costs for on-site hardware may vary, depending on the specific infrastructure at a research unit, but have only moderate impact on the overall comparison and subsequent decision for obtaining affordable scientific computing resources. Overall utilization has a much stronger impact as it determines the actual computing hours needed per year. Taking this into account, cloud computing might still be a viable option for projects with limited maturity, or as a supplement for short peaks in demand.
Keywords
Costs and cost analysis - biostatistics - mathematical computing - cloud computing - Amazon Web Services* Supplementary material published on our website www.methods-online.com
-
References
- 1 Li XI, Li YI, Liu TI, Qiu JI, Wang FE. The Method and Tool of Cost Analysis for Cloud Computing. IEEE International Conference on Cloud Computing. 2009.
- 2 Juve G, Deelman E, Vahi K, Mehta G, Berriman B, Berman B. et al Scientific Workflow Applications on Amazon EC2. Workshop on Cloud-based Services and Applications in conjunction with 5th IEEE International Conference on e-Science (e-Science 2009). 2009.
- 3 Armbrust M, Fox A, Grifth R, Joseph A.D, Katz R, Konwinski A. et al Above the clouds: A Berkeley view of cloud computing. Technical report, University of California at Berkeley. 2009. URL http://d1smfj0g31qzek.cloudfront.net/abovetheclouds.pdf.
- 4 eScience Institute. Hyak Operating Costs and Comparison with Commercial Alternatives (Internet, cited Oct 17, 2011). Available from. http://escience.washington.edu/get-help-now/hyak-operating-costs-and-comparison-commercial-alternatives.
- 5 eScience Institute. Hyak Operating Costs and Comparison with Commercial Alternatives (Internet, cited Oct 17, 2011). Available from. http://escience.washington.edu/get-help-now/hyak-operating-costs-and-comparison-commercial-alternatives.
- 6 Han Y. On the clouds: a new way of computing. Information Technology Libraries 2010; 29 (02) 87
- 7 Greenberg A, Hamilton J, Maltz DA, Patel P. The cost of a cloud: research problems in data center networks. SIGCOMM Comput Commun Rev 2008; 39 (01) 68-73.
- 8 Gartner, Inc. Distributed Computing Chart of Accounts (Internet). 2003 (cited Oct 16, 2011). Available from http://www.gartner.com/4_decision_tools/modeling_tools/costcat.pdf.
- 9 R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna; Austria: 2011. http://www.R-project.org.
- 10 LG Electronics (Internet). 2011 (cited Oct 23, 2011). Available from http://m.lg.com/uk/air-conditioning/single-split-systems/LG-UV48.NLD.jsp.
- 11 SAS Institute Inc. 2009. What’s New in SAS® 9.2. Cary, NC: SAS Institute Inc.
- 12 StataCorp. 2011. Stata Statistical Software: Release 12. College Station, TX: StataCorp LP.
- 13 Xymon systems and network monitor (Internet). 2011 (cited Oct 16, 2011). Available from http://sourceforge.net/projects/xymon.
- 14 Amazon Web Services. Amazon Elastic Compute Cloud (EC2) (Internet). 2011 (cited Oct 12, 2011). Available from http://aws.amazon.com/ec2.
- 15 Amazon Web Services. Amazon Elastic Block Store (EBS)(Internet). 2011 (cited Oct 12, 2011). Available from http://aws.amazon.com/ebs.
- 16 European Union: Directive 95/46/EC of the European Parliament and of the Council of 24th October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data (Internet, cited Feb 18, 2012). Available from. http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31995L0046:EN:NOT.
- 17 Bernau C, Boulesteix AL, Knaus J. Application of Microarray Analysis Procedures on Cluster and Cloud Platforms. Methods Inf Med 2013; 52: 65-71.
- 18 Yi S, Kondo D, Andrzejak A. Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud. IEEE 3rd International Conference on Cloud Computing. 2010: 236-243.
- 19 Amazon Web Services. Announcing Lower Amazon EC2 Instance Pricing (Internet). 2009 (cited Oct 23 , 2011). Available from http://aws.amazon.com/about-aws/whats-new/2009/10/27/announcing-lower-amazon-ec2-instance-pricing/.
- 20 Amazon Web Services. New Lower Prices for High Memory Double and Quadruple XL Instances (Internet). 2010 (cited Oct 23, 2011). Available from http://aws.amazon.com/about-aws/whats-new/2010/09/01/New-Lower-Prices-for-Amazon-EC2-m2-2xlarge-and-m2-4xlarge-Instances/.
- 21 Chen BE, Sakoda LC, Hsing AW, Rosenberg PS. Resampling-based multiple hypothesis testing procedures for genetic case-control association studies. Genet Epidemiol 2006; 30 (06) 495-507.
- 22 Westfall PH, Zaykin DV, Young SS. Multiple tests for genetic effects in association studies. Methods Mol Biol 2002; 184: 143-168.
- 23 Hieke S. minPtest: Gene region-level testing procedure for SNP data, using the min P test resampling approach. R package version 1.1. 2011 Available from http://CRAN.R-project.org/package=minPtest.
- 24 Wikimedia Foundation. Wikipedia: White Box hardware (Internet). 2011 (cited Nov 21, 2011). Available from http://en.wikipedia.org/w/index.php?title=White_box_%28computer_hardware%29oldid=453882408.
- 25 Knaus J, Porzelius C, Binder H, Schwarzer G. Easier parallel computing in R with snowfall and sfCluster. R Journal 2009; 1: 54-59.
- 26 Bioconductor. Bioconductor in the cloud (Internet). 2011 (cited Oct 17, 2011). Available from http://www.bioconductor.org/help/bioconductor-cloud-ami.
- 27 MIT Office of Educational Innovation and Technology. StarCluster (Internet). 2011 (cited Oct 17, 2011). Available from http://web.mit.edu/stardev/cluster/.