Retail Site Selection using Machine Learning Algorithms
Hui-Jia Yee1, Choo-Yee Ting2, Chiung Ching Ho3
1Hui-Jia Yee*, Faculty of Computing and Informatics, Multimedia University, Cyberjaya, Malaysia. Email:
2Choo-Yee Ting, Institute of Postgraduate Studies, Multimedia University, Cyberjaya, Malaysia.
3Chiung Ching Ho, Faculty of Computing and Informatics, Multimedia University, Cyberjaya, Malaysia.

Manuscript received on November 15, 2019. | Revised Manuscript received on November 23, 2019. | Manuscript published on November 30, 2019. | PP: 2422-2431 | Volume-8 Issue-4, November 2019. | Retrieval Number: D7186118419/2019©BEIESP | DOI: 10.35940/ijrte.D7186.118419

Open Access | Ethics and Policies | Cite  | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (

Abstract: Selecting a new site for retail business expansion has always been a challenge for decision-makers. It requires not only the sales data but the geographic data in order to decide the potential location for their respective purposes. Proper use of the data could lead to better decision-making. To date, common techniques such as geographic information system (GIS) and multi-criteria decision making (MCDM) have been applied to site selection. These methods, however, require not only extensive human effort but more importantly, difficult to validate the importance of identified variables. In this work, sales performance is proposed as a function of geospatial features to determine the suitability of a retail location. The main aim of this study was to identify features attributed to optimal site selection which in turn facilitate sales prediction for a telecommunication company in Malaysia. In this research, various feature selection techniques and machine learning models were deployed for sales prediction in order to determine the suitability of the new location. The findings show the top 3 feature selections are prediction step in VSURF, random search, and fuse learner with search strategy; the top 3 families are boosting, random forest and bagging; and the top 3 classifiers are C5.0, rf, and parRF. The crossover combination of the top feature selection-classifier can produce the AUC of more than 0.75. The highest AUC, 0.8354 was obtained through random search-parRF.
Keywords: Information Analytics, Machine Learning Comparison, Retail Site Selection.
Scope of the Article: Machine Learning.