Malware Detection by Risky Zone “Signature” Extraction from API Calls String Transformation using (AOSSR) Technique
Gamal A. N. Mohamed1, Norafida Bte Ithnin2
1Gamal A. N. Mohamed, Faculty of Computing, University Technology Malaysia, Skudai, Johor Bahru, Malaysia.
2Department of Computing, Muscat College, Muscat, Sultanate of Oman.
3Dr. Norafida Bte Ithnin, Faculty of Computing, University Technology Malaysia, Skudai, Johor Bahru, Malaysia. 

Manuscript received on January 09, 2020. | Revised Manuscript received on January 22, 2020. | Manuscript published on January 30, 2020. | PP: 2865-2875 | Volume-8 Issue-5, January 2020. | Retrieval Number: E6260018520/2020©BEIESP | DOI: 10.35940/ijrte.E6260.018520

Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Low accuracy of malware detection exists in malware detectors that are based on various malware representation architectures due to several problems as in case of API call graph construction and matching algorithms, where a major issue of building a precise call graph from the information collected about malware samples, also graph matching algorithms are having NP-Complete Problems and slow because of their computational complexity [1], [2]. Moreover, increasing the malware detection accuracy based on API call graphs by enhancing the graph matching and construction algorithms, experiences more computational time taken for matching process and in construction process as there are many graphs created which makes it so difficult to fetch or identify the malware (Elhadi et al., 2013) [3]. It has been further argued by (Li et al., 2018) that in case of malevolent activities, it is comparatively intricate to find whether the software will fall under the spell of malicious occurrences or not, since the duty to determine whether the conduct of a program is malicious is quite complex [4]. This research proposes enhancement of API String-based representation technique through the implementation of a malware detection framework that adopts String-based representation of the malware signature which is a compact representation of the malware risky zone where only the set of API calls representing the actual malware behaviour is accounted in the String using Absolute Order Signature String Representation technique (AOSSR) to represent the malware Strings resulting in a better performance of malware detection accuracy. The Methodology this research work follows mainly composed of three phases. The first phase deals with the conversion of the known malware samples from text format to string format. The second phase addresses the extractaction of the risky zone of an input file which is the file that needs to be checked, by the help of the file signatures already been presented in the database. The third phase addresses how to match two strings efficiency. The Application of the research is very clear as last experiment conducted on 515 malware samples demonstrates that the proposed malware detection architecture has 98% accuracy and 0 false positive rates. Comparing three families, using Analysis of Variance (ANOVA) test it is proven there is no significant difference (p>0.05) in the detection rate of algorithm across the three families of malwares. The algorithms performance is consistent. The result also shows that the Receiver Operating Characteristics (ROC) curves display a better True Positives Rate (TPR) for the proposed architecture over the previous attempts, which reflect significant improvement of TPR.
Keywords: Malware, Detection Techniques, API String, Risky Zone, ANOVA, AOSSR.
Scope of the Article: Software Engineering Techniques and Production Perspectives.