Analysis of the Fuzziness of Image Caption Generation Models due to Data Augmentation Techniques
Kota Akshith Reddy¹, Satish C. J², Polsani Jahnavi³, Chintapalli Teja Naveen⁴, Gangapatnam Sai Ananya⁵
¹Kota Akshith Reddy*, Department of Computer Science, Vellore Institute of Technology, Vellore, Tamil Nadu, India.
²Satish C J, Department of Computer Science, Anna University, Tamil Nadu, India.
³Jahnavi Polsani, Department of Computer Science, Vellore Institute of Technology, Vellore, Tamil Nadu, India.
⁴Teja Naveen Chintapalli, Department of Computer Science, Vellore Institute of Technology, Vellore, Tamil Nadu, India.
⁵Gangapatnam Sai Ananya, Department of Computer Science, Narayana Engineering College, Tamil Nadu, India.
Manuscript received on September 03, 2021. | Revised Manuscript received on September 13, 2021. | Manuscript published on September 30, 2021. | PP: 131-139 | Volume-10 Issue-3, September 2021. | Retrieval Number: 100.1/ijrte.C64390910321 | DOI: 10.35940/ijrte.C6439.0910321
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Automatic Image Caption Generation is one of the core problems in the field of Deep Learning. Data Augmentation is a technique which helps in increasing the amount of data at hand and this is done by augmenting the training data using various techniques like flipping, rotating, Zooming, Brightening, etc. In this work, we create an Image Captioning model and check its robustness on all the major types of Image Augmentation techniques. The results show the fuzziness of the model while working with the same image but a different augmentation technique and because of this, a different caption is produced every time a different data augmentation technique is employed. We also show the change in the performance of the model after applying these augmentation techniques. Flickr8k dataset is used for this study along with BLEU score as the evaluation metric for the image captioning model.
Keywords: Automatic Image, Data Augmentation, Flickr8k dataset, BLEU score.

Download PDF

JOURNAL

REQUIREMENTS

PRODUCT

CONTACT US