An Alternative Fashion to Automate the Appropriateness of ALT-Text using Microsoft Computer Vision API
Karamjeet Singh Gulati1, Anupreet Sihra2, Veena Khandelwal3, Sergej Dogadov4
1Karamjeet Singh Gulati, SRM Institute of Science & Technology, Delhi NCR Campus, Ghaziabad (U.P), India.
2Anupreet Sihra, Banasthali University, Rajasthan, India.
3Dr. Veena Khandelwal, SRM Institute of Science & Technology, Delhi NCR Campus, Ghaziabad (U.P), India.
4Sergej Dogadov, Technische Universität, Berlin, Germany. 

Manuscript received on 25 October 2022 | Revised Manuscript received on 31 October 2022 | Manuscript Accepted on 15 November 2022 | Manuscript published on 30 November 2022 | PP: 57-63 | Volume-11 Issue-4, November 2022 | Retrieval Number: 100.1/ijrte.D73321111422 | DOI: 10.35940/ijrte.D7332.1111422

Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (

Abstract: Designing and releasing of software’s in production that contains images takes a lot of time due to the need of finding ALT-text attributes for the images embedded in the applications. This paper automates the task of writing ALT-text attributes in HTML, especially if image integration is large with the use of python PIP package and Microsoft Computer Vision API. This will save huge time and efforts of the developers by automating the task of captioning images manually up to a great extent. The challenge that confronts us is the quality of annotations generated by the machine with respect to the human generated annotations. To study the appropriateness of the captions delivered by APIs, a blend of human and machine assessment was used. We have noticed a high similarity in human and machine generated annotations as we obtained individual and cumulative BLEU score metric . Another metric is confidence score with a percentage mean of 0.5 .Also, we have calculated the time taken per caption which is 1.6 seconds per image which took 6.01 minutes to caption 200 images.. 
Keywords: ALT-text, Beautiful Soup, Image Captioning automation, Microsoft Computer Vision API, PIP Package.
Scope of the Article: Computer Vision