Detecting and Captioning Images Using Deep Neural Networks and Flask

Authors

  • Mohammed Saif Scholars, Department of Information Technology Engineering, SKN Sinhgad Institute of Technology and Science, Lonavala, India
  • Vaibhav Mohurle Scholars, Department of Information Technology Engineering, SKN Sinhgad Institute of Technology and Science, Lonavala, India
  • Kajal Dhumale Scholars, Department of Information Technology Engineering, SKN Sinhgad Institute of Technology and Science, Lonavala, India
  • Ajay Sonawane Professor, Department of Information Technology Engineering, SKN Sinhgad Institute of Technology and Science, Lonavala, India

Keywords:

RNN, CNN, LSTM, API, Flask, MSCOCO, NLP Model, Flask Rest API, Transfer Learning, VGG Model, Tensorflow, Keras

Abstract

One of the most important functions of the human visual system is to automatically caption images. There are numerous benefits to having an application that automatically captions the scenes around them and then converts the caption to a plain message. We offer a model based on CNN-LSTM neural networks that recognizes items in photos and creates descriptions for them automatically in this study. It performs the task of object detection using multiple pre-trained models, and the captions are generated using CNN and LSTM. For the job of object detection, it employs Transfer Learning-based pre-trained models. This model is capable of doing two tasks. The first is to recognize objects in the image.

Downloads

Download data is not yet available.

Downloads

Published

21-11-2021

Issue

Section

Articles

How to Cite

[1]
M. Saif, V. Mohurle, K. Dhumale, and A. Sonawane, “Detecting and Captioning Images Using Deep Neural Networks and Flask”, IJRAMT, vol. 2, no. 11, pp. 66–68, Nov. 2021, Accessed: Jan. 22, 2025. [Online]. Available: https://journals.ijramt.com/index.php/ijramt/article/view/1516