This is a multimodal AI web app built with Streamlit and powered by Google's Gemini 2.5 API, enabling both text-to-text and image-to-text generation.
- ๐ค Text Generation: Enter a prompt and receive intelligent, contextual completions using Gemini's LLM capabilities.
- ๐ผ๏ธ Image Captioning: Upload an image and get a detailed description using Gemini's vision model.
- โก Updated UI: Clean and intuitive layout with improved user experience.
- ๐ Built with Streamlit for responsive, real-time interaction.
- ๐ง Powered by Google Generative AI (LLM + Vision multimodal models).
- Python
- Streamlit
- Google Generative AI (Gemini 2.5)
- PIL (Python Imaging Library)
- Natural Language Text Completion
- Image Understanding / Caption Generation
- AI Demos and Multimodal Prototypes
