AI Mineral Tech
AUTOMATIC ANALYSIS OF
ROCK THIN SECTIONS
AUTOMATIC IMAGE DESCRIPTION
The goal of automatic image description is to generate a coherent and fluent sentence that accurately describes the content of the image.

Feature extraction from an andesite image

CNN EXPLAINER
Convolutional Neural Network
It is a type of network specifically designed to process data that has a grid structure, such as images.
It is responsible for extracting features from the input image, such as the locations, sizes, and colors of objects.
TRANSFORMER
It is a neural network architecture designed primarily for natural language processing tasks. It was first proposed in the 2017 article "Attention is All You Need" by Vaswani et al.
Instead of relying on recursions or convolutions, the Transformer uses attention mechanisms to process input and output sequences in a parallel and efficient manner.
The Transformer has become the basis for many cutting-edge models in the field of natural language processing, including BERT and GPT.

This component takes image features and descriptions as input and learns to generate them.

EVALUATION OF THE MODEL
The generated descriptions are usually evaluated using metrics such as BLEU, METEOR, ROUGE and CIDEr.
In this model, the BLEU (Bilingual Evaluation Understudy) metric is applied to compare the similarity between the automatic description and the authentic description.