4th Interdisciplinary Doctoral School Seminar

A unified framework for testing image captioning models.

Mateusz Bartosiewicz

supervisor: Marcin Iwanowski

My doctoral dissertation focused on investigating methods and algorithms for image captioning. The work aims to automate the process of generating image descriptions in the form of sentences fully describing the scene.

Unified Testing Framework allows to automate the process of designing image captioning models and provides a repeatable, heterogenous data environment to test models at each stage of development. Furthermore, it provides a unified input data structure that the model can easily use.

The framework is applicable in each stage of development of the target model. It allows focusing on designing model architecture rather than other steps like data loading, data preprocessing, or building an evaluation engine. Framework support COCO, Flickr8k, Flickr30k datasets, and available languages are Polish and English. Users can also shuffle splits of mentioned datasets.

Unified Testing Framework is built along six general-purpose modules. It operates on the basis of a configuration file, where the user defines the source dataset for the training and testing stages. In the Data Loader module, data is loaded to the unified structure accordingly to the configuration defined in the configuration file. Data Processor prepares captions and images to be consumed by the model. Users can entirely change components of this module to check how it affects the training results. In the Model module user defines the general Image Captioning model. The sample is already provided in the Framework source code. Model Evaluator is a part of the testing stage and evaluates the model with the BLEU_1-4, METEOR, CIDEr, WMD, ROGUE, and SPICE metrics. Implementations are placed in the separated module to maintain consistency in the versions of implementations of those metrics. Finally, Framework generates a report, with all previously mentioned metrics, separately for each caption from the test set and, overall, for the whole test set. Furthermore, the general overview of all testing results is generated for all previously run metrics.

Unified Testing Framework provides easy to run and customize research environment. It may be applied to check how datasets affect the results of the particular model run.

Go back