A comprehensive multimodal recommender system for fashion ecommerce

MikoĊ‚aj Wieczorek

supervisor: Grzegorz Pastuszak



This Industrial PhD thesis, conducted in cooperation with Synerise S.A., focuses on creating a comprehensive multimodal recommender system for the fashion industry.


The system relies heavily on Computer Vision methods to encode and ‘understand’ images, as clothes are best assessed based on their look. However, the look itself may not suffice; therefore, the system needs additional data about the user, such as a history of transactions and clicked products; some textual description of the products may also be helpful. Images, text and behavioural data makes the system multimodal and best suited to serve personalised recommendations to users.


There are two main research challenges 1) fusion mechanism to combine information from visual appearance, textual data and user behaviour; 2) preparing data and training schema to train a model a notion of ‘fashionability’ and ‘compatibility for a population/single user.


During the PhD course, two papers were publicised with state-of-the-art results in a fashion retrieval task. Some of the models created so far are currently being prepared to be used by Synerise.


The main goal is to create a comprehensive, production-ready recommender system for fashion retailers. Moreover, each submodule of the comprehensive system is expected to perform on the state-of-the-art level.