Raúl Gómez blog

Jun 25, 2022 Visual Styles RecSys: Learning Users’ Preferred Visual Styles in an Image Marketplace
A model that learns users’ visual style preferences transversal to the projects they work on, and which aims to personalise the content served at Shutterstock. Presented as an oral in ACM RecSys ‘22 industrial track.
Oct 8, 2020 PhD Thesis Defence
I successfully defended my PhD the 08/10/2020 and got an excellent Cum Laude. In this post the thesis pdf, the presentation slides, and a video of the presentation are made available.
Aug 20, 2020 Retrieval Guided Unsupervised Multi-Domain Image to Image Translation
We propose using an image retrieval system to boost the performance of an image to image translation system, experimenting with a dataset of face images.
Jun 3, 2020 Location Sensitive Image Retrieval and Tagging
We design a model to retrieve images related to a query hashtag and near to a given location, and to tag images exploiting their location information.
Jan 23, 2020 Face Images Retrieval with Attributes Modifications
Design of an image retrieval system able to retrieve images similar to a query image but with some modified attributes. In this case, I work with face images and attributes describing face characteristics.
Oct 9, 2019 Exploring Hate Speech Detection in Multimodal Publications
We target the problem of hate speech detection in multimodal publications formed by a text and an image. We gather and annotate a large scale dataset from Twitter, MMHS150K, and propose different models that jointly analyze textual and visual information for hate speech detection.
May 14, 2019 Selective Text Style Transfer
A selective style transfer model is trained to learn text styles and transfer them to text instances found in images. Experiments in different text domains (scene text, machine printed text and handwritten text) show the potential of text style transfer in different applications.
Apr 3, 2019 Understanding Ranking Loss, Contrastive Loss, Margin Loss, Triplet Loss, Hinge Loss and all those confusing names
A review of different variants and names of Ranking Losses, Siamese Nets, Triplet Nets and their application in multi-modal self-supervised learning.
Jan 14, 2019 A CNN can learn Miró' surrealism: Joan Miró Neural Style Transfer & DeepDream
Magenta Neural Style Transfer is trained to transfer the style of different paintings by Joan Miró. DeepDream is applied on models trained with #joanmiró data to visualize which visual features a CNN learns from those posts.
Oct 10, 2018 Barcelona DeepDream
Using the Google DeepDream algorithm on models trained with #Barcelona Instagram data to visualize what the users (and the CNN) highlight from the city.
Aug 2, 2018 Learning from #Barcelona what Locals and Tourists post about its Neighbourhoods
We learn relations between words, images and Barcelona neighbourhoods from Instragram data. We split the dataset by language and analyze what locals and tourists posts about the different Barcelona neghbourhoods.
Aug 1, 2018 Learning to Learn from Web Data
A performance comparison of different text embeddings in an image by text retrieval task. A multimodal retrieval pipeline is trained in a self-supervised way with Web and Social Media data, and Word2Vec, GloVe, Doc2Vec, FastText and LDA performances in different datasets are reported.
Aug 1, 2018 The InstaCities1M Dataset
A dataset of social media images with associated text formed by Instagram images associated with one of the 10 most populated English speaking cities.
May 23, 2018 Understanding Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss, Softmax Loss, Logistic Loss, Focal Loss and all those confusing names
A review of different variants and names of Cross-Entropy Loss, analyzing its different applications, its gradients and the Cross-Entropy Loss layers in deep learning frameworks.
Feb 11, 2018 Presenting Tourism Applications of my Research at ForumTurisTIC
Applying algorithms to learn from images and associated text to Barcelona Instagram images lead to interesting results for the tourism industry, which I presented in ForumTurisTIC.
Jan 12, 2018 What Do People Think about Barcelona?
A joint image and text embedding is trained using Instagram data related with Barcelona. It is shown how the embedding can be used to do interesting social or commercial analysis, which can be extrapolated to other topics.
Jan 8, 2018 FCN for Face and Hair Segmentation
Training a fully convolutional network to perform pixel level segmentation of faces and hair.
Sep 14, 2017 Data Augmentation Benchmarking on CNN Training
Benchmarking of different data augmentation techniques to train a CNN for image classification. Does data augmentation help to get a model that generalizes better?
Aug 16, 2017 SetaMind: Building an Android App that Identifies the Species of a Mushroom
SetaMind is a simple Android application: You take a picture of a mushroom with your phone and the app identifies the species and provides information about it. To identify the species it uses a classification CNN that runs locally in the phone.
Jul 26, 2017 About my Participation in the WebVision Challenge
An ImageNet-like competition but training with noisy data collected from the web. Notes about training a GoogleNet from scratch in a cluster and about how to combine images, noisy labels and associated text to train a classifier.
Jul 19, 2017 Inferring Ingredients from Food Images
A LDA and a CNN are used to embbed the ingredients lists and the food images respectibly in a topic space. The CNN can predict topic distributions from food images, and from the topic distribution we predict the ingredients.
Jun 30, 2017 Learning Image Topics from Instagram to Build an Image Retrieval System
Learning of a joint embedding of text and images using InstaCities1M. A LDA and a CNN are used to embbed text and images respectibly in a topic space. Then a retrieval by text system is built and tested.
Jun 25, 2017 Using Instagram Data to Learn a City Classifier
Construction of InstaCities1M, a dataset of Instagram images associated to a city and training of a CNN that learns to classify images betweeen the different cities. A simple experiment to show how social media data can be used to learn.