Yannis Avrithis

Working on computer vision and machine learning

News

Apr 15, 2025

Preprint released at arXiv: Deep Multimodal Fusion

Feb 17, 2025

I serve as Area Chair of NeurIPS 2025

Dec 19, 2024

Paper accepted at PR: Unlabeled Data in Few-Shot Learning

Dec 05, 2024

FreeDom code released

Dec 04, 2024

Preprint released at arXiv: Composed Image Retrieval

Oct 28, 2024

Paper accepted at WACV 2025: Composed Image Retrieval

Oct 23, 2024

Paper accepted at TASL: Power Set Mixing

Sep 23, 2024

Felipe Torres Figueroa defends his PhD thesis on Interpretable Visual Recognition

Sep 10, 2024

Bill Psomas defends his PhD thesis on Visual and Multimodal Representations

Aug 09, 2024

Opti-CAM code released

Jul 29, 2024

Paper accepted at CVIU: Saliency Maps for Interpretability

Jul 01, 2024

Shashanka Venkataramanan defends his PhD thesis on Data Augmentation

Jun 11, 2024

Deniz Engin defends her PhD thesis on Video Question Answering

May 24, 2024

Preprint released at arXiv: Composed Retrieval for Remote Sensing

May 13, 2024

UT-KD code released

May 10, 2024

Preprint released at arXiv: Unsupervised Domain Adaptation

May 06, 2024

Our Walking Tours receives an Outstanding Paper Honorable Mention of ICLR 2024

May 01, 2024

I serve as Area Chair of BMVC 2024

Apr 29, 2024

DoRA code released

Apr 23, 2024

Preprint released at arXiv: Interpretable Gradients

Apr 23, 2024

Preprint released at arXiv: Cross Attention Stream

Apr 07, 2024

Paper accepted at XAI4CV/CVPR 2024: Cross Attention Stream

Apr 01, 2024

Preprint released at arXiv: Revisiting Google Landmarks

Mar 29, 2024

RGLDv2-clean dataset released

Mar 27, 2024

I serve as Area Chair of NeurIPS 2024

Mar 15, 2024

Paper accepted at IGARSS 2024: Composed Retrieval for Remote Sensing

Feb 26, 2024

Paper accepted at CVPR 2024: Revisiting Google Landmarks

Jan 24, 2024

AM code released

Jan 19, 2024

I am a Program Committee member of ESSAI 2024

Jan 18, 2024

WTours dataset released

Full timeline

About me

I am a researcher on computer vision and machine learning. Between 2022 and 2023 I have been a Principal Investigator at the Institute of Advanced Research on Artificial Intelligence (IARAI). Between 2021 and 2022 I have been a Research Director in Information Management Systems Institute (IMSI) of Athena Research Center. I enjoy working on learning visual representations from data, with as little supervision as possible. My recent work is focusing on different learning settings including metric learning, incremental learning and few-shot learning; multimodal learning including vision, language and 3D models; interpretability; and video question answering.

Between 2016 and 2021 I have been a research scientist in LinkMedia team of Inria Rennes-Bretagne Atlantique and I have been teaching Deep Learning for Vision at University of Rennes 1. My work has focused on exploring the manifold structure of data and, apart from image retrieval, using it for unsupervised, semi-supervised and few-shot learning. I have also worked on adversarial examples and on investigating the sparsity of convolutional activations, applied to spatial matching and unsupervised object discovery. In 2020, I was awarded the Habilitation à Diriger des Recherches (HDR) qualification from University of Rennes 1.

In 2015 I have been at the Laboratory of Algebraic and Geometric Algorithms (ΕρΓΑ) of National and Kapodistrian University of Athens (NKUA), where I have worked on large-scale clustering and nearest neighbor search with Ioannis Emiris. An achievement of this period has been Web-Scale Image Clustering.

Between 2001 and 2015 I have been at the Image, Video and Multimedia Systems Laboratory (IVML) of the National Technical University of Athens (NTUA). In the latter part of this period, since 2008, I have been leading the Image and Video Analysis (IVA) team. We have worked on a diverse set of problems including local feature and salient region detection, spatial and spatiotemporal visual representation and matching, object recognition and tracking, action recognition in video, scene classification, image/video indexing, retrieval and summarization.

A large body of this work has focused on local feature/descriptor representations. In this context, we have delivered repeatable feature detection, extremely fast spatial matching, geometry indexing, multi-view and single-view feature selection, approximate nearest neighbor search for large scale clustering and vocabulary construction, and mining 3d scenes from millions of images. Examples of this work are Hough Pyramid Matching (HPM), the Aggregated Selective Match Kernel (ASMK), and Locally Optimized Product Quantization (LOPQ). In 2017, using a CNN image representation, Yahoo! Research has chosen LOPQ to index and provide a similar image search functionality on its entire Flickr collection. A flagship product of this period has been our unique application Visual Image Retrieval and Localization (VIRaL).

In the preceding period before 2008, I have worked on problems related to object detection and image understanding. This includes regions-of-interest for generic object detection, semantic segmentation, integrated segmentation and region labeling, spatio-temporal object segmentation and tracking, image classification, as well as the use of visual context and prior knowledge.

While at NTUA, I have had the opportunity to collaborate with a large number of researchers across Europe by participating in Networks of Excellence Muscle and K-Space and Integrated Projects like aceMedia and WeKnowIt. I have developed a number of activities including being the Program Chair or General Chair of workshops and conferences in the field of multimedia like WIAMIS, CBMI, MMM and CIVR. For several years I have been teaching Signals and Systems and Image and Video Analysis.

Between 1996 and 2001 I have been working on my Ph.D. at NTUA with Stefanos Kollias, studying visual representations for video sequence analysis. I have investigated accurate spatio-temporal segmentation by fusing color, motion and depth, as well as a region-based representation for retrieval and summarization in the form of key-frames. I have designed an affine-invariant representation of object contours for shape-based object classification and retrieval. Finally, I have studied temporal segmentation and parsing of broadcast news based on human face detection.

In 1993-1994 I completed a M.Sc. with Distinction in Communications and Signal Processing at Imperial College, University of London. As part of my Masters thesis, I have worked on communications and investigated the capacity of a cellular CDMA system with Athanassios Manikas.

Between 1998 and 1993 I have studied Electrical and Computer Engineering at NTUA. As part of my Diploma thesis, I have worked on analog electronics and developed a hardware implementation of a fuzzy logic processor with Yannis Tsividis. Before that, my first exposure to computer vision and machine learning has been the study of an invariant representation for optical character recognition with Anastasios Delopoulos, leading to my first conference publication before my Diploma.

Complete resume

Recent publications

Composed Image Retrieval for Training-Free Domain Conversion

N. Efthymiadis, B. Psomas, Z. Laskar, K. Karantzalos, Y. Avrithis, O. Chum, G. Tolias

WACV 2025

Exploiting Unlabeled Data in Few-Shot Learning

M. Lazarou, T. Stathaki, Y. Avrithis

PR, 2025

Adaptive Manifold for Imbalanced Transductive Few-Shot Learning

M. Lazarou, Y. Avrithis, T. Stathaki

WACV 2024

A Learning Paradigm for Interpretable Gradients

F. Torres Figueroa, H. Zhang, R. Sicre, Y. Avrithis, S. Ayache

VISAPP 2024 Oral

Is Imagenet Worth 1 Video? Learning Strong Image Encoders from 1 Long Unlabelled Video

S. Venkataramanan, M.N. Rizve, J. Carreira, Y.M. Asano, Y. Avrithis

ICLR 2024 Oral

Composed Image Retrieval for Remote Sensing

B. Psomas, I. Kakogeorgiou, N. Efthymiadis, G. Tolias, O. Chum, Y. Avrithis, K. Karantzalos

IGARSS 2024 Oral

CA-Stream: Attention-Based Pooling for Interpretable Image Recognition

F. Torres Figueroa, H. Zhang, R. Sicre, S. Ayache, Y. Avrithis

XAI4CV/CVPR 2024

On Train-Test Class Overlap and Detection for Image Retrieval

C.H. Song, J. Yoon, T. Hwang, S. Choi, Y.H. Gu, Y. Avrithis

CVPR 2024

PowMix: a Versatile Regularizer for Multimodal Sentiment Analysis

E. Georgiou, Y. Avrithis, A. Potamianos

TASL, 2024

Opti-CAM: Optimizing Saliency Maps for Interpretability

H. Zhang, F. Torres, R. Sicre, Y. Avrithis, S. Ayache

CVIU, 2024

Boosting Vision Transformers for Image Retrieval

C.H. Song, J. Yoon, S. Choi, Y. Avrithis

WACV 2023

Generating Part-Aware Editable 3D Shapes without 3D Supervision

K. Tertikas, D. Paschalidou, B. Pan, J.J. Park, M.A. Uy, I. Emiris, Y. Avrithis, L. Guibas

CVPR 2023

Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts

D. Engin, Y. Avrithis

CLVL/ICCV 2023

Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?

B. Psomas, I. Kakogeorgiou, K. Karantzalos, Y. Avrithis

ICCV 2023

Adaptive Anchors Label Propagation for Transductive Few-Shot Learning

M. Lazarou, Y. Avrithis, G. Ren, T. Stathaki

ICIP 2023

Embedding Space Interpolation Beyond Mini-Batch, Beyond Pairs and Beyond Examples

S. Venkataramanan, E. Kijak, L. Amsaleg, Y. Avrithis

NeurIPS 2023

Deep Neural Network Attacks and Defense: the Case of Image Classification

H. Zhang, T. Furon, L. Amsaleg, Y. Avrithis

Wiley, 2022

All the Attention You Need: Global-Local, Spatial-Channel Attention for Image Retrieval

C.H. Song, H.J. Han, Y. Avrithis

WACV 2022

Tensor Feature Hallucination for Few-Shot Learning

M. Lazarou, T. Stathaki, Y. Avrithis

WACV 2022

It Takes Two to Tango: Mixup for Deep Metric Learning

S. Venkataramanan, B. Psomas, E. Kijak, L. Amsaleg, K. Karantzalos, Y. Avrithis

ICLR 2022

AlignMixup: Improving Representations by Interpolating Aligned Features

S. Venkataramanan, E. Kijak, L. Amsaleg, Y. Avrithis

CVPR 2022

What to Hide from Your Students: Attention-Guided Masked Image Modeling

I. Kakogeorgiou, S. Gidaris, B. Psomas, Y. Avrithis, A. Bursuc, K. Karantzalos, N. Komodakis

ECCV 2022

Asymmetric Metric Learning for Knowledge Transfer

M. Budnik, Y. Avrithis

CVPR 2021

Patch Replacement: a Transformation-Based Method to Improve Robustness Against Adversarial Attacks

H. Zhang, Y. Avrithis, T. Furon, L. Amsaleg

TAI/ACM-MM 2021

On the Hidden Treasure of Dialog in Video Question Answering

D. Engin, N.Q.K. Duong, F. Schnitzler, Y. Avrithis

ICCV 2021

Iterative Label Cleaning for Transductive and Semi-Supervised Few-Shot Learning

M. Lazarou, T. Stathaki, Y. Avrithis

ICCV 2021

Walking on the Edge: Fast, Low-Distortion Adversarial Examples

H. Zhang, Y. Avrithis, T. Furon, L. Amsaleg

TIFS, 2021

Training Object Detectors from Few Weakly-Labeled and Many Unlabeled Images

Z. Yang, M. Shi, C. Xu, V. Ferrari, Y. Avrithis

PR, 2021

Graph Convolutional Networks for Learning with Few Clean and Many Noisy Labels

A. Iscen, G. Tolias, Y. Avrithis, O. Chum, C. Schmid

ECCV 2020

Local Propagation for Few-Shot Learning

Y. Lifchitz, Y. Avrithis, S. Picard

ICPR 2020

Few-Shot Few-Shot Learning and the Role of Spatial Attention

Y. Lifchitz, Y. Avrithis, S. Picard

ICPR 2020

Rethinking Deep Active Learning: Using Unlabeled Data at Model Training

O. Siméoni, M. Budnik, Y. Avrithis, G. Gravier

ICPR 2020

Smooth Adversarial Examples

H. Zhang, Y. Avrithis, T. Furon, L. Amsaleg

JIS, 2020

Exploring and Learning from Visual Data

Y. Avrithis

UR1, 2020 Talk

Label Propagation for Deep Semi-Supervised Learning

A. Iscen, G. Tolias, Y. Avrithis, O. Chum

CVPR 2019

Dense Classification and Implanting for Few-Shot Learning

Y. Lifchitz, Y. Avrithis, S. Picard, A. Bursuc

CVPR 2019

Local Features and Visual Words Emerge in Activations

O. Siméoni, Y. Avrithis, O. Chum

CVPR 2019

Revisiting the Medial Axis for Planar Shape Decomposition

N. Papanelopoulos, Y. Avrithis, S. Kollias

CVIU, 2019

Graph-Based Particular Object Discovery

O. Siméoni, A. Iscen, G. Tolias, Y. Avrithis, O. Chum

MVA, 2019

Unsupervised Object Discovery for Instance Recognition

O. Siméoni, A. Iscen, G. Tolias, Y. Avrithis, O. Chum

WACV 2018

Mining on Manifolds: Metric Learning without Labels

A. Iscen, G. Tolias, Y. Avrithis, O. Chum

CVPR 2018

Revisiting Oxford And Paris: Large-Scale Image Retrieval Benchmarking

F. Radenović, A. Iscen, G. Tolias, Y. Avrithis, O. Chum

CVPR 2018

Fast Spectral Ranking for Similarity Search

A. Iscen, Y. Avrithis, G. Tolias, T. Furon, O. Chum

CVPR 2018

Hybrid Diffusion: Spectral-Temporal Graph Filtering for Manifold Ranking

A. Iscen, Y. Avrithis, G. Tolias, T. Furon, O. Chum

ACCV 2018

Panorama to Panorama Matching for Location Recognition

A. Iscen, G. Tolias, Y. Avrithis, T. Furon, O. Chum

ICMR 2017

Efficient Diffusion on Region Manifolds: Recovering Small Objects with Compact CNN Representations

A. Iscen, G. Tolias, Y. Avrithis, T. Furon, O. Chum

CVPR 2017

Unsupervised Part Learning for Visual Recognition

R. Sicre, Y. Avrithis, E. Kijak, F. Jurie

CVPR 2017

Automatic Discovery of Discriminative Parts as a Quadratic Assignment Problem

R. Sicre, J. Rabin, Y. Avrithis, T. Furon, F. Jurie, E. Kijak

CEFRL/ICCV 2017

High-Dimensional Visual Similarity Search: k-d Generalized Randomized Forests

Y. Avrithis, I. Emiris, G. Samaras

CGI 2016

$\alpha$-Shapes for Local Feature Detection

C. Varytimidis, K. Rapantzikos, Y. Avrithis, S. Kollias

PR, 2016

Image Search with Selective Match Kernels: Aggregation Across Single and Multiple Images

G. Tolias, Y. Avrithis, H. Jégou

IJCV, 2016

Early Burst Detection for Memory-Efficient Image Retrieval

M. Shi, Y. Avrithis, H. Jégou

CVPR 2015

Planar Shape Decomposition Made Simple

N. Papanelopoulos, Y. Avrithis

BMVC 2015

Dithering-Based Sampling and Weighted $\alpha$-Shapes for Local Feature Detection

C. Varytimidis, K. Rapantzikos, Y. Avrithis, S. Kollias

CVA, 2015

Web-Scale Image Clustering Revisited

Y. Avrithis, Y. Kalantidis, E. Anagnostopoulos, I. Emiris

ICCV 2015 Oral

Locally Optimized Product Quantization for Approximate Nearest Neighbor Search

Y. Kalantidis, Y. Avrithis

CVPR 2014

Improving Local Features by Dithering-Based Image Sampling

C. Varytimidis, K. Rapantzikos, Y. Avrithis, S. Kollias

ACCV 2014

Hough Pyramid Matching: Speeded-Up Geometry Re-Ranking for Large Scale Image Retrieval

Y. Avrithis, G. Tolias

IJCV, 2014

Towards Large-Scale Geometry Indexing by Feature Selection

G. Tolias, Y. Kalantidis, Y. Avrithis, S. Kollias

CVIU, 2014

Quantize and Conquer: a Dimensionality-Recursive Solution to Clustering, Vector Quantization, and Image Retrieval

Y. Avrithis

ICCV 2013

To Aggregate or Not to Aggregate: Selective Match Kernels for Image Search

G. Tolias, Y. Avrithis, H. Jégou

ICCV 2013 Oral

Multimodal Saliency and Fusion for Movie Summarization Based on Aural, Visual, and Textual Attention

G. Evangelopoulos, A. Zlatintsi, A. Potamianos, P. Maragos, K. Rapantzikos, G. Skoumas, Y. Avrithis

TMM, 2013

W$\alpha$Sh: Weighted $\alpha$-Shapes for Local Feature Detection

C. Varytimidis, K. Rapantzikos, Y. Avrithis

ECCV 2012

Approximate Gaussian Mixtures for Large Scale Vocabularies

Y. Avrithis, Y. Kalantidis

ECCV 2012

SymCity: Feature Selection by Symmetry for Large Scale Image Retrieval

G. Tolias, Y. Kalantidis, Y. Avrithis

ACM-MM 2012 Full

Vision, Attention Control, and Goals Creation System

K. Rapantzikos, Y. Avrithis, S. Kollias

Springer, 2011

Scalable Triangulation-Based Logo Recognition

Y. Kalantidis, L.G. Pueyo, M. Trevisiol, R. van Zwol, Y. Avrithis

ICMR 2011

Speeded-Up, Relaxed Spatial Matching

G. Tolias, Y. Avrithis

ICCV 2011

The Medial Feature Detector: Stable Regions from Image Boundaries

Y. Avrithis, K. Rapantzikos

ICCV 2011

VIRaL: Visual Image Retrieval and Localization

Y. Kalantidis, G. Tolias, Y. Avrithis, M. Phinikettos, E. Spyrou, Ph. Mylonas, S. Kollias

MTAP, 2011

Spatiotemporal Features for Action Recognition and Salient Event Detection

K. Rapantzikos, Y. Avrithis, S. Kollias

CC, 2011

Detecting Regions from Single Scale Edges

K. Rapantzikos, Y. Avrithis, S. Kollias

SGA/ECCV 2010

Full list of publications

Contact

Institute of Advanced Research on Artificial Intelligence

Landstraßer Hauptstraße 5
1030 Vienna, Austria

+43 1 9089990

Scholar

Social