Yannis Avrithis
Yannis Avrithis

Working on computer vision and machine learning

News

Jan 24, 2024
AM code released
Jan 19, 2024
Jan 18, 2024
WTours dataset released
Jan 16, 2024
Paper accepted at ICLR 2024: Walking Tours
Dec 23, 2023
I become a Reviewer of ICML
Dec 22, 2023
Paper accepted at VISAPP 2024: Interpretable Gradients
Dec 19, 2023
Preprint released at arXiv: PowMix
Nov 24, 2023
Nov 09, 2023
Preprint released at arXiv: MultiMix
Oct 24, 2023
Paper accepted at WACV 2024: Imbalanced Few-Shot Learning
Oct 13, 2023
Thodoris Lymperopoulos defends his MSc thesis on Explainable AI
Oct 12, 2023
Preprint released at arXiv: Walking Tours
Oct 06, 2023
Christos Morfopoulos defends his MSc thesis on Vision-Language Tasks
Oct 01, 2023
ViTiS code released
Sep 27, 2023
Preprint released at arXiv: ViTiS
Sep 22, 2023
Felipe wins the best project award at ELLIS Summer School 2023
Sep 21, 2023
Paper accepted at NeurIPS 2023: MultiMix
Sep 17, 2023
SimPool code released
Sep 13, 2023
Preprint released at arXiv: SimPool
Aug 15, 2023
Paper accepted at CLVL/ICCV 2023: ViTiS
Aug 14, 2023
I serve as Area Chair of WACV 2024
Jul 14, 2023
Paper accepted at ICCV 2023: SimPool
Jul 06, 2023
A2LP code released
Jun 21, 2023
Paper accepted at ICIP 2023: Adaptive Anchor Label Propagation
Jun 12, 2023
PartNeRF code released
Jun 06, 2023
Apr 27, 2023
Preprint released at arXiv: Adaptive Manifold
Apr 18, 2023
Shashank accepted at ICVSS 2023
Apr 10, 2023
I serve as Area Chair of BMVC 2023
Apr 07, 2023
Metrix code released

About me

Since 2022 I am a Principal Investigator at the Institute of Advanced Research on Artificial Intelligence (IARAI), carrying out research on computer vision and machine learning. Between 2021 and 2022 I have been a Research Director in Information Management Systems Institute (IMSI) of Athena Research Center. I enjoy working on learning visual representations from data, with as little supervision as possible. My recent work is focusing on different learning settings including metric learning, incremental learning and few-shot learning; multimodal learning including vision, language and 3D models; interpretability; and video question answering.

Between 2016 and 2021 I have been a research scientist in LinkMedia team of Inria Rennes-Bretagne Atlantique and I have been teaching Deep Learning for Vision at University of Rennes 1. My work has focused on exploring the manifold structure of data and, apart from image retrieval, using it for unsupervised, semi-supervised and few-shot learning. I have also worked on adversarial examples and on investigating the sparsity of convolutional activations, applied to spatial matching and unsupervised object discovery. In 2020, I was awarded the Habilitation à Diriger des Recherches (HDR) qualification from University of Rennes 1.

Between 2016 and 2021 I have been a research scientist in LinkMedia team of Inria Rennes-Bretagne Atlantique and I have been teaching Deep Learning for Vision at University of Rennes 1. My work has focused on exploring the manifold structure of data and, apart from image retrieval, using it for unsupervised, semi-supervised and few-shot learning. I have also worked on adversarial examples and on investigating the sparsity of convolutional activations, applied to spatial matching and unsupervised object discovery. In 2020, I was awarded the Habilitation à Diriger des Recherches (HDR) qualification from University of Rennes 1.

In 2015 I have been at the Laboratory of Algebraic and Geometric Algorithms (ΕρΓΑ) of National and Kapodistrian University of Athens (NKUA), where I have worked on large-scale clustering and nearest neighbor search with Ioannis Emiris. An achievement of this period has been Web-Scale Image Clustering.

Between 2001 and 2015 I have been at the Image, Video and Multimedia Systems Laboratory (IVML) of the National Technical University of Athens (NTUA). In the latter part of this period, since 2008, I have been leading the Image and Video Analysis (IVA) team. We have worked on a diverse set of problems including local feature and salient region detection, spatial and spatiotemporal visual representation and matching, object recognition and tracking, action recognition in video, scene classification, image/video indexing, retrieval and summarization.

A large body of this work has focused on local feature/descriptor representations. In this context, we have delivered repeatable feature detection, extremely fast spatial matching, geometry indexing, multi-view and single-view feature selection, approximate nearest neighbor search for large scale clustering and vocabulary construction, and mining 3d scenes from millions of images. Examples of this work are Hough Pyramid Matching (HPM), the Aggregated Selective Match Kernel (ASMK), and Locally Optimized Product Quantization (LOPQ). In 2017, using a CNN image representation, Yahoo! Research has chosen LOPQ to index and provide a similar image search functionality on its entire Flickr collection. A flagship product of this period has been our unique application Visual Image Retrieval and Localization (VIRaL).

In the preceding period before 2008, I have worked on problems related to object detection and image understanding. This includes regions-of-interest for generic object detection, semantic segmentation, integrated segmentation and region labeling, spatio-temporal object segmentation and tracking, image classification, as well as the use of visual context and prior knowledge.

While at NTUA, I have had the opportunity to collaborate with a large number of researchers across Europe by participating in Networks of Excellence Muscle and K-Space and Integrated Projects like aceMedia and WeKnowIt. I have developed a number of activities including being the Program Chair or General Chair of workshops and conferences in the field of multimedia like WIAMIS, CBMI, MMM and CIVR. For several years I have been teaching Signals and Systems and Image and Video Analysis.

Between 1996 and 2001 I have been working on my Ph.D. at NTUA with Stefanos Kollias, studying visual representations for video sequence analysis. I have investigated accurate spatio-temporal segmentation by fusing color, motion and depth, as well as a region-based representation for retrieval and summarization in the form of key-frames. I have designed an affine-invariant representation of object contours for shape-based object classification and retrieval. Finally, I have studied temporal segmentation and parsing of broadcast news based on human face detection.

In 1993-1994 I completed a M.Sc. with Distinction in Communications and Signal Processing at Imperial College, University of London. As part of my Masters thesis, I have worked on communications and investigated the capacity of a cellular CDMA system with Athanassios Manikas.

Between 1998 and 1993 I have studied Electrical and Computer Engineering at NTUA. As part of my Diploma thesis, I have worked on analog electronics and developed a hardware implementation of a fuzzy logic processor with Yannis Tsividis. Before that, my first exposure to computer vision and machine learning has been the study of an invariant representation for optical character recognition with Anastasios Delopoulos, leading to my first conference publication before my Diploma.

Recent publications

Is Imagenet Worth 1 Video? Learning Strong Image Encoders from 1 Long Unlabelled Video
S. Venkataramanan, M.N. Rizve, J. Carreira, Y.M. Asano, Y. Avrithis
ICLR 2024 Oral
A Learning Paradigm for Interpretable Gradients
F. Torres Figueroa, H. Zhang, R. Sicre, Y. Avrithis, S. Ayache
VISAPP 2024 Oral
Adaptive Manifold for Imbalanced Transductive Few-Shot Learning
M. Lazarou, Y. Avrithis, T. Stathaki
WACV 2024
Embedding Space Interpolation Beyond Mini-Batch, Beyond Pairs and Beyond Examples
S. Venkataramanan, E. Kijak, L. Amsaleg, Y. Avrithis
NeurIPS 2023
Generating Part-Aware Editable 3D Shapes without 3D Supervision
K. Tertikas, D. Paschalidou, B. Pan, J.J. Park, M.A. Uy, I. Emiris, Y. Avrithis, L. Guibas
Boosting Vision Transformers for Image Retrieval
C.H. Song, J. Yoon, S. Choi, Y. Avrithis
What to Hide from Your Students: Attention-Guided Masked Image Modeling
I. Kakogeorgiou, S. Gidaris, B. Psomas, Y. Avrithis, A. Bursuc, K. Karantzalos, N. Komodakis
AlignMixup: Improving Representations by Interpolating Aligned Features
S. Venkataramanan, E. Kijak, L. Amsaleg, Y. Avrithis
It Takes Two to Tango: Mixup for Deep Metric Learning
S. Venkataramanan, B. Psomas, E. Kijak, L. Amsaleg, K. Karantzalos, Y. Avrithis
On the Hidden Treasure of Dialog in Video Question Answering
D. Engin, N.Q.K. Duong, F. Schnitzler, Y. Avrithis
Smooth Adversarial Examples
H. Zhang, Y. Avrithis, T. Furon, L. Amsaleg
Local Propagation for Few-Shot Learning
Y. Lifchitz, Y. Avrithis, S. Picard
Graph-Based Particular Object Discovery
O. Siméoni, A. Iscen, G. Tolias, Y. Avrithis, O. Chum
Label Propagation for Deep Semi-Supervised Learning
A. Iscen, G. Tolias, Y. Avrithis, O. Chum
Dense Classification and Implanting for Few-Shot Learning
Y. Lifchitz, Y. Avrithis, S. Picard, A. Bursuc
Mining on Manifolds: Metric Learning without Labels
A. Iscen, G. Tolias, Y. Avrithis, O. Chum
Revisiting Oxford And Paris: Large-Scale Image Retrieval Benchmarking
F. Radenović, A. Iscen, G. Tolias, Y. Avrithis, O. Chum
Fast Spectral Ranking for Similarity Search
A. Iscen, Y. Avrithis, G. Tolias, T. Furon, O. Chum
Unsupervised Object Discovery for Instance Recognition
O. Siméoni, A. Iscen, G. Tolias, Y. Avrithis, O. Chum
Automatic Discovery of Discriminative Parts as a Quadratic Assignment Problem
R. Sicre, J. Rabin, Y. Avrithis, T. Furon, F. Jurie, E. Kijak
Unsupervised Part Learning for Visual Recognition
R. Sicre, Y. Avrithis, E. Kijak, F. Jurie
Panorama to Panorama Matching for Location Recognition
A. Iscen, G. Tolias, Y. Avrithis, T. Furon, O. Chum
$\alpha$-Shapes for Local Feature Detection
C. Varytimidis, K. Rapantzikos, Y. Avrithis, S. Kollias
Web-Scale Image Clustering Revisited
Y. Avrithis, Y. Kalantidis, E. Anagnostopoulos, I. Emiris
Planar Shape Decomposition Made Simple
N. Papanelopoulos, Y. Avrithis
Towards Large-Scale Geometry Indexing by Feature Selection
G. Tolias, Y. Kalantidis, Y. Avrithis, S. Kollias
Improving Local Features by Dithering-Based Image Sampling
C. Varytimidis, K. Rapantzikos, Y. Avrithis, S. Kollias
Multimodal Saliency and Fusion for Movie Summarization Based on Aural, Visual, and Textual Attention
G. Evangelopoulos, A. Zlatintsi, A. Potamianos, P. Maragos, K. Rapantzikos, G. Skoumas, Y. Avrithis
VIRaL: Visual Image Retrieval and Localization
Y. Kalantidis, G. Tolias, Y. Avrithis, M. Phinikettos, E. Spyrou, Ph. Mylonas, S. Kollias
Scalable Triangulation-Based Logo Recognition
Y. Kalantidis, L.G. Pueyo, M. Trevisiol, R. van Zwol, Y. Avrithis