Alaaeldin El-Nouby

I am a Staff Research Scientist at Meta Superintelligence Labs (FAIR) working on agentic coding post-training and previously LLM pre-training. Before that, I was part of the MLR team at Apple where I led research in scaling vision and multimodal pre-training.

I completed my PhD in Computer Science at Meta AI (FAIR) and École Normale Supérieure, advised by Ivan Laptev, Natalia Neverova, and Hervé Jégou. I have a MSc in Computer Engineering from University of Guelph, where I was advised by Dr. Graham Taylor. During that time, I was a student researcher at the Vector Institute.

Email: alaaelnouby-at-gmail.com  /  Google Scholar  /  Resume

profile photo

News

  • Apr 2026 Meta announced Muse Spark, its latest frontier LLM. My team contributed to its agentic coding capabilities.
  • Jun 2025 Joined Meta Superintelligence Labs (FAIR).
  • Apr 2025 Scaling Laws for Native Multimodal Models was accepted at ICCV 2025 as an oral.
  • Feb 2025 Released FlexTok, accepted at ICML 2025.
  • Jan 2025 Parameters vs FLOPs was accepted at ICML 2025.
  • Nov 2024 Released AIMv2, accepted at CVPR 2025 as a highlight.
  • Jun 2024 Released DataComp-LM, accepted at NeurIPS 2024.
  • Jan 2024 Released Autoregressive Image Models (AIM), accepted at ICML 2024.
  • Aug 2023 Joined Apple MLR as a Research Scientist.
  • May 2023 Mark Zuckerberg announced our recent foundational multimodal model ImageBind.
  • May 2023 Released ImageBind, accepted at CVPR 2023.

Research

I'm interested in agentic coding post-training, LLM pre-training, multimodal vision language models, large-scale visual representation learning.

project image

Scaling Laws for Optimal Data Mixtures

Mustafa Shukor, Louis Bethune, Dan Busbridge, David Grangier, Enrico Fini, Alaaeldin El-Nouby, Pierre Ablin
NeurIPS 2025
paper
project image

Scaling Laws for Native Multimodal Models

Mustafa Shukor, Enrico Fini, Victor Guilherme Turrisi da Costa, Matthieu Cord, Joshua Susskind, Alaaeldin El-Nouby
ICCV 2025 Oral
paper
project image

FlexTok: Resampling Images into 1D Token Sequences of Flexible Length

Roman Bachmann, Jesse Allardice, David Mizrahi, Enrico Fini, Oguzhan Fatih Kar, Elmira Amirloo, Alaaeldin El-Nouby, Amir Zamir, Afshin Dehghan
ICML 2025
paper code
project image

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Samira Abnar, Harshay Shah, Dan Busbridge, Alaaeldin El-Nouby, Josh Susskind, Vimal Thilak
ICML 2025
paper
project image

Multimodal Autoregressive Pre-training of Large Vision Encoders

Enrico Fini, Mustafa Shukor, Xiujun Li, Philipp Dufter, Michal Klein, David Haldimann, Sai Aitharaju, Victor Guilherme Turrisi da Costa, Louis Bethune, Zhe Gan, Alexander T Toshev, Marcin Eichner, Moin Nabi, Yinfei Yang, Joshua M. Susskind, Alaaeldin El-Nouby
CVPR 2025 Highlight
paper code
project image

DataComp-LM: In Search of the Next Generation of Training Sets for Language Models

Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Vaishaal Shankar et al.
NeurIPS 2024
paper code
project image

Scalable Pre-training of Large Autoregressive Image Models

Alaaeldin El-Nouby, Michal Klein, Shuangfei Zhai, Miguel Angel Bautista, Alexander Toshev, Vaishaal Shankar, Joshua M. Susskind, Armand Joulin
ICML 2024
paper code
project image

ImageBind: One Embedding Space To Bind Them All

Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra
CVPR 2023 Highlight
paper code
project image

DINOv2: Learning Robust Visual Features without Supervision

Maxime Oquab, Timothee Darcet, Theo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Herve Jegou, Julien Mairal, Patrick Labatut, Armand Joulin, Piotr Bojanowski
Preprint
paper code
project image

Improving Statistical Fidelity for Neural Image Compression with Implicit Local Likelihood Models

Matthew J. Muckley, Alaaeldin El-Nouby, Karen Ullrich, Herve Jegou, Jakob Verbeek
ICML 2023
paper
project image

Image Compression with Product Quantized Masked Image Modeling

Alaaeldin El-Nouby, Matthew J. Muckley, Karen Ullrich, Ivan Laptev, Jakob Verbeek, Hervé Jégou
Transactions of Machine Learning Research (TMLR)
paper
project image

OmniMAE: Single Model Masked Pretraining on Images and Videos

Rohit Girdhar*, Alaaeldin El-Nouby*, Mannat Singh*, Kalyan Vasudev Alwala*, Armand Joulin, Ishan Misra*
CVPR 2023
paper code
project image

Three things everyone should know about Vision Transformers

Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Jakob Verbeek, Hervé Jégou
ECCV 2022
paper code
project image

Are Large-scale Datasets Necessary for Self-Supervised Pre-training?

Alaaeldin El-Nouby*, Gautier Izacard*, Hugo Touvron, Ivan Laptev, Hervé Jegou, Edouard Grave
Under Review
paper
project image

XCiT: Cross-Covariance Image Transformer

Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou
NeurIPS 2021
paper video code
project image

ResMLP: Feedforward networks for image classification with data-efficient training

Hugo Touvron, Piotr Bojanowski, Mathilde Caron, Matthieu Cord, Alaaeldin El-Nouby, Edouard Grave, Gautier Izacard, Armand Joulin, Gabriel Synnaeve, Jakob Verbeek, Hervé Jégou
TPAMI
paper code
project image

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

Ben Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou, Matthijs Douze
ICCV 2021
paper code
project image

Training Vision Transformers for Image Retrieval

Alaaeldin El-Nouby, Natalia Neverova, Ivan Laptev, Hervé Jégou
Preprint
paper
project image

Skip-Clip: Self-Supervised Spatiotemporal Representation Learning by Future Clip Order Ranking

Alaaeldin El-Nouby, Shuangfei Zhai, Graham W. Taylor, Joshua M. Susskind
Holistic Video Understanding Workshop ICCV2019 (Best poster Award)
paper poster bibtex
project image

Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction

Alaaeldin El-Nouby, Shikhar Sharma, Hannes Schulz, Devon Hjelm, Layla El Asri, Samira Ebrahimi Kahou, Yoshua Bengio, Graham W.Taylor
Proceedings of the 2019 IEEE International Conference on Computer Vision (ICCV)
paper code poster blog bibtex
project image

Real-Time End-to-End Action Detection with Two-Stream Networks

Alaaeldin El-Nouby, Graham W. Taylor
15th Conference on Computer and Robot Vision, CRV 2018 Oral
paper bibtex
project image

Spatiotemporal Representation Learning For Human Action Recognition And Localization

Alaaeldin El-Nouby
paper bibtex

Invited Talks

  • L3D-IVU Workshop, CVPR’24 - Scalable Pre-training of Large Autoregressive Image Models
  • IMAGINE lab, École des Ponts ParisTech - Bringing the power of Transformers to Computer Vision
  • Vector Institute / University of Guelph - Masked Image Modeling for Visual Representation Learning
  • Transformers for Vision workshop, CVPR’22 - Are Large-scale Datasets Necessary for Self-Supervised Pre-training?
  • Max Planck Institute / Tübingen AI Center - Are Large-scale Datasets Necessary for Self-Supervised Pre-training?
  • Computer Vision Group (CVG), University of Bern - Are Large-scale Datasets Necessary for Self-Supervised Pre-training?
  • Large Scale Holistic Video Understanding workshop, CVPR’21 - Training Vision Transformers for Image Retrieval
  • KTH Royal Institute of Technology - Training Vision Transformers for Image Retrieval
  • Microsoft Research Montreal - Sequential Scene Understanding and Generation
  • DeepVision workshop, Simon Fraser University - Real-Time End-to-End Action Detection with Two-Stream Networks
  • Twenty Billion Neurons - Real-Time End-to-End Action Detection with Two-Stream Networks

Reviewing

  • CVPR’22-'25, ECCV’22-'24, NeurIPS’24, ICLR’24, ICCV’21, NeurIPS’22 SSL workshop, TPAMI (2021-Present).



Design and source code from Jon Barron's website