بسم الله الرحمن الرحيم
Variational Autoencoders (VAEs)
Generative Adversarial Networks (GANs)
- Generative Adversarial Networks (Goodfellow et al., NIPS 2014) (Note: Original NIPS link - please verify access)
- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (DCGAN) (Radford et al., ICLR 2016)
Sequence-to-Sequence Models & Attention
- Sequence to Sequence Learning with Neural Networks (Sutskever et al., NIPS 2014)
- Neural Machine Translation by Jointly Learning to Align and Translate (Bahdanau et al., ICLR 2015)
- Attention Is All You Need (Transformer) (Vaswani et al., NIPS 2017)
Optimizers & Normalization
- Adam: A Method for Stochastic Optimization (Kingma & Ba, ICLR 2015)
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (Ioffe & Szegedy, ICML 2015)
Computer Vision Architectures
- Going Deeper with Convolutions (GoogLeNet/Inception) (Szegedy et al., CVPR 2015)
- Very Deep Convolutional Networks for Large-Scale Image Recognition (VGGNet) (Simonyan & Zisserman, ICLR 2015)
- Deep Residual Learning for Image Recognition (ResNet) (He et al., CVPR 2016)
- You Only Look Once: Unified, Real-Time Object Detection (YOLO) (Redmon et al., CVPR 2016)
Normalization & Regularization (Continued)
Key 2017-2020 Papers
- Attention Is All You Need (Vaswani et al., NIPS 2017)
- U-Net: Convolutional Networks for Biomedical Image Segmentation (Ronneberger et al., MICCAI 2015)
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al., NAACL 2019)
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT) (Dosovitskiy et al., ICLR 2021)
Recent Foundational Models (2021-2023)
- Highly accurate protein structure prediction with AlphaFold (Jumper et al., Nature 2021)
- CLIP: Learning Transferable Visual Models From Natural Language Supervision (Radford et al., ICML 2021)
- Denoising Diffusion Probabilistic Models (DDPM) (Ho et al., NeurIPS 2020)
- High-Resolution Image Synthesis with Latent Diffusion Models (Stable Diffusion / LDM) (Rombach et al., CVPR 2022)
- Learning Transferable Visual Models From Natural Language Supervision (CLIP) (Radford et al., ICML 2021)
Large Language Models (LLMs)
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al., NAACL 2019)
- Language Models are Few-Shot Learners (GPT-3) (Brown et al., NeurIPS 2020)
- Training language models to follow instructions with human feedback (InstructGPT) (Ouyang et al., NeurIPS 2022)
- GPT-4 Technical Report (OpenAI, 2023)
- Llama 2: Open Foundation and Fine-Tuned Chat Models (Touvron et al., 2023)
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al., NAACL 2019) - Duplicate removed, keep one.
- Language Models are Few-Shot Learners (GPT-3) (Brown et al., NeurIPS 2020)
- Training language models to follow instructions with human feedback (InstructGPT) (Ouyang et al., NeurIPS 2022)
Transformers Beyond Text
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT) (Dosovitskiy et al., ICLR 2021)
- High-Resolution Image Synthesis with Latent Diffusion Models (Stable Diffusion / LDM) (Rombach et al., CVPR 2022) - Duplicate removed, keep under Generative Models.
- Segment Anything (Kirillov et al., ICCV 2023)
- Generative Agents: Interactive Simulacra of Human Behavior (Park et al., 2023)
Miscellaneous & Other Notable Papers
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (Ioffe & Szegedy, ICML 2015) - Duplicate removed, keep under Optimizers & Normalization.
- Deep Residual Learning for Image Recognition (ResNet) (He et al., CVPR 2016)
- U-Net: Convolutional Networks for Biomedical Image Segmentation (Ronneberger et al., MICCAI 2015)
- Highly accurate protein structure prediction with AlphaFold (Jumper et al., Nature 2021)
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (Wei et al., NeurIPS 2022)
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Yao et al., NeurIPS 2023)
- Visual Instruction Tuning (LLaVA) (Liu et al., NeurIPS 2023)
الوصف