Vector Quantization
VQ (STE)
[NeurIPS 2017] [arXiv:1711.00937] Neural Discrete Representation Learning
[CVPR 2021] [arXiv:2012.09841] Taming Transformers for High-Resolution Image Synthesis
VQ (Gumbel Softmax)
[ICLR 2020] [arXiv:1910.05453] vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
Improved VQ
[ICLR 2022] [arXiv:2110.04627] Vector-quantized Image Modeling with Improved VQGAN
FSQ
[ICLR 2024] [arXiv:2309.15505] Finite Scalar Quantization: VQ-VAE Made Simple
LFQ
[ICLR 2024] [arXiv:2310.05737] Language Model Beats Diffusion: Tokenizer is Key to Visual Generation
BSQ
[ICLR 2025] [arXiv:2406.07548] Image and Video Tokenization with Binary Spherical Quantization
Rotation Trick
[ICLR 2025] [ arXiv:2410.06424] Restructuring Vector Quantization with the Rotation Trick
SimVQ
[ICCV 2025] [arXiv:2411.02038] Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
GSQ
[arXiv:2412.02632] Scaling Image Tokenizers with Grouped Spherical Quantization
IBQ
[ICCV 2025] [arXiv:2412.02692] Taming Scalable Visual Tokenizer for Autoregressive Image Generation
SoftVQ
[CVPR 2025] [arXiv:2412.10958] SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
OptVQ
[arXiv:2412.15195] Preventing Local Pitfalls in Vector Quantization via Optimal Transport