Imran M, Krebs JR, Sivaraman VB, Zhang T, Kumar A, Ueland WR, Fassler MJ, Huang J, Sun X, Wang L, Shi P. Multi-Class Segmentation of Aortic Branches and Zones in Computed Tomography Angiography: The AortaSeg24 Challenge. arXiv preprint arXiv:2502.05330. 2025 Feb 7. link
Durante Z, Huang Q, Wake N, Gong R, Park JS, Sarkar B, Taori R, Noda Y, Terzopoulos D, Choi Y, Ikeuchi K. Agent ai: Surveying the horizons of multimodal interaction. arXiv preprint arXiv:2401.03568. 2024 Jan 7. link
Du C, Wang Y, Song S, Huang G. Probabilistic contrastive learning for long-tailed visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2024 Feb 23. link
Zhao D, Wang S, Zang Q, Quan D, Ye X, Jiao L. Towards better stability and adaptability: Improve online self-training for model adaptation in semantic segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 11733-11743). link
Team G, Anil R, Borgeaud S, Alayrac JB, Yu J, Soricut R, Schalkwyk J, Dai AM, Hauth A, Millican K, Silver D. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805. 2023 Dec 19. link
Xu X, Shen Y, Chi Y, Ma C. The power of preconditioning in overparameterized low-rank matrix sensing. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 38611-38654). PMLR. link
Wang H, Ma S, Huang S, Dong L, Wang W, Peng Z, Wu Y, Bajaj P, Singhal S, Benhaim A, Patra B. Magneto: A foundation transformer. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 36077-36092). PMLR. link
Wang W, Bao H, Dong L, Bjorck J, Peng Z, Liu Q, Aggarwal K, Mohammed OK, Singhal S, Som S, Wei F. Image as a foreign language: Beit pretraining for vision and vision-language tasks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 19175-19186). link
Yao Y, Gong M, Du Y, Yu J, Han B, Zhang K, Liu T. Which is better for learning with noisy labels: The semi-supervised method or modeling label noise?. InInternational conference on machine learning 2023 Jul 3 (pp. 39660-39673). PMLR. link
Wang J, Ge Y, Yan R, Ge Y, Lin KQ, Tsutsui S, Lin X, Cai G, Wu J, Shan Y, Qie X. All in one: Exploring unified video-language pre-training. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 6598-6608). link
Su W, Zhu X, Tao C, Lu L, Li B, Huang G, Qiao Y, Wang X, Zhou J, Dai J. Towards all-in-one pre-training via maximizing multi-modal mutual information. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 15888-15899). link
Selva J, Johansen AS, Escalera S, Nasrollahi K, Moeslund TB, Clapés A. Video transformers: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023 Feb 9;45(11):12922-43. link
Novack Z, McAuley J, Lipton ZC, Garg S. Chils: Zero-shot image classification with hierarchical label sets. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 26342-26362). PMLR. link
Radenovic F, Dubey A, Kadian A, Mihaylov T, Vandenhende S, Patel Y, Wen Y, Ramanathan V, Mahajan D. Filtering, distillation, and hard negatives for vision-language pre-training. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 6967-6977). link
Müller P, Meissen F, Brandt J, Kaissis G, Rueckert D. Anatomy-driven pathology detection on chest x-rays. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2023 Oct 1 (pp. 57-66). Cham: Springer Nature Switzerland. link
Mirza MJ, Soneira PJ, Lin W, Kozinski M, Possegger H, Bischof H. Actmad: Activation matching to align distributions for test-time-training. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 24152-24161). link
Luo X, Wu H, Zhang J, Gao L, Xu J, Song J. A closer look at few-shot classification again. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 23103-23123). PMLR. link
Long Y, Wen Y, Han J, Xu H, Ren P, Zhang W, Zhao S, Liang X. Capdet: Unifying dense captioning and open-world detection pretraining. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 15233-15243). link
Li L, Spratling M. Improved Adversarial Training Through Adaptive Instance-wise Loss Smoothing. arXiv preprint arXiv:2303.14077. 2023 Mar 24. link
Koh JY, Salakhutdinov R, Fried D. Grounding language models to images for multimodal inputs and outputs. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 17283-17300). PMLR. link
Kaul P, Xie W, Zisserman A. Multi-modal classifiers for open-vocabulary object detection. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 15946-15969). PMLR. link
Karani N, Dey N, Golland P. Boundary-weighted logit consistency improves calibration of segmentation networks. InInternational conference on medical image computing and computer-assisted intervention 2023 Oct 1 (pp. 367-377). Cham: Springer Nature Switzerland. link
Hu Q, Chen Y, Xiao J, Sun S, Chen J, Yuille AL, Zhou Z. Label-free liver tumor segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 7422-7432). link
Xu X, Shen Y, Chi Y, Ma C. The power of preconditioning in overparameterized low-rank matrix sensing. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 38611-38654). PMLR. link
Pethick T, Latafat P, Patrinos P, Fercoq O, Cevher V. Escaping limit cycles: Global convergence for constrained nonconvex-nonconcave minimax problems. arXiv preprint arXiv:2302.09831. 2023 Feb 20. link
Liu Y, Zhang Y, Wang Y, Zhang Y, Tian J, Shi Z, Fan J, He Z. SAP-DETR: bridging the gap between salient points and queries-based transformer detector for fast model convergency. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 15539-15547). link
Hang T, Gu S, Li C, Bao J, Chen D, Hu H, Geng X, Guo B. Efficient diffusion training via min-snr weighting strategy. InProceedings of the IEEE/CVF international conference on computer vision 2023 (pp. 7441-7451). link
Chen M, Huang K, Zhao T, Wang M. Score approximation, estimation and distribution recovery of diffusion models on low-dimensional data. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 4672-4712). PMLR. link
Axiotis K, Sviridenko M. Gradient descent converges linearly for logistic regression on separable data. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 1302-1319). PMLR. link
Taheri H, Thrampoulidis C. Fast convergence in learning two-layer neural networks with separable data. InProceedings of the AAAI Conference on Artificial Intelligence 2023 Jun 26 (Vol. 37, No. 8, pp. 9944-9952). link
Liu A, Bragin MA, Chen X, Guan X. Accelerating level-value adjustment for the polyak stepsize. arXiv preprint arXiv:2311.18255. 2023 Nov 30. link
Lin X, Yan Z, Deng X, Zheng C, Yu L. ConvFormer: Plug-and-play CNN-style transformers for improving medical image segmentation. InInternational conference on medical image computing and computer-assisted intervention 2023 Oct 1 (pp. 642-651). Cham: Springer Nature Switzerland. link
Li K, Wang Y, Li Y, Wang Y, He Y, Wang L, Qiao Y. Unmasked teacher: Towards training-efficient video foundation models. InProceedings of the IEEE/CVF International Conference on Computer Vision 2023 (pp. 19948-19960). link
Knyazev B, Hwang D, Lacoste-Julien S. Can we scale transformers to predict parameters of diverse imagenet models?. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 17243-17259). PMLR. link
Li H, Lin Z. Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the in the O (epsilon^(-7/4)) Complexity. Journal of Machine Learning Research. 2023;24(157):1-37. link
Wang Z, Balasubramanian K, Ma S, Razaviyayn M. Zeroth-order algorithms for nonconvex–strongly-concave minimax problems with improved complexities. Journal of Global Optimization. 2023 Nov;87(2):709-40. link
Arjevani Y, Carmon Y, Duchi JC, Foster DJ, Srebro N, Woodworth B. Lower bounds for non-convex stochastic optimization. Mathematical Programming. 2023 May;199(1):165-214. link
He Y, Yang G, Ge R, Chen Y, Coatrieux JL, Wang B, Li S. Geometric visual similarity learning in 3d medical image self-supervised pre-training. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 9538-9547). link
Han Y, Zhang L, Chen Q, Chen Z, Li Z, Yang J, Cao Z. Fashionsap: Symbols and attributes prompt for fine-grained fashion vision-language pre-training. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 15028-15038). link
Gupta T, Kembhavi A. Visual programming: Compositional visual reasoning without training. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 14953-14962). link
Evron I, Moroshko E, Buzaglo G, Khriesh M, Marjieh B, Srebro N, Soudry D. Continual learning in linear classification on separable data. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 9440-9484). PMLR. link
Dima AF, Zimmer VA, Menten MJ, Li HB, Graf M, Lemke T, Raffler P, Graf R, Kirschke JS, Braren R, Rueckert D. 3D arterial segmentation via single 2D projections and depth supervision in contrast-enhanced CT images. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2023 Oct 1 (pp. 141-151). Cham: Springer Nature Switzerland. link
Dehghani M, Djolonga J, Mustafa B, Padlewski P, Heek J, Gilmer J, Steiner AP, Caron M, Geirhos R, Alabdulmohsin I, Jenatton R. Scaling vision transformers to 22 billion parameters. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 7480-7512). PMLR. link
Chien E, Zhang J, Hsieh CJ, Jiang JY, Chang WC, Milenkovic O, Yu HF. Pina: leveraging side information in extreme multi-label classification via predicted instance neighborhood aggregation. arXiv preprint arXiv:2305.12349. 2023 May 21. link
Chen S, Hou W, Hong Z, Ding X, Song Y, You X, Liu T, Zhang K. Evolving semantic prototype improves generative zero-shot learning. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 4611-4622). PMLR. link
Chen L, Xu J, Luo L. Faster gradient-free algorithms for nonsmooth nonconvex stochastic optimization. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 5219-5233). PMLR. link
Chen F, Fei J, Chen Y, Huang C. Decoupled consistency for semi-supervised medical image segmentation. InInternational conference on medical image computing and computer-assisted intervention 2023 Oct 1 (pp. 551-561). Cham: Springer Nature Switzerland. link
Chen F, Zhang H, Hu K, Huang YK, Zhu C, Savvides M. Enhanced training of query-based object detection via selective query recollection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 23756-23765). link
Chen C, Zhong A, Wu D, Luo J, Li Q. Contrastive masked image-text modeling for medical visual representation learning. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2023 Oct 1 (pp. 493-503). Cham: Springer Nature Switzerland. link
Chen B, Ye Z, Liu Y, Zhang Z, Pan J, Zeng B, Lu G. Combating medical label noise via robust semi-supervised contrastive learning. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2023 Oct 1 (pp. 562-572). Cham: Springer Nature Switzerland. link
Team AA, Bauer J, Baumli K, Baveja S, Behbahani F, Bhoopchand A, Bradley-Schmieg N, Chang M, Clay N, Collister A, Dasagi V. Human-timescale adaptation in an open-ended task space. arXiv preprint arXiv:2301.07608. 2023 Jan 18. link
Basu S, Rawat AS, Zaheer M. A statistical perspective on retrieval-based models. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 1852-1886). PMLR. link
Allingham JU, Ren J, Dusenberry MW, Gu X, Cui Y, Tran D, Liu JZ, Lakshminarayanan B. A simple zero-shot prompt weighting technique to improve prompt ensembling in text-image models. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 547-568). PMLR. link
Ghamizi S, Zhang J, Cordy M, Papadakis M, Sugiyama M, Le Traon Y. Gat: guided adversarial training with pareto-optimal auxiliary tasks. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 11255-11282). PMLR. link
Kanai S, Yamaguchi SY, Yamada M, Takahashi H, Ohno K, Ida Y. One-vs-the-rest loss to focus on important samples in adversarial training. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 15669-15695). PMLR. link
Feng Y, Wang Z, Xu X, Wang Y, Fu H, Li S, Zhen L, Lei X, Cui Y, Ting JS, Ting Y. Contrastive domain adaptation with consistency match for automated pneumonia diagnosis. Medical Image Analysis. 2023 Jan 1;83:102664. link
Mohammadi K, Zhao H, Zhai M, Tung F. Ranking Regularization for Critical Rare Classes: Minimizing False Positives at a High True Positive Rate. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 15783-15792). link
Chen KY, Chiang PH, Chou HR, Chen TW, Chang TH. Trompt: Towards a better deep neural network for tabular data. arXiv preprint arXiv:2305.18446. 2023 May 29. link
Ali-Bey A, Chaib-draa B, Giguère P. Global proxy-based hard mining for visual place recognition. arXiv preprint arXiv:2302.14217. 2023 Feb 28. link
Ali-Bey A, Chaib-draa B, Giguère P. Global proxy-based hard mining for visual place recognition. arXiv preprint arXiv:2302.14217. 2023 Feb 28. link
Yang Z, Kafle K, Dernoncourt F, Ordonez V. Improving visual grounding by encouraging consistent gradient-based explanations. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 19165-19174). link
Wang H, Joshi D, Wang S, Ji Q. Gradient-based uncertainty attribution for explainable bayesian deep learning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 12044-12053). link
Wang C, Luo X, Ross K, Li D. Vrl3: A data-driven framework for visual deep reinforcement learning. Advances in Neural Information Processing Systems. 2022 Dec 6;35:32974-88. link
Ding Z, Wang J, Tu Z. Open-vocabulary universal image segmentation with maskclip. arXiv preprint arXiv:2208.08984. 2022 Aug 18. link
Ben-David E, Ziser Y, Reichart R. Domain adaptation from scratch. arXiv preprint arXiv:2209.00830. 2022 Sep 2. link
Zhou D, Wang N, Gao X, Han B, Wang X, Zhan Y, Liu T. Improving adversarial robustness via mutual information estimation. InInternational conference on machine learning 2022 Jun 28 (pp. 27338-27352). PMLR. link
Chen Y, Jamieson K, Du S. Active multi-task representation learning. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 3271-3298). PMLR. link
Crabbé J, van der Schaar M. Label-free explainability for unsupervised models. arXiv preprint arXiv:2203.01928. 2022 Mar 3. link
Zhang Y, Zhang G, Khanduri P, Hong M, Chang S, Liu S. Revisiting and advancing fast adversarial training through the lens of bi-level optimization. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 26693-26712). PMLR. link
Zhang M, Sohoni NS, Zhang HR, Finn C, Ré C. Correct-n-contrast: A contrastive approach for improving robustness to spurious correlations. arXiv preprint arXiv:2203.01517. 2022 Mar 3. link
Yan S, Xiong X, Arnab A, Lu Z, Zhang M, Sun C, Schmid C. Multiview transformers for video recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 3333-3343). link
Yan L, Wang Q, Cui Y, Feng F, Quan X, Zhang X, Liu D. Gl-rg: Global-local representation granularity for video captioning. arXiv preprint arXiv:2205.10706. 2022 May 22. link
Diakonikolas I, Kontonis V, Tzamos C, Zarifis N. Learning general halfspaces with adversarial label noise via online gradient descent. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 5118-5141). PMLR. link
Hu Q, Zhong Y, Yang T. Multi-block min-max bilevel optimization with applications in multi-task deep auc maximization. Advances in Neural Information Processing Systems. 2022 Dec 6;35:29552-65. link
Frieder S, Lukasiewicz T. (Non-) Convergence Results for Predictive Coding Networks. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 6793-6810). PMLR. link
Park J, Ryu EK. Exact optimal accelerated complexity for fixed-point iterations. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 17420-17457). PMLR. link
Gao H, Li J, Huang H. On the convergence of local stochastic compositional gradient descent with momentum. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 7017-7035). PMLR. link
Gasnikov A, Novitskii A, Novitskii V, Abdukhakimov F, Kamzolov D, Beznosikov A, Takáč M, Dvurechensky P, Gu B. The power of first-order smooth optimization for black-box non-smooth problems. arXiv preprint arXiv:2201.12289. 2022 Jan 28. link
Gorbunov E, Berard H, Gidel G, Loizou N. Stochastic extragradient: General analysis and improved rates. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 7865-7901). PMLR. link
Gorbunov E, Loizou N, Gidel G. Extragradient method: O (1/k) last-iterate convergence for monotone variational inequalities and connections with cocoercivity. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 366-402). PMLR. link
Irons NJ, Scetbon M, Pal S, Harchaoui Z. Triangular flows for generative modeling: Statistical consistency, smoothness classes, and fast rates. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 10161-10195). PMLR. link
Kim J, Yang I. Accelerated gradient methods for geodesically convex optimization: Tractable algorithms and convergence analysis. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 11255-11282). PMLR. link
Kostic VR, Salzo S, Pontil M. Batch greenkhorn algorithm for entropic-regularized multimarginal optimal transport: Linear rate of convergence and iteration complexity. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 11529-11558). PMLR. link
Leahy JM, Kerimkulov B, Siska D, Szpruch L. Convergence of policy gradient for entropy regularized MDPs with neural network approximation in the mean-field regime. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 12222-12252). PMLR. link
Liu Y, Shang F, An W, Liu H, Lin Z. Kill a bird with two stones: Closing the convergence gaps in non-strongly convex optimization by directly accelerated SVRG with double compensation and snapshots. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 14008-14035). PMLR. link
Emmenegger N, Kyng R, Zehmakan AN. On the oracle complexity of higher-order smooth non-convex finite-sum optimization. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 10718-10752). PMLR. link
Kim D, Kim Y, Kwon SJ, Kang W, Moon IC. Refining generative process with discriminator guidance in score-based diffusion models. arXiv preprint arXiv:2211.17091. 2022 Nov 28. link
Zilber P, Nadler B. Inductive matrix completion: No bad local minima and a fast algorithm. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 27671-27692). PMLR. link
Zhu D, Li G, Wang B, Wu X, Yang T. When AUC meets DRO: Optimizing partial AUC for deep learning with non-convex convergence guarantee. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 27548-27573). PMLR. link
Zhang G, Wang Y, Lessard L, Grosse RB. Near-optimal local convergence of alternating gradient descent-ascent for minimax optimization. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 7659-7679). PMLR. link
Zhang G, Luo Z, Yu Y, Cui K, Lu S. Accelerating DETR convergence via semantic-aligned matching. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 949-958). link
Yun J, Lozano A, Yang E. Adablock: SGD with practical block diagonal matrix adaptation for deep learning. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 2574-2606). PMLR. link
Shin Y. Effects of depth, width, and initialization: A convergence analysis of layer-wise training for deep linear neural networks. Analysis and Applications. 2022 Jan 31;20(01):73-119. link
Bah B, Rauhut H, Terstiege U, Westdickenberg M. Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers. Information and Inference: A Journal of the IMA. 2022 Mar;11(1):307-53. link
Scaman K, Malherbe C, Dos Santos L. Convergence rates of non-convex stochastic gradient descent under a generic lojasiewicz condition and local smoothness. InInternational conference on machine learning 2022 Jun 28 (pp. 19310-19327). PMLR. link
Ren T, Cui F, Atsidakou A, Sanghavi S, Ho N. Towards statistical and computational complexities of Polyak step size gradient descent. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 3930-3961). PMLR. link
Qiu ZH, Hu Q, Zhong Y, Zhang L, Yang T. Large-scale stochastic optimization of NDCG surrogates for deep learning with provable convergence. arXiv preprint arXiv:2202.12183. 2022 Feb 24. link
El Hanchi A, Stephens D, Maddison C. Stochastic reweighted gradient descent. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 8359-8374). PMLR. link
Salim A, Sun L, Richtarik P. A convergence theory for SVGD in the population limit under Talagrand’s inequality T1. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 19139-19152). PMLR. link
Rangwani H, Aithal SK, Mishra M, Jain A, Radhakrishnan VB. A closer look at smoothness in domain adversarial training. InInternational conference on machine learning 2022 Jun 28 (pp. 18378-18399). PMLR. link
Rakotomamonjy A, Flamary R, Salmon J, Gasso G. Convergent working set algorithm for lasso with non-convex sparse regularizers. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 5196-5211). PMLR. link
Raj A, Joulani P, Gyorgy A, Szepesvári C. Faster Rates, Adaptive Algorithms, and Finite-Time Bounds for Linear Composition Optimization and Gradient TD Learning. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 7176-7186). PMLR. link
Li H, Farnia F, Das S, Jadbabaie A. On convergence of gradient descent ascent: A tight local analysis. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 12717-12740). PMLR. link
Yu Y, Lin T, Mazumdar EV, Jordan M. Fast distributionally robust learning with variance-reduced min-max optimization. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 1219-1250). PMLR. link
Yang J, Orvieto A, Lucchi A, He N. Faster single-loop algorithms for minimax optimization without strong concavity. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 5485-5517). PMLR. link
Yang J, Ren S. Informed learning by wide neural networks: Convergence, generalization and sampling complexity. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 25198-25240). PMLR. link
Xing Y, Song Q, Cheng G. Unlabeled data help: Minimax analysis and adversarial robustness. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 136-168). PMLR. link
Wu CT, Masoomi A, Gretton A, Dy J. Deep layer-wise networks have closed-form weights. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 188-225). PMLR. link
Wang JK, Lin CH, Wibisono A, Hu B. Provable acceleration of heavy ball beyond quadratics for a class of Polyak-Lojasiewicz functions when the non-convexity is averaged-out. InInternational conference on machine learning 2022 Jun 28 (pp. 22839-22864). PMLR. link
Wang H, Si H, Li B, Zhao H. Provable domain generalization via invariant-feature subspace recovery. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 23018-23033). PMLR. link
Wang B, Yang T. Finite-sum coupled compositional stochastic optimization: Theory and applications. arXiv preprint arXiv:2202.12396. 2022 Feb 24. link
Cohen E, Hallak N, Teboulle M. A dynamic alternating direction of multipliers for nonconvex minimization with nonlinear functional equality constraints. Journal of Optimization Theory and Applications. 2022 Jun 1:1-30. link
Chamon LF, Paternain S, Calvo-Fullana M, Ribeiro A. Constrained learning with non-convex losses. IEEE Transactions on Information Theory. 2022 Jul 1;69(3):1739-60. link
Horváth S, Lei L, Richtárik P, Jordan MI. Adaptivity of stochastic gradient methods for nonconvex optimization. SIAM Journal on Mathematics of Data Science. 2022;4(2):634-48. link
Vigogna S, Meanti G, De Vito E, Rosasco L. Multiclass learning with margin: exponential rates with no bias-variance trade-off. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 22260-22269). PMLR. link
Vaswani S, Dubois-Taine B, Babanezhad R. Towards noise-adaptive, problem-adaptive (accelerated) stochastic gradient descent. InInternational conference on machine learning 2022 Jun 28 (pp. 22015-22059). PMLR. link
Tran TH, Scheinberg K, Nguyen LM. Nesterov accelerated shuffling gradient method for convex optimization. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 21703-21732). PMLR. link
Sebbouh O, Cuturi M, Peyré G. Randomized stochastic gradient descent ascent. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 2941-2969). PMLR. link
Loizou N, Vaswani S, Laradji IH, Lacoste-Julien S. Stochastic polyak step-size for sgd: An adaptive learning rate for fast convergence. InInternational Conference on Artificial Intelligence and Statistics 2021 Mar 18 (pp. 1306-1314). PMLR. link
Asi H, Duchi J, Fallah A, Javidbakht O, Talwar K. Private adaptive gradient methods for convex optimization. InInternational Conference on Machine Learning 2021 Jul 1 (pp. 383-392). PMLR. link
Chen T, Sun Y, Yin W. Solving stochastic compositional optimization is nearly as easy as solving stochastic optimization. IEEE Transactions on Signal Processing. 2021 Jun 25;69:4937-48. link
Jin C, Netrapalli P, Ge R, Kakade SM, Jordan MI. On nonconvex optimization for machine learning: Gradients, stochasticity, and saddle points. Journal of the ACM (JACM). 2021 Feb 24;68(2):1-29. link
Diakonikolas J, Daskalakis C, Jordan MI. Efficient methods for structured nonconvex-nonconcave min-max optimization. InInternational Conference on Artificial Intelligence and Statistics 2021 Mar 18 (pp. 2746-2754). PMLR. link
Amir I, Koren T, Livni R. SGD generalizes better than GD (and regularization doesn’t help). InConference on Learning Theory 2021 Jul 21 (pp. 63-92). PMLR. link
Vlaski S, Sayed AH. Distributed learning in non-convex environments—Part I: Agreement at a linear rate. IEEE Transactions on Signal Processing. 2021 Jan 12;69:1242-56. link
Vlaski S, Sayed AH. Distributed learning in non-convex environments—Part II: Polynomial escape from saddle-points. IEEE Transactions on Signal Processing. 2021 Jan 13;69:1257-70. link
Bu Z, Xu S, Chen K. A dynamical view on optimization algorithms of overparameterized neural networks. InInternational conference on artificial intelligence and statistics 2021 Mar 18 (pp. 3187-3195). PMLR. link
Farnia F, Ozdaglar A. Train simultaneously, generalize better: Stability of gradient-based minimax learners. InInternational Conference on Machine Learning 2021 Jul 1 (pp. 3174-3185). PMLR. link
Lamperski A. Projected stochastic gradient langevin algorithms for constrained sampling and non-convex learning. InConference on Learning Theory 2021 Jul 21 (pp. 2891-2937). PMLR. link
Guminov S, Dvurechensky P, Tupitsa N, Gasnikov A. On a combination of alternating minimization and Nesterov’s momentum. InInternational conference on machine learning 2021 Jul 1 (pp. 3886-3898). PMLR. link
Hsieh YP, Mertikopoulos P, Cevher V. The limits of min-max optimization algorithms: Convergence to spurious non-critical sets. InInternational Conference on Machine Learning 2021 Jul 1 (pp. 4337-4348). PMLR. link
Li M, Sutter T, Kuhn D. Distributionally robust optimization with Markovian data. InInternational Conference on Machine Learning 2021 Jul 1 (pp. 6493-6503). PMLR. link
Li Z, Chen PY, Liu S, Lu S, Xu Y. Rate-improved inexact augmented Lagrangian method for constrained nonconvex optimization. InInternational Conference on Artificial Intelligence and Statistics 2021 Mar 18 (pp. 2170-2178). PMLR. link
Zhang S, Yang J, Guzmán C, Kiyavash N, He N. The complexity of nonconvex-strongly-concave minimax optimization. InUncertainty in Artificial Intelligence 2021 Dec 1 (pp. 482-492). PMLR. link
Li Z, Bao H, Zhang X, Richtárik P. PAGE: A simple and optimal probabilistic gradient estimator for nonconvex optimization. InInternational conference on machine learning 2021 Jul 1 (pp. 6286-6295). PMLR. link
Liu Y, Sun Y, Yin W. Decentralized learning with lazy and approximate dual gradients. IEEE Transactions on Signal Processing. 2021 Feb 4;69:1362-77. link
Usmanova I, Kamgarpour M, Krause A, Levy K. Fast projection onto convex smooth constraints. InInternational Conference on Machine Learning 2021 Jul 1 (pp. 10476-10486). PMLR. link
Tao W, Li W, Pan Z, Tao Q. Gradient descent averaging and primal-dual averaging for strongly convex optimization. InProceedings of the AAAI conference on artificial intelligence 2021 May 18 (Vol. 35, No. 11, pp. 9843-9850). link
Richards D, Rabbat M. Learning with gradient descent and weakly convex losses. InInternational Conference on Artificial Intelligence and Statistics 2021 Mar 18 (pp. 1990-1998). PMLR. link
Ene A, Nguyen HL, Vladu A. Adaptive gradient methods for constrained convex optimization and variational inequalities. InProceedings of the AAAI Conference on Artificial Intelligence 2021 May 18 (Vol. 35, No. 8, pp. 7314-7321). link
Liu Y, Sun Y, Yin W. Decentralized learning with lazy and approximate dual gradients. IEEE Transactions on Signal Processing. 2021 Feb 4;69:1362-77. link
Lacotte J, Pilanci M. Optimal randomized first-order methods for least-squares problems. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 5587-5597). PMLR. link
Ma C, Rao Y, Cheng Y, Chen C, Lu J, Zhou J. Structure-preserving super resolution with gradient guidance. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 7769-7778). link
Pham HT, Nguyen PM. A note on the global convergence of multilayer neural networks in the mean field regime. arXiv preprint arXiv:2006.09355. 2020 Jun 16. link
Nguyen QN, Mondelli M. Global convergence of deep networks with one wide layer followed by pyramidal topology. Advances in Neural Information Processing Systems. 2020;33:11961-72. link
Mokhtari A, Ozdaglar AE, Pattathil S. Convergence rate of O(1/k) for optimistic gradient and extragradient methods in smooth convex-concave saddle point problems. SIAM Journal on Optimization. 2020;30(4):3230-51. link
Lei Y, Zhou DX. Convergence of online mirror descent. Applied and Computational Harmonic Analysis. 2020 Jan 1;48(1):343-73. link
Golowich N, Pattathil S, Daskalakis C, Ozdaglar A. Last iterate is slower than averaged iterate in smooth convex-concave saddle point problems. InConference on Learning Theory 2020 Jul 15 (pp. 1758-1784). PMLR. link
Fang H, Fan Z, Sun Y, Friedlander M. Greed meets sparsity: Understanding and improving greedy coordinate descent for sparse optimization. InInternational Conference on Artificial Intelligence and Statistics 2020 Jun 3 (pp. 434-444). PMLR. link
Eftekhari A. Training linear neural networks: Non-local convergence and complexity results. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 2836-2847). PMLR. link
Doikov N, Nesterov Y. Contracting proximal methods for smooth convex optimization. SIAM Journal on Optimization. 2020;30(4):3146-69. link
Cocola J, Hand P. Global convergence of sobolev training for overparameterized neural networks. InInternational Conference on Machine Learning, Optimization, and Data Science 2020 Jul 19 (pp. 574-586). Cham: Springer International Publishing. link
Assran M, Rabbat M. On the convergence of nesterov's accelerated gradient method in stochastic settings. arXiv preprint arXiv:2002.12414. 2020 Feb 27. link
Goerigk M, Kurtz J. Data-driven robust optimization using unsupervised deep learning. arXiv preprint arXiv:2011.09769. 2020 Nov 19. link
Le H, Gillis N, Patrinos P. Inertial block proximal methods for non-convex non-smooth optimization. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 5671-5681). PMLR. link
Ma R, Lin Q, Yang T. Quadratically regularized subgradient methods for weakly convex optimization with weakly convex constraints. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 6554-6564). PMLR. link
Bunel RR, Hinder O, Bhojanapalli S, Dvijotham K. An efficient nonconvex reformulation of stagewise convex optimization problems. Advances in Neural Information Processing Systems. 2020;33:8247-58. link
Frei S, Cao Y, Gu Q. Agnostic learning of a single neuron with gradient descent. Advances in neural information processing systems. 2020;33:5417-28. link
Jiang R, Li D. A linear-time algorithm for generalized trust region subproblems. SIAM Journal on Optimization. 2020;30(1):915-32. link
Kim S, Kojima M, Toh KC. A geometrical analysis on convex conic reformulations of quadratic and polynomial optimization problems. SIAM Journal on Optimization. 2020;30(2):1251-73. link
Ahuja K, Dhurandhar A, Varshney KR. Learning to Initialize Gradient Descent Using Gradient Descent. arXiv preprint arXiv:2012.12141. 2020 Dec 22. link
Lin T, Jin C, Jordan M. On gradient descent ascent for nonconvex-concave minimax problems. InInternational conference on machine learning 2020 Nov 21 (pp. 6083-6093). PMLR. link
Lu S, Razaviyayn M, Yang B, Huang K, Hong M. Finding second-order stationary points efficiently in smooth nonconvex linearly constrained optimization problems. Advances in Neural Information Processing Systems. 2020;33:2811-22. link
Qu Q, Zhai Y, Li X, Zhang Y, Zhu Z. Geometric analysis of nonconvex optimization landscapes for overcomplete learning. InInternational Conference on Learning Representations 2020 Apr. link
Tran-Dinh Q, Pham N, Nguyen L. Stochastic Gauss-Newton algorithms for nonconvex compositional optimization. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 9572-9582). PMLR. link
Hu J, Liu X, Wen ZW, Yuan YX. A brief introduction to manifold optimization. Journal of the Operations Research Society of China. 2020 Jun;8:199-248. link
Yan Y, Xu Y, Zhang L, Xiaoyu W, Yang T. Stochastic optimization for non-convex inf-projection problems. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 10660-10669). PMLR. link
Kazemipour A, Larsen B, Druckmann S. No spurious local minima in deep quadratic networks. CoRR, abs/2001.00098. 2020. link
Arjevani Y, Carmon Y, Duchi JC, Foster DJ, Sekhari A, Sridharan K. Second-order information in non-convex stochastic optimization: Power and limitations. InConference on Learning Theory 2020 Jul 15 (pp. 242-299). PMLR. link
Dey SS, Wang G, Xie Y. Approximation algorithms for training one-node ReLU neural networks. IEEE Transactions on Signal Processing. 2020 Nov 24;68:6696-706. link
Guo Z, Liu M, Yuan Z, Shen L, Liu W, Yang T. Communication-efficient distributed stochastic auc maximization with deep neural networks. InInternational conference on machine learning 2020 Nov 21 (pp. 3864-3874). PMLR. link
Li Y, Wu C, Duan Y. The TV p regularized Mumford-Shah model for image labeling and segmentation. IEEE Transactions on Image Processing. 2020 Jun 2;29:7061-75. link
Mai V, Johansson M. Convergence of a stochastic gradient method with momentum for non-smooth non-convex optimization. InInternational conference on machine learning 2020 Nov 21 (pp. 6630-6639). PMLR. link
Xie Y, Wu X, Ward R. Linear convergence of adaptive stochastic gradient descent. InInternational conference on artificial intelligence and statistics 2020 Jun 3 (pp. 1475-1485). PMLR. link
Yang Y, Yu J. Fast proximal gradient descent for a class of non-convex and non-smooth sparse learning problems. InUncertainty in Artificial Intelligence 2020 Aug 6 (pp. 1253-1262). PMLR. link
Lacotte J, Pilanci M. Optimal randomized first-order methods for least-squares problems. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 5587-5597). PMLR. link
Korba A, Salim A, Arbel M, Luise G, Gretton A. A non-asymptotic analysis for Stein variational gradient descent. Advances in Neural Information Processing Systems. 2020;33:4672-82. link
Ji Z, Dudík M, Schapire RE, Telgarsky M. Gradient descent follows the regularization path for general losses. InConference on Learning Theory 2020 Jul 15 (pp. 2109-2136). PMLR. link
Heckel R, Soltanolkotabi M. Compressive sensing with un-trained neural networks: Gradient descent finds a smooth approximation. InInternational conference on machine learning 2020 Nov 21 (pp. 4149-4158). PMLR. link
Ma C, Rao Y, Cheng Y, Chen C, Lu J, Zhou J. Structure-preserving super resolution with gradient guidance. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 7769-7778). link
Mokhtari A, Ozdaglar A, Pattathil S. A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. InInternational Conference on Artificial Intelligence and Statistics 2020 Jun 3 (pp. 1497-1507). PMLR. link
Celentano M, Montanari A, Wu Y. The estimation error of general first order methods. InConference on Learning Theory 2020 Jul 15 (pp. 1078-1141). PMLR. link
Bu Y, Zou S, Veeravalli VV. Tightening mutual information-based bounds on generalization error. IEEE Journal on Selected Areas in Information Theory. 2020 Apr 28;1(1):121-30. link
Terada Y, Hirose R. Fast generalization error bound of deep learning without scale invariance of activation functions. Neural Networks. 2020 Sep 1;129:344-58. link
Amid E, Warmuth MK. Winnowing with gradient descent. InConference on Learning Theory 2020 Jul 15 (pp. 163-182). PMLR. link
Mokhtari A, Ozdaglar A, Pattathil S. A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. InInternational Conference on Artificial Intelligence and Statistics 2020 Jun 3 (pp. 1497-1507). PMLR. link
He F, Liu T, Tao D. Why resnet works? residuals generalize. IEEE transactions on neural networks and learning systems. 2020 Feb 5;31(12):5349-62. link
Hellström F, Durisi G. Generalization bounds via information density and conditional information density. IEEE Journal on Selected Areas in Information Theory. 2020 Nov;1(3):824-39. link
Zou D, Long PM, Gu Q. On the global convergence of training deep linear resnets. arXiv preprint arXiv:2003.01094. 2020 Mar 2. link
Zhou P, Feng J, Ma C, Xiong C, Hoi SC. Towards theoretically understanding why sgd generalizes better than adam in deep learning. Advances in Neural Information Processing Systems. 2020;33:21285-96. link
Zhou B, Liu J, Sun W, Chen R, Tomlin CJ, Yuan Y. pbSGD: Powered Stochastic Gradient Descent Methods for Accelerated Non-Convex Optimization. InIJCAI 2020 Jul 20 (pp. 3258-3266). link
Zhang B, Jin J, Fang C, Wang L. Improved analysis of clipping algorithms for non-convex optimization. Advances in Neural Information Processing Systems. 2020;33:15511-21. link
Yang Y, Yuan Y, Chatzimichailidis A, van Sloun RJ, Lei L, Chatzinotas S. Proxsgd: Training structured neural networks under regularization and constraints. InInternational Conference on Learning Representations (ICLR) 2020 2020. link
Wu J, Zou D, Braverman V, Gu Q. Direction matters: On the implicit bias of stochastic gradient descent with moderate learning rate. arXiv preprint arXiv:2011.02538. 2020 Nov 4. link
Wu J, Hu W, Xiong H, Huan J, Braverman V, Zhu Z. On the noisy gradient descent that generalizes as sgd. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 10367-10376). PMLR. link
Wang JK, Lin CH, Abernethy J. Escaping saddle points faster with stochastic momentum. arXiv preprint arXiv:2106.02985. 2021 Jun 5. link
Ma S, Zhou Y. Understanding the impact of model incoherence on convergence of incremental sgd with random reshuffle. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 6565-6574). PMLR. link
Lu Y, Ma C, Lu Y, Lu J, Ying L. A mean field analysis of deep resnet and beyond: Towards provably optimization via overparameterization from depth. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 6426-6436). PMLR. link
Leluc R, Portier F. Asymptotic analysis of conditioned stochastic gradient descent. arXiv preprint arXiv:2006.02745. 2020 Jun 4. link
Lei Y, Ying Y. Fine-grained analysis of stability and generalization for stochastic gradient descent. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 5809-5819). PMLR. link
Lei Q, Lee J, Dimakis A, Daskalakis C. Sgd learns one-layer networks in wgans. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 5799-5808). PMLR. link
Gorbunov E, Danilova M, Gasnikov A. Stochastic optimization with heavy-tailed noise via accelerated gradient clipping. Advances in Neural Information Processing Systems. 2020;33:15042-53. link
Gargiani M, Zanelli A, Tran-Dinh Q, Diehl M, Hutter F. Convergence analysis of homotopy-sgd for non-convex optimization. arXiv preprint arXiv:2011.10298. 2020 Nov 20. link
Garber D. On the convergence of stochastic gradient descent with low-rank projections for convex low-rank matrix problems. InConference on Learning Theory 2020 Jul 15 (pp. 1666-1681). PMLR. link
Allen-Zhu Z, Ebrahimian F, Li J, Alistarh D. Byzantine-resilient non-convex stochastic gradient descent. arXiv preprint arXiv:2012.14368. 2020 Dec 28. link
Ahn K, Sra S. On tight convergence rates of without-replacement sgd. arXiv preprint arXiv:2004.08657. 2020 Apr 18. link
Li X, Gu Q, Zhou Y, Chen T, Banerjee A. Hessian based analysis of sgd for deep nets: Dynamics and generalization. InProceedings of the 2020 SIAM International Conference on Data Mining 2020 (pp. 190-198). Society for Industrial and Applied Mathematics. link
Gorbunov E, Hanzely F, Richtárik P. A unified theory of SGD: Variance reduction, sampling, quantization and coordinate descent. InInternational Conference on Artificial Intelligence and Statistics 2020 Jun 3 (pp. 680-690). PMLR. link
Keswani V, Mangoubi O, Sachdeva S, Vishnoi NK. A convergent and dimension-independent min-max optimization algorithm. arXiv preprint arXiv:2006.12376. 2020 Jun 22. link
Zhou Z, Mertikopoulos P, Bambos N, Boyd SP, Glynn PW. On the convergence of mirror descent beyond stochastic convex programming. SIAM Journal on Optimization. 2020;30(1):687-716. link
Zhang J, Xiao P, Sun R, Luo Z. A single-loop smoothed gradient descent-ascent algorithm for nonconvex-concave min-max problems. Advances in neural information processing systems. 2020;33:7377-89. link
Zhang J, Luo ZQ. A proximal alternating direction method of multiplier for linearly constrained nonconvex minimization. SIAM Journal on Optimization. 2020;30(3):2272-302. link
Yang J, Kiyavash N, He N. Global convergence and variance reduction for a class of nonconvex-nonconcave minimax problems. Advances in Neural Information Processing Systems. 2020;33:1153-65. link
Wang Z, Zhou Y, Liang Y, Lan G. Cubic regularization with momentum for nonconvex optimization. InUncertainty in Artificial Intelligence 2020 Aug 6 (pp. 313-322). PMLR. link
Shi Q, Hong M. Penalty dual decomposition method for nonsmooth nonconvex optimization—Part I: Algorithms and convergence analysis. IEEE Transactions on Signal Processing. 2020 Jun 18;68:4108-22. link
Pieper K, Petrosyan A. Nonconvex penalization for sparse neural networks. arXiv preprint arXiv:2004.11515. 2020 Apr 24;679. link
Nouiehed M, Razaviyayn M. A trust region method for finding second-order stationarity in linearly constrained nonconvex optimization. SIAM Journal on Optimization. 2020;30(3):2501-29. link
Mukkamala MC, Ochs P, Pock T, Sabach S. Convex-concave backtracking for inertial Bregman proximal gradient algorithms in nonconvex optimization. SIAM Journal on Mathematics of Data Science. 2020;2(3):658-82. link
Jin C, Netrapalli P, Jordan M. What is local optimality in nonconvex-nonconcave minimax optimization?. InInternational conference on machine learning 2020 Nov 21 (pp. 4880-4889). PMLR. link
Geffner T, Domke J. On the difficulty of unbiased alpha divergence minimization. arXiv preprint arXiv:2010.09541. 2020 Oct 19. link
Zhu Z, Soudry D, Eldar YC, Wakin MB. The global optimization geometry of shallow linear neural networks. Journal of Mathematical Imaging and Vision. 2020 Apr;62(3):279-92. link
Davis D, Drusvyatskiy D. High probability guarantees for stochastic convex optimization. InConference on Learning Theory 2020 Jul 15 (pp. 1411-1427). PMLR. link
Gerchinovitz S, Ménard P, Stoltz G. Fano’s inequality for random variables. link
Joulani P, Raj A, Gyorgy A, Szepesvari C. A simpler approach to accelerated optimization: iterative averaging meets optimism. InInternational conference on machine learning 2020 Nov 21 (pp. 4984-4993). PMLR. link
Lee J, Pacchiano A, Bartlett P, Jordan M. Accelerated message passing for entropy-regularized MAP inference. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 5736-5746). PMLR. link
Lin T, Fan C, Wang M, Jordan MI. Improved sample complexity for stochastic compositional variance reduced gradient. In2020 American Control Conference (ACC) 2020 Jul 1 (pp. 126-131). IEEE. link
Ma C, Wu L. On the generalization properties of minimum-norm solutions for over-parameterized neural network models. stat. 2019 Dec;1050:15. link
Jin C, Netrapalli P, Ge R, Kakade SM, Jordan MI. Stochastic gradient descent escapes saddle points efficiently. arXiv. 2019;2019. link
Kawaguchi K, Huang J. Gradient descent finds global minima for generalizable deep neural networks of practical sizes. In2019 57th annual allerton conference on communication, control, and computing (Allerton) 2019 Sep 24 (pp. 92-99). IEEE. link
Yu S, Principe JC. Understanding autoencoders with information theoretic concepts. Neural Networks. 2019 Sep 1;117:104-23. link
Du S, Hu W. Width provably matters in optimization for deep linear neural networks. InInternational Conference on Machine Learning 2019 May 24 (pp. 1655-1664). PMLR. link
Malitsky Y, Mishchenko K. Adaptive gradient descent without descent. arXiv preprint arXiv:1910.09529. 2019 Oct 21. link
Zhao R, Haskell WB, Tan VY. An optimal algorithm for stochastic three-composite optimization. InThe 22nd International Conference on Artificial Intelligence and Statistics 2019 Apr 11 (pp. 428-437). PMLR. link
Azizan N, Lale S, Hassibi B. Stochastic mirror descent on overparameterized nonlinear models: Convergence, implicit regularization, and generalization. arXiv preprint arXiv:1906.03830. 2019 Jun 10. link
Allen-Zhu Z, Li Y, Song Z. A convergence theory for deep learning via over-parameterization. InInternational conference on machine learning 2019 May 24 (pp. 242-252). PMLR. link
Li J, Luo X, Qiao M. On generalization error bounds of noisy gradient methods for non-convex learning. arXiv preprint arXiv:1902.00621. 2019 Feb 2. link
Lei Y, Hu T, Li G, Tang K. Stochastic gradient descent for nonconvex learning without bounded gradient assumptions. IEEE transactions on neural networks and learning systems. 2019 Dec 11;31(10):4394-400. link
Dubey SR, Chakraborty S, Roy SK, Mukherjee S, Singh SK, Chaudhuri BB. diffGrad: an optimization method for convolutional neural networks. IEEE transactions on neural networks and learning systems. 2019 Dec 23;31(11):4500-11. link
Tong Q, Liang G, Bi J. Calibrating the learning rate for adaptive gradient methods to improve generalization performance. arXiv preprint arXiv:1908.00700. 2019. link
Stich SU. Unified optimal analysis of the (stochastic) gradient method. arXiv preprint arXiv:1907.04232. 2019 Jul 9. link
Sievert S, Shah S. Improving the convergence of SGD through adaptive batch sizes. arXiv preprint arXiv:1910.08222. 2019 Oct 18. link
Kreusser LM, Osher SJ, Wang B. A deterministic approach to avoid saddle points. arXiv preprint arXiv:1901.06827. 2019 Jan 22. link
Cao Y, Gu Q. Generalization bounds of stochastic gradient descent for wide and deep neural networks. Advances in neural information processing systems. 2019;32. link
Yu X, Wu L, Xu C, Hu Y, Ma C. A novel neural network for solving nonsmooth nonconvex optimization problems. IEEE transactions on neural networks and learning systems. 2019 Jun 27;31(5):1475-88. link
Yang Y, Pesavento M, Luo ZQ, Ottersten B. Inexact block coordinate descent algorithms for nonsmooth nonconvex optimization. IEEE Transactions on Signal Processing. 2019 Dec 11;68:947-61. link
Liu R, Cheng S, He Y, Fan X, Lin Z, Luo Z. On the convergence of learning-based iterative methods for nonconvex inverse problems. IEEE transactions on pattern analysis and machine intelligence. 2019 Jun 3;42(12):3027-39. link
Li W, Bian W, Xue X. Projected neural network for a class of non-Lipschitz optimization problems with linear constraints. IEEE Transactions on Neural Networks and Learning Systems. 2019 Nov 1;31(9):3361-73. link
Allen-Zhu Z, Li Y, Liang Y. Learning and generalization in overparameterized neural networks, going beyond two layers. Advances in neural information processing systems. 2019;32. link
Chang FC, Wang HJ, Chou CN, Chang EY. G2R Bound: A Generalization Bound for Supervised Learning from GAN-Synthetic Data. arXiv preprint arXiv:1905.12313. 2019 May 29. link
Frei S, Cao Y, Gu Q. Algorithm-dependent generalization bounds for overparameterized deep residual networks. Advances in neural information processing systems. 2019;32. link
Latorre F, Cevher V. Fast and provable ADMM for learning with generative priors. Advances in Neural Information Processing Systems. 2019;32. link
Chen L, Fang F, Wang T, Zhang G. Blind image deblurring with local maximum gradient prior. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2019 (pp. 1742-1750). link
Malitsky Y, Mishchenko K. Adaptive gradient descent without descent. arXiv preprint arXiv:1910.09529. 2019 Oct 21. link
Zou F, Shen L, Jie Z, Zhang W, Liu W. A sufficient condition for convergences of adam and rmsprop. InProceedings of the IEEE/CVF Conference on computer vision and pattern recognition 2019 (pp. 11127-11135). link
Liu M, Mroueh Y, Ross J, Zhang W, Cui X, Das P, Yang T. Towards better understanding of adaptive gradient algorithms in generative adversarial nets. arXiv preprint arXiv:1912.11940. 2019 Dec 26. link
Cotter A, Jiang H, Sridharan K. Two-player games for efficient non-convex constrained optimization. InAlgorithmic Learning Theory 2019 Mar 10 (pp. 300-332). PMLR. link
Sanyal A, Torr PH, Dokania PK. Stable rank normalization for improved generalization in neural networks and gans. arXiv preprint arXiv:1906.04659. 2019 Jun 11. link
Ma YA, Chen Y, Jin C, Flammarion N, Jordan MI. Sampling can be faster than optimization. Proceedings of the National Academy of Sciences. 2019 Oct 15;116(42):20881-5. link
Sahin MF, Alacaoglu A, Latorre F, Cevher V. An inexact augmented Lagrangian framework for nonconvex optimization with nonlinear constraints. Advances in Neural Information Processing Systems. 2019;32. link
Zhang Y, Liu T, Long M, Jordan M. Bridging theory and algorithm for domain adaptation. InInternational conference on machine learning 2019 May 24 (pp. 7404-7413). PMLR. link
Khrulkov V, Oseledets I. Universality theorems for generative models. arXiv preprint arXiv:1905.11520. 2019 May 27. link
Sedghi H. Size-free generalization bounds for convolutional neural networks. link
Mukherjee S. General information bottleneck objectives and their applications to machine learning. arXiv preprint arXiv:1912.06248. 2019 Dec 12. link
Hu Z, Yang Z, Salakhutdinov RR, Qin LI, Liang X, Dong H, Xing EP. Deep generative models with learnable knowledge constraints. Advances in Neural Information Processing Systems. 2018;31. link
Sun H, Chen X, Shi Q, Hong M, Fu X, Sidiropoulos ND. Learning to optimize: Training deep neural networks for interference management. IEEE Transactions on Signal Processing. 2018 Aug 23;66(20):5438-53. link
Yu H, Zhang Z, Qin Z, Wu H, Li D, Zhao J, Lu X. Loss rank mining: A general hard example mining method for real-time detectors. In2018 International Joint Conference on neural networks (IJCNN) 2018 Jul 8 (pp. 1-8). IEEE. link
Zhao P, Lai L. Nonparametric direct entropy difference estimation. In2018 IEEE Information Theory Workshop (ITW) 2018 Nov 25 (pp. 1-5). IEEE. link
Lee C, Zame W, Yoon J, Van Der Schaar M. Deephit: A deep learning approach to survival analysis with competing risks. InProceedings of the AAAI conference on artificial intelligence 2018 Apr 26 (Vol. 32, No. 1). link
Sanjabi M, Ba J, Razaviyayn M, Lee JD. Solving approximate Wasserstein GANs to stationarity. arXiv preprint arXiv:1802.08249. 2018. link
Zou F, Shen L. On the convergence of adagrad with momentum for training deep neural networks. arXiv preprint arXiv:1808.03408. 2018 Aug 10;2(3):5. link
Du SS, Zhai X, Poczos B, Singh A. Gradient descent provably optimizes over-parameterized neural networks. arXiv preprint arXiv:1810.02054. 2018 Oct 4. link
Vishnoi NK. Geodesic convex optimization: Differentiation on manifolds, geodesics, and convexity. arXiv preprint arXiv:1806.06373. 2018 Jun 17. link
Li J, Madry A, Peebles J, Schmidt L. On the limitations of first-order approximation in GAN dynamics. InInternational Conference on Machine Learning 2018 Jul 3 (pp. 3005-3013). PMLR. link
Yin D, Chen Y, Kannan R, Bartlett P. Byzantine-robust distributed learning: Towards optimal statistical rates. InInternational conference on machine learning 2018 Jul 3 (pp. 5650-5659). Pmlr. link
Chen J, Zhou D, Tang Y, Yang Z, Cao Y, Gu Q. Closing the generalization gap of adaptive gradient methods in training deep neural networks. arXiv preprint arXiv:1806.06763. 2018 Jun 18. link
Hu Y, Liu X, Jacob M. A generalized structured low-rank matrix completion algorithm for MR image recovery. IEEE transactions on medical imaging. 2018 Dec 11;38(8):1841-51. link
Mardani M, Sun Q, Donoho D, Papyan V, Monajemi H, Vasanawala S, Pauly J. Neural proximal gradient descent for compressive imaging. Advances in Neural Information Processing Systems. 2018;31. link
Smirnov E, Melnikov A, Oleinik A, Ivanova E, Kalinovskiy I, Luckyanets E. Hard example mining with auxiliary embeddings. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 2018 (pp. 37-46). link
Liu C, Belkin M. Accelerating SGD with momentum for over-parameterized learning. arXiv preprint arXiv:1810.13395. 2018 Oct 31. link
Li X, Lu J, Wang Z, Haupt J, Zhao T. On tighter generalization bound for deep neural networks: Cnns, resnets, and beyond. arXiv preprint arXiv:1806.05159. 2018 Jun 13. link
Neyshabur B, Li Z, Bhojanapalli S, LeCun Y, Srebro N. Towards understanding the role of over-parametrization in generalization of neural networks. arXiv preprint arXiv:1805.12076. 2018 May 30. link
Tran-Dinh Q, Cevher V. Smoothing Alternating Direction Methods for Fully Nonsmooth Constrained Convex Optimization. Large-Scale and Distributed Optimization. 2018:57-95. link
Xu Z, Figueiredo MA, Yuan X, Studer C, Goldstein T. Adaptive relaxed ADMM: Convergence theory and practical implementation. InProceedings of the IEEE conference on computer vision and pattern recognition 2017 (pp. 7389-7398). link
Perekrestenko D, Cevher V, Jaggi M. Faster coordinate descent via adaptive importance sampling. InArtificial Intelligence and Statistics 2017 Apr 10 (pp. 869-877). PMLR. link
Haeffele BD, Vidal R. Global optimality in neural network training. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017 (pp. 7331-7339). link
Ma S, Nguyen CT, Christodoulou AG, Luthringer D, Kobashigawa J, Lee SE, Chang HJ, Li D. Accelerated cardiac diffusion tensor imaging using joint low-rank and sparsity constraints. IEEE Transactions on Biomedical Engineering. 2017 Dec 25;65(10):2219-30. link
Boyd N, Schiebinger G, Recht B. The alternating descent conditional gradient method for sparse inverse problems. SIAM Journal on Optimization. 2017;27(2):616-39. link
Drusvyatskiy D, Wolkowicz H. The many faces of degeneracy in conic optimization. Foundations and Trends® in Optimization. 2017 Dec 19;3(2):77-170. link
Du SS, Jin C, Lee JD, Jordan MI, Singh A, Poczos B. Gradient descent can take exponential time to escape saddle points. Advances in neural information processing systems. 2017;30. link
Lu H, Kawaguchi K. Depth creates no bad local minima. arXiv preprint arXiv:1702.08580. 2017 Feb 27. link
Agarwal N, Allen-Zhu Z, Bullins B, Hazan E, Ma T. Finding approximate local minima faster than gradient descent. InProceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing 2017 Jun 19 (pp. 1195-1199). link
Brutzkus A, Globerson A. Globally optimal gradient descent for a convnet with gaussian inputs. InInternational conference on machine learning 2017 Jul 17 (pp. 605-614). PMLR. link
Li Y, Yuan Y. Convergence analysis of two-layer neural networks with relu activation. Advances in neural information processing systems. 2017;30. link
Nguyen Q, Hein M. The loss surface of deep and wide neural networks. InInternational conference on machine learning 2017 Jul 17 (pp. 2603-2612). PMLR. link
Allen-Zhu ZN. 2: Faster non-convex optimization than sgd. arXiv preprint arXiv:1708.08694. 2017. link
Park J, Boyd S. General heuristics for nonconvex quadratically constrained quadratic programming. arXiv preprint arXiv:1703.07870. 2017 Mar 22. link
Lei L, Ju C, Chen J, Jordan MI. Non-convex finite-sum optimization via scsg methods. Advances in neural information processing systems. 2017;30. link
Hong M, Luo ZQ, Razaviyayn M. Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM Journal on Optimization. 2016;26(1):337-64. link
Moritz P, Nishihara R, Jordan M. A linearly-convergent stochastic L-BFGS algorithm. InArtificial Intelligence and Statistics 2016 May 2 (pp. 249-258). PMLR. link
Soudry D, Carmon Y. No bad local minima: Data independent training error guarantees for multilayer neural networks. arXiv preprint arXiv:1605.08361. 2016 May 26. link
Taylor G, Burmeister R, Xu Z, Singh B, Patel A, Goldstein T. Training neural networks without gradients: A scalable admm approach. InInternational conference on machine learning 2016 Jun 11 (pp. 2722-2731). PMLR. link
Eldan R, Shamir O. The power of depth for feedforward neural networks. InConference on learning theory 2016 Jun 6 (pp. 907-940). PMLR. link
Safran I, Shamir O. On the quality of the initial basin in overspecified neural networks. InInternational Conference on Machine Learning 2016 Jun 11 (pp. 774-782). PMLR. link
Ravishankar S, Bresler Y. Sparsifying transform learning with efficient optimal updates and convergence guarantees. IEEE Transactions on Signal Processing. 2015 Feb 19;63(9):2389-404. link
Giselsson P. Tight linear convergence rate bounds for Douglas-Rachford splitting and ADMM. In2015 54th IEEE Conference on Decision and Control (CDC) 2015 Dec 15 (pp. 3305-3310). IEEE. link
Zheng Q, Lafferty J. A convergent gradient descent algorithm for rank minimization and semidefinite programming from random linear measurements. Advances in Neural Information Processing Systems. 2015;28. link
Chen Y, Wainwright MJ. Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees. arXiv preprint arXiv:1509.03025. 2015 Sep 10. link
Sun C, Dai R. An iterative approach to rank minimization problems. In2015 54th IEEE Conference on Decision and Control (CDC) 2015 Dec 15 (pp. 3317-3323). IEEE. link
Scutari G, Facchinei F, Lampariello L, Song P. Parallel and distributed methods for nonconvex optimization. In2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014 May 4 (pp. 840-844). IEEE. link
Goldstein T, O'Donoghue B, Setzer S, Baraniuk R. Fast alternating direction optimization methods. SIAM Journal on Imaging Sciences. 2014;7(3):1588-623. link
Giselsson P, Boyd S. Metric selection in Douglas-Rachford splitting and ADMM. arXiv preprint arXiv:1410.8479. 2014 Oct. link
Blumensath T. Compressed sensing with nonlinear observations and related nonlinear optimization problems. IEEE Transactions on Information Theory. 2013 Feb 22;59(6):3466-74. link
Nesterov Y. Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM Journal on Optimization. 2012;22(2):341-62. link
Tao M, Yuan X. On the O(1/t) Convergence Rate of Alternating Direction Method with Logarithmic-Quadratic Proximal Regularization. SIAM Journal on Optimization. 2012;22(4):1431-48. link
Chen Y, Ye X. Projection onto a simplex. arXiv preprint arXiv:1101.6081. 2011 Jan 31. link
Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira F, Vaughan JW. A theory of learning from different domains. Machine learning. 2010 May;79:151-75. link
Gregor K, LeCun Y. Learning fast approximations of sparse coding. InProceedings of the 27th international conference on international conference on machine learning 2010 Jun 21 (pp. 399-406). link
Cai JF, Osher S, Shen Z. Linearized Bregman iterations for compressed sensing. Mathematics of computation. 2009;78(267):1515-36. link
Cai JF, Osher S, Shen Z. Convergence of the linearized Bregman iteration for ℓ₁-norm minimization. Mathematics of Computation. 2009;78(268):2127-36. link
Daubechies I, Fornasier M, Loris I. Accelerated projected gradient method for linear inverse problems with sparsity constraints. journal of fourier analysis and applications. 2008 Dec;14:764-92. link
Yin W, Osher S, Goldfarb D, Darbon J. Bregman iterative algorithms for l1-minimization with applications to compressed sensing: Siam journal on imaging sciences, 1, 143–168. LIST OF FIGURES. 2008. link
Chartrand R, Staneva V. Restricted isometry properties and nonconvex compressive sensing. Inverse Problems. 2008 May 14;24(3):035020. link
Ben-David S, Blitzer J, Crammer K, Pereira F. Analysis of representations for domain adaptation. Advances in neural information processing systems. 2006;19. link
Burer S, Monteiro RD. Local minima and convergence in low-rank semidefinite programming. Mathematical programming. 2005 Jul;103(3):427-44. link
Daubechies I, Defrise M, De Mol C. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences. 2004 Nov;57(11):1413-57. link
Bertsekas DP, Tsitsiklis JN. Gradient convergence in gradient methods with errors. SIAM Journal on Optimization. 2000;10(3):627-42. link
Liu S, Zeng Z, Ren T, Li F, Zhang H, Yang J, Jiang Q, Li C, Yang J, Su H, Zhu J. Grounding dino: Marrying dino with grounded pre-training for open-set object detection. InEuropean Conference on Computer Vision 2024 Sep 29 (pp. 38-55). Cham: Springer Nature Switzerland. link
Hu J, Lin J, Gong S, Cai W. Relax image-specific prompt requirement in sam: A single generic prompt for segmenting camouflaged objects. InProceedings of the AAAI Conference on Artificial Intelligence 2024 Mar 24 (Vol. 38, No. 11, pp. 12511-12518). link
Gao Z, Du Y, Zhang X, Ma X, Han W, Zhu SC, Li Q. Clova: A closed-loop visual assistant with tool usage and update. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 13258-13268). link
Guo J, Wang N, Qi L, Shi Y. Aloft: A lightweight mlp-like architecture with dynamic low-frequency transform for domain generalization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 24132-24141). link
Saberi SA, Najafi A, Motahari A, Khalaj B. Sample complexity bounds for learning high-dimensional simplices in noisy regimes. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 29514-29541). PMLR. link
Gao S, Zhou C, Zhang J. Generalized relation modeling for transformer tracking. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 18686-18695). link
Dong P, Kong Z, Meng X, Zhang P, Tang H, Wang Y, Chou CH. SpeedDETR: Speed-aware transformers for end-to-end object detection. link
Dai R, Zhang Y, Fang Z, Han B, Tian X. Moderately distributional exploration for domain generalization. arXiv preprint arXiv:2304.13976. 2023 Apr 27. link
Cui J, Liang J, Yue Q, Liang J. A general representation learning framework with generalization performance guarantees. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 6522-6544). PMLR. link
Lin S, Ju P, Liang Y, Shroff N. Theory on forgetting and generalization of continual learning. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 21078-21100). PMLR. link
Lin A, Tolooshams B, Atchadé Y, Ba DE. Probabilistic unrolling: Scalable, inverse-free maximum likelihood estimation for latent Gaussian models. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 21153-21181). PMLR. link
Hu Z, Wu X, Huang H. Beyond lipschitz smoothness: A tighter analysis for nonconvex optimization. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 13652-13678). PMLR. link
Chuang CY, Jegelka S, Alvarez-Melis D. Infoot: Information maximizing optimal transport. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 6228-6242). PMLR. link
Chen L, Zhang Y, Song Y, Shan Y, Liu L. Improved test-time adaptation for domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 24172-24182). link
Castro E, Pereira JC, Cardoso JS. Symmetry-based regularization in deep breast cancer screening. Medical Image Analysis. 2023 Jan 1;83:102690. link
Bai H, Canal G, Du X, Kwon J, Nowak RD, Li Y. Feed two birds with one scone: Exploiting wild data for both out-of-distribution generalization and detection. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 1454-1471). PMLR. link
Baevski A, Babu A, Hsu WN, Auli M. Efficient self-supervised learning with contextualized target representations for vision, speech and language. InInternational conference on machine learning 2023 Jul 3 (pp. 1416-1429). PMLR. link
Azizi S, Culp L, Freyberg J, Mustafa B, Baur S, Kornblith S, Chen T, Tomasev N, Mitrović J, Strachan P, Mahdavi SS. Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging. Nature Biomedical Engineering. 2023 Jun;7(6):756-79. link
Hasson H, Maddix DC, Wang B, Gupta G, Park Y. Theoretical guarantees of learning ensembling strategies with applications to time series forecasting. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 12616-12632). PMLR. link
Liu C, Ding H, Jiang X. Gres: Generalized referring expression segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 23592-23601). link
Godau P, Kalinowski P, Christodoulou E, Reinke A, Tizabi M, Ferrer L, Jäger PF, Maier-Hein L. Deployment of image analysis algorithms under prevalence shifts. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2023 Oct 1 (pp. 389-399). Cham: Springer Nature Switzerland. link
Ghosh S, Yu K, Batmanghelich K. Distilling blackbox to interpretable models for efficient transfer learning. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2023 Oct 1 (pp. 628-638). Cham: Springer Nature Switzerland. link
Li M, Sun K, Gu Y, Zhang K, Sun Y, Li Z, Shen D. Developing large pre-trained model for breast tumor segmentation from ultrasound images. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2023 Oct 1 (pp. 89-96). Cham: Springer Nature Switzerland. link
Li H, Zhu J, Jiang X, Zhu X, Li H, Yuan C, Wang X, Qiao Y, Wang X, Wang W, Dai J. Uni-perceiver v2: A generalist model for large-scale vision and vision-language tasks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 2691-2700). link
Jin J, Li Z, Lyu K, Du SS, Lee JD. Understanding incremental learning of gradient descent: A fine-grained analysis of matrix sensing. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 15200-15238). PMLR. link
Huang W, Chen C, Li Y, Li J, Li C, Song F, Yan Y, Xiong Z. Style projected clustering for domain generalized semantic segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 3061-3071). link
Hu S, Liao Z, Xia Y. Devil is in channels: Contrastive single domain generalization for medical image segmentation. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2023 Oct 1 (pp. 14-23). Cham: Springer Nature Switzerland. link
Lu J, Clark C, Lee S, Zhang Z, Khosla S, Marten R, Hoiem D, Kembhavi A. Unified-io 2: Scaling autoregressive multimodal models with vision language audio and action. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 26439-26455). link
Wang Z, Ding N, Levinboim T, Chen X, Soricut R. Improving robust generalization by direct pac-bayesian bound minimization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 16458-16468). link
Lei Y, Yang T, Ying Y, Zhou DX. Generalization analysis for contrastive representation learning. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 19200-19227). PMLR. link
Kong I, Yang D, Lee J, Ohn I, Baek G, Kim Y. Masked Bayesian neural networks: Theoretical guarantee and its posterior inference. InInternational conference on machine learning 2023 Jul 3 (pp. 17462-17491). PMLR. link
Kawaguchi K, Deng Z, Ji X, Huang J. How does information bottleneck help deep learning?. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 16049-16096). PMLR. link
Zandieh A, Han I, Daliri M, Karbasi A. Kdeformer: Accelerating transformers via kernel density estimation. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 40605-40623). PMLR. link
Wu J, Zou D, Chen Z, Braverman V, Gu Q, Kakade SM. Finite-sample analysis of learning high-dimensional single relu neuron. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 37919-37951). PMLR. link
Tong P, Su W, Li H, Ding J, Haoxiang Z, Chen SX. Distribution free domain generalization. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 34369-34378). PMLR. link
Williams DJ, Liu S. Approximate stein classes for truncated density estimation. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 37066-37090). PMLR. link
Wang Y, Chen Y, Jamieson K, Du SS. Improved active multi-task representation learning via lasso. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 35548-35578). PMLR. link
Shen M, Bu Y, Wornell GW. On balancing bias and variance in unsupervised multi-source-free domain adaptation. InInternational conference on machine learning 2023 Jul 3 (pp. 30976-30991). PMLR. link
Mazzetto A, Upfal E. Nonparametric density estimation under distribution drift. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 24251-24270). PMLR. link
Gálvez BR, Blaas A, Rodríguez P, Golinski A, Suau X, Ramapuram J, Busbridge D, Zappella L. The role of entropy and reconstruction in multi-view self-supervised learning. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 29143-29160). PMLR. link
Evron I, Moroshko E, Buzaglo G, Khriesh M, Marjieh B, Srebro N, Soudry D. Continual learning in linear classification on separable data. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 9440-9484). PMLR. link
Yuan Y. On the power of foundation models. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 40519-40530). PMLR. link
Wei J, Narasimhan H, Amid E, Chu WS, Liu Y, Kumar A. Distributionally robust post-hoc classifiers under prior shifts. arXiv preprint arXiv:2309.08825. 2023 Sep 16. link
Wang X, Wang W, Cao Y, Shen C, Huang T. Images speak in images: A generalist painter for in-context visual learning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 6830-6839). link
Noarov G, Roth A. The statistical scope of multicalibration. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 26283-26310). PMLR. link
Oko K, Akiyama S, Suzuki T. Diffusion models are minimax optimal distribution estimators. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 26517-26582). PMLR. link
Lv T, Liu Y, Miao K, Li L, Pan X. Diffusion kinetic model for breast cancer segmentation in incomplete dce-mri. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2023 Oct 1 (pp. 100-109). Cham: Springer Nature Switzerland. link
Song Z, Ye M, Yin J, Zhang L. A Nearly-Optimal Bound for Fast Regression with $\ell_\infty $ Guarantee. arXiv preprint arXiv:2302.00248. 2023 Feb 1. link
Brekelmans R, Huang S, Ghassemi M, Steeg GV, Grosse R, Makhzani A. Improving mutual information estimation with annealed and energy-based bounds. arXiv preprint arXiv:2303.06992. 2023 Mar 13. link
Ramaswamy VV, Kim SS, Fong R, Russakovsky O. Overlooked factors in concept-based explanations: Dataset choice, concept learnability, and human capability. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 10932-10941). link
Aghbalou A, Staerman G. Hypothesis transfer learning with surrogate classification losses. arXiv preprint arXiv:2305.19694. 2023. link
Federici M, Ruhe D, Forré P. On the effectiveness of hybrid mutual information estimation. arXiv preprint arXiv:2306.00608. 2023 Jun 1. link
Lachapelle S, Deleu T, Mahajan D, Mitliagkas I, Bengio Y, Lacoste-Julien S, Bertrand Q. Synergies between disentanglement and sparsity: Generalization and identifiability in multi-task learning. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 18171-18206). PMLR. link
Wei H, Zhuang H, Xie R, Feng L, Niu G, An B, Li Y. Mitigating memorization of noisy labels by clipping the model prediction. InInternational conference on machine learning 2023 Jul 3 (pp. 36868-36886). PMLR. link
Aggarwal P, Deshpande A, Narasimhan KR. Semsup-xc: semantic supervision for zero and few-shot extreme classification. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 228-247). PMLR. link
Wu Y, Zheng B, Chen J, Chen DZ, Wu J. Self-learning and one-shot learning based single-slice annotation for 3d medical image segmentation. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2022 Sep 16 (pp. 244-254). Cham: Springer Nature Switzerland. link
Johnson DD, Hanchi AE, Maddison CJ. Contrastive learning can find an optimal basis for approximately view-invariant functions. arXiv preprint arXiv:2210.01883. 2022 Oct 4. link
Yang K, Zhou T, Tian X, Tao D. Identity-disentangled adversarial augmentation for self-supervised learning. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 25364-25381). PMLR. link
Zhang J, Lin H, Das S, Sra S, Jadbabaie A. Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 26347-26361). PMLR. link
Teng J, Huang W, He H. Can pretext-based self-supervised learning be boosted by downstream data? a theoretical analysis. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 4198-4216). PMLR. link
Mustafa W, Lei Y, Kloft M. On the generalization analysis of adversarial learning. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 16174-16196). PMLR. link
Manole T, Ho N. Refined convergence rates for maximum likelihood estimation under finite mixture models. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 14979-15006). PMLR. link
Liu R, Liu X, Zeng S, Zhang J, Zhang Y. Optimization-derived learning with essential convergence analysis of training and hyper-training. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 13825-13856). PMLR. link
Yoo J, Lee H, Seo S, Chung I, Kwak N. End-to-end multi-object detection with a regularized mixture model. arXiv preprint arXiv:2205.08714. 2022 May 18. link
Zhu Z, Dong Z, Liu Y. Detecting corrupted labels without training a model to predict. InInternational conference on machine learning 2022 Jun 28 (pp. 27412-27427). PMLR. link
Zhou Z, Qi L, Yang X, Ni D, Shi Y. Generalizable cross-modality medical image segmentation via style augmentation and dual normalization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 20856-20865). link
Zhou K, Liu Z, Qiao Y, Xiang T, Loy CC. Domain generalization: A survey. IEEE transactions on pattern analysis and machine intelligence. 2022 Aug 1;45(4):4396-415. link
Zhang Y, Jiang H, Miura Y, Manning CD, Langlotz CP. Contrastive learning of medical visual representations from paired images and text. InMachine learning for healthcare conference 2022 Dec 31 (pp. 2-25). PMLR. link
Zhang H, Zhang YF, Liu W, Weller A, Schölkopf B, Xing EP. Towards principled disentanglement for domain generalization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 8024-8034). link
Xu A, Li W, Guo P, Yang D, Roth HR, Hatamizadeh A, Zhao C, Xu D, Huang H, Xu Z. Closing the generalization gap of cross-silo federated medical image segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022 (pp. 20866-20875). link
Witowski J, Heacock L, Reig B, Kang SK, Lewin A, Pysarenko K, Patel S, Samreen N, Rudnicki W, Łuczyńska E, Popiela T. Improving breast cancer diagnostics with deep learning for MRI. Science translational medicine. 2022 Sep 28;14(664):eabo4802. link
Zhou HY, Chen X, Zhang Y, Luo R, Wang L, Yu Y. Generalized radiograph representation learning via cross-supervision between images and free-text radiology reports. Nature Machine Intelligence. 2022 Jan;4(1):32-40. link
Zhang X, Blanchet J, Ghosh S, Squillante MS. A class of geometric structures in transfer learning: Minimax bounds and optimality. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 3794-3820). PMLR. link
Yang Z, Gan Z, Wang J, Hu X, Ahmed F, Liu Z, Lu Y, Wang L. Unitab: Unifying text and box outputs for grounded vision-language modeling. InEuropean Conference on Computer Vision 2022 Oct 23 (pp. 521-539). Cham: Springer Nature Switzerland. link
Yao Y, Lin Q, Yang T. Large-scale optimization of partial auc in a range of false positive rates. Advances in Neural Information Processing Systems. 2022 Dec 6;35:31239-53. link
Shah K, Deshpande A, Goyal N. Learning and Generalization in Overparameterized Normalizing Flows. InInternational Conference on Artificial Intelligence and Statistics 2022 May 3 (pp. 9430-9504). PMLR. link
Xu X, Zhang JY, Ma E, Son HH, Koyejo S, Li B. Adversarially robust models may not transfer better: Sufficient conditions for domain transferability from the view of regularization. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 24770-24802). PMLR. link
Xu J, Teng J, Yao AC. Relaxing the feature covariance assumption: Time-variant bounds for benign overfitting in linear regression. arXiv preprint arXiv:2202.06054. 2022. link
Xia M, Yang H, Qu Y, Guo Y, Zhou G, Zhang F, Wang Y. Multilevel structure-preserved GAN for domain adaptation in intravascular ultrasound analysis. Medical Image Analysis. 2022 Nov 1;82:102614. link
Li B, Shen Y, Yang J, Wang Y, Ren J, Che T, Zhang J, Liu Z. Sparse mixture-of-experts are domain generalizable learners. arXiv preprint arXiv:2206.04046. 2022 Jun 8. link
Wang Z, Li M, Xu R, Zhou L, Lei J, Lin X, Wang S, Yang Z, Zhu C, Hoiem D, Chang SF. Language models with image descriptors are strong few-shot video-language learners. Advances in Neural Information Processing Systems. 2022 Dec 6;35:8483-97. link
Kumar A, Ma T, Liang P. Understanding self-training for gradual domain adaptation. InInternational conference on machine learning 2020 Nov 21 (pp. 5468-5479). PMLR. link
Wang CR, Gao F, Zhang F, Zhong F, Yu Y, Wang Y. Disentangling disease-related representation from obscure for disease prediction. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 22652-22664). PMLR. link
Shen Z, Yang H, Zhang S. Deep network approximation in terms of intrinsic parameters. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 19909-19934). PMLR. link
Zhou D, Gu Q. Dimension-free Complexity Bounds for High-order Nonconvex Finite-sum Optimization. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 27143-27158). PMLR. link
Yang S. Global Hardest Example Mining with Prototype-based Triplet Loss. link
Aghabozorgi M, Peng S, Li K. Adaptive IMLE for few-shot pretraining-free generative modelling. link
XGBoost tutorials.link
XGBoost frontiers: link
Hollmann N, Müller S, Purucker L, Krishnakumar A, Körfer M, Hoo SB, Schirrmeister RT, Hutter F. Accurate predictions on small data with a tabular foundation model. Nature. 2025 Jan 9;637(8045):319-26. link
Shahbaz M, Basharat A, Yasmin R, Ahmad N, Abbasi R, Hussain S. Leveraging Explainable AI for Early Risk Prediction and Type Classification for Leukemia: Insights using Clinical Data from Pakistan. IEEE Journal of Biomedical and Health Informatics. 2025 Jan 17. link
Imrie F, Denner S, Brunschwig LS, Maier-Hein K, Van Der Schaar M. Automated ensemble multimodal machine learning for healthcare. IEEE Journal of Biomedical and Health Informatics. 2025 Jan 15. link
Condado JG, Elorriaga IT, Cortes JM, Erramuzpe A. AgeML: Age modeling with Machine Learning. IEEE Journal of Biomedical and Health Informatics. 2025 Jan 17. link
Yang Y, Wang ZY, Liu Q, Sun S, Wang K, Chellappa R, Zhou Z, Yuille A, Zhu L, Zhang YD, Chen J. Medical world model: Generative simulation of tumor evolution for treatment planning. arXiv preprint arXiv:2506.02327. 2025 Jun 2. link
Gottweis J, Weng WH, Daryin A, Tu T, Palepu A, Sirkovic P, Myaskovsky A, Weissenberger F, Rong K, Tanno R, Saab K. Towards an AI co-scientist. arXiv preprint arXiv:2502.18864. 2025 Feb 26. link
Wang X, Zhu H. Artificial intelligence in image-based cardiovascular disease analysis: A comprehensive survey and future outlook. arXiv preprint arXiv:2402.03394. 2024 Feb 4. link
Dentamaro V, Giglio P, Impedovo D, Pirlo G, Di Ciano M. An interpretable adaptive multiscale attention deep neural network for tabular data. IEEE Transactions on Neural Networks and Learning Systems. 2024 May 15. link
Ma J, Thomas V, Hosseinzadeh R, Kamkari H, Labach A, Cresswell JC, Golestan K, Yu G, Volkovs M, Caterini AL. Tabdpt: Scaling tabular foundation models. arXiv preprint arXiv:2410.18164. 2024 Oct 23. link
Wen X, Zhang H, Zheng S, Xu W, Bian J. From supervised to generative: A novel paradigm for tabular deep learning with large language models. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2024 Aug 25 (pp. 3323-3333). link
van Breugel B, van der Schaar M. Why tabular foundation models should be a research priority. ICML 2024. link
Qi H, Hu Y, Fan R, Deng L. Tab-Cox: An Interpretable Deep Survival Analysis Model for Patients with Nasopharyngeal Carcinoma based on TabNet. IEEE Journal of Biomedical and Health Informatics. 2024 May 13. link
Mendoza A, Tume S, Puri K, Acosta S, Cavallaro JR. Clinical Features and Physiological Signals Fusion Network for Mechanical Circulatory Support Need Prediction in Pediatric Cardiac Intensive Care Unit. IEEE Journal of Biomedical and Health Informatics. 2024 Dec 2. link
Vo HQ, Wang L, Wong KK, Ezeana CF, Yu X, Nguyen HV, Wong ST. Frozen Large-scale Pretrained Vision-Language Models are an Effective Foundational Backbone for Enhancing Multimodal Breast Cancer Risk Assessment. Authorea Preprints. 2024 Jan 22. link
Zhang Y, Sun K, Liu Y, Xie F, Guo Q, Shen D. A Modality-Flexible Framework for Alzheimer's Disease Diagnosis Following Clinical Routine. IEEE Journal of Biomedical and Health Informatics. 2024 Oct 1. link
He H, Xi Y, Chen Y, Malin B, Ho J. A flexible generative model for heterogeneous tabular EHR with missing modality. InThe Twelfth International Conference on Learning Representations 2024 Apr. link
McElfresh D, Khandagale S, Valverde J, Prasad C V, Ramakrishnan G, Goldblum M, White C. When do neural nets outperform boosted trees on tabular data?. Advances in Neural Information Processing Systems. 2024 Feb 13;36. link
Beyazit E, Kozaczuk J, Li B, Wallace V, Fadlallah B. An inductive bias for tabular deep learning. Advances in Neural Information Processing Systems. 2024 Feb 13;36. link
Jolicoeur-Martineau A, Fatras K, Kachman T. Generating and imputing tabular data via diffusion and flow-based gradient-boosted trees. InInternational Conference on Artificial Intelligence and Statistics 2024 Apr 18 (pp. 1288-1296). PMLR. link
Tran QM, Hoang SN, Nguyen LM, Phan D, Lam HT. TabularFM: An open framework for tabular foundational models. In2024 IEEE International Conference on Big Data (BigData) 2024 Dec 15 (pp. 1694-1699). IEEE. link
Cresswell JC, Kim T. Scaling Up Diffusion and Flow-based XGBoost Models. arXiv preprint arXiv:2408.16046. 2024 Aug 28. link
Chen GH. Survival kernets: Scalable and interpretable deep kernel survival analysis with an accuracy guarantee. Journal of Machine Learning Research. 2024;25(40):1-78. link
Zhang H, Wen X, Zheng S, Xu W, Bian J. Towards foundation models for learning on tabular data. link
Wen X, Zhang H, Zheng S, Xu W, Bian J. From supervised to generative: A novel paradigm for tabular deep learning with large language models. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2024 Aug 25 (pp. 3323-3333). link
Fang X, Xu W, Tan FA, Zhang J, Hu Z, Qi Y, Nickleach S, Socolinsky D, Sengamedu S, Faloutsos C. Large Language Models (LLMs) on Tabular Data: Prediction, Generation, and Understanding--A Survey. arXiv preprint arXiv:2402.17944. 2024 Feb 27. link
Ferté T, Dutartre D, Hejblum BP, Griffier R, Jouhet V, Thiébaut R, Legrand P, Hinaut X. Reservoir computing for short high-dimensional time series: an application to SARS-CoV-2 hospitalization forecast. Proceedings of Machine Learning Research. 2024 Jul 8. link
Emami S, Martínez-Muñoz G. Sequential training of neural networks with gradient boosting. IEEE Access. 2023 Apr 28;11:42738-50. link
Zhu B, Shi X, Erickson N, Li M, Karypis G, Shoaran M. Xtab: Cross-table pretraining for tabular transformers. arXiv preprint arXiv:2305.06090. 2023 May 10. link
Tang J, Hua F, Gao Z, Zhao P, Li J. Gadbench: Revisiting and benchmarking supervised graph anomaly detection. Advances in Neural Information Processing Systems. 2023 Dec 15;36:29628-53. link
Hegselmann S, Buendia A, Lang H, Agrawal M, Jiang X, Sontag D. Tabllm: Few-shot classification of tabular data with large language models. InInternational Conference on Artificial Intelligence and Statistics 2023 Apr 11 (pp. 5549-5581). PMLR. link
Kotelnikov A, Baranchuk D, Rubachev I, Babenko A. Tabddpm: Modelling tabular data with diffusion models. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 17564-17579). PMLR. link
Wang Z, Zhang W, Liu N, Wang J. Learning interpretable rules for scalable data representation and classification. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023 Oct 31;46(2):1121-33. link
Yang J, Soltan AA, Eyre DW, Clifton DA. Algorithmic fairness and bias mitigation for clinical machine learning with deep reinforcement learning. Nature Machine Intelligence. 2023 Aug;5(8):884-94. link
Ruiz C, Ren H, Huang K, Leskovec J. High dimensional, tabular deep learning with an auxiliary knowledge graph. Advances in Neural Information Processing Systems. 2023 Dec 15;36:26348-71. link
Qian Z, Davis R, Van Der Schaar M. Synthcity: a benchmark framework for diverse use cases of tabular synthetic data. Advances in neural information processing systems. 2023 Dec 15;36:3173-88. link
Errica F. On class distributions induced by nearest neighbor graphs for node classification of tabular data. Advances in Neural Information Processing Systems. 2023 Dec 15;36:28910-40. link
Papicchio S, Papotti P, Cagliero L. Qatch: Benchmarking sql-centric tasks with table representation learning models on your data. Advances in Neural Information Processing Systems. 2023 Dec 15;36:30898-917. link
Chen P, Sarkar S, Lausen L, Srinivasan B, Zha S, Huang R, Karypis G. Hytrel: Hypergraph-enhanced tabular data representation learning. Advances in Neural Information Processing Systems. 2023 Dec 15;36:32173-93. link
Hansen L, Seedat N, van der Schaar M, Petrovic A. Reimagining synthetic tabular data generation through data-centric AI: A comprehensive benchmark. Advances in neural information processing systems. 2023 Dec 15;36:33781-823. link
Cherepanova V, Levin R, Somepalli G, Geiping J, Bruss CB, Wilson AG, Goldstein T, Goldblum M. A performance-driven benchmark for feature selection in tabular deep learning. Advances in Neural Information Processing Systems. 2023 Dec 15;36:41956-79. link
Beyazit E, Kozaczuk J, Li B, Wallace V, Fadlallah B. An inductive bias for tabular deep learning. Advances in Neural Information Processing Systems. 2023 Dec 15;36:43108-35. link
Liu J, Wang T, Cui P, Namkoong H. On the need for a language describing distribution shifts: Illustrations on tabular datasets. Advances in Neural Information Processing Systems. 2023 Dec 15;36:51371-408. link
Gardner J, Popovic Z, Schmidt L. Benchmarking distribution shift in tabular data with tableshift. Advances in Neural Information Processing Systems. 2023 Dec 15;36:53385-432. link
McElfresh D, Khandagale S, Valverde J, Prasad C V, Ramakrishnan G, Goldblum M, White C. When do neural nets outperform boosted trees on tabular data?. Advances in Neural Information Processing Systems. 2023 Dec 15;36:76336-69. link
Chen KY, Chiang PH, Chou HR, Chen TW, Chang TH. Trompt: Towards a better deep neural network for tabular data. arXiv preprint arXiv:2305.18446. 2023 May 29. link
Telyatnikov L, Scardapane S. Egg-gae: scalable graph neural networks for tabular data imputation. InInternational conference on artificial intelligence and statistics 2023 Apr 11 (pp. 2661-2676). PMLR. link
Zhu B, Shi X, Erickson N, Li M, Karypis G, Shoaran M. XTab: cross-table pretraining for tabular transformers. InProceedings of the 40th International Conference on Machine Learning 2023 Jul 23 (pp. 43181-43204). link
Akhtar N, Jalwana MA. Towards credible visual model interpretation with path attribution. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 439-457). PMLR. link
Barnes J, Brendel M, Gao VR, Rajendran S, Kim J, Li Q, Malmsten JE, Sierra JT, Zisimopoulos P, Sigaras A, Khosravi P. A non-invasive artificial intelligence approach for the prediction of human blastocyst ploidy: a retrospective model development and validation study. The Lancet Digital Health. 2023 Jan 1;5(1):e28-40. link
Chen KY, Chiang PH, Chou HR, Chen TW, Chang TH. Trompt: Towards a better deep neural network for tabular data. arXiv preprint arXiv:2305.18446. 2023 May 29. link
Doudesis D, Lee KK, Boeddinghaus J, Bularga A, Ferry AV, Tuck C, Lowry MT, Lopez-Ayala P, Nestelberger T, Koechlin L, Bernabeu MO. Machine learning for diagnosis of myocardial infarction using cardiac troponin concentrations. Nature Medicine. 2023 May;29(5):1201-10. link
Doveh S, Arbelle A, Harary S, Schwartz E, Herzig R, Giryes R, Feris R, Panda R, Ullman S, Karlinsky L. Teaching structured vision & language concepts to vision & language models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 2657-2668). link
Ghosh S, Yu K, Arabshahi F, Batmanghelich K. Dividing and conquering a blackbox to a mixture of interpretable models: route, interpret, repeat. InProceedings of the... International Conference on Machine Learning. International Conference on Machine Learning 2023 Jul (Vol. 202, p. 11360). link
Hager P, Menten MJ, Rueckert D. Best of both worlds: Multimodal contrastive learning with tabular and imaging data. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 23924-23935). link
Kwong JC, Khondker A, Meng E, Taylor N, Kuk C, Perlis N, Kulkarni GS, Hamilton RJ, Fleshner NE, Finelli A, van der Kwast TH. Development, multi-institutional external validation, and algorithmic audit of an artificial intelligence-based Side-specific Extra-Prostatic Extension Risk Assessment tool (SEPERA) for patients undergoing radical prostatectomy: a retrospective cohort study. The Lancet Digital Health. 2023 Jul 1;5(7):e435-45. link
Liu X, Hu P, Yeung W, Zhang Z, Ho V, Liu C, Dumontier C, Thoral PJ, Mao Z, Cao D, Mark RG. Illness severity assessment of older adults in critical illness using machine learning (ELDER-ICU): an international multicentre study with subgroup bias evaluation. The Lancet Digital Health. 2023 Oct 1;5(10):e657-67. link
Zeng A, Wu C, Lin G, Xie W, Hong J, Huang M, Zhuang J, Bi S, Pan D, Ullah N, Khan KN. ImageCAS: A large-scale dataset and benchmark for coronary artery segmentation based on computed tomography angiography images. Computerized Medical Imaging and Graphics. 2023 Oct 1;109:102287. link
Li X, Hu Y, Li C, Yang X, Jiang T. Sparse estimation via lower-order penalty optimization methods in high-dimensional linear regression. Journal of Global Optimization. 2023 Feb;85(2):315-49. link
Celentano M, Montanari A, Wei Y. The lasso with general gaussian designs with applications to hypothesis testing. The Annals of Statistics. 2023 Oct;51(5):2194-220. link
Jentzen A, Welti T. Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation. Applied Mathematics and Computation. 2023 Oct 15;455:127907. link
Damm S, Forster D, Velychko D, Dai Z, Fischer A, Lücke J. The elbo of variational autoencoders converges to a sum of entropies. InInternational Conference on Artificial Intelligence and Statistics 2023 Apr 11 (pp. 3931-3960). PMLR. link
Ahn K, Zhang J, Sra S. Understanding the unstable convergence of gradient descent. InInternational conference on machine learning 2022 Jun 28 (pp. 247-257). PMLR. link
Biza O, Platt R, van de Meent JW, Wong LL, Kipf T. Binding actions to objects in world models. arXiv preprint arXiv:2204.13022. 2022 Apr 27. link
Oikonomou EK, Spatz ES, Suchard MA, Khera R. Individualising intensive systolic blood pressure reduction in hypertension using computational trial phenomaps and machine learning: a post-hoc analysis of randomised clinical trials. The Lancet Digital Health. 2022 Nov 1;4(11):e796-805. link
Zhang Y, Ma J, Li J. Coronary r-cnn: Vessel-wise method for coronary artery lesion detection and analysis in coronary ct angiography. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention 2022 Sep 16 (pp. 207-216). Cham: Springer Nature Switzerland. link
Birrell J, Katsoulakis MA, Pantazis Y. Optimizing variational representations of divergences and accelerating their statistical estimation. IEEE Transactions on Information Theory. 2022 Mar 18;68(7):4553-72. link
Grinsztajn L, Oyallon E, Varoquaux G. Why do tree-based models still outperform deep learning on typical tabular data?. Advances in neural information processing systems. 2022 Dec 6;35:507-20. link
Rodriguez-Almeida AJ, Fabelo H, Ortega S, Deniz A, Balea-Fernandez FJ, Quevedo E, Soguero-Ruiz C, Wägner AM, Callico GM. Synthetic patient data generation and evaluation in disease prediction using small and imbalanced datasets. IEEE journal of biomedical and health informatics. 2022 Aug 5;27(6):2670-80. link
Seedat N, Crabbé J, Bica I, van der Schaar M. Data-iq: Characterizing subgroups with heterogeneous outcomes in tabular data. Advances in Neural Information Processing Systems. 2022 Dec 6;35:23660-74. link
Wang Z, Sun J. Transtab: Learning transferable tabular transformers across tables. Advances in Neural Information Processing Systems. 2022 Dec 6;35:2902-15. link
Yang C, Bender G, Liu H, Kindermans PJ, Udell M, Lu Y, Le QV, Huang D. TabNAS: rejection sampling for neural architecture search on tabular datasets. Advances in Neural Information Processing Systems. 2022 Dec 6;35:11906-17. link
Du K, Zhang W, Zhou R, Wang Y, Zhao X, Jin J, Gan Q, Zhang Z, Wipf DP. Learning enhanced representation for tabular data via neighborhood propagation. Advances in neural information processing systems. 2022 Dec 6;35:16373-84. link
Gorishniy Y, Rubachev I, Babenko A. On embeddings for numerical features in tabular deep learning. Advances in Neural Information Processing Systems. 2022 Dec 6;35:24991-5004. link
Jesus S, Pombal J, Alves D, Cruz A, Saleiro P, Ribeiro R, Gama J, Bizarro P. Turning the tables: Biased, imbalanced, dynamic tabular datasets for ml evaluation. Advances in Neural Information Processing Systems. 2022 Dec 6;35:33563-75. link
Borisov V, Leemann T, Seßler K, Haug J, Pawelczyk M, Kasneci G. Deep neural networks and tabular data: A survey. IEEE transactions on neural networks and learning systems. 2022 Dec 23. link
Gavrilev D, Amangeldiuly N, Ivanov S, Burnaev E. High performance of gradient boosting in binding affinity prediction. arXiv preprint arXiv:2205.07023. 2022 May 14. link
Xia H, Tang J, Yu W, Qiao J. Tree broad learning system for small data modeling. IEEE Transactions on Neural Networks and Learning Systems. 2022 Nov 3. link
Teague NJ. Feature Encodings for Gradient Boosting with Automunge. arXiv preprint arXiv:2209.12309. 2022 Sep 25. link
Dubey A, Radenovic F, Mahajan D. Scalable interpretability via polynomials. Advances in neural information processing systems. 2022 Dec 6;35:36748-61. link
Yu Y, Tran H. An XGBoost-based fitted Q iteration for finding the optimal STI strategies for HIV patients. IEEE Transactions on Neural Networks and Learning Systems. 2022 Jun 2;35(1):648-56. link
Zhao C, Wu D, Huang J, Yuan Y, Zhang HT, Peng R, Shi Z. BoostTree and BoostForest for ensemble learning. IEEE transactions on pattern analysis and machine intelligence. 2022 Dec 7;45(7):8110-26. link
Iosipoi L, Vakhrushev A. Sketchboost: Fast gradient boosted decision tree for multioutput problems. Advances in Neural Information Processing Systems. 2022 Dec 6;35:25422-35. link
Dutta S, Long J, Mishra S, Tilli C, Magazzeni D. Robust counterfactual explanations for tree-based ensembles. InInternational conference on machine learning 2022 Jun 28 (pp. 5742-5756). PMLR. link
Shwartz-Ziv R, Armon A. Tabular data: Deep learning is not all you need. Information Fusion. 2022 May 1;81:84-90. link
Xia S, Chen B, Wang G, Zheng Y, Gao X, Giem E, Chen Z. mCRF and mRD: Two classification methods based on a novel multiclass label noise filtering learning framework. IEEE Transactions on Neural Networks and Learning Systems. 2021 Jan 11;33(7):2916-30. link
Bhattacharya S, Maiti T. Statistical foundation of variational bayes neural networks. Neural Networks. 2021 May 1;137:151-73. link
Muehlebach M, Jordan MI. Optimization with momentum: Dynamical, control-theoretic, and symplectic perspectives. Journal of Machine Learning Research. 2021;22(73):1-50. link
Liu F, Liao Z, Suykens J. Kernel regression in high dimensions: Refined analysis beyond double descent. InInternational Conference on Artificial Intelligence and Statistics 2021 Mar 18 (pp. 649-657). PMLR. link
Whang J, Lei Q, Dimakis A. Solving inverse problems with a flow-based noise model. InInternational Conference on Machine Learning 2021 Jul 1 (pp. 11146-11157). PMLR. link
Ma C, Jaggi M, Curtis FE, Srebro N, Takáč M. An accelerated communication-efficient primal-dual optimization framework for structured machine learning. Optimization Methods and Software. 2021 Jan 2;36(1):20-44. link
Lin J, Cevher V. Kernel conjugate gradient methods with random projections. Applied and Computational Harmonic Analysis. 2021 Nov 1;55:223-69. link
Adcock B, Antun V, Hansen AC. Uniform recovery in infinite-dimensional compressed sensing and applications to structured binary sampling. Applied and Computational Harmonic Analysis. 2021 Nov 1;55:1-40. link
Gillis N, Leplat V, Tan VY. Distributionally robust and multi-objective nonnegative matrix factorization. IEEE transactions on pattern analysis and machine intelligence. 2021 Feb 11;44(8):4052-64. link
Foucart S, Needell D, Pathak R, Plan Y, Wootters M. Weighted matrix completion from non-random, non-uniform sampling patterns. IEEE Transactions on Information Theory. 2020 Nov 19;67(2):1264-90. link
Zadaianchuk A, Seitzer M, Martius G. Self-supervised visual reinforcement learning with object-centric representations. arXiv preprint arXiv:2011.14381. 2020 Nov 29. link
Lotfi M, Vidyasagar M. Compressed sensing using binary matrices of nearly optimal dimensions. IEEE Transactions on Signal Processing. 2020 Apr 27;68:3008-21. link
Heckel R, Huang W, Hand P, Voroninski V. Deep denoising: Rate-optimal recovery of structured signals with a deep prior. link
Li H, Wen G. Modeling reverse thinking for machine learning. Soft Computing. 2020 Jan;24:1483-96. link
Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee SI. From local explanations to global understanding with explainable AI for trees. Nature machine intelligence. 2020 Jan;2(1):56-67. link
Wang Z, Cao C, Zhu Y. Entropy and confidence-based undersampling boosting random forests for imbalanced problems. IEEE transactions on neural networks and learning systems. 2020 Jan 24;31(12):5178-91. link
Zhang Z, Jung C. GBDT-MO: Gradient-boosted decision trees for multiple outputs. IEEE transactions on neural networks and learning systems. 2020 Aug 4;32(7):3156-67. link
Li L, Fang Y, Wu J, Wang J, Ge Y. Encoder–decoder full residual deep networks for robust regression and spatiotemporal estimation. IEEE transactions on neural networks and learning systems. 2020 Sep 3;32(9):4217-30. link
Mai V, Johansson M. Convergence of a stochastic gradient method with momentum for non-smooth non-convex optimization. InInternational conference on machine learning 2020 Nov 21 (pp. 6630-6639). PMLR. link
Posada JG, Vani A, Schwarzer M, Lacoste-Julien S. Gait: A geometric approach to information theory. InInternational Conference on Artificial Intelligence and Statistics 2020 Jun 3 (pp. 2601-2611). PMLR. link
Zadaianchuk A, Seitzer M, Martius G. Self-supervised visual reinforcement learning with object-centric representations. arXiv preprint arXiv:2011.14381. 2020 Nov 29. link
Shen Z, Savvides M. Meal v2: Boosting vanilla resnet-50 to 80%+ top-1 accuracy on imagenet without tricks. arXiv preprint arXiv:2009.08453. 2020 Sep 17. link
Mukherjee S, Dittmer S, Shumaylov Z, Lunz S, Öktem O, Schönlieb CB. Learned convex regularizers for inverse problems. arXiv preprint arXiv:2008.02839. 2020 Aug 6. link
Ouyang D, He B, Ghorbani A, Yuan N, Ebinger J, Langlotz CP, Heidenreich PA, Harrington RA, Liang DH, Ashley EA, Zou JY. Video-based AI for beat-to-beat assessment of cardiac function. Nature. 2020 Apr;580(7802):252-6. link
Ozdemir O, Russell RL, Berlin AA. A 3D probabilistic deep learning system for detection and diagnosis of lung cancer using low-dose CT scans. IEEE transactions on medical imaging. 2019 Oct 30;39(5):1419-29. link
Aden-Ali I, Ashtiani H. On the sample complexity of learning sum-product networks. InInternational Conference on Artificial Intelligence and Statistics 2020 Jun 3 (pp. 4508-4518). PMLR. link
Becker P, Arenz O, Neumann G. Expected information maximization: Using the i-projection for mixture density estimation. arXiv preprint arXiv:2001.08682. 2020 Jan 23. link
Brehmer J, Louppe G, Pavez J, Cranmer K. Mining gold from implicit models to improve likelihood-free inference. Proceedings of the National Academy of Sciences. 2020 Mar 10;117(10):5242-9. link
Chen Z, Cao Y, Gu Q, Zhang T. Mean-field analysis of two-layer neural networks: Non-asymptotic rates and generalization bounds. arXiv preprint arXiv:2002.04026. 2020. link
Chong KF. A closer look at the approximation capabilities of neural networks. arXiv preprint arXiv:2002.06505. 2020 Feb 16. link
Chu D, Zhang C, Sun S, Tao Q. Semismooth Newton Algorithm for Efficient Projections onto $\ell_1,∞ $-norm Ball. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 1974-1983). PMLR. link
Chun IY, Adcock B. Uniform recovery from subgaussian multi-sensor measurements. Applied and Computational Harmonic Analysis. 2020 Mar 1;48(2):731-65. link
Vidal AF, De Bortoli V, Pereyra M, Durmus A. Maximum likelihood estimation of regularization parameters in high-dimensional inverse problems: An empirical bayesian approach part i: Methodology and experiments. SIAM Journal on Imaging Sciences. 2020;13(4):1945-89. link
De Bortoli V, Durmus A, Pereyra M, Vidal AF. Maximum likelihood estimation of regularization parameters in high-dimensional inverse problems: an empirical Bayesian approach. Part II: Theoretical analysis. SIAM Journal on Imaging Sciences. 2020;13(4):1990-2028. link
de Dios J, Bruna J. On sparsity in overparametrised shallow ReLU networks. arXiv preprint arXiv:2006.10225. 2020 Jun 18. link
Duersch JA, Gu M. Randomized projection for rank-revealing matrix factorizations and low-rank approximations. SIAM Review. 2020;62(3):661-82. link
Eboli T, Sun J, Ponce J. End-to-end interpretable learning of non-blind image deblurring. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII 16 2020 (pp. 314-331). Springer International Publishing. link
Gourgoulias K, Katsoulakis MA, Rey-Bellet L, Wang J. How biased is your model? Concentration inequalities, information and model bias. IEEE Transactions on Information Theory. 2020 Feb 28;66(5):3079-97. link
Hoang Q, Le T, Phung D. Parameterized rate-distortion stochastic encoder. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 4293-4303). PMLR. link
Huang C, Sun X, Xiong J, Yao Y. Boosting with structural sparsity: A differential inclusion approach. Applied and Computational Harmonic Analysis. 2020 Jan 1;48(1):1-45. link
Huang SL, Xu X, Zheng L. An information-theoretic approach to unsupervised feature selection for high-dimensional data. IEEE Journal on Selected Areas in Information Theory. 2020 Mar 17;1(1):157-66. link
Bueno LF, Martínez JM. On the complexity of an inexact restoration method for constrained optimization. SIAM Journal on Optimization. 2020;30(1):80-101. link
Bullins B. Highly smooth minimization of non-smooth problems. InConference on Learning Theory 2020 Jul 15 (pp. 988-1030). PMLR. link
Jha SK, Jha S, Ewetz R, Raj S, Velasquez A, Pullum LL, Swami A. An extension of fano's inequality for characterizing model susceptibility to membership inference attacks. arXiv preprint arXiv:2009.08097. 2020 Sep 17. link
Jin B, Zhou Z, Zou J. On the convergence of stochastic gradient descent for nonlinear ill-posed problems. SIAM Journal on Optimization. 2020;30(2):1421-50. link
Kamath A, Price E, Karmalkar S. On the power of compressed sensing with generative models. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 5101-5109). PMLR. link
Kim H, Lim YK, Goh Y, Jeong C, Hwang UJ, Choi SH, Cho B, Kwak J. Plan optimization with L0-norm and group sparsity constraints for a new rotational, intensity-modulated brachytherapy for cervical cancer. Plos one. 2020 Jul 28;15(7):e0236585. link
Lan X, Guo X, Barner KE. PAC-Bayesian generalization bounds for multilayer perceptrons. arXiv preprint arXiv:2006.08888. 2020 Jun 16. link
Levy D, Carmon Y, Duchi JC, Sidford A. Large-scale methods for distributionally robust optimization. Advances in Neural Information Processing Systems. 2020;33:8847-60. link
Lin T, Ho N, Jordan M. On efficient optimal transport: An analysis of greedy and accelerated mirror descent algorithms. InInternational Conference on Machine Learning 2019 May 24 (pp. 3982-3991). PMLR. link
Zou D, Gu Q. An improved analysis of training over-parameterized deep neural networks. Advances in neural information processing systems. 2019;32. link
Bai Y, Lee JD. Beyond linearization: On quadratic and higher-order approximation of wide neural networks. arXiv preprint arXiv:1910.01619. 2019 Oct 3. link
Shen Z, He Z, Xue X. Meal: Multi-model ensemble via adversarial learning. InProceedings of the AAAI conference on artificial intelligence 2019 Jul 17 (Vol. 33, No. 01, pp. 4886-4893). link
Alghunaim S, Yuan K, Sayed AH. A linearly convergent proximal gradient algorithm for decentralized optimization. Advances in Neural Information Processing Systems. 2019;32. link
Chen M, Li X, Zhao T. On generalization bounds of a family of recurrent neural networks. arXiv preprint arXiv:1910.12947. 2019 Oct 28. link
Combettes PL, Glaudin LE. Proximal activation of smooth functions in splitting algorithms for convex image recovery. SIAM Journal on Imaging Sciences. 2019;12(4):1905-35. link
Diadie BA. Non parametric estimation of residual-past entropy, mean residual-past lifetime, residual-past inaccuracy measure and asymptotic limits. arXiv preprint arXiv:1912.00150. 2019 Nov 30. link
Jacob M, Mani MP, Ye JC. Structured low-rank algorithms: theory, MR applications, and links to machine learning, arXiv preprint,(2019) [Internet]. arXiv preprint arXiv:1910.12162; link
Daei S, Haddadi F, Amini A, Lotz M. On the error in phase transition computations for compressed sensing. IEEE Transactions on Information Theory. 2019 Jun 4;65(10):6620-32. link
Jiang H, Ahn JH, Wang X. Lipschitz Learning for Signal Recovery. arXiv preprint arXiv:1910.02142. 2019 Oct 4. link
Krishnamurthy A, Mazumdar A, McGregor A, Pal S. Sample complexity of learning mixture of sparse linear regressions. Advances in Neural Information Processing Systems. 2019;32. link
Mahloujifar S, Zhang X, Mahmoody M, Evans D. Empirically measuring concentration: Fundamental limits on intrinsic robustness. Advances in Neural Information Processing Systems. 2019;32. link
Malgouyres F, Landsberg J. Multilinear compressive sensing and an application to convolutional linear networks. SIAM Journal on Mathematics of Data Science. 2019;1(3):446-75. link
Ji Z, Telgarsky M, Xian R. Neural tangent kernels, transportation mappings, and universal approximation. arXiv preprint arXiv:1910.06956. 2019 Oct 15. link
Murdoch WJ, Singh C, Kumbier K, Abbasi-Asl R, Yu B. Definitions, methods, and applications in interpretable machine learning. Proceedings of the National Academy of Sciences. 2019 Oct 29;116(44):22071-80. link
Nguyen LT, Kim J, Shim B. Low-rank matrix completion: A contemporary survey. IEEE Access. 2019 Jul 10;7:94215-37. link
Stankovic L, Mandic D, Dakovic M, Kisil I. An intuitive derivation of the coherence index relation in compressive sensing. arXiv preprint arXiv:1903.11136. 2019 Mar 26. link
Truong TT. Convergence to minima for the continuous version of backtracking gradient descent. arXiv preprint arXiv:1911.04221. 2019 Nov 11. link
Nair AV, Pong V, Dalal M, Bahl S, Lin S, Levine S. Visual reinforcement learning with imagined goals. Advances in neural information processing systems. 2018;31. link
Sankar AR, Srinivasan V, Balasubramanian VN. On the Analysis of Trajectories of Gradient Descent in the Optimization of Deep Neural Networks. arXiv preprint arXiv:1807.08140. 2018 Jul 21. link
Scaman K, Bach F, Bubeck S, Massoulié L, Lee YT. Optimal algorithms for non-smooth distributed optimization in networks. Advances in Neural Information Processing Systems. 2018;31. link
Zhang J, Zhu G, Heath Jr RW, Huang K. Grassmannian learning: Embedding geometry awareness in shallow and deep learning. arXiv preprint arXiv:1808.02229. 2018 Aug 7. link
Mukherjee A, Basu A, Arora R, Mianjy P. Understanding deep neural networks with rectified linear units. InInternational Conference on Learning Representations 2018. link
Huang J, Wang J, Zhang F, Wang W. New Sufficient Conditions of Signal Recovery With Tight Frames via ${l} _1 $-Analysis Approach. IEEE Access. 2018 May 7;6:26718-28. link
Gabrié M, Manoel A, Luneau C, Macris N, Krzakala F, Zdeborová L. Entropy and mutual information in models of deep neural networks. Advances in neural information processing systems. 2018;31. link
Feng J, Yu Y, Zhou ZH. Multi-layered gradient boosting decision trees. Advances in neural information processing systems. 2018;31. link
Amari SI, Karakida R, Oizumi M. Statistical neurodynamics of deep networks: Geometry of signal spaces. arXiv preprint arXiv:1808.07169. 2018 Aug 22. link
Cho M, Vijay Mishra K, Xu W. Computable performance guarantees for compressed sensing matrices. EURASIP journal on advances in signal processing. 2018 Dec;2018:1-8. link
Tropp JA, Yurtsever A, Udell M, Cevher V. Practical sketching algorithms for low-rank matrix approximation. SIAM Journal on Matrix Analysis and Applications. 2017;38(4):1454-85. link
Chollet F. Xception: Deep learning with depthwise separable convolutions. InProceedings of the IEEE conference on computer vision and pattern recognition 2017 (pp. 1251-1258). link
Gamarnik D, Zadik I. Sparse high-dimensional linear regression. algorithmic barriers and a local search algorithm. arXiv preprint arXiv:1711.04952. 2017 Nov 14. link
Zhou Y, Liang Y. Characterization of gradient dominance and regularity conditions for neural networks. arXiv preprint arXiv:1710.06910. 2017 Oct 18. link
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems. 2017;30. link
Higgins I, Matthey L, Pal A, Burgess C, Glorot X, Botvinick M, Mohamed S, Lerchner A. beta-vae: Learning basic visual concepts with a constrained variational framework. InInternational conference on learning representations 2017 Feb 6. link
Kabanava M, Kueng R, Rauhut H, Terstiege U. Stable low-rank matrix recovery via null space properties. Information and Inference: A Journal of the IMA. 2016 Dec 1;5(4):405-41. link
Tropp JA, Yurtsever A, Udell M, Cevher V. Randomized single-view algorithms for low-rank matrix approximation. link
Yue MC, So AM. A perturbation inequality for concave functions of singular values and its applications in low-rank matrix recovery. Applied and Computational Harmonic Analysis. 2016 Mar 1;40(2):396-416. link
Bi S, Pan S. Error bounds for rank constrained optimization problems and applications. Operations Research Letters. 2016 May 1;44(3):336-41. link
Qian CZ, Zhou ZS. Accelerated stochastic admm for empirical risk minimization. arXiv preprint arXiv:1611.04074. 2016 Nov. link
Chen T, Guestrin C. Xgboost: A scalable tree boosting system. InProceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining 2016 Aug 13 (pp. 785-794). link
El Halabi M, Cevher V. A totally unimodular view of structured sparsity. InArtificial Intelligence and Statistics 2015 Feb 21 (pp. 223-231). PMLR. link
Tropp JA. Convex recovery of a structured signal from independent random linear measurements. Sampling theory, a renaissance. 2015 Dec 8:67-101. link
Bubeck S. Convex optimization: Algorithms and complexity. Foundations and Trends® in Machine Learning. 2015 Nov 11;8(3-4):231-357. link
Bhave S, Eslami R, Jacob M. Sparse spectral deconvolution algorithm for noncartesian MR spectroscopic imaging. Magnetic resonance in medicine. 2014 Feb;71(2):469-76. link
Yu AW, Ma W, Yu Y, Carbonell JG, Sra S. Efficient structured matrix rank minimization. Advances in neural information processing systems. 2014;27. link
Ghadimi E, Teixeira A, Shames I, Johansson M. Optimal parameter selection for the alternating direction method of multipliers (ADMM): Quadratic problems. IEEE Transactions on Automatic Control. 2014 Sep 5;60(3):644-58. link
Shalev-Shwartz S. Online learning and online convex optimization. Foundations and Trends® in Machine Learning. 2012 Mar 28;4(2):107-94. link
Yang Z, Zhang C, Xie L. Robustly stable signal recovery in compressed sensing with structured matrix perturbation. IEEE Transactions on Signal Processing. 2012 May 28;60(9):4658-71. link
Nowozin S, Lampert CH. Structured learning and prediction in computer vision. Foundations and Trends® in Computer Graphics and Vision. 2011 May 22;6(3–4):185-365. link
Tan VY, Balzano L, Draper SC. Rank minimization over finite fields: Fundamental limits and coding-theoretic interpretations. IEEE transactions on information theory. 2011 Dec 2;58(4):2018-39. link
Baraniuk RG, Cevher V, Duarte MF, Hegde C. Model-based compressive sensing. IEEE Transactions on information theory. 2010 Mar 22;56(4):1982-2001. link
Cai TT, Wang L, Xu G. Stable recovery of sparse signals and an oracle inequality. IEEE Transactions on Information Theory. 2010 Jun 14;56(7):3516-22. link
Teschke G, Borries C. Accelerated projected steepest descent method for nonlinear inverse problems with sparsity constraints. Inverse Problems. 2010 Jan 12;26(2):025007. link
Hale ET, Yin W, Zhang Y. Fixed-point continuation for \ell_1-minimization: Methodology and convergence. SIAM Journal on Optimization. 2008;19(3):1107-30. link
Markovsky I. Structured low-rank approximation and its applications. Automatica. 2008 Apr 1;44(4):891-909. link
Candes EJ, Romberg JK, Tao T. Stable signal recovery from incomplete and inaccurate measurements. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences. 2006 Aug;59(8):1207-23. link
From: UK Biobank publications
Kany S, Rämö JT, Playford D, Strange G, Hou C, Jurgens SJ, Nauffal V, Cunningham JW, Lau ES, Butte AJ, Ho JE. New threshold for defining mild aortic stenosis derived from velocity-encoded MRI in 60,000 individuals. Journal of the American College of Cardiology. 2025 Apr 8;85(13):1387-99. link
Oikonomou EK, Holste G, Yuan N, Coppi A, McNamara RL, Haynes NA, Vora AN, Velazquez EJ, Li F, Menon V, Kapadia SR. A multimodal video-based AI biomarker for aortic stenosis development and progression. JAMA cardiology. 2024 Jun 1;9(6):534-44. link
Thomson RJ, Grafton‐Clarke C, Matthews G, Swoboda PP, Swift AJ, Frangi A, Petersen SE, Aung N, Garg P. Risk factors for raised left ventricular filling pressure by cardiovascular magnetic resonance: Prognostic insights. ESC Heart Failure. 2024 Dec;11(6):4148-59. link
Muffoletto M, Xu H, Burns R, Suinesiaputra A, Nasopoulou A, Kunze KP, Neji R, Petersen SE, Niederer SA, Rueckert D, Young AA. Evaluation of deep learning estimation of whole heart anatomy from automated cardiovascular magnetic resonance short-and long-axis analyses in UK Biobank. European Heart Journal-Cardiovascular Imaging. 2024 Oct;25(10):1374-83. link
Chadalavada S, Fung K, Rauseo E, Lee AM, Khanji MY, Amir-Khalili A, Paiva J, Naderi H, Banik S, Chirvasa M, Jensen MT. Myocardial strain measured by cardiac magnetic resonance predicts cardiovascular morbidity and death. Journal of the American College of Cardiology. 2024 Aug 13;84(7):648-59. link
Bertrand A, Lewis A, Camps J, Grau V, Rodriguez B. Multi-modal characterisation of early-stage, subclinical cardiac deterioration in patients with type 2 diabetes. Cardiovascular Diabetology. 2024 Oct 19;23(1):371. link
Linge J, Widholm P, Nilsson D, Kugelberg A, Olbers T, Leinhard OD. Risk stratification using magnetic resonance imaging-derived, personalized z-scores of visceral adipose tissue, subcutaneous adipose tissue, and liver fat in persons with obesity. Surgery for Obesity and Related Diseases. 2024 May 1;20(5):419-24. link
Li L, Camps J, Wang ZJ, Beetz M, Banerjee A, Rodriguez B, Grau V. Toward enabling cardiac digital twins of myocardial infarction using deep computational models for inverse inference. IEEE transactions on medical imaging. 2024 Feb 19;43(7):2466-78. link
Lang O, Yaya-Stupp D, Traynis I, Cole-Lewis H, Bennett CR, Lyles CR, Lau C, Irani M, Semturs C, Webster DR, Corrado GS. Using generative AI to investigate medical imagery models and datasets. EBioMedicine. 2024 Apr 1;102. link
Li Y, Chan E, Puyol-Antón E, Ruijsink B, Cecelja M, King AP, Razavi R, Chowienczyk P. Hemodynamic determinants of elevated blood pressure and hypertension in the middle to older-age UK population: a UK Biobank Imaging Study. Hypertension. 2023 Nov;80(11):2473-84. link
Dawood T, Chen C, Sidhu BS, Ruijsink B, Gould J, Porter B, Elliott MK, Mehta V, Rinaldi CA, Puyol-Antón E, Razavi R. Uncertainty aware training to improve deep learning model calibration for classification of cardiac MR images. Medical Image Analysis. 2023 Aug 1;88:102861. link
Raisi-Estabragh Z, McCracken C, Hann E, Condurache DG, Harvey NC, Munroe PB, Ferreira VM, Neubauer S, Piechnik SK, Petersen SE. Incident clinical and mortality associations of myocardial native T1 in the UK Biobank. Cardiovascular Imaging. 2023 Apr 1;16(4):450-60. link
Pujadas ER, Raisi-Estabragh Z, Szabo L, McCracken C, Morcillo CI, Campello VM, Martín-Isla C, Atehortua AM, Vago H, Merkely B, Maurovich-Horvat P. Prediction of incident cardiovascular events using machine learning and CMR radiomics. European radiology. 2023 May;33(5):3488-500. link
Raisi-Estabragh Z, McCracken C, Condurache D, Aung N, Vargas JD, Naderi H, Munroe PB, Neubauer S, Harvey NC, Petersen SE. Left atrial structure and function are associated with cardiovascular outcomes independent of left ventricular measures: a UK Biobank CMR study. European Heart Journal-Cardiovascular Imaging. 2022 Sep 1;23(9):1191-200. link
Pujadas ER, Raisi-Estabragh Z, Szabo L, Morcillo CI, Campello VM, Martin-Isla C, Vago H, Merkely B, Harvey NC, Petersen SE, Lekadir K. Atrial fibrillation prediction by combining ECG markers and CMR radiomics. Scientific Reports. 2022 Nov 7;12(1):18876. link
Cecelja M, Ruijsink B, Puyol‐Antón E, Li Y, Godwin H, King AP, Razavi R, Chowienczyk P. Aortic distensibility measured by automated analysis of magnetic resonance imaging predicts adverse cardiovascular events in UK biobank. Journal of the American Heart Association. 2022 Dec 6;11(23):e026361. link
Puyol-Antón E, Sidhu BS, Gould J, Porter B, Elliott MK, Mehta V, Rinaldi CA, King AP. A multimodal deep learning model for cardiac resynchronisation therapy response prediction. Medical Image Analysis. 2022 Jul 1;79:102465. link
Ardissino M, McCracken C, Bard A, Antoniades C, Neubauer S, Harvey NC, Petersen SE, Raisi-Estabragh Z. Pericardial adiposity is independently linked to adverse cardiovascular phenotypes: a CMR study of 42 598 UK Biobank participants. European Heart Journal-Cardiovascular Imaging. 2022 Nov 1;23(11):1471-81. link
Pirruccello JP, Lin H, Khurshid S, Nekoui M, Weng LC, Vasan RS, Isselbacher EM, Benjamin EJ, Lubitz SA, Lindsay ME, Ellinor PT. Development of a prediction model for ascending aortic diameter among asymptomatic individuals. JAMA. 2022 Nov 15;328(19):1935-44. link
Khurshid S, Friedman S, Pirruccello JP, Di Achille P, Diamant N, Anderson CD, Ellinor PT, Batra P, Ho JE, Philippakis AA, Lubitz SA. Deep learning to predict cardiac magnetic resonance–derived left ventricular mass and hypertrophy from 12-lead ECGs. Circulation: Cardiovascular Imaging. 2021 Jun;14(6):e012281. link
Puyol-Antón E, Ruijsink B, Baumgartner CF, Masci PG, Sinclair M, Konukoglu E, Razavi R, King AP. Automated quantification of myocardial tissue characteristics from native T1 mapping using neural networks with uncertainty-based quality-control. Journal of Cardiovascular Magnetic Resonance. 2020 Jan 20;22(1):60. link
Ruijsink B, Puyol-Antón E, Oksuz I, Sinclair M, Bai W, Schnabel JA, Razavi R, King AP. Fully automated, quality-controlled cardiac analysis from CMR: validation and large-scale application to characterize cardiac function. Cardiovascular Imaging. 2020 Mar 1;13(3):684-95. link
Guo F, Ng M, Goubran M, Petersen SE, Piechnik SK, Neubauer S, Wright G. Improving cardiac MRI convolutional neural network segmentation on small training datasets and dataset shift: A continuous kernel cut approach. Medical image analysis. 2020 Apr 1;61:101636. link
Honigberg MC, Pirruccello JP, Aragam K, Sarma AA, Scott NS, Wood MJ, Natarajan P. Menopausal age and left ventricular remodeling by cardiac magnetic resonance imaging among 14,550 women. American heart journal. 2020 Nov 1;229:138-43. link
Bai W, Suzuki H, Huang J, Francis C, Wang S, Tarroni G, Guitton F, Aung N, Fung K, Petersen SE, Piechnik SK. A population-based phenome-wide association study of cardiac and aortic structure and function. Nature medicine. 2020 Oct;26(10):1654-62. link
Córdova-Palomera A, Tcheandjieu C, Fries JA, Varma P, Chen VS, Fiterau M, Xiao K, Tejeda H, Keavney BD, Cordell HJ, Tanigawa Y. Cardiac imaging of aortic valve area from 34 287 UK Biobank participants reveals novel genetic associations and shared genetic comorbidity with multiple disease phenotypes. Circulation: Genomic and Precision Medicine. 2020 Dec;13(6):e003014. link
Pirruccello JP, Bick A, Wang M, Chaffin M, Friedman S, Yao J, Guo X, Venkatesh BA, Taylor KD, Post WS, Rich S. Analysis of cardiac magnetic resonance imaging in 36,000 individuals yields genetic insights into dilated cardiomyopathy. Nature communications. 2020 May 7;11(1):2254. link
Fung K, Cheshire C, Cooper JA, Catarino P, Piechnik SK, Neubauer S, Bhagra S, Pettit S, Petersen SE. Validation of cardiovascular magnetic resonance–derived equation for predicted left ventricular mass using the UK Biobank Imaging Cohort: tool for donor-recipient size matching. Circulation: Heart Failure. 2019 Dec;12(12):e006362. link
Fries JA, Varma P, Chen VS, Xiao K, Tejeda H, Saha P, Dunnmon J, Chubb H, Maskatia S, Fiterau M, Delp S. Weakly supervised classification of aortic valve malformations using unlabeled cardiac MRI sequences. Nature communications. 2019 Jul 15;10(1):3111. link
Suinesiaputra A, Sanghvi MM, Aung N, Paiva JM, Zemrak F, Fung K, Lukaschuk E, Lee AM, Carapella V, Kim YJ, Francis J. Fully-automated left ventricular mass and volume MRI analysis in the UK Biobank population cohort: evaluation of initial results. The international journal of cardiovascular imaging. 2018 Feb;34(2):281-91. link
Sanghvi MM, Aung N, Cooper JA, Paiva JM, Lee AM, Zemrak F, Fung K, Thomson RJ, Lukaschuk E, Carapella V, Kim YJ. The impact of menopausal hormone therapy (MHT) on cardiac structure and function: insights from the UK Biobank imaging enhancement study. PloS one. 2018 Mar 8;13(3):e0194015. link
Puyol-Anton E, Sinclair M, Gerber B, Amzulescu MS, Langet H, De Craene M, Aljabar P, Piro P, King AP. A multimodal spatiotemporal cardiac motion atlas from MR and ultrasound data. Medical image analysis. 2017 Aug 1;40:96-110. link
From: NIH NCI CDAS
Rathore S, Gautam A, Raghav P, Subramaniam V, Gupta V, Rathore M, Rathore A, Rathore S, Iyengar S. Fully automated coronary artery calcium score and risk categorization from chest CT using deep learning and multiorgan segmentation: A validation study from National Lung Screening Trial (NLST). IJC Heart & Vasculature. 2025 Feb 1;56:101593. link
Gendarme S, Irajizad E, Long JP, Fahrmann JF, Dennison JB, Ghasemi SM, Dou R, Volk RJ, Meza R, Toumazis I, Canoui-Poitrine F. Impact of comorbidities on the mortality benefits of lung cancer screening: a post-hoc analysis of the PLCO and NLST trials. Journal of Thoracic Oncology. 2025 May 1;20(5):565-76. link
Jiang Y, Ebrahimpour L, Després P, Manem VS. A benchmark of deep learning approaches to predict lung cancer risk using national lung screening trial cohort. Scientific reports. 2025 Jan 11;15(1):1736. link
Meza R, ten Haaf K, Kong CY, Erdogan A, Black WC, Tammemagi MC, Choi SE, Jeon J, Han SS, Munshi V, van Rosmalen J. Comparative analysis of 5 lung cancer natural history and screening models that reproduce outcomes of the NLST and PLCO trials. Cancer. 2014 Jun 1;120(11):1713-24. link
Oo DW, Sturniolo A, Jung M, Langenbach M, Foldyna B, Kiel DP, Aerts HJ, Natarajan P, Lu MT, Raghu VK. Opportunistic Assessment Of Cardiovascular Risk Using Ai-Derived Structural Aortic And Cardiac Phenotypes From Non-Contrast Chest Computed Tomography. medRxiv. 2025 Jan 29. link
Langenbach IL, Hadzic I, Zeleznik R, Langenbach MC, Maintz D, Mayrhofer T, Lu MT, Aerts HJ, Foldyna B. Association of epicardial adipose tissue changes on serial chest CT scans with mortality: insights from the national lung screening trial. Radiology. 2025 Feb 18;314(2):e240473. link
Beeche C, Yu T, Wang J, Wilson D, Chen P, Duman E, Pu J. A generalized health index: automated thoracic CT-derived biomarkers predict life expectancy. British Journal of Radiology. 2025 Mar;98(1167):412-21. link
Jiang Y, Manem VS. Data augmented lung cancer prediction framework using the nested case control NLST cohort. Frontiers in oncology. 2025 Feb 25;15:1492758. link
Jang S, Kim J, Lee S, Kim YW, Kim J, Lee KW, Lee CT. Visual Emphysema as a Category Modifier in Lung-RADS: Secondary Analysis of National Lung Screening Trial. Journal of the American College of Radiology. 2025 Mar 4. link
Tailor TD, Gutman R, An N, Hoffman RM, Chiles C, Carlos RC, Sicks JD, Gareen IF. Positive Screens Are More Likely in a National Lung Cancer Screening Registry Than the National Lung Screening Trial. Journal of the American College of Radiology. 2025 Feb 27. link
Lin F, Zhang Z, Wang J, Liang C, Xu J, Zeng X, Zeng Q, Chen H, Zhuang J, Ma Y, Ma Q. AutoCOPD–A novel and practical machine learning model for COPD detection using whole-lung inspiratory quantitative CT measurements: a retrospective, multicenter study. EClinicalMedicine. 2025 Apr 1;82. link
Behr CM, IJzerman MJ, Kip MM, Groen HJ, Heuvelmans MA, van den Berge M, van der Harst P, Vonder M, Vliegenthart R, Koffijberg H. Model-Based Cost-Utility Analysis of Combined Low-Dose Computed Tomography Screening for Lung Cancer, Chronic Obstructive Pulmonary Disease, and Cardiovascular Disease. JTO clinical and research reports. 2025 Feb 19;6(5):100813. link
Wang JM, Bose S, Murray S, Labaki WW, Kazerooni EA, Chung JH, Flaherty KR, Han MK, Hatt CR, Oldham JM. Quantitative CT Measures of Lung Fibrosis and Outcomes in the National Lung Screening Trial. Annals of the American Thoracic Society. 2025 Apr 11(ja). link
Wang X, Sharpnack J, Lee TC. Improving lung cancer diagnosis and survival prediction with deep learning and CT imaging. PLoS One. 2025 Jun 11;20(6):e0323174. link
Krishnaswamy D, Bontempi D, Thiriveedhi VK, Punzo D, Clunie D, Bridge CP, Aerts HJ, Kikinis R, Fedorov A. Enrichment of lung cancer computed tomography collections with AI-derived annotations. Scientific data. 2024 Jan 4;11(1):25. link
Chen JR, Hou KY, Wang YC, Lin SP, Mo YH, Peng SC, Lu CF. Enhanced Malignancy Prediction of Small Lung Nodules in Different Populations Using Transfer Learning on Low-Dose Computed Tomography. Diagnostics. 2025 Jun 8;15(12):1460. link
Xie Y, Zhang Y, Zhang P, Li Y, Xu B, Shao F, Zhang Y, Yang T, Li J, Li C, Chen T. Timing of screening benefit for lung cancer with low-dose computed tomography. Chest. 2025 Jun 18. link
Sun Y, Kang J, Haridas C, Mayne N, Potter A, Yang CF, Christiani DC, Li Y. Penalized deep partially linear cox models with application to CT scans of lung cancer patients. Biometrics. 2024 Mar;80(1):ujad024. link
Krishnan AR, Xu K, Li TZ, Remedios LW, Sandler KL, Maldonado F, Landman BA. Lung CT harmonization of paired reconstruction kernel images using generative adversarial networks. Medical Physics. 2024 Aug;51(8):5510-23. link
Foldyna B, Hadzic I, Zeleznik R, Langenbach MC, Raghu VK, Mayrhofer T, Lu MT, Aerts HJ. Deep learning analysis of epicardial adipose tissue to predict cardiovascular risk in heavy smokers. Communications medicine. 2024 Mar 13;4(1):44. link
Liu J, Qi L, Xu Q, Chen J, Cui S, Li F, Wang Y, Cheng S, Tan W, Zhou Z, Wang J. A self-supervised learning-based fine-grained classification model for distinguishing malignant from benign subcentimeter solid pulmonary nodules. Academic Radiology. 2024 Nov 1;31(11):4687-95. link
Wang Y, Zhou C, Ying L, Lee E, Chan HP, Chughtai A, Hadjiiski LM, Kazerooni EA. Leveraging serial low-dose CT scans in radiomics-based reinforcement learning to improve early diagnosis of lung cancer at baseline screening. Radiology: Cardiothoracic Imaging. 2024 May 16;6(3):e230196. link
Thiriveedhi VK, Krishnaswamy D, Clunie D, Pieper S, Kikinis R, Fedorov A. Cloud-based large-scale curation of medical imaging data using AI segmentation. Research Square. 2024 May 3:rs-3. link
Hermoza R, Nascimento JC, Carneiro G. Weakly-supervised preclinical tumor localization associated with survival prediction from lung cancer screening Chest X-ray images. Computerized Medical Imaging and Graphics. 2024 Jul 1;115:102395. link
Wang Z, Sui X, Song W, Xue F, Han W, Hu Y, Jiang J. Reinforcement learning for individualized lung cancer screening schedules: A nested case–control study. Cancer Medicine. 2024 Jul;13(13):e7436. link
Yang R, Li W, Yu S, Wu Z, Zhang H, Liu X, Tao L, Li X, Huang J, Guo X. Deep learning model for pathological grading and prognostic assessment of lung cancer using CT imaging: a study on NLST and external validation cohorts. Academic Radiology. 2025 Jan 1;32(1):533-42. link
Mascalchi M, Puliti D, Cavigli E, Cortés-Ibáñez FO, Picozzi G, Carrozzi L, Gorini G, Delorme S, Zompatori M, De Luca GR, Diciotti S. Large cell carcinoma of the lung: LDCT features and survival in screen-detected cases. European journal of radiology. 2024 Oct 1;179:111679. link
Marcinkiewicz AM, Buchwald M, Shanbhag A, Bednarski BP, Killekar A, Miller RJ, Builoff V, Lemley M, Berman DS, Dey D, Slomka PJ. AI for multistructure incidental findings and mortality prediction at chest CT in lung cancer screening. Radiology. 2024 Sep 17;312(3):e240541. link
Mascalchi M, Cavigli E, Picozzi G, Cozzi D, De Luca GR, Diciotti S. The Azygos Esophageal Recess Is Not to Be Missed in Screening Lung Cancer With LDCT. Journal of Thoracic Imaging. 2025 May 1;40(3):e0813. link
Aslani S, Alluri P, Gudmundsson E, Chandy E, McCabe J, Devaraj A, Horst C, Janes SM, Chakkara R, Alexander DC, Nair A. Enhancing cancer prediction in challenging screen-detected incident lung nodules using time-series deep learning. Computerized Medical Imaging and Graphics. 2024 Sep 1;116:102399. link
Ebrahimpour L, Després P, Manem VS. Differential Radiomics‐Based Signature Predicts Lung Cancer Risk Accounting for Imaging Parameters in NLST Cohort. Cancer Medicine. 2024 Oct;13(20):e70359. link
Simon J, Mikhael P, Graur A, Chang AE, Skates SJ, Osarogiagbon RU, Sequist LV, Fintelmann FJ. Significance of Image Reconstruction Parameters for Future Lung Cancer Risk Prediction Using Low-Dose Chest Computed Tomography and the Open-Access Sybil Algorithm. Investigative Radiology. 2024 Dec 13:10-97. link
Pan Z, Zhang R, Shen S, Lin Y, Zhang L, Wang X, Ye Q, Wang X, Chen J, Zhao Y, Christiani DC. OWL: an optimized and independently validated machine learning prediction model for lung cancer screening based on the UK Biobank, PLCO, and NLST populations. EBioMedicine. 2023 Feb 1;88. link
Liu P, Ji L, Ye F, Fu B. GraphLSurv: A scalable survival prediction network with adaptive and sparse structure learning for histopathological whole-slide images. Computer methods and programs in biomedicine. 2023 Apr 1;231:107433. link
Mikhael PG, Wohlwend J, Yala A, Karstens L, Xiang J, Takigami AK, Bourgouin PP, Chan P, Mrah S, Amayri W, Juan YH. Sybil: a validated deep learning model to predict future lung cancer risk from a single low-dose chest computed tomography. Journal of Clinical Oncology. 2023 Apr 20;41(12):2191-200. link
Huang YS, Wang TC, Huang SZ, Zhang J, Chen HM, Chang YC, Chang RF. An improved 3-D attention CNN with hybrid loss and feature fusion for pulmonary nodule classification. Computer Methods and Programs in Biomedicine. 2023 Feb 1;229:107278. link
Landy R, Wang VL, Baldwin DR, Pinsky PF, Cheung LC, Castle PE, Skarzynski M, Robbins HA, Katki HA. Recalibration of a deep learning model for low-dose computed tomographic images to inform lung cancer screening intervals. JAMA Network Open. 2023 Mar 1;6(3):e233273-. link
Fan L, Sowmya A, Meijering E, Song Y. Cancer survival prediction from whole slide images with self-supervised learning and slide consistency. IEEE Transactions on Medical Imaging. 2022 Dec 12;42(5):1401-12. link
Masquelin AH, Alshaabi T, Cheney N, Estépar RS, Bates JH, Kinsey CM. Perinodular parenchymal features improve indeterminate lung nodule classification. Academic radiology. 2023 Jun 1;30(6):1073-80. link
Xu K, Khan MS, Li TZ, Gao R, Terry JG, Huo Y, Lasko TA, Carr JJ, Maldonado F, Landman BA, Sandler KL. AI body composition in lung cancer screening: added value beyond lung cancer detection. Radiology. 2023 Jul 25;308(1):e222937. link
Li TZ, Hin Lee H, Xu K, Gao R, Dawant BM, Maldonado F, Sandler KL, Landman BA. Quantifying emphysema in lung screening computed tomography with robust automated lobe segmentation. Journal of Medical Imaging. 2023 Jul 1;10(4):044002-. link
Ma M, Zhang X, Li Y, Wang X, Zhang R, Wang Y, Sun P, Wang X, Sun X. ConvLSTM coordinated longitudinal transformer under spatio-temporal features for tumor growth prediction. Computers in Biology and Medicine. 2023 Sep 1;164:107313. link
Li S, Chen M, Wang Y, Li X, Gao G, Luo X, Tang L, Liu X, Wu N. An Effective Malignancy Prediction Model for Incidentally Detected Pulmonary Subsolid Nodules Based on Current and Prior CT Scans. Clinical Lung Cancer. 2023 Dec 1;24(8):e301-10. link
Venkadesh KV, Aleef TA, Scholten ET, Saghir Z, Silva M, Sverzellati N, Pastorino U, van Ginneken B, Prokop M, Jacobs C. Prior CT improves deep learning for malignancy risk estimation of screening-detected pulmonary nodules. Radiology. 2023 Aug 1;308(2):e223308. link
Juwara L, Yang YA, Velly AM, Saha-Chaudhuri P. Privacy-preserving analysis of time-to-event data under nested case-control sampling. Statistical Methods in Medical Research. 2024 Jan;33(1):96-111. link
Callender T, Imrie F, Cebere B, Pashayan N, Navani N, Van der Schaar M, Janes SM. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLoS Medicine. 2023 Oct 3;20(10):e1004287. link
Sun J, Liao X, Yan Y, Zhang X, Sun J, Tan W, Liu B, Wu J, Guo Q, Gao S, Li Z. Detection and staging of chronic obstructive pulmonary disease using a computed tomography–based weakly supervised deep learning approach. European Radiology. 2022 Aug;32(8):5319-29. link
Di D, Zhang J, Lei F, Tian Q, Gao Y. Big-hypergraph factorization neural network for survival prediction from whole slide image. IEEE Transactions on Image Processing. 2022 Jan 4;31:1149-60. link
Nagaraj Y, Wisselink HJ, Rook M, Cai J, Nagaraj SB, Sidorenkov G, Veldhuis R, Oudkerk M, Vliegenthart R, van Ooijen P. AI-driven model for automatic emphysema detection in low-dose computed tomography using disease-specific augmentation. Journal of Digital Imaging. 2022 Jun;35(3):538-50. link
Chetan MR, Dowson N, Price NW, Ather S, Nicolson A, Gleeson FV. Developing an understanding of artificial intelligence lung nodule risk prediction using insights from the Brock model. European Radiology. 2022 Aug;32(8):5330-8. link
Van Velzen SG, de Vos BD, Noothout JM, Verkooijen HM, Viergever MA, Išgum I. Generative models for reproducible coronary calcium scoring. Journal of Medical Imaging. 2022 Sep 1;9(5):052406-. link
Zheng Y, Gindra RH, Green EJ, Burks EJ, Betke M, Beane JE, Kolachalama VB. A graph-transformer for whole slide image classification. IEEE transactions on medical imaging. 2022 May 20;41(11):3003-15. link
Tan H, Bates JH, Matthew Kinsey C. Discriminating TB lung nodules from early lung cancers using deep learning. BMC Medical Informatics and Decision Making. 2022 Jun 21;22(1):161. link
Gichoya JW, Banerjee I, Bhimireddy AR, Burns JL, Celi LA, Chen LC, Correa R, Dullerud N, Ghassemi M, Huang SC, Kuo PC. AI recognition of patient race in medical imaging: a modelling study. The Lancet Digital Health. 2022 Jun 1;4(6):e406-14. link
Zhai Z, van Velzen SG, Lessmann N, Planken N, Leiner T, Išgum I. Learning coronary artery calcium scoring in coronary CTA from non-contrast CT using unsupervised domain adaptation. Frontiers in cardiovascular medicine. 2022 Sep 12;9:981901. link
Di D, Zou C, Feng Y, Zhou H, Ji R, Dai Q, Gao Y. Generating hypergraph-based high-order representations of whole-slide histopathological images for survival prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2022 Sep 26;45(5):5800-15. link
Huang P, Illei PB, Franklin W, Wu PH, Forde PM, Ashrafinia S, Hu C, Khan H, Vadvala HV, Shih IM, Battafarano RJ. Lung cancer recurrence risk prediction through integrated deep learning evaluation. Cancers. 2022 Aug 27;14(17):4150. link
Adams SJ, Madtes DK, Burbridge B, Johnston J, Goldberg IG, Siegel EL, Babyn P, Nair VS, Calhoun ME. Clinical impact and generalizability of a computer-assisted diagnostic tool to risk-stratify lung nodules with CT. Journal of the American College of Radiology. 2023 Feb 1;20(2):232-42. link
Wang H, Xiao N, Zhang J, Yang W, Ma Y, Suo Y, Zhao J, Qiang Y, Lian J, Yang Q. Static–dynamic coordinated transformer for tumor longitudinal growth prediction. Computers in Biology and Medicine. 2022 Sep 1;148:105922. link
Song P, Hou J, Xiao N, Zhao J, Zhao J, Qiang Y, Yang Q. MSTS-Net: malignancy evolution prediction of pulmonary nodules from longitudinal CT images via multi-task spatial-temporal self-attention network. International Journal of Computer Assisted Radiology and Surgery. 2023 Apr;18(4):685-93. link
Killekar A, Grodecki K, Lin A, Cadet S, McElhinney P, Razipour A, Chan C, Pressman BD, Julien P, Chen P, Simon J. Rapid quantification of COVID-19 pneumonia burden from computed tomography with convolutional long short-term memory networks. Journal of Medical Imaging. 2022 Sep 1;9(5):054001-. link
Zeleznik R, Foldyna B, Eslami P, Weiss J, Alexander I, Taron J, Parmar C, Alvi RM, Banerji D, Uno M, Kikuchi Y. Deep convolutional neural networks to predict cardiovascular risk from computed tomography. Nature communications. 2021 Jan 29;12(1):715. link
Masquelin AH, Cheney N, Kinsey CM, Bates JH. Wavelet decomposition facilitates training on small datasets for medical image classification by deep learning. Histochemistry and cell biology. 2021 Feb;155(2):309-17. link
Henderson LM, Durham DD, Tammemägi MC, Benefield T, Marsh MW, Rivera MP. Lung cancer screening with low dose computed tomography in patients with and without prior history of cancer in the National Lung Screening Trial. Journal of Thoracic Oncology. 2021 Jun 1;16(6):980-9. link
Schreuder A, Jacobs C, Lessmann N, Broeders MJ, Silva M, Išgum I, de Jong PA, Sverzellati N, Prokop M, Pastorino U, Schaefer-Prokop CM. Combining pulmonary and cardiac computed tomography biomarkers for disease-specific risk modelling in lung cancer screening. European Respiratory Journal. 2021 Sep 2;58(3). link
Heuvelmans MA, van Ooijen PM, Ather S, Silva CF, Han D, Heussel CP, Hickes W, Kauczor HU, Novotny P, Peschl H, Rook M. Lung cancer prediction by Deep Learning to identify benign lung nodules. Lung cancer. 2021 Apr 1;154:1-4. link
Raghu VK, Weiss J, Hoffmann U, Aerts HJ, Lu MT. Deep learning to estimate biological age from chest radiographs. Cardiovascular Imaging. 2021 Nov 1;14(11):2226-36. link
Schreuder A, Mets OM, Schaefer-Prokop CM, Jacobs C, Prokop M. Microsimulation modeling of extended annual CT screening among lung cancer cases in the National Lung Screening Trial. Lung Cancer. 2021 Jun 1;156:5-11. link
Chillakuru YR, Kranen K, Doppalapudi V, Xiong Z, Fu L, Heydari A, Sheth A, Seo Y, Vu T, Sohn JH. High precision localization of pulmonary nodules on chest CT utilizing axial slice number labels. BMC medical imaging. 2021 Apr 9;21(1):66. link
Venkadesh KV, Setio AA, Schreuder A, Scholten ET, Chung K, W. Wille MM, Saghir Z, van Ginneken B, Prokop M, Jacobs C. Deep learning for malignancy risk estimation of pulmonary nodules detected at low-dose screening CT. Radiology. 2021 Aug;300(2):438-47. link
Kastner J, Hossain R, Jeudy J, Dako F, Mehta V, Dalal S, Dharaiya E, White C. Lung-RADS version 1.0 versus lung-RADS version 1.1: comparison of categories using nodules from the national lung screening trial. Radiology. 2021 Jul;300(1):199-206. link
Yoo H, Lee SH, Arru CD, Doda Khera R, Singh R, Siebert S, Kim D, Lee Y, Park JH, Eom HJ, Digumarthy SR. AI-based improvement in lung cancer detection on chest radiographs: results of a multi-reader study in NLST dataset. European radiology. 2021 Dec;31(12):9664-74. link
National Lung Screening Trial Research Team. Reduced lung-cancer mortality with low-dose computed tomographic screening. New England Journal of Medicine. 2011 Aug 4;365(5):395-409. link
Kerssies T, Cavagnero N, Hermans A, Norouzi N, Averta G, Leibe B, Dubbelman G, de Geus D. Your ViT is Secretly an Image Segmentation Model. arXiv preprint arXiv:2503.19108. 2025 Mar 24. link
Erisen S. SERNet-former: semantic segmentation by efficient residual network with attention-boosting gates and attention-fusion networks. arXiv preprint arXiv:2401.15741. 2024 Jan 28. link
Zhang B, Liu L, Phan MH, Tian Z, Shen C, Liu Y. Segvit v2: Exploring efficient and continual semantic segmentation with plain vision transformers. International Journal of Computer Vision. 2024 Apr;132(4):1126-47. link
Chen X, Ding M, Wang X, Xin Y, Mo S, Wang Y, Han S, Luo P, Zeng G, Wang J. Context autoencoder for self-supervised representation learning. International Journal of Computer Vision. 2024 Jan;132(1):208-23. link
Shi D. Transnext: Robust foveal visual perception for vision transformers. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2024 (pp. 17773-17783). link
BaoLong N, Zhang C, Shi Y, Hirakawa T, Yamashita T, Matsui T, Fujiyoshi H. DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention. InProceedings of the Asian Conference on Computer Vision 2024 (pp. 4455-4472). link
Zhu X, Yang X, Wang Z, Li H, Dou W, Ge J, Lu L, Qiao Y, Dai J. Parameter-inverted image pyramid networks. Advances in Neural Information Processing Systems. 2024 Dec 16;37:132267-88. link
Ding X, Zhang Y, Ge Y, Zhao S, Song L, Yue X, Shan Y. Unireplknet: A universal perception large-kernel convnet for audio video point cloud time-series and image recognition. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 5513-5524). link
Kondapaneni N, Marks M, Knott M, Guimarães R, Perona P. Text-image alignment for diffusion-based perception. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 13883-13893). link
Oquab M, Darcet T, Moutakanni T, Vo H, Szafraniec M, Khalidov V, Fernandez P, Haziza D, Massa F, El-Nouby A, Assran M. Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193. 2023 Apr 14. link
Ji Y, Chen Z, Xie E, Hong L, Liu X, Liu Z, Lu T, Li Z, Luo P. Ddp: Diffusion model for dense visual prediction. InProceedings of the IEEE/CVF International Conference on Computer Vision 2023 (pp. 21741-21752). link
Guo MH, Lu CZ, Liu ZN, Cheng MM, Hu SM. Visual attention network. Computational visual media. 2023 Dec;9(4):733-52. link
Wei G, Zhang Z, Lan C, Lu Y, Chen Z. Active token mixer. InProceedings of the AAAI conference on artificial intelligence 2023 Jun 26 (Vol. 37, No. 3, pp. 2759-2767). link
Wang W, Chen W, Qiu Q, Chen L, Wu B, Lin B, He X, Liu W. Crossformer++: A versatile vision transformer hinging on cross-scale attention. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023 Dec 19;46(5):3123-36. link
Zhu L, Wang X, Ke Z, Zhang W, Lau RW. Biformer: Vision transformer with bi-level routing attention. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 10323-10333). link
Cui J, Zhong Z, Tian Z, Liu S, Yu B, Jia J. Generalized parametric contrastive learning. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023 May 22. link
Xia Z, Pan X, Song S, Li LE, Huang G. Dat++: Spatially dynamic vision transformer with deformable attention. arXiv preprint arXiv:2309.01430. 2023 Sep 4. link
Wang W, Dai J, Chen Z, Huang Z, Li Z, Zhu X, Hu X, Lu T, Lu L, Li H, Wang X. Internimage: Exploring large-scale vision foundation models with deformable convolutions. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 14408-14419). link
Woo S, Debnath S, Hu R, Chen X, Liu Z, Kweon IS, Xie S. Convnext v2: Co-designing and scaling convnets with masked autoencoders. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 16133-16142). link
Wu D, Guo Z, Li A, Yu C, Gao C, Sang N. Conditional boundary loss for semantic segmentation. IEEE Transactions on Image Processing. 2023 Jul 5;32:3717-31. link
Jain J, Singh A, Orlov N, Huang Z, Li J, Walton S, Shi H. Semask: Semantically masked transformers for semantic segmentation. InProceedings of the IEEE/CVF international conference on computer vision 2023 (pp. 752-761). link
Wan Q, Huang Z, Kang B, Feng J, Zhang L. Harnessing diffusion models for visual perception with meta prompts. arXiv preprint arXiv:2312.14733. 2023 Dec 22. link
He H, Cai J, Pan Z, Liu J, Zhang J, Tao D, Zhuang B. Dynamic focus-aware positional queries for semantic segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 11299-11308). link
Su W, Zhu X, Tao C, Lu L, Li B, Huang G, Qiao Y, Wang X, Zhou J, Dai J. Towards all-in-one pre-training via maximizing multi-modal mutual information. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 15888-15899). link
Wang W, Bao H, Dong L, Bjorck J, Peng Z, Liu Q, Aggarwal K, Mohammed OK, Singhal S, Som S, Wei F. Image as a foreign language: Beit pretraining for vision and vision-language tasks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 19175-19186). link
Li F, Zhang H, Xu H, Liu S, Zhang L, Ni LM, Shum HY. Mask dino: Towards a unified transformer-based framework for object detection and segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 3041-3050). link
Fang Y, Wang W, Xie B, Sun Q, Wu L, Wang X, Huang T, Wang X, Cao Y. Eva: Exploring the limits of masked visual representation learning at scale. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 19358-19369). link
Wang P, Wang S, Lin J, Bai S, Zhou X, Zhou J, Wang X, Zhou C. One-peace: Exploring one general representation model toward unlimited modalities. arXiv preprint arXiv:2305.11172. 2023 May 18. link
Liu Z, Hu H, Lin Y, Yao Z, Xie Z, Wei Y, Ning J, Cao Y, Zhang Z, Dong L, Wei F. Swin transformer v2: Scaling up capacity and resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 12009-12019). link
Chen Z, Duan Y, Wang W, He J, Lu T, Dai J, Qiao Y. Vision transformer adapter for dense predictions. arXiv preprint arXiv:2205.08534. 2022 May 17. link
Cai Y, Zhou Y, Han Q, Sun J, Kong X, Li J, Zhang X. Reversible column networks. arXiv preprint arXiv:2212.11696. 2022 Dec 22. link
Wei Y, Hu H, Xie Z, Zhang Z, Cao Y, Bao J, Chen D, Guo B. Contrastive learning rivals masked image modeling in fine-tuning via feature distillation. arXiv preprint arXiv:2205.14141. 2022 May 27. link
Liu X, Zhou J, Kong T, Lin X, Ji R. Exploring target representations for masked autoencoders. arXiv preprint arXiv:2209.03917. 2022 Sep 8. link
Liu Z, Mao H, Wu CY, Feichtenhofer C, Darrell T, Xie S. A convnet for the 2020s. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 11976-11986). link
Dong X, Bao J, Chen D, Zhang W, Yu N, Yuan L, Chen D, Guo B. Cswin transformer: A general vision transformer backbone with cross-shaped windows. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 12124-12134). link
Liu X, Zhou J, Kong T, Lin X, Ji R. Exploring target representations for masked autoencoders. arXiv preprint arXiv:2209.03917. 2022 Sep 8. link
Lin Y, Liu Z, Zhang Z, Hu H, Zheng N, Lin S, Cao Y. Could giant pretrained image models extract universal representations?. arXiv preprint arXiv:2211.02043. 2022 Nov 3. link
Cui J, Yuan Y, Zhong Z, Tian Z, Hu H, Lin S, Jia J. Region rebalance for long-tailed semantic segmentation. arXiv preprint arXiv:2204.01969. 2022 Apr 5. link
Rao Y, Zhao W, Tang Y, Zhou J, Lim SN, Lu J. Hornet: Efficient high-order spatial interactions with recursive gated convolutions. Advances in Neural Information Processing Systems. 2022 Dec 6;35:10353-66. link
Hassani A, Shi H. Dilated neighborhood attention transformer. arXiv preprint arXiv:2209.15001. 2022 Sep 29. link
Yang C, Qiao S, Yu Q, Yuan X, Zhu Y, Yuille A, Adam H, Chen LC. Moat: Alternating mobile convolution and attention brings strong vision models. arXiv preprint arXiv:2210.01820. 2022 Oct 4. link
Yang J, Li C, Dai X, Gao J. Focal modulation networks. Advances in Neural Information Processing Systems. 2022 Dec 6;35:4203-17. link
Li S, Wang Z, Liu Z, Tan C, Lin H, Wu D, Chen Z, Zheng J, Li SZ. Moganet: Multi-order gated aggregation network. arXiv preprint arXiv:2211.03295. 2022 Nov 7. link
Khirodkar R, Smith B, Chandra S, Agrawal A, Criminisi A. Sequential ensembling for semantic segmentation. arXiv preprint arXiv:2210.05387. 2022 Oct 8. link
Lin Y, Liu Z, Zhang Z, Hu H, Zheng N, Lin S, Cao Y. Could giant pretrained image models extract universal representations?. arXiv preprint arXiv:2211.02043. 2022 Nov 3. link
Yuan L, Hou Q, Jiang Z, Feng J, Yan S. Volo: Vision outlooker for visual recognition. IEEE transactions on pattern analysis and machine intelligence. 2022 Sep 12;45(5):6575-86. link
He K, Chen X, Xie S, Li Y, Dollár P, Girshick R. Masked autoencoders are scalable vision learners. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 16000-16009). link
Hong Y, Pan H, Sun W, Yu X, Gao H. Representation separation for semantic segmentation with vision transformers. arXiv preprint arXiv:2212.13764. 2022 Dec 28. link
Cheng B, Misra I, Schwing AG, Kirillov A, Girdhar R. Masked-attention mask transformer for universal image segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 1290-1299). link
Liu X, Zhou J, Kong T, Lin X, Ji R. Exploring target representations for masked autoencoders. arXiv preprint arXiv:2209.03917. 2022 Sep 8. link
Yang J, Li C, Zhang P, Dai X, Xiao B, Yuan L, Gao J. Focal self-attention for local-global interactions in vision transformers. arXiv preprint arXiv:2107.00641. 2021 Jul 1. link
Huang S, Lu Z, Cheng R, He C. FaPN: Feature-aligned pyramid network for dense image prediction. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 864-873). link
Bao H, Dong L, Piao S, Wei F. Beit: Bert pre-training of image transformers. arXiv preprint arXiv:2106.08254. 2021 Jun 15. link
Bousselham W, Thibault G, Pagano L, Machireddy A, Gray J, Chang YH, Song X. Efficient self-ensemble for semantic segmentation. arXiv preprint arXiv:2111.13280. 2021 Nov 26. link
Zhang W, Pang J, Chen K, Loy CC. K-net: Towards unified image segmentation. Advances in Neural Information Processing Systems. 2021 Dec 6;34:10326-38. link
Gong C, Wang D, Li M, Chandra V, Liu Q. Vision transformers with patch diversification. arXiv preprint arXiv:2104.12753. 2021 Apr 26. link
Geng Z, Guo MH, Chen H, Li X, Wei K, Lin Z. Is attention better than matrix decomposition?. arXiv preprint arXiv:2109.04553. 2021 Sep 9. link
Touvron H, Cord M, El-Nouby A, Bojanowski P, Joulin A, Synnaeve G, Jégou H. Augmenting convolutional networks with attention-based aggregation. arXiv preprint arXiv:2112.13692. 2021 Dec 27. link
Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, Luo P. SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in neural information processing systems. 2021 Dec 6;34:12077-90. link
Jiang ZH, Hou Q, Yuan L, Zhou D, Shi Y, Jin X, Wang A, Feng J. All tokens matter: Token labeling for training better vision transformers. Advances in neural information processing systems. 2021 Dec 6;34:18590-602. link
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B. Swin transformer: Hierarchical vision transformer using shifted windows. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 10012-10022). link
Strudel R, Garcia R, Laptev I, Schmid C. Segmenter: Transformer for semantic segmentation. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 7262-7272). link
Cheng B, Schwing A, Kirillov A. Per-pixel classification is not all you need for semantic segmentation. Advances in neural information processing systems. 2021 Dec 6;34:17864-75. link
Bousselham W, Thibault G, Pagano L, Machireddy A, Gray J, Chang YH, Song X. Efficient self-ensemble for semantic segmentation. arXiv preprint arXiv:2111.13280. 2021 Nov 26. link
Zhu X, Yang X, Wang Z, Li H, Dou W, Ge J, Lu L, Qiao Y, Dai J. Parameter-inverted image pyramid networks. Advances in Neural Information Processing Systems. 2024 Dec 16;37:132267-88. link
Wu J, Wang J, Yang Z, Gan Z, Liu Z, Yuan J, Wang L. Grit: A generative region-to-text transformer for object understanding. InEuropean Conference on Computer Vision 2024 Sep 29 (pp. 207-224). Cham: Springer Nature Switzerland. link
Wu J, Jiang Y, Liu Q, Yuan Z, Bai X, Bai S. General object foundation model for images and videos at scale. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 3783-3795). link
Hou X, Liu M, Zhang S, Wei P, Chen B, Lan X. Relation detr: Exploring explicit position relation prior for object detection. InEuropean Conference on Computer Vision 2024 Sep 29 (pp. 89-105). Cham: Springer Nature Switzerland. link
Liu S, Zeng Z, Ren T, Li F, Zhang H, Yang J, Jiang Q, Li C, Yang J, Su H, Zhu J. Grounding dino: Marrying dino with grounded pre-training for open-set object detection. InEuropean Conference on Computer Vision 2024 Sep 29 (pp. 38-55). Cham: Springer Nature Switzerland. link
Zong Z, Song G, Liu Y. Detrs with collaborative hybrid assignments training. InProceedings of the IEEE/CVF international conference on computer vision 2023 (pp. 6748-6758). link
Lin Y, Yuan Y, Zhang Z, Li C, Zheng N, Hu H. Detr does not need multi-scale or locality design. InProceedings of the IEEE/CVF International Conference on Computer Vision 2023 (pp. 6545-6554). link
Wang W, Bao H, Dong L, Bjorck J, Peng Z, Liu Q, Aggarwal K, Mohammed OK, Singhal S, Som S, Wei F. Image as a foreign language: Beit pretraining for vision and vision-language tasks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 19175-19186). link
Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q. CenterNet++ for object detection. IEEE transactions on pattern analysis and machine intelligence. 2023 Dec 13;46(5):3509-21. link
Wang CY, Bochkovskiy A, Liao HY. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 7464-7475). link
Song P, Li P, Dai L, Wang T, Chen Z. Boosting R-CNN: Reweighting R-CNN samples by RPN’s error for underwater object detection. Neurocomputing. 2023 Apr 14;530:150-64. link
Wang W, Dai J, Chen Z, Huang Z, Li Z, Zhu X, Hu X, Lu T, Lu L, Li H, Wang X. Internimage: Exploring large-scale vision foundation models with deformable convolutions. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 14408-14419). link
Su W, Zhu X, Tao C, Lu L, Li B, Huang G, Qiao Y, Wang X, Zhou J, Dai J. Towards all-in-one pre-training via maximizing multi-modal mutual information. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 15888-15899). link
Oksuz K, Kuzucu S, Joy T, Dokania PK. Mocae: Mixture of calibrated experts significantly improves object detection. arXiv preprint arXiv:2309.14976. 2023 Sep 26. link
Ren T, Yang J, Liu S, Zeng A, Li F, Zhang H, Li H, Zeng Z, Zhang L. A strong and reproducible object detector with only public datasets. arXiv preprint arXiv:2304.13027. 2023 Apr 25.
Fang Y, Wang W, Xie B, Sun Q, Wu L, Wang X, Huang T, Wang X, Cao Y. Eva: Exploring the limits of masked visual representation learning at scale. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 19358-19369). link
Chen Q, Wang J, Han C, Zhang S, Li Z, Chen X, Chen J, Wang X, Han S, Zhang G, Feng H. Group detr v2: Strong object detector with encoder-decoder pretraining. arXiv preprint arXiv:2211.03594. 2022 Nov 7. link
Yang J, Li C, Dai X, Gao J. Focal modulation networks. Advances in Neural Information Processing Systems. 2022 Dec 6;35:4203-17. link
Wei Y, Hu H, Xie Z, Zhang Z, Cao Y, Bao J, Chen D, Guo B. Contrastive learning rivals masked image modeling in fine-tuning via feature distillation. arXiv preprint arXiv:2205.14141. 2022 May 27. link
Cai Y, Zhou Y, Han Q, Sun J, Kong X, Li J, Zhang X. Reversible column networks. arXiv preprint arXiv:2212.11696. 2022 Dec 22. link
Ouyang-Zhang J, Cho JH, Zhou X, Krähenbühl P. Nms strikes back. arXiv preprint arXiv:2212.06137. 2022 Dec 12. link
Zhang H, Li F, Liu S, Zhang L, Su H, Zhu J, Ni LM, Shum HY. Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv preprint arXiv:2203.03605. 2022 Mar 7. link
Liu Z, Hu H, Lin Y, Yao Z, Xie Z, Wei Y, Ning J, Cao Y, Zhang Z, Dong L, Wei F. Swin transformer v2: Scaling up capacity and resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 12009-12019). link
Zhang H, Zhang P, Hu X, Chen YC, Li L, Dai X, Wang L, Yuan L, Hwang JN, Gao J. Glipv2: Unifying localization and vision-language understanding. Advances in Neural Information Processing Systems. 2022 Dec 6;35:36067-80. link
Li LH, Zhang P, Zhang H, Yang J, Li C, Zhong Y, Wang L, Yuan L, Zhang L, Hwang JN, Chang KW. Grounded language-image pre-training. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 10965-10975). link
Chen Z, Duan Y, Wang W, He J, Lu T, Dai J, Qiao Y. Vision transformer adapter for dense predictions. arXiv preprint arXiv:2205.08534. 2022 May 17. link
Liang T, Chu X, Liu Y, Wang Y, Tang Z, Chu W, Chen J, Ling H. Cbnet: A composite backbone network architecture for object detection. IEEE Transactions on Image Processing. 2022 Oct 28;31:6893-906. link
Liu X, Zhou J, Kong T, Lin X, Ji R. Exploring target representations for masked autoencoders. arXiv preprint arXiv:2209.03917. 2022 Sep 8. link
Zhang H, Wu C, Zhang Z, Zhu Y, Lin H, Zhang Z, Sun Y, He T, Mueller J, Manmatha R, Li M. Resnest: Split-attention networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 2736-2746). link
Xu S, Wang X, Lv W, Chang Q, Cui C, Deng K, Wang G, Dang Q, Wei S, Du Y, Lai B. PP-YOLOE: An evolved version of YOLO. arXiv preprint arXiv:2203.16250. 2022 Mar 30. link
Li S, Wu D, Wu F, Zang Z, Li S. Architecture-agnostic masked image modeling--from vit back to cnn. arXiv preprint arXiv:2205.13943. 2022 May 27. link
Yuan L, Chen D, Chen YL, Codella N, Dai X, Gao J, Hu H, Huang X, Li B, Li C, Liu C. Florence: A new foundation model for computer vision. arXiv preprint arXiv:2111.11432. 2021 Nov 22. link
Xu M, Zhang Z, Hu H, Wang J, Wang L, Wei F, Bai X, Liu Z. End-to-end semi-supervised object detection with soft teacher. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 3060-3069). link
Dai X, Chen Y, Xiao B, Chen D, Liu M, Yuan L, Zhang L. Dynamic head: Unifying object detection heads with attentions. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 7373-7382). link
Yang J, Li C, Zhang P, Dai X, Xiao B, Yuan L, Gao J. Focal self-attention for local-global interactions in vision transformers. arXiv preprint arXiv:2107.00641. 2021 Jul 1. link
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B. Swin transformer: Hierarchical vision transformer using shifted windows. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 10012-10022). link
Ghiasi G, Cui Y, Srinivas A, Qian R, Lin TY, Cubuk ED, Le QV, Zoph B. Simple copy-paste is a strong data augmentation method for instance segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 2918-2928). link
Zhou X, Koltun V, Krähenbühl P. Probabilistic two-stage detection. arXiv preprint arXiv:2103.07461. 2021 Mar 12. link
Hu J, Cao L, Lu Y, Zhang S, Wang Y, Li K, Huang F, Shao L, Ji R. Istr: End-to-end instance segmentation with transformers. arXiv preprint arXiv:2105.00637. 2021 May 3. link
Fang Y, Yang S, Wang X, Li Y, Fang C, Shan Y, Feng B, Liu W. Instances as queries. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 6910-6919). link
Wang CY, Bochkovskiy A, Liao HY. Scaled-yolov4: Scaling cross stage partial network. InProceedings of the IEEE/cvf conference on computer vision and pattern recognition 2021 (pp. 13029-13038). link
Qiao S, Chen LC, Yuille A. Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 10213-10224). link
Wang CY, Yeh IH, Liao HY. You only learn one representation: Unified network for multiple tasks. arXiv preprint arXiv:2105.04206. 2021 May 10. link
Shinya Y. USB: Universal-scale object detection benchmark. arXiv preprint arXiv:2103.14027. 2021 Mar 25. link
Li X, Wang W, Hu X, Li J, Tang J, Yang J. Generalized focal loss v2: Learning reliable localization quality estimation for dense object detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 11632-11641). link
Ge Z, Liu S, Li Z, Yoshie O, Sun J. Ota: Optimal transport assignment for object detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 303-312). link
Ge Z, Liu S, Wang F, Li Z, Sun J. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430. 2021 Jul 18. link
Zoph B, Ghiasi G, Lin TY, Cui Y, Liu H, Cubuk ED, Le Q. Rethinking pre-training and self-training. Advances in neural information processing systems. 2020;33:3833-45. link
Kim K, Lee HS. Probabilistic anchor assignment with iou prediction for object detection. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXV 16 2020 (pp. 355-371). Springer International Publishing. link
Chi C, Wei F, Hu H. Relationnet++: Bridging visual representations for object detection via transformer decoder. Advances in Neural Information Processing Systems. 2020;33:13564-74. link
Tan M, Pang R, Le QV. Efficientdet: Scalable and efficient object detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 10781-10790). link
Zhu X, Su W, Lu L, Li B, Wang X, Dai J. Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159. 2020 Oct 8. link
Cao Y, Xu J, Lin S, Wei F, Hu H. Global context networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2020 Dec 24;45(6):6881-95. link
Du X, Lin TY, Jin P, Ghiasi G, Tan M, Cui Y, Le QV, Song X. Spinenet: Learning scale-permuted backbone for recognition and localization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 11592-11601). link
Chen Y, Zhang Z, Cao Y, Wang L, Lin S, Hu H. Reppoints v2: Verification meets regression for object detection. Advances in Neural Information Processing Systems. 2020;33:5621-31. link
Cao J, Chen Q, Guo J, Shi R. Attention-guided context feature pyramid network for object detection. arXiv preprint arXiv:2005.11475. 2020 May 23. link
Song G, Liu Y, Wang X. Revisiting the sibling head in object detector. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 11563-11572). link
Zhang S, Chi C, Yao Y, Lei Z, Li SZ. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 9759-9768). link
Zoph B, Cubuk ED, Ghiasi G, Lin TY, Shlens J, Le QV. Learning data augmentation strategies for object detection. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVII 16 2020 (pp. 566-583). Springer International Publishing. link
Oksuz K, Cam BC, Akbas E, Kalkan S. A ranking-based, balanced loss function unifying classification and localisation in object detection. Advances in Neural Information Processing Systems. 2020;33:15534-45. link
Wang X, Zhang S, Yu Z, Feng L, Zhang W. Scale-equalizing pyramid convolution for object detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 13359-13368). link
Cao J, Cholakkal H, Anwer RM, Khan FS, Pang Y, Shao L. D2det: Towards high quality object detection and instance segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 11485-11494). link
Zhang H, Chang H, Ma B, Wang N, Chen X. Dynamic R-CNN: Towards high quality object detection via dynamic training. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16 2020 (pp. 260-275). Springer International Publishing. link
Duan K, Xie L, Qi H, Bai S, Huang Q, Tian Q. Corner proposal network for anchor-free, two-stage object detection. InEuropean Conference on Computer Vision 2020 Aug 23 (pp. 399-416). Cham: Springer International Publishing. link
Duan K, Xie L, Qi H, Bai S, Huang Q, Tian Q. Location-sensitive visual recognition with cross-iou loss. arXiv preprint arXiv:2104.04899. 2021 Apr 11. link
Zhou J, Wei C, Wang H, Shen W, Xie C, Yuille A, Kong T. ibot: Image bert pre-training with online tokenizer. arXiv preprint arXiv:2111.07832. 2021 Nov 15. link
Wang C, Li K, Jiang T, Zeng X, Wang Y, Wang L. Make Your Training Flexible: Towards Deployment-Efficient Video Models. arXiv preprint arXiv:2503.14237. 2025 Mar 18. link
Srivastava S, Sharma G. Omnivec2-a novel transformer based network for large scale multimodal and multitask learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2024 (pp. 27412-27424). link
Lu H, Jian H, Poppe R, Salah AA. Enhancing video transformers for action understanding with vlm-aided training. arXiv preprint arXiv:2403.16128. 2024 Mar 24. link
Wang Y, Li K, Li X, Yu J, He Y, Chen G, Pei B, Zheng R, Wang Z, Shi Y, Jiang T. Internvideo2: Scaling foundation models for multimodal video understanding. InEuropean Conference on Computer Vision 2024 Sep 29 (pp. 396-416). Cham: Springer Nature Switzerland. link
Srivastava S, Sharma G. Omnivec: Learning robust representations with cross modal sharing. InProceedings of the IEEE/CVF winter conference on applications of computer vision 2024 (pp. 1236-1248). link
Agrawal T, Ali A, Dantcheva A, Bremond F. AM Flow: Adapters for Temporal Processing in Action Recognition. arXiv preprint arXiv:2411.02065. 2024 Nov 4. link
Li X, Zhu Y, Wang L. Zeroi2v: Zero-cost adaptation of pre-trained transformers from image to video. InEuropean Conference on Computer Vision 2024 Sep 29 (pp. 425-443). Cham: Springer Nature Switzerland. link
Kim M, Seo PH, Schmid C, Cho M. Learning correlation structures for vision transformers. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2024 (pp. 18941-18951).
Kahatapitiya K, Arnab A, Nagrani A, Ryoo MS. Victr: Video-conditioned text representations for activity recognition. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 18547-18558). link
Li K, Li X, Wang Y, He Y, Wang Y, Wang L, Qiao Y. Videomamba: State space model for efficient video understanding. InEuropean Conference on Computer Vision 2024 Sep 29 (pp. 237-255). Cham: Springer Nature Switzerland. link
Zhao Z, Huang B, Xing S, Wu G, Qiao Y, Wang L. Asymmetric masked distillation for pre-training small foundation models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 18516-18526). link
Bandara WG, Patel N, Gholami A, Nikkhah M, Agrawal M, Patel VM. Adamae: Adaptive masking for efficient spatiotemporal learning with masked autoencoders. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 14507-14517). link
Lee D, Lee J, Choi J. CAST: cross-attention in space and time for video action recognition. Advances in Neural Information Processing Systems. 2023 Dec 15;36:79399-425. link
Wang L, Huang B, Zhao Z, Tong Z, He Y, Wang Y, Wang Y, Qiao Y. Videomae v2: Scaling video masked autoencoders with dual masking. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 14549-14560). link
Li K, Wang Y, Li Y, Wang Y, He Y, Wang L, Qiao Y. Unmasked teacher: Towards training-efficient video foundation models. InProceedings of the IEEE/CVF International Conference on Computer Vision 2023 (pp. 19948-19960). link
Huang Z, Zhang S, Pan L, Qing Z, Zhang Y, Liu Z, Ang Jr MH. Temporally-adaptive models for efficient video understanding. arXiv preprint arXiv:2308.05787. 2023 Aug 10. link
Wu W, Wang X, Luo H, Wang J, Yang Y, Ouyang W. Bidirectional cross-modal knowledge exploration for video recognition with pre-trained vision-language models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 6620-6630). link
Tu S, Dai Q, Wu Z, Cheng ZQ, Hu H, Jiang YG. Implicit temporal modeling with learnable alignment for video recognition. InProceedings of the ieee/cvf international conference on computer vision 2023 (pp. 19936-19947). link
Fang Y, Wang W, Xie B, Sun Q, Wu L, Wang X, Huang T, Wang X, Cao Y. Eva: Exploring the limits of masked visual representation learning at scale. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 19358-19369). link
Yao H, Wu W, Li Z. Side4video: Spatial-temporal side network for memory-efficient image-to-video transfer learning. arXiv preprint arXiv:2311.15769. 2023 Nov 27. link
Wu W, Song Y, Sun Z, Wang J, Xu C, Ouyang W. What can simple arithmetic operations do for temporal modeling?. InProceedings of the IEEE/CVF international conference on computer vision 2023 (pp. 13712-13722). link
Qing Z, Zhang S, Huang Z, Wang X, Wang Y, Lv Y, Gao C, Sang N. Mar: Masked autoencoders for efficient action recognition. IEEE Transactions on Multimedia. 2023 Mar 30;26:218-33. link
Wu W, Sun Z, Ouyang W. Revisiting classifier: Transferring vision-language models for video recognition. InProceedings of the AAAI conference on artificial intelligence 2023 Jun 26 (Vol. 37, No. 3, pp. 2847-2855). link
Wang Y, Li K, Li Y, He Y, Huang B, Zhao Z, Zhang H, Xu J, Liu Y, Wang Z, Xing S. Internvideo: General video foundation models via generative and discriminative learning. arXiv preprint arXiv:2212.03191. 2022 Dec 6. link
Li K, Wang Y, He Y, Li Y, Wang Y, Wang L, Qiao Y. Uniformerv2: Spatiotemporal learning by arming image vits with video uniformer. arXiv preprint arXiv:2211.09552. 2022 Nov 17. link
Yan S, Xiong X, Arnab A, Lu Z, Zhang M, Sun C, Schmid C. Multiview transformers for video recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 3333-3343). link
Yu J, Wang Z, Vasudevan V, Yeung L, Seyedhosseini M, Wu Y. Coca: Contrastive captioners are image-text foundation models. arXiv preprint arXiv:2205.01917. 2022 May 4. link
Wang P, Wang S, Lin J, Bai S, Zhou X, Zhou J, Wang X, Zhou C. One-peace: Exploring one general representation model toward unlimited modalities. arXiv preprint arXiv:2305.11172. 2023 May 18. link
Park J, Lee J, Sohn K. Dual-path adaptation from image to video transformers. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 2203-2213). link
Yang T, Zhu Y, Xie Y, Zhang A, Chen C, Li M. Aim: Adapting image models for efficient video action recognition. arXiv preprint arXiv:2302.03024. 2023 Feb 6. link
Xu H, Ye Q, Yan M, Shi Y, Ye J, Xu Y, Li C, Bi B, Qian Q, Wang W, Xu G. mplug-2: A modularized multi-modal foundation model across text, image and video. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 38728-38748). PMLR. link
Wang R, Chen D, Wu Z, Chen Y, Dai X, Liu M, Yuan L, Jiang YG. Masked video distillation: Rethinking masked feature modeling for self-supervised video representation learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 6312-6322). link
Dehghani M, Djolonga J, Mustafa B, Padlewski P, Heek J, Gilmer J, Steiner AP, Caron M, Geirhos R, Alabdulmohsin I, Jenatton R. Scaling vision transformers to 22 billion parameters. InInternational Conference on Machine Learning 2023 Jul 3 (pp. 7480-7512). PMLR. link
Ryali C, Hu YT, Bolya D, Wei C, Fan H, Huang PY, Aggarwal V, Chowdhury A, Poursaeed O, Hoffman J, Malik J. Hiera: A hierarchical vision transformer without the bells-and-whistles. InInternational conference on machine learning 2023 Jul 3 (pp. 29441-29454). PMLR. link
Lin Z, Geng S, Zhang R, Gao P, De Melo G, Wang X, Dai J, Qiao Y, Li H. Frozen clip models are efficient video learners. InEuropean Conference on Computer Vision 2022 Oct 23 (pp. 388-404). Cham: Springer Nature Switzerland. link
Tong Z, Song Y, Wang J, Wang L. Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training. Advances in neural information processing systems. 2022 Dec 6;35:10078-93. link
Pan J, Lin Z, Zhu X, Shao J, Li H. St-adapter: Parameter-efficient image-to-video transfer learning. Advances in Neural Information Processing Systems. 2022 Dec 6;35:26462-77. link
Wei C, Fan H, Xie S, Wu CY, Yuille A, Feichtenhofer C. Masked feature prediction for self-supervised visual pre-training. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 14668-14678). link
Qiu Z, Yao T, Ngo CW, Mei T. Mlp-3d: A mlp-like 3d architecture with grouped time mixing. InProceedings of the ieee/cvf conference on computer vision and pattern recognition 2022 (pp. 3062-3072). link
Liu Z, Ning J, Cao Y, Wei Y, Zhang Z, Lin S, Hu H. Video swin transformer. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 3202-3211). link
Girdhar R, Singh M, Ravi N, Van Der Maaten L, Joulin A, Misra I. Omnivore: A single model for many visual modalities. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 16102-16112). link
Hernandez J, Villegas R, Ordonez V. Vic-mae: Self-supervised representation learning from images and video with contrastive masked autoencoders. InEuropean Conference on Computer Vision 2024 Sep 29 (pp. 444-463). Cham: Springer Nature Switzerland. link
Liu Z, Hu H, Lin Y, Yao Z, Xie Z, Wei Y, Ning J, Cao Y, Zhang Z, Dong L, Wei F. Swin transformer v2: Scaling up capacity and resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 12009-12019). link
Li K, Wang Y, Gao P, Song G, Liu Y, Li H, Qiao Y. Uniformer: Unified transformer for efficient spatiotemporal representation learning. arXiv preprint arXiv:2201.04676. 2022 Jan 12. link
Truong TD, Bui QH, Duong CN, Seo HS, Phung SL, Li X, Luu K. Direcformer: A directed attention in transformer approach to robust action recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 20030-20040). link
Xiang W, Li C, Wang B, Wei X, Hua XS, Zhang L. Spatiotemporal self-attention modeling with temporal patch shift for action recognition. InEuropean Conference on Computer Vision 2022 Oct 23 (pp. 627-644). Cham: Springer Nature Switzerland. link
Ni B, Peng H, Chen M, Zhang S, Meng G, Fu J, Xiang S, Ling H. Expanding language-image pretrained models for general video recognition. InEuropean conference on computer vision 2022 Oct 23 (pp. 1-18). Cham: Springer Nature Switzerland. link
Li Y, Wu CY, Fan H, Mangalam K, Xiong B, Malik J, Feichtenhofer C. Mvitv2: Improved multiscale vision transformers for classification and detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 4804-4814). link
Long F, Qiu Z, Pan Y, Yao T, Luo J, Mei T. Stand-alone inter-frame attention in video models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022 (pp. 3192-3201). link
Wang M, Xing J, Liu Y. Actionclip: A new paradigm for video action recognition. arXiv preprint arXiv:2109.08472. 2021 Sep 17. link
Zhang B, Yu J, Fifty C, Han W, Dai AM, Pang R, Sha F. Co-training transformer with videos and images improves action recognition. arXiv preprint arXiv:2112.07175. 2021 Dec 14. link
Patrick M, Campbell D, Asano Y, Misra I, Metze F, Feichtenhofer C, Vedaldi A, Henriques JF. Keeping your eye on the ball: Trajectory attention in video transformers. Advances in neural information processing systems. 2021 Dec 6;34:12493-506. link
Akbari H, Yuan L, Qian R, Chuang WH, Chang SF, Cui Y, Gong B. Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text. Advances in neural information processing systems. 2021 Dec 6;34:24206-21. link
Ryoo MS, Piergiovanni AJ, Arnab A, Dehghani M, Angelova A. Tokenlearner: What can 8 learned tokens do for images and videos?. arXiv preprint arXiv:2106.11297. 2021 Jun 21. link
Kondratyuk D, Yuan L, Li Y, Zhang L, Tan M, Brown M, Gong B. Movinets: Mobile video networks for efficient video recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 16020-16030). link
Zhang Y, Li X, Liu C, Shuai B, Zhu Y, Brattoli B, Chen H, Marsic I, Tighe J. Vidtr: Video transformer without convolutions. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 13577-13587). link
Bertasius G, Wang H, Torresani L. Is space-time attention all you need for video understanding?. InICML 2021 Jul 18 (Vol. 2, No. 3, p. 4). link
Nagrani A, Yang S, Arnab A, Jansen A, Schmid C, Sun C. Attention bottlenecks for multimodal fusion. Advances in neural information processing systems. 2021 Dec 6;34:14200-13. link
Fan H, Xiong B, Mangalam K, Li Y, Yan Z, Malik J, Feichtenhofer C. Multiscale vision transformers. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 6824-6835). link
Du X, Li Y, Cui Y, Qian R, Li J, Bello I. Revisiting 3D ResNets for video recognition. arXiv preprint arXiv:2109.01696. 2021 Sep 3. link
Arnab A, Dehghani M, Heigold G, Sun C, Lučić M, Schmid C. Vivit: A video vision transformer. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 6836-6846). link
Sharir G, Noy A, Zelnik-Manor L. An image is worth 16x16 words, what is a video worth?. arXiv preprint arXiv:2103.13915. 2021 Mar 25. link
Feichtenhofer C. X3d: Expanding architectures for efficient video recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 203-213). link
Duan H, Zhao Y, Xiong Y, Liu W, Lin D. Omni-sourced webly-supervised learning for video recognition. InEuropean conference on computer vision 2020 Aug 23 (pp. 670-688). Cham: Springer International Publishing. link
Tran D, Wang H, Torresani L, Feiszli M. Video classification with channel-separated convolutional networks. InProceedings of the IEEE/CVF international conference on computer vision 2019 (pp. 5552-5561). link
Ghadiyaram D, Tran D, Mahajan D. Large-scale weakly-supervised pre-training for video action recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2019 (pp. 12046-12055). link
Qiu Z, Yao T, Ngo CW, Tian X, Mei T. Learning spatio-temporal representation with local and global diffusion. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2019 (pp. 12056-12065). link
Alexandridis KP, Deng J, Nguyen A, Luo S. Adaptive Parametric Activation. InEuropean Conference on Computer Vision 2024 Sep 29 (pp. 455-476). Cham: Springer Nature Switzerland. link
Yang Z, Xu Q, Wang Z, Li S, Han B, Bao S, Cao X, Huang Q. Harnessing hierarchical label distribution variations in test agnostic long-tail recognition. arXiv preprint arXiv:2405.07780. 2024 May 13. link
Rangwani H, Mondal P, Mishra M, Asokan AR, Babu RV. Deit-lt: Distillation strikes back for vision transformer training on long-tailed datasets. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 23396-23406). link
Du C, Wang Y, Song S, Huang G. Probabilistic contrastive learning for long-tailed visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2024 Feb 23. link
Shi JX, Wei T, Zhou Z, Shao JJ, Han XY, Li YF. Long-tail learning with foundation model: Heavy fine-tuning hurts. arXiv preprint arXiv:2309.10019. 2023 Sep 18. link
Zhao Q, Jiang C, Hu W, Zhang F, Liu J. Mdcs: More diverse experts with consistency self-distillation for long-tailed recognition. InProceedings of the IEEE/CVF International Conference on Computer Vision 2023 (pp. 11597-11608). link
Iscen A, Fathi A, Schmid C. Improving image recognition by retrieving from web-scale image-text data. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 19295-19304). link
Suh MK, Seo SW. Long-tailed recognition by mutual information maximization between latent features and ground-truth labels. InInternational conference on machine learning 2023 Jul 3 (pp. 32770-32782). PMLR. link
Cui J, Zhong Z, Tian Z, Liu S, Yu B, Jia J. Generalized parametric contrastive learning. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023 May 22. link
Sinha S, Ohashi H. Difficulty-net: Learning to predict difficulty for long-tailed recognition. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision 2023 (pp. 6444-6453). link
Du F, Yang P, Jia Q, Nan F, Chen X, Yang Y. Global and local mixture consistency cumulative learning for long-tailed visual recognitions. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 15814-15823). link
Li J, Tan Z, Wan J, Lei Z, Guo G. Nested collaborative learning for long-tailed visual recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 6949-6958). link
Cui J, Liu S, Tian Z, Zhong Z, Jia J. Reslt: Residual learning for long-tailed recognition. IEEE transactions on pattern analysis and machine intelligence. 2022 May 13;45(3):3695-706. link
Rangwani H, Aithal SK, Mishra M. Escaping saddle points for effective generalization on class-imbalanced data. Advances in Neural Information Processing Systems. 2022 Dec 6;35:22791-805. link
Li T, Cao P, Yuan Y, Fan L, Yang Y, Feris RS, Indyk P, Katabi D. Targeted supervised contrastive learning for long-tailed recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 6918-6928). link
Alshammari S, Wang YX, Ramanan D, Kong S. Long-tailed recognition via weight balancing. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 6897-6907). link
Gesmundo A. A continual development methodology for large-scale multitask dynamic ML systems. arXiv preprint arXiv:2209.07326. 2022 Sep 15. link
Tian C, Wang W, Zhu X, Dai J, Qiao Y. Vl-ltr: Learning class-wise visual-linguistic representation for long-tailed visual recognition. InEuropean conference on computer vision 2022 Oct 20 (pp. 73-91). Cham: Springer Nature Switzerland. link
Park S, Hong Y, Heo B, Yun S, Choi JY. The majority can help the minority: Context-rich minority oversampling for long-tailed classification. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 6887-6896). link
Zhu J, Wang Z, Chen J, Chen YP, Jiang YG. Balanced contrastive learning for long-tailed visual recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 6908-6917). link
Hou Z, Yu B, Tao D. Batchformer: Learning to explore sample relationships for robust representation learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 7256-7266). link
Zada S, Benou I, Irani M. Pure noise to the rescue of insufficient data: Improving imbalanced classification by training on random noise images. InInternational Conference on Machine Learning 2022 Jun 28 (pp. 25817-25833). PMLR. link
Zhang Y, Hooi B, Hong L, Feng J. Self-supervised aggregation of diverse experts for test-agnostic long-tailed recognition. Advances in neural information processing systems. 2022 Dec 6;35:34077-90. link
Ma T, Geng S, Wang M, Shao J, Lu J, Li H, Gao P, Qiao Y. A simple long-tailed recognition baseline via vision-language model. arXiv preprint arXiv:2111.14745. 2021 Nov 29. link
Hong Y, Han S, Choi K, Seo S, Kim B, Chang B. Disentangling label distribution for long-tailed visual recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 6626-6636). link
Wang J, Lukasiewicz T, Hu X, Cai J, Xu Z. Rsg: A simple but effective module for learning imbalanced datasets. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 3784-3793). link
Zhong Z, Cui J, Liu S, Jia J. Improving calibration for long-tailed recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 16489-16498). link
Li S, Gong K, Liu CH, Wang Y, Qiao F, Cheng X. Metasaug: Meta semantic augmentation for long-tailed visual recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 5212-5221). link
Samuel D, Atzmon Y, Chechik G. From generalized zero-shot learning to long-tail with class descriptors. InProceedings of the IEEE/CVF winter conference on applications of computer vision 2021 (pp. 286-295). link
Tang K, Huang J, Zhang H. Long-tailed classification by keeping the good and removing the bad momentum causal effect. Advances in neural information processing systems. 2020;33:1513-24. link
Yang Y, Xu Z. Rethinking the value of labels for improving class-imbalanced learning. Advances in neural information processing systems. 2020;33:19290-301. link
Kang B, Li Y, Xie S, Yuan Z, Feng J. Exploring balanced feature spaces for representation learning. InInternational conference on learning representations 2020 Apr. link
Kozerawski J, Fragoso V, Karianakis N, Mittal G, Turk M, Chen M. Blt: Balancing long-tailed datasets with adversarially-perturbed images. InProceedings of the Asian Conference on Computer Vision 2020. link
Zhu L, Yang Y. Inflated episodic memory with region self-attention for long-tailed visual recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 4344-4353). link
Sharma S, Yu N, Fritz M, Schiele B. Long-tailed recognition using class-balanced experts. InPattern Recognition: 42nd DAGM German Conference, DAGM GCPR 2020, Tübingen, Germany, September 28–October 1, 2020, Proceedings 42 2021 (pp. 86-100). Springer International Publishing. link
Ren J, Yu C, Ma X, Zhao H, Yi S. Balanced meta-softmax for long-tailed visual recognition. Advances in neural information processing systems. 2020;33:4175-86. link
Xiang L, Ding G, Han J. Learning from multiple experts: Self-paced knowledge distillation for long-tailed classification. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16 2020 (pp. 247-263). Springer International Publishing. link
Sinha S, Ohashi H, Nakamura K. Class-wise difficulty-balanced loss for solving class-imbalance. InProceedings of the Asian conference on computer vision 2020. link
Kang B, Xie S, Rohrbach M, Yan Z, Gordo A, Feng J, Kalantidis Y. Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:1910.09217. 2019 Oct 21. link
Jamal MA, Brown M, Yang MH, Wang L, Gong B. Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 7610-7619). link
Chu P, Bian X, Liu S, Ling H. Feature space augmentation for long-tailed data. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX 16 2020 (pp. 694-710). Springer International Publishing. link
Menon AK, Jayasumana S, Rawat AS, Jain H, Veit A, Kumar S. Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314. 2020 Jul 14. link
Iscen A, Araujo A, Gong B, Schmid C. Class-balanced distillation for long-tailed visual recognition. arXiv preprint arXiv:2104.05279. 2021 Apr 12. link
He YY, Wu J, Wei XS. Distilling virtual examples for long-tailed recognition. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 235-244). link
Zhang S, Li Z, Yan S, He X, Sun J. Distribution alignment: A unified framework for long-tail visual recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 2361-2370). link
Samuel D, Chechik G. Distributional robustness loss for long-tail learning. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 9495-9504). link
Wang X, Lian L, Miao Z, Liu Z, Yu SX. Long-tailed recognition by routing diverse distribution-aware experts. arXiv preprint arXiv:2010.01809. 2020 Oct 5. link
Liu Z, Miao Z, Zhan X, Wang J, Gong B, Yu SX. Large-scale long-tailed recognition in an open world. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2019 (pp. 2537-2546). link
Zhang Z, Tang H, Tang J. Multi-scale Activation, Selection, and Aggregation: Exploring Diverse Cues for Fine-Grained Bird Recognition. InProceedings of the AAAI Conference on Artificial Intelligence 2025 Apr 11 (Vol. 39, No. 10, pp. 10385-10393). link
Zheng R, Liu L, Yu Z, Zhang Y, Cheng HV, Ding C. Learning class unique features in fine-grained visual classification. InICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025 Apr 6 (pp. 1-5). IEEE. link
Chen V, Taesiri MR, Nguyen AT. PCNN: Probable-class nearest-neighbor explanations improve fine-grained image classification accuracy for AIs and humans. arXiv preprint arXiv:2308.13651. 2023 Aug 25. link
Chou PY, Kao YY, Lin CH. Fine-grained visual classification with high-temperature refinement and background suppression. arXiv preprint arXiv:2303.06442. 2023 Mar 11. link
Singh M, Gustafson L, Adcock A, de Freitas Reis V, Gedik B, Kosaraju RP, Mahajan D, Girshick R, Dollár P, Van Der Maaten L. Revisiting weakly supervised pre-training of visual perception models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022 (pp. 804-814). link
He J, Chen JN, Liu S, Kortylewski A, Yang C, Bai Y, Wang C. Transfg: A transformer architecture for fine-grained recognition. InProceedings of the AAAI conference on artificial intelligence 2022 Jun 28 (Vol. 36, No. 1, pp. 852-860). link
Liang Y, Zhu L, Wang X, Yang Y. A simple episodic linear probe improves visual recognition in the wild. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022 (pp. 9559-9569). link
Chou PY, Lin CH, Kao WC. A novel plug-in module for fine-grained visual classification. arXiv preprint arXiv:2202.03822. 2022 Feb 8. link
Wu D, Li S, Zang Z, Li SZ. Exploring localization for self-supervised fine-grained contrastive learning. arXiv preprint arXiv:2106.15788. 2021 Jun 30. link
Behera A, Wharton Z, Hewage PR, Bera A. Context-aware attentional pooling (cap) for fine-grained visual classification. InProceedings of the AAAI conference on artificial intelligence 2021 May 18 (Vol. 35, No. 2, pp. 929-937). link
Rao Y, Chen G, Lu J, Zhou J. Counterfactual attention learning for fine-grained visual categorization and re-identification. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 1025-1034). link
Wang J, Li N, Luo Z, Zhong Z, Li S. High-order-interaction for weakly supervised fine-grained visual categorization. Neurocomputing. 2021 Nov 13;464:27-36. link
Wang J, Yu X, Gao Y. Feature fusion vision transformer for fine-grained visual categorization. arXiv preprint arXiv:2107.02341. 2021 Jul 6. link
Song J, Yang R. Feature boosting, suppression, and diversification for fine-grained visual classification. In2021 International joint conference on neural networks (IJCNN) 2021 Jul 18 (pp. 1-8). IEEE. link
Zhang F, Li M, Zhai G, Liu Y. Multi-branch and multi-scale attention learning for fine-grained visual categorization. InMultiMedia Modeling: 27th International Conference, MMM 2021, Prague, Czech Republic, June 22–24, 2021, Proceedings, Part I 27 2021 (pp. 136-147). Springer International Publishing. link
Imran A, Athitsos V. Domain adaptive transfer learning on visual attention aware data augmentation for fine-grained visual categorization. InAdvances in Visual Computing: 15th International Symposium, ISVC 2020, San Diego, CA, USA, October 5–7, 2020, Proceedings, Part II 15 2020 (pp. 53-65). Springer International Publishing. link
Zhou M, Bai Y, Zhang W, Zhao T, Mei T. Look-into-object: Self-supervised structure modeling for object recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 11774-11783). link
Nawaz S, Calefati A, Caraffini M, Landro N, Gallo I. Are these birds similar: Learning branched networks for fine-grained representations. In2019 International Conference on Image and Vision Computing New Zealand (IVCNZ) 2019 Dec 2 (pp. 1-5). IEEE. link
Touvron H, Vedaldi A, Douze M, Jégou H. Fixing the train-test resolution discrepancy. Advances in neural information processing systems. 2019;32. link
Hu T, Qi H, Huang Q, Lu Y. See better before looking closer: Weakly supervised data augmentation network for fine-grained visual classification. arXiv preprint arXiv:1901.09891. 2019 Jan 26. link
Zheng H, Fu J, Zha ZJ, Luo J. Looking for the devil in the details: Learning trilinear attention sampling network for fine-grained image recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2019 (pp. 5012-5021). link
Li P, Xie J, Wang Q, Gao Z. Towards faster training of global covariance pooling networks by iterative matrix square root normalization. InProceedings of the IEEE conference on computer vision and pattern recognition 2018 (pp. 947-955). link
Wang Y, Morariu VI, Davis LS. Learning a discriminative filter bank within a CNN for fine-grained recognition. InProceedings of the IEEE conference on computer vision and pattern recognition 2018 (pp. 4148-4157). link
Dubey A, Gupta O, Guo P, Raskar R, Farrell R, Naik N. Pairwise confusion for fine-grained visual classification. InProceedings of the European conference on computer vision (ECCV) 2018 (pp. 70-86). link
Zheng H, Fu J, Mei T, Luo J. Learning multi-attention convolutional neural network for fine-grained image recognition. InProceedings of the IEEE international conference on computer vision 2017 (pp. 5209-5217). link
Zheng Z, Zheng L, Yang Y. Unlabeled samples generated by gan improve the person re-identification baseline in vitro. InProceedings of the IEEE international conference on computer vision 2017 (pp. 3754-3762). link
Zhang N, Farrell R, Iandola F, Darrell T. Deformable part descriptors for fine-grained recognition and attribute prediction. InProceedings of the IEEE International Conference on Computer Vision 2013 (pp. 729-736). link
Feng M, Lu H, Yu Y. Residual learning for salient object detection. IEEE Transactions on Image Processing. 2020 Feb 28;29:4696-708. link
Famulari G, Duclos M, Enger SA. A novel 169Yb‐based dynamic‐shield intensity modulated brachytherapy delivery system for prostate cancer. Medical physics. 2020 Mar;47(3):859-68. link
Emami M, Sahraee-Ardakan M, Pandit P, Rangan S, Fletcher A. Generalization error of generalized linear models in high dimensions. InInternational conference on machine learning 2020 Nov 21 (pp. 2892-2901). PMLR. link
Dutordoir V, Wilk M, Artemev A, Hensman J. Bayesian image classification with deep convolutional Gaussian processes. InInternational Conference on Artificial Intelligence and Statistics 2020 Jun 3 (pp. 1529-1539). PMLR. link
Duanmu Z, Liu W, Li Z, Wang Z. Modeling generalized rate-distortion functions. IEEE Transactions on Image Processing. 2020 Jun 23;29:7331-44. link
Du Y, Xu J, Xiong H, Qiu Q, Zhen X, Snoek CG, Shao L. Learning to learn with variational information bottleneck for domain generalization. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16 2020 (pp. 200-216). Springer International Publishing. link
Dombrowski AK, Anders CJ, Müller KR, Kessel P. Towards robust explanations for deep neural networks. Pattern Recognition. 2022 Jan 1;121:108194. link
Díaz-Vico D, Dorronsoro JR. Deep least squares Fisher discriminant analysis. IEEE transactions on neural networks and learning systems. 2019 Apr 11;31(8):2752-63. link
Diakonikolas I, Goel S, Karmalkar S, Klivans AR, Soltanolkotabi M. Approximation schemes for relu regression. InConference on learning theory 2020 Jul 15 (pp. 1452-1485). PMLR. link
De Klerk E, Glineur F, Taylor AB. Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation. SIAM Journal on Optimization. 2020;30(3):2053-82. link
Dagan Y, Feldman V. PAC learning with stable and private predictions. InConference on Learning Theory 2020 Jul 15 (pp. 1389-1410). PMLR.link
Dagan Y, Feldman V. PAC learning with stable and private predictions. InConference on Learning Theory 2020 Jul 15 (pp. 1389-1410). PMLR. link
Condat L, Malinovsky G, Richtárik P. Distributed proximal splitting algorithms with rates and acceleration. Frontiers in Signal Processing. 2022 Jan 25;1:776825. link
Chizat L, Bach F. Implicit bias of gradient descent for wide two-layer neural networks trained with the logistic loss. InConference on learning theory 2020 Jul 15 (pp. 1305-1338). PMLR. link
Cheng W, Wang Y, Li H, Duan Y. Learned full-sampling reconstruction from incomplete data. IEEE Transactions on Computational Imaging. 2020 May 25;6:945-57. link
Charoenphakdee N, Vongkulbhisal J, Chairatanakul N, Sugiyama M. On focal loss for class-posterior probability estimation: A theoretical perspective. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021 (pp. 5202-5211). link
Cao T, Law M, Fidler S. A theoretical analysis of the number of shots in few-shot learning. arXiv preprint arXiv:1909.11722. 2019 Sep 25. link
Bhattacharya S, Liu Z, Maiti T. Variational bayes neural network: Posterior consistency, classification accuracy and computational challenges. arXiv preprint arXiv:2011.09592. 2020 Nov 19. link
Bertsimas D, Li ML. Fast exact matrix completion: A unified optimization framework for matrix completion. Journal of Machine Learning Research. 2020;21(231):1-43. link
Bergou EH, Gorbunov E, Richtarik P. Stochastic three points method for unconstrained smooth minimization. SIAM Journal on Optimization. 2020;30(4):2726-49. link
Bartlett PL, Long PM, Lugosi G, Tsigler A. Benign overfitting in linear regression. Proceedings of the National Academy of Sciences. 2020 Dec 1;117(48):30063-70. link
Barnes LP, Han Y, Ozgur A. Lower bounds for learning distributions under communication constraints via fisher information. Journal of Machine Learning Research. 2020;21(236):1-30. link
Bai J, Song Q, Cheng G. Nearly optimal variational inference for high dimensional regression with shrinkage priors. arXiv preprint arXiv:2010.12887. 2020 Oct 24. link
Bahmani S, Romberg J. Convex programming for estimation in nonlinear recurrent models. Journal of Machine Learning Research. 2020;21(235):1-20. link
Baher HL, Lemaire V, Trinquart R. On the intrinsic robustness to noise of some leading classifiers and symmetric loss function--an empirical evaluation. arXiv preprint arXiv:2010.13570. 2020 Oct 22. link
Axelrod B, Garg S, Sharan V, Valiant G. Sample amplification: Increasing dataset size even when learning is impossible. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 442-451). PMLR. link
Arjovsky M, Bottou L, Gulrajani I, Lopez-Paz D. Invariant risk minimization. arXiv preprint arXiv:1907.02893. 2019 Jul 5. link
Ardizzone L, Mackowiak R, Rother C, Köthe U. Training normalizing flows with the information bottleneck for competitive generative classification. Advances in Neural Information Processing Systems. 2020;33:7828-40. link
Amjad RA, Geiger BC. Learning representations for neural network-based classification using the information bottleneck principle. IEEE transactions on pattern analysis and machine intelligence. 2019 Apr 2;42(9):2225-39. link
Adams Q, Hopfensperger KM, Kim Y, Wu X, Flynn RT. 169Yb‐based rotating shield brachytherapy for prostate cancer. Medical physics. 2020 Dec;47(12):6430-9. link
Yu B, Zhang J, Zhu Z. On the learning dynamics of two-layer nonlinear convolutional neural networks. arXiv preprint arXiv:1905.10157. 2019 May 24. link
Yang J, Sun S, Roy DM. Fast-rate PAC-Bayes generalization bounds via shifted Rademacher processes. Advances in Neural Information Processing Systems. 2019;32. link
Wu S, Dimakis A, Sanghavi S, Yu F, Holtmann-Rice D, Storcheus D, Rostamizadeh A, Kumar S. Learning a compressed sensing measurement matrix via gradient unrolling. InInternational Conference on Machine Learning 2019 May 24 (pp. 6828-6839). PMLR. link
Wen B, Ravishankar S, Pfister L, Bresler Y. Transform learning for magnetic resonance image reconstruction: From model-based learning to building neural networks. IEEE Signal Processing Magazine. 2020 Jan 17;37(1):41-53. link
Thesing L, Hansen AC. Non-uniform recovery guarantees for binary measurements and infinite-dimensional compressed sensing. Journal of Fourier Analysis and Applications. 2021 Apr;27(2):14. link
Shu R, Bui H, Whang J, Ermon S. Training variational autoencoders with buffered stochastic variational inference. InThe 22nd International Conference on Artificial Intelligence and Statistics 2019 Apr 11 (pp. 2134-2143). PMLR. link
Sher Y. Review of algorithms for compressive sensing of images. arXiv preprint arXiv:1908.01642. 2019 Aug 5. link
Rudolph M, Wandt B, Rosenhahn B. Structuring autoencoders. InProceedings of the IEEE/CVF International Conference on Computer Vision Workshops 2019 (pp. 0-0). link
Montanari A, Venkataramanan R. Estimation of low-rank matrices via approximate message passing. link
Montanari A, Ruan F, Sohn Y, Yan J. The generalization error of max‐margin linear classifiers: High‐dimensional asymptotics in the overparametrized regime. Preprint. arXiv preprint arXiv:1911.01544. 2019. link
Monga V, Li Y, Eldar YC. Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing. IEEE Signal Processing Magazine. 2021 Feb 25;38(2):18-44. link
Marcu A, Prügel-Bennett A. Rethinking Generalisation. arXiv preprint arXiv:1911.04301. 2019 Nov 11. link
Ma Y, Rush C, Baron D. Analysis of approximate message passing with non-separable denoisers and Markov random field priors. IEEE Transactions on Information Theory. 2019 Aug 9;65(11):7367-89. link
Lin S, Zhang J. Generalization bounds for convolutional neural networks. arXiv preprint arXiv:1910.01487. 2019 Oct 3. link
Li Y, Wei C, Ma T. Towards explaining the regularization effect of initial large learning rate in training neural networks. Advances in neural information processing systems. 2019;32. link
AK M, TS M. Neural Networks-based Regularization of Large-Scale Inverse Problems in Medical Imaging. link
Kempfert KC, Wang Y, Chen C, Wong SW. A comparison study on nonlinear dimension reduction methods with kernel variations: Visualization, optimization and classification. Intelligent Data Analysis. 2020 Mar;24(2):267-90. link
Duan Z, Min MR, Li LE, Cai M, Xu Y, Ni B. Disentangled deep autoencoding regularization for robust image classification. arXiv preprint arXiv:1902.11134. 2019 Feb 27. link
Li J, Xiong C, Hoi SC. Learning from noisy data with robust representation learning. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 9485-9494). link
Deshmukh AA, Lei Y, Sharma S, Dogan U, Cutler JW, Scott C. A generalization error bound for multi-class domain generalization. arXiv preprint arXiv:1905.10392. 2019 May 24. link
Deng Z, Kammoun A, Thrampoulidis C. A model of double descent for high-dimensional binary linear classification. Information and Inference: A Journal of the IMA. 2022 Jun;11(2):435-95. link
Cheng H, Lian D, Gao S, Geng Y. Utilizing information bottleneck to evaluate the capability of deep neural networks for image classification. Entropy. 2019 May 1;21(5):456. link
Chen C, Li O, Tao D, Barnett A, Rudin C, Su JK. This looks like that: deep learning for interpretable image recognition. Advances in neural information processing systems. 2019;32. link
Chau G, Wohlberg B, Rodriguez P. Efficient Projection onto the ∞,1 Mixed-Norm Ball Using a Newton Root Search Method. SIAM Journal on Imaging Sciences. 2019;12(1):604-23. link
Celentano M, Montanari A. Fundamental barriers to high-dimensional regression with convex penalties. The Annals of Statistics. 2022 Feb;50(1):170-96. link
Castera C, Bolte J, Févotte C, Pauwels E. An inertial newton algorithm for deep learning. Journal of Machine Learning Research. 2021;22(134):1-31. link
Cao Y, Gu Q. Tight sample complexity of learning one-hidden-layer convolutional neural networks. Advances in Neural Information Processing Systems. 2019;32. link
Bolya D, Zhou C, Xiao F, Lee YJ. Yolact: Real-time instance segmentation. InProceedings of the IEEE/CVF international conference on computer vision 2019 (pp. 9157-9166). link
Arpit D, Bengio Y. The benefits of over-parameterization at initialization in deep ReLU networks. arXiv preprint arXiv:1901.03611. 2019 Jan 11. link
YOLO series
Lei M, Li S, Wu Y, Hu H, Zhou Y, Zheng X, Ding G, Du S, Wu Z, Gao Y. YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception. arXiv preprint arXiv:2506.17733. 2025 Jun 21. link
Tian Y, Ye Q, Doermann D. Yolov12: Attention-centric real-time object detectors. arXiv preprint arXiv:2502.12524. 2025 Feb 18. link
Khanam R, Hussain M. Yolov11: An overview of the key architectural enhancements. arXiv preprint arXiv:2410.17725. 2024 Oct 23. link
Wang A, Chen H, Liu L, Chen K, Lin Z, Han J. Yolov10: Real-time end-to-end object detection. Advances in Neural Information Processing Systems. 2024 Dec 16;37:107984-8011. link
Wang CY, Yeh IH, Mark Liao HY. Yolov9: Learning what you want to learn using programmable gradient information. InEuropean conference on computer vision 2024 Sep 29 (pp. 1-21). Cham: Springer Nature Switzerland. link
Wang CY, Bochkovskiy A, Liao HY. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 7464-7475). link
Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W, Li Y. YOLOv6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976. 2022 Sep 7. link
Khanam R, Hussain M. What is YOLOv5: A deep look into the internal features of the popular object detector. arXiv preprint arXiv:2407.20892. 2024 Jul 30. link
Bochkovskiy A, Wang CY, Liao HY. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. 2020 Apr 23. link
Redmon J, Farhadi A. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767. 2018 Apr 8. link
Redmon J, Farhadi A. YOLO9000: better, faster, stronger. InProceedings of the IEEE conference on computer vision and pattern recognition 2017 (pp. 7263-7271). link
Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. InProceedings of the IEEE conference on computer vision and pattern recognition 2016 (pp. 779-788). link
VQ-VAE series
Razavi A, Van den Oord A, Vinyals O. Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems. 2019;32. link
Van Den Oord A, Vinyals O. Neural discrete representation learning. Advances in neural information processing systems. 2017;30. link
InternVideo series
Wang Y, Li K, Li X, Yu J, He Y, Chen G, Pei B, Zheng R, Wang Z, Shi Y, Jiang T. Internvideo2: Scaling foundation models for multimodal video understanding. InEuropean Conference on Computer Vision 2024 Sep 29 (pp. 396-416). Cham: Springer Nature Switzerland. link
Wang Y, Li K, Li Y, He Y, Huang B, Zhao Z, Zhang H, Xu J, Liu Y, Wang Z, Xing S. Internvideo: General video foundation models via generative and discriminative learning. arXiv preprint arXiv:2212.03191. 2022 Dec 6. link
InternVL series
Zhu J, Wang W, Chen Z, Liu Z, Ye S, Gu L, Tian H, Duan Y, Su W, Shao J, Gao Z. Internvl3: Exploring advanced training and test-time recipes for open-source multimodal models. arXiv preprint arXiv:2504.10479. 2025 Apr 14. link
Chen Z, Wang W, Cao Y, Liu Y, Gao Z, Cui E, Zhu J, Ye S, Tian H, Liu Z, Gu L. Expanding performance boundaries of open-source multimodal models with model, data, and test-time scaling. arXiv preprint arXiv:2412.05271. 2024 Dec 6. link
Chen Z, Wu J, Wang W, Su W, Chen G, Xing S, Zhong M, Zhang Q, Zhu X, Lu L, Li B. Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2024 (pp. 24185-24198). link
V-JEPA series
Assran M, Bardes A, Fan D, Garrido Q, Howes R, Muckley M, Rizvi A, Roberts C, Sinha K, Zholus A, Arnaud S. V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning. arXiv preprint arXiv:2506.09985. 2025 Jun 11. link
Bardes A, Garrido Q, Ponce J, Chen X, Rabbat M, LeCun Y, Assran M, Ballas N. Revisiting feature prediction for learning visual representations from video. arXiv preprint arXiv:2404.08471. 2024 Feb 15. link
Uniperceiver series
Li H, Zhu J, Jiang X, Zhu X, Li H, Yuan C, Wang X, Qiao Y, Wang X, Wang W, Dai J. Uni-perceiver v2: A generalist model for large-scale vision and vision-language tasks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 2691-2700). link
Zhu X, Zhu J, Li H, Wu X, Li H, Wang X, Dai J. Uni-perceiver: Pre-training unified architecture for generic perception for zero-shot and few-shot tasks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022 (pp. 16804-16815). link
Uniformer series
Li K, Wang Y, He Y, Li Y, Wang Y, Wang L, Qiao Y. Uniformerv2: Unlocking the potential of image vits for video understanding. InProceedings of the IEEE/CVF International Conference on Computer Vision 2023 (pp. 1632-1643). link
Li K, Wang Y, Gao P, Song G, Liu Y, Li H, Qiao Y. Uniformer: Unified transformer for efficient spatiotemporal representation learning. arXiv preprint arXiv:2201.04676. 2022 Jan 12. link
Unified IO series
Lu J, Clark C, Lee S, Zhang Z, Khosla S, Marten R, Hoiem D, Kembhavi A. Unified-io 2: Scaling autoregressive multimodal models with vision language audio and action. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 26439-26455). link
Lu J, Clark C, Zellers R, Mottaghi R, Kembhavi A. Unified-io: A unified model for vision, language, and multi-modal tasks. arXiv preprint arXiv:2206.08916. 2022 Jun 17. link
Unbiased teacher series
Liu YC, Ma CY, Kira Z. Unbiased teacher v2: Semi-supervised object detection for anchor-free and anchor-based detectors. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022 (pp. 9819-9828). link
Liu YC, Ma CY, He Z, Kuo CW, Chen K, Zhang P, Wu B, Kira Z, Vajda P. Unbiased teacher for semi-supervised object detection. arXiv preprint arXiv:2102.09480. 2021 Feb 18. link
Swin Transformer series
Liu Z, Hu H, Lin Y, Yao Z, Xie Z, Wei Y, Ning J, Cao Y, Zhang Z, Dong L, Wei F. Swin transformer v2: Scaling up capacity and resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 12009-12019). link
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B. Swin transformer: Hierarchical vision transformer using shifted windows. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 10012-10022). link
ShuffleNet series
Ma N, Zhang X, Zheng HT, Sun J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. InProceedings of the European conference on computer vision (ECCV) 2018 (pp. 116-131). link
Zhang X, Zhou X, Lin M, Sun J. Shufflenet: An extremely efficient convolutional neural network for mobile devices. InProceedings of the IEEE conference on computer vision and pattern recognition 2018 (pp. 6848-6856). link
SAM series
Ravi N, Gabeur V, Hu YT, Hu R, Ryali C, Ma T, Khedr H, Rädle R, Rolland C, Gustafson L, Mintun E. Sam 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714. 2024 Aug 1. link
Kirillov A, Mintun E, Ravi N, Mao H, Rolland C, Gustafson L, Xiao T, Whitehead S, Berg AC, Lo WY, Dollár P. Segment anything. InProceedings of the IEEE/CVF international conference on computer vision 2023 (pp. 4015-4026). link
RepPoints series
Chen Y, Zhang Z, Cao Y, Wang L, Lin S, Hu H. Reppoints v2: Verification meets regression for object detection. Advances in Neural Information Processing Systems. 2020;33:5621-31. link
Yang Z, Liu S, Hu H, Wang L, Lin S. Reppoints: Point set representation for object detection. InProceedings of the IEEE/CVF international conference on computer vision 2019 (pp. 9657-9666). link
R-CNN series
He K, Gkioxari G, Dollár P, Girshick R. Mask r-cnn. InProceedings of the IEEE international conference on computer vision 2017 (pp. 2961-2969). link
Ren S, He K, Girshick R, Sun J. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems. 2015;28. link
Girshick R. Fast r-cnn. InProceedings of the IEEE international conference on computer vision 2015 (pp. 1440-1448). link
Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. InProceedings of the IEEE conference on computer vision and pattern recognition 2014 (pp. 580-587). link
PVT series
Wang W, Xie E, Li X, Fan DP, Song K, Liang D, Lu T, Luo P, Shao L. Pvt v2: Improved baselines with pyramid vision transformer. Computational visual media. 2022 Sep;8(3):415-24. link
Wang W, Xie E, Li X, Fan DP, Song K, Liang D, Lu T, Luo P, Shao L. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 568-578). link
PALM series
Anil R, Dai AM, Firat O, Johnson M, Lepikhin D, Passos A, Shakeri S, Taropa E, Bailey P, Chen Z, Chu E. Palm 2 technical report. arXiv preprint arXiv:2305.10403. 2023 May 17. link
Chowdhery A, Narang S, Devlin J, Bosma M, Mishra G, Roberts A, Barham P, Chung HW, Sutton C, Gehrmann S, Schuh P. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research. 2023;24(240):1-13.
Multiscale vision transformer series
Li Y, Wu CY, Fan H, Mangalam K, Xiong B, Malik J, Feichtenhofer C. Mvitv2: Improved multiscale vision transformers for classification and detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 4804-4814). link
Fan H, Xiong B, Mangalam K, Li Y, Yan Z, Malik J, Feichtenhofer C. Multiscale vision transformers. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 6824-6835). link
MOTR series
Yu E, Wang T, Li Z, Zhang Y, Zhang X, Tao W. Motrv3: Release-fetch supervision for end-to-end multi-object tracking. arXiv preprint arXiv:2305.14298. 2023 May 23. link
Zhang Y, Wang T, Zhang X. Motrv2: Bootstrapping end-to-end multi-object tracking by pretrained object detectors. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 22056-22065). link
Zeng F, Dong B, Zhang Y, Wang T, Zhang X, Wei Y. Motr: End-to-end multiple-object tracking with transformer. InEuropean conference on computer vision 2022 Oct 23 (pp. 659-675). Cham: Springer Nature Switzerland. link
MoCo series
Chen X, Xie S, He K. An empirical study of training self-supervised vision transformers. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 9640-9649). link
Chen X, Fan H, Girshick R, He K. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297. 2020 Mar 9. link
He K, Fan H, Wu Y, Xie S, Girshick R. Momentum contrast for unsupervised visual representation learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 9729-9738). link
MobileNet series
Qin D, Leichner C, Delakis M, Fornoni M, Luo S, Yang F, Wang W, Banbury C, Ye C, Akin B, Aggarwal V. MobileNetV4: universal models for the mobile ecosystem. InEuropean Conference on Computer Vision 2024 Sep 29 (pp. 78-96). Cham: Springer Nature Switzerland. link
Howard A, Sandler M, Chu G, Chen LC, Chen B, Tan M, Wang W, Zhu Y, Pang R, Vasudevan V, Le QV. Searching for mobilenetv3. InProceedings of the IEEE/CVF international conference on computer vision 2019 (pp. 1314-1324). link
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC. Mobilenetv2: Inverted residuals and linear bottlenecks. InProceedings of the IEEE conference on computer vision and pattern recognition 2018 (pp. 4510-4520). link
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861. 2017 Apr 17. link
Llama series
The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation.
Grattafiori A, Dubey A, Jauhri A, Pandey A, Kadian A, Al-Dahle A, Letman A, Mathur A, Schelten A, Vaughan A, Yang A. The llama 3 herd of models. arXiv preprint arXiv:2407.21783. 2024 Jul 31. link
Touvron H, Martin L, Stone K, Albert P, Almahairi A, Babaei Y, Bashlykov N, Batra S, Bhargava P, Bhosale S, Bikel D. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288. 2023 Jul 18. link
Touvron H, Lavril T, Izacard G, Martinet X, Lachaux MA, Lacroix T, Rozière B, Goyal N, Hambro E, Azhar F, Rodriguez A. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. 2023 Feb 27. link
InceptionNet series
Szegedy C, Ioffe S, Vanhoucke V, Alemi A. Inception-v4, inception-resnet and the impact of residual connections on learning. InProceedings of the AAAI conference on artificial intelligence 2017 Feb 12 (Vol. 31, No. 1). link
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. InProceedings of the IEEE conference on computer vision and pattern recognition 2016 (pp. 2818-2826). link
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. InProceedings of the IEEE conference on computer vision and pattern recognition 2015 (pp. 1-9). link
GPT series
Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, Aleman FL, Almeida D, Altenschmidt J, Altman S, Anadkat S, Avila R. Gpt-4 technical report. arXiv preprint arXiv:2303.08774. 2023 Mar 15. link
Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C, Mishkin P, Zhang C, Agarwal S, Slama K, Ray A, Schulman J. Training language models to follow instructions with human feedback. Advances in neural information processing systems. 2022 Dec 6;35:27730-44. link
Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S. Language models are few-shot learners. Advances in neural information processing systems. 2020;33:1877-901. link
Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training. link
Generalized focal loss series
Li X, Wang W, Hu X, Li J, Tang J, Yang J. Generalized focal loss v2: Learning reliable localization quality estimation for dense object detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 11632-11641). link
Li X, Wang W, Wu L, Chen S, Hu X, Li J, Tang J, Yang J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Advances in neural information processing systems. 2020;33:21002-12. link
ExpansionNet series
Hu JC. Expansionnet: Exploring the sequence length bottleneck in the transformer for image captioning. CoRR. 2022 Jan 1. link
Hu JC, Cavicchioli R, Capotondi A. Exploiting multiple sequence lengths in fast end to end training for image captioning. In2023 IEEE International Conference on Big Data (BigData) 2023 Dec 15 (pp. 2173-2182). IEEE. link
EfficientNet series
Tan M, Le Q. Efficientnetv2: Smaller models and faster training. InInternational conference on machine learning 2021 Jul 1 (pp. 10096-10106). PMLR. link
Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks. InInternational conference on machine learning 2019 May 24 (pp. 6105-6114). PMLR. link
Florence series
Xiao B, Wu H, Xu W, Dai X, Hu H, Lu Y, Zeng M, Liu C, Yuan L. Florence-2: Advancing a unified representation for a variety of vision tasks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 4818-4829). link
Yuan L, Chen D, Chen YL, Codella N, Dai X, Gao J, Hu H, Huang X, Li B, Li C, Liu C. Florence: A new foundation model for computer vision. arXiv preprint arXiv:2111.11432. 2021 Nov 22. link
Detclip series
Yao L, Pi R, Han J, Liang X, Xu H, Zhang W, Li Z, Xu D. Detclipv3: Towards versatile generative open-vocabulary object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 27391-27401). link
Yao L, Han J, Liang X, Xu D, Zhang W, Li Z, Xu H. Detclipv2: Scalable open-vocabulary object detection pre-training via word-region alignment. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 23497-23506). link
Yao L, Han J, Wen Y, Liang X, Xu D, Zhang W, Li Z, Xu C, Xu H. Detclip: Dictionary-enriched visual-concept paralleled pre-training for open-world detection. Advances in Neural Information Processing Systems. 2022 Dec 6;35:9125-38. link
Deformable convolutional network series
Xiong Y, Li Z, Chen Y, Wang F, Zhu X, Luo J, Wang W, Lu T, Li H, Qiao Y, Lu L. Efficient deformable convnets: Rethinking dynamic and sparse operator for vision applications. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (pp. 5652-5661). link
Li H, Zhang Y, Zhang Y, Li H, Sang L. Dcnv3: Towards next generation deep cross network for ctr prediction. arXiv e-prints. 2024 Jul:arXiv-2407. link
Zhu X, Hu H, Lin S, Dai J. Deformable convnets v2: More deformable, better results. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2019 (pp. 9308-9316). link
Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y. Deformable convolutional networks. InProceedings of the IEEE international conference on computer vision 2017 (pp. 764-773). link
Deepseek series
Liu A, Feng B, Xue B, Wang B, Wu B, Lu C, Zhao C, Deng C, Zhang C, Ruan C, Dai D. Deepseek-v3 technical report. arXiv preprint arXiv:2412.19437. 2024 Dec 27. link
Liu A, Feng B, Wang B, Wang B, Liu B, Zhao C, Dengr C, Ruan C, Dai D, Guo D, Yang D. Deepseek-v2: A strong, economical, and efficient mixture-of-experts language model. arXiv preprint arXiv:2405.04434. 2024 May 7. link
Xin H, Ren ZZ, Song J, Shao Z, Zhao W, Wang H, Liu B, Zhang L, Lu X, Du Q, Gao W. Deepseek-prover-v1. 5: Harnessing proof assistant feedback for reinforcement learning and monte-carlo tree search. arXiv preprint arXiv:2408.08152. 2024 Aug 15. link
Bi X, Chen D, Chen G, Chen S, Dai D, Deng C, Ding H, Dong K, Du Q, Fu Z, Gao H. Deepseek llm: Scaling open-source language models with longtermism. arXiv preprint arXiv:2401.02954. 2024 Jan 5. link
DALL E series
What's new with DALL·E 3?
Ramesh A, Dhariwal P, Nichol A, Chu C, Chen M. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125. 2022 Apr 13;1(2):3. link
Ramesh A, Pavlov M, Goh G, Gray S, Voss C, Radford A, Chen M, Sutskever I. Zero-shot text-to-image generation. InInternational conference on machine learning 2021 Jul 1 (pp. 8821-8831). Pmlr. link
ConvNext series
Woo S, Debnath S, Hu R, Chen X, Liu Z, Kweon IS, Xie S. Convnext v2: Co-designing and scaling convnets with masked autoencoders. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 16133-16142). link
Liu Z, Mao H, Wu CY, Feichtenhofer C, Darrell T, Xie S. A convnet for the 2020s. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 11976-11986). link
BLIP series
Xue L, Shu M, Awadalla A, Wang J, Yan A, Purushwalkam S, Zhou H, Prabhu V, Dai Y, Ryoo MS, Kendre S. xgen-mm (blip-3): A family of open large multimodal models. arXiv preprint arXiv:2408.08872. 2024 Aug 16. link
Li J, Li D, Savarese S, Hoi S. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. InInternational conference on machine learning 2023 Jul 3 (pp. 19730-19742). PMLR. link
Li J, Li D, Xiong C, Hoi S. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. InInternational conference on machine learning 2022 Jun 28 (pp. 12888-12900). PMLR. link
BEIT series
Wang W, Bao H, Dong L, Bjorck J, Peng Z, Liu Q, Aggarwal K, Mohammed OK, Singhal S, Som S, Wei F. Image as a foreign language: Beit pretraining for vision and vision-language tasks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 19175-19186). link
Peng Z, Dong L, Bao H, Ye Q, Wei F. Beit v2: Masked image modeling with vector-quantized visual tokenizers. arXiv preprint arXiv:2208.06366. 2022 Aug 12. link
Bao H, Dong L, Piao S, Wei F. Beit: Bert pre-training of image transformers. arXiv preprint arXiv:2106.08254. 2021 Jun 15. link