Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall

Mingyu Jo1 Jaesik Yoon1,4 Justin Deschenaux2 Caglar Gulcehre2,3 Sungjin Ahn1,5

1KAIST 2EPFL 3Microsoft 4SAP 5NYU

Arxiv

Abstract

Discrete diffusion models offer a promising alternative to autoregressive generation through parallel decoding, but they suffer from a sampling wall: once categorical sampling occurs, rich distributional information collapses into one-hot vectors and cannot be propagated across steps, forcing subsequent steps to operate with limited information. To mitigate this problem, we introduce Loopholing, a novel and simple mechanism that preserves this information via a deterministic latent pathway, leading to Loopholing Discrete Diffusion Models (LDDMs). Trained efficiently with a self-conditioning strategy, LDDMs achieve substantial gains—reducing generative perplexity by up to 61% over prior baselines, closing (and in some cases surpassing) the gap with autoregressive models, and producing more coherent text. Applied to reasoning tasks, LDDMs also improve performance on reasoning benchmarks such as Countdown and Game of 24. These results also indicate that loopholing mitigates idle steps and oscillations, providing a scalable path toward high-quality non-autoregressive text generation.

Overview

Loopholing is a simple mechanism that preserves pre-sampling distributional context through a deterministic latent pathway—bypassing the sampling wall—to stabilize discrete diffusion and reduce idle steps and oscillations.

Loopholing Discrete Diffusion Models

Generation with Loopholing

At each denoising step, a discrete diffusion model predicts the clean sequence by generating a rich categorical distribution over the vocabulary for each token. This distribution captures nuanced information about plausible token candidates and their relative likelihoods. However, the subsequent sampling process collapses this rich distribution into a one-hot vector, a phenomenon we term the "sampling wall".

Motivated by this observation, we introduce Loopholing, which preserves this distributional context by passing a deterministic latent summary across steps. This ensures the information isn’t lost at the sampling step, allowing later updates to reference both the sample and its surrounding probability mass. With this deterministic contextual information, generation becomes more stable, reducing oscillations and idle steps and yielding more coherent outputs.

Training with Self-Conditioned Loopholing

Generation with Loopholing requires propagating the latent embedding across steps, which introduces a recurrent dependency. A key advantage of standard diffusion models, however, is that their training process avoids this time-consuming temporal unrolling by operating on randomly sampled time steps. Maintaining this efficiency within Loopholing's training process is therefore a significant challenge.

To address this, we introduce a self-conditioning approach that avoids unrolling the full generation path during training. The core idea is to simulate context propagation using two forward passes: the model first computes a pseudo-context and then uses it in a second, context-conditioned pass to make the final prediction. Crucially, gradients flow only through the second forward pass. This allows the model to learn how to consume its own representations as context without the prohibitive cost of backpropagating through time.

Experimental Results

We empirically validate the effectiveness of the proposed loopholing mechanism across a variety of models and tasks. Our experiments demonstrate that by accumulating contextual information, our method achieves superior perplexity and generation quality in language modeling , alongside higher success rates on reasoning tasks.

Language Modeling

To show the effectiveness of our method in language modeling, we integrate our method into the Masked Diffusion Language Models (MDLM) and the Uniform Diffusion Language Models (UDLM), creating LDDM-M and LDDM-U, respectively.

Table 1 shows the likelihood evaluation, where LDDM outperforms the baseline models when trained on the OpenWebText (OWT) and One Billion Word (LM1B) datasets.

Table 2 presents the zero-shot likelihood evaluation. When trained on the OWT dataset, our loopholing-enhanced model, LDDM-M, consistently outperforms the baseline MDLM on all evaluated unseen datasets except for LM1B.

The loopholing mechanism is designed to address the sampling wall issue, a property that is difficult to verify solely through likelihood evaluation. To directly assess its impact on generation quality, we therefore employ two metrics. First, we measure the perplexity of unconditionally generated samples using a pretrained GPT-2 Large model (Gen PPL). Second, we utilize GPT-4.1 to score the consistency and naturalness of the samples on a 0-to-10 scale, following the G-eval framework.

As shown in figure above, LDDMs significantly outperform baseline discrete diffusion models. For instance, at 1024 sampling steps, LDDM-M achieves a GenPPL of 49.13, more than halving MDLM’s 108.94. Similarly, LDDM-U (28.76) shows about 2.5x improvement over UDLM (73.95). Notably, LDDM-U surpasses the strong auto-regressive baseline after 512 steps.

Reasoning Task

To evaluate the effectiveness of loopholing on reasoning tasks, we integrate it into the MultiGranularity Diffusion Model (MGDM), a masked diffusion framework designed for reasoning, resulting in the model we refer to as LDDM-G.

As presented in Table 3, LDDM-G demonstrates substantial performance gains over the MGDM baseline across all evaluated tasks and model scales. For instance, with the 85M parameter model, LDDM-G achieves a 16% improvement on Game of 24 and an almost 8% gain on Countdown 4.

Conclusion

In this work, we identified the sampling wall as a key limitation of discrete diffusion models, where rich distributional information collapses into one-hot representations, leading to inefficiencies such as steps without progress and excessive oscillation. To overcome this, we proposed the loopholing mechanism and developed Loopholing Discrete Diffusion Models (LDDMs), which preserve and propagate distributional context latent across denoising steps through a deterministic latent pathway. Extensive experiments demonstrated that LDDMs improve fluency, naturalness, and semantic consistency in text generation and reasoning tasks, significantly narrowing the performance gap with autoregressive models. These results highlight loopholing as a general mechanism to enhance discrete diffusion, with promising future directions including multimodal extensions, theoretical understanding, and integration with broader non-autoregressive frameworks.

Citation

@misc{jo2025loopholingdiscretediffusiondeterministic,

title={Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall},

author={Mingyu Jo and Jaesik Yoon and Justin Deschenaux and Caglar Gulcehre and Sungjin Ahn},

year={2025},

eprint={2510.19304},

archivePrefix={arXiv},

primaryClass={cs.LG},

url={https://arxiv.org/abs/2510.19304},

}

Page updated

Google Sites

Report abuse