In our work, we use the contrained beam search method to probe if an LLM memorize the web content from a domain that disallow the LLM bot access. Below are over all statistics of "memorized" instances across different LLMs.
The following table presents sentences generated by the LLM, given the probing sentences provided to it, which show high similarity with the original subsequent sentences.