Skeleton-of-Thought

Xuefei Ning*, Zinan Lin*, Zixuan Zhou*, Zifu Wang, Huazhong Yang, Yu Wang

Let us accelerate the end-to-end generation of LLMs by 2x without any change to the model, system, or hardware!

  [ArXiv]         [BibTex]         [Code]         [Open Review]         [Poster]