The LED block cipher also has good software performances, with measurement in cycles per byte. In this page we list the best table-based software implementations of LED (benchmarks done on an Intel(R) Core(TM) i7 CPU Q 720 cadenced at 1.60GHz). The best speed record up to now are 12 and 18 cycles per byte for LED-64 and LED-128 respectively, achieved by a bit-slice implementation with 32 parallelism at Core i3-2367M @ 1.4GHz. More implementation results and source codes are available here.
Following are the implementation figures for a 4-bit micro-controller -- Epson S1C63003 1.5V model, with/without masking.