Application specific instruction-set processor (ASIP) design is part of my job and a hobby as well. There are different instruction set architectures for a programmable processor, for example, the popular reduced instruction set computer (RISC) and complex instruction set computer (CISC). For baseband signal processing, very long instruction word (VLIW) and transport triggered architecture (TTA) are more popular.
I design TTA processors. More about the architecture can be found here:
http://en.wikipedia.org/wiki/Transport_triggered_architecture
The tool I use to design the processor is called TTA-based codesign environment (TCE) and its open source!! Most of the processor designing tools are very expensive, while TCE is open source. TCE can be downloaded from:
http://tce.cs.tut.fi/downloads/INSTALL
The tool is very simple to use and very well documented.
http://tce.cs.tut.fi/user_manual/TCE/
The screenshot below is showing the processor designer tool of TCE where different function units and register files can be added. For example the figure below is showing a processor named minimal1.adf with load store unit (LSU), arithmetic logic unit (ALU) and other function units. More function unit can be added in drag-and-drop style from the edit section.
The next picture is showing the functionality of the cycle-accurate simulator. We can run a C code or assembly code in the processor and check how many clock cycle it takes. In the figure below we can see bus number 3-5 and the assembly codes of line 62 - 76. I didnt write the assembly code. I wrote the C code and TCE generated the assembly itself with the help of retargetable compiler.
We can see the dataflow through different function units, in different clock cycles from TCE. The figure below shows a data transport through the ALU.
More screenshots can be found here:
http://tce.cs.tut.fi/screenshots.html
tcecc -a new_minimal.adf -o new_LMMSE.tpef new_LMMSE.c
proxim minimal_with_stdout.adf test.tpef &
createbem minimal_with_stdout2.adf
generateprocessor -e toplevel -i minimal_with_stdout2.idf -o proge_output3 minimal_with_stdout2.adfgeneratebits -e toplevel -b minimal_with_stdout2.bem -d -w 4 -p test2.tpef -x proge_output3 -f 'vhdl' -o 'vhdl' minimal_with_stdout2.adf