For a quick overview of RAMP Gold, please check our tutorial slides at ISPASS 2010. Besides the basic setup, It briefly describes the functional/timing interfaces as well as several simple usage cases of extending the simulator, e.g. adding instructions. 

Hardware Requirement

  • Xilinx XUPV5-LX110T FPGA board
  • DDR2-800 dual-rank 2GB unbuffered SODIMM, e.g. Crucial CT25664AC800
  • A fansink is strongly encouraged to cool the FPGA from >100 celsius
  • Software Requirement

    Synplify Pro D-2010.03 or above. E-2011.03 and E-2011.03SP1 are strongly discouraged, due to known regressions on synthesizing LUTRAM. (E2010.09-SP3 is recommended)
    MentorGraphics Modelsim 6.6 or above (10.0b is recommended)
    Xilinx ISE 11.5i or above
    64-bit Linux (for appserver)

    Source Code Directory Structure

    Hardware tarball:
          |-----designs         top-level RTL modules for synthesis, simulation and etc
          |-----lib                 Systemverilog source code
          |      |----cpu           mosf of the RAMP Gold functional models, e.g. SPARC v8 pipelines
          |      |----stdlib         helper libraries
          |      |----tech          FPGA dependent modules, e.g. register files, host caches and memory controllers
          |      |----tm             RAMP Gold timing models
          |-----project_files   Modelsim, Synplify, ISE project files


    • [IMPORTANT] Test the board with the diagnostics on the ACE CF card first! We saw both broken DDR2 and Ethernet interfaces before. XUP boards are cheap, but the build quality is not good.
    • Change the included 256 MB DIMM with a 2GB one mentioned above.
    • Connect the XUP board to PC over a Gigabit Ethernet connection (100Mbps network will not work)
    • Program the FPGA with the corresponding bit file.

    Guide to build the RAMP Gold Hardware

    Logic synthesis with Synopsis Synplify

    • Open the appropriate project file in Hardware/Processor/sparcmt/project_files/synthesis/synplify/...
    • Hit Run

    Using ModelSim/Questasim

    • Modesim/Questasim version 10.0 and above are recommended. Compiled Xilinx unisim library is required.
    • Build the SPARC v8 disassembler DPI module in 'lib/sim/disasm'. GNU binutil is required. See /lib/sim/disasm/disasm.c for instructions of building binutil.
    • Build the appserver Systemverilog frontend DPI module in 'lib/sim/mac_fedriver' by typing 'make' in the same directory.
    • Make sure all path name in 'sparcmt/project_files/simulation/test_1P_bee3mem_neweth/test_1P_bee3mem_neweth.mpf' (or some other project) are correct. Modelsim tends to use absolute path names for everything.
    • Run vsim to start ModelSim with the .mpf project file.
    • Run 'vlib work' on the ModelSim command line to create a work library
    • Right click on Project window and select Compile All
    • Double click on Simulation file in project window to start the simulation
    • On the simulation command prompt, type 'run 10ms' (or any time you want to simulated)
    • Start the appserver with the 'vs' option. The following example loads the ROS kernel in the simulation.
               $ sparc_app -p2 vs -s -fappserver_ros.conf obj/kern/kernel none
    The functional simulator talks to verilog simulation using unix domain socket/named pipe, so the socket name must be specified correctly in the .conf file.
    Note that Verilog simulation is extremely slow to simulate complex software. The execution may never really start due to a long binary loading time. A simplified proxykernel with smaller memory footprints, such as kernel.ramp.tiny, is recommended. In our SVN, we also provide a tool called ptgen to directly load target binaries into a magic memory (search MAGIC_MEM in the top level module), so the simulation can start right away with "sparc_app -p[#cores] vs none none". In addition, in verilog simulation the maximum simulated memory is always limited due to your host machine configuration (although you can still configure the Micron simulation module to simulate the full 2GB memory theoretically). By default, the proxykernel is compiled to run with 2GB memory, so the address layout needs to be changed in Verilog simulations with <2GB target memory.

    Using Xilinx ISE

    • Open ISE with some project file in Hardware/Processor/sparcmt/project_files/synthesis/ise/...
    • Assign the appropriate P&R constraints to their associated modules (details below) and do a simple floorplan to improve the PAR QoR.
    • Go to the Design tab, in the Processes window, expand the User Constraints item, and double click on Floorplan Area/IO/Logic
    • Select the inhibit_false.ucf constraint file
    • PlanAhead will launch
    • Floorplan the functional model (optional if the design meets timing)
      • In the Netlist window, right click on gen_cpu and select Assign...
      • Assign to pblock_fm
        Expand gen_cpu, and right click on gen_tm and select Unassign (all the other subsets of gen_cpu should still be assigned)
    • Floorplan the memory controller (very important for reliability)
      • Assign block 'gen_bee3mem to' area-group 'pblock_gen_bee3mem' (shown as a Pblock in Planahead GUI)
      • Expand gen_bee3mem, expand ddr, and right click on genblk*.ddrbankx[*].ddr and select Assign...
      • Assign to the matching ddrbank*
      • Repeat for all 8 ddr IO banks
    • Save floorplan
    • Back in ISE, double click on Implement Design in the Processes window
      • Older ISE might require environment variable "XIL_PAR_ENABLE_CHKCIBIPINS" to be set to 1 to workaround a Xilinx tool issue.
    • Handle multicycle path constraints defined in inhibit_false.ucf  (caution)
      • Some versions of Synplify are very aggressive when sharing registers between modules. But some of the shared registers are explicitly used in MCPs defined in 'inhibit_false.ucf'. PAR may fail because of missing registers. Unfortunately, we have to put MCPs in the UCF because of several known bugs in Synplify. This may not be necessary in the future.
      • To workaround this problem, check the PAR log carefully and try to add "syn_preserve=1" on the definition of those registers in corresponding source files. For example, PAR may report a failure on the following line in the UCF file: 
        • INST "gen_cpu/gen_irqmp/delr_xc.tid*" TNM = "TNM_IRQTID";

                        You can modify file as the following:
                         (* syn_maxfan = 16,  syn_preserve = 1*) struct {
                          After the change, resynthesize your code before PAR again.

    Guide to build the RAMP Gold Software Infrastructure

    Build the cross compiler
    • $ cd tools/compilers/gcc-glibc/
    • Make a Makelocal file based on Makelocal.template. Should look like:
            RAMP_INSTDIR := /scratch/hcook/ros-gcc-glibc/install-sparc-ros-gcc
            X86_INSTDIR := /scratch/hcook/ros-gcc-glibc/install-i386-ros-gcc
            ROSDIR := /scratch/hcook/ros
    • $ make ramp
    • Add the install-sparc-ros-gcc/bin directory to your PATH

    Build the appserver and the proxy kernel:
    • Go to Software/build
    • Copy Makelocal from Makelocal.template.  You must have su access on the machine.
    • Make sure you can build the appserver and the fuctional simulator
    • Add Software/build/bin to your PATH
    • Check the build/bin/appserver*conf* files to make sure the paths etc are right


    The software package downloaded on our website includes some README files and more examples.

    Build and Run Akaros kernel



    Stability Issues of the Memory Controller

    Although there are many design features in the controller to improve reliability, the memory controller is still the most fragile component. Before running any simulation, we suggest passing both appserver memtest and kernel memtest.

    Appserver memtest (test DDR2 analog interface as well as the frontend link)
    sparc_app -p64 hw memtest none

    Kernel memtest (stress test, can be built in Application/memtest)
    sparc_app -p64 hw kernel.ramp memtest

    By default, the controller is running at 225 MHz. If keep seeing memory errors, try to lower the DDR2 memory controller frequency by adjusting the DRAM clock PLL settings in lib/ However, lowering clock frequency rarely happens in practice. 

    parameter DRAM_PERIOD    = 10.0;      //DDR2 PLL input clock period (ns)
    parameter DRAM_CLKMUL   = 9.0;       //Set to 8.0 will lower the clock to 200 MHz
    parameter DRAM_CLKDIV    = 4.0;
    parameter DRAM_PLLDIV    = 1.0;