DE0-Nano | NiosII LCD driver
Here's how I hooked up my LCD display to the NiosII from the previous tut. Part of the reason why I upgraded the memory to 32MB is so I have enough headroom for frame buffers. The PSP screen is quite high res (for a micro controller) with a 24 bit color resolution. In order to see what it can do I'd like to use the full color range which will require 4 Bytes per pixel (due to memory alignment, I'll get to that later) and 510KB for a full frame. Twice that for double buffered drawing.
I'll go "quickly" through the steps of
adding a Video Sync Generator core to the NiosII which will interface with the screen and generate all the necessary signals
adding a 9Mhz video clock to the existing PLL component
a Scatter-Gather DMA Controller core that will feed the Sync Generator with data
a few bits and bobs to glue it all togrther
and then write a few lines of code to test the screen
Hardware
So the basis for this new extension is going to be the previously built NiosII with SDRAM project.
Copy the folder and rename it.
Then open the project from the new folder in QuartusII and fire up SOPC builder.
Scatter-Gather DMA Controller
Add the Scatter-Gather DMA Controller (Bridges and Adapters>DMA>Scatter-Gather DMA Controller).
change the Transfer mode to Memory To Stream.
leave everything else on defaults and click Finish.
rename it to sgdma.
you'll see a lot of complaints about missing connections.
connect the sgdma.descriptor_read to onchip_memory2.s1 by right-clicking the signal descriptor_read (listed under the sgdma component) and from the pop-up menu selecting sgdma.descriptor_read Connections>onchip_memory2.s1
do the same thing for sgdma.descriptor_write (sgdma.descriptor_write Connections>onchip_memory2.s1)
connect sgdma.m_read to sdram.s1
ignore the error about the missing out connection for now. We'll connect that in the next step.
if not already connected automatically, connect sgdma.csr to cpu.data_master
Avalon-ST Dual Clock FIFO
To cross the two clock domains (50Mhz RAM, 9MHz display pixel clock) and to smooth out potential memory access collisions between DMA and CPU we add a Dual clock FIFO buffer.
Add an Avalon-ST Dual Clock FIFO (Memories and Memory Controllers>On-Chip>Avalon-ST Dual Clock FIFO)
change the Symbols per beat to 4
set the FIFO depth to a value of your liking. I set mine to an overkill of 512. The larger the buffer the longer the allowed data traffic jams can be but this is on-chip memory and it's quite limited.
enable Use packets
click Finish
rename the component to dc_fifo
connect sgdma.out to dc_fifo.in
Pixel Converter
Now the Video Sync Generator expects to be fed with triplets of color bytes. A 24 bit symbol for each pixel. The dma controller however reads data in 4 byte chunks. Which means there is one unnecessary byte in every 32 bit word.
The Pixel Converter core just snips off that byte.
add a Pixel Converter component (Peripherals>Display>Pixel Converter (BGR0->BGR))
change the Source symbols per beat to 1
click Finish
and rename it to pixel_converter
connect dc_fifo.out to pixel_converter.in
Video Sync Generator
And finally we get to the Sync Generator which reads the data stream and presents the color signals to the display while also generating various sync signals.
add a Video Sync Generator (Peripherals>Display>Video Sync Generator)
change the Data Stream Bit Width to 24
Beats per Pixel to 1
Number of Columns to 480
Number of Rows to 272
Horizontal Blank Pixels to 43
Horizontal Front Porch Pixels to 2
Horizontal Sync Pulse Pixels to 41
leave the Horizontal Sync Pulse Polarity at 0
Vertical Blank Lines to 12
Vertical Front Porch Lines to 2
Vertical Sync Pulse Lines to 10
Vertical Sync Pulse Polarity to 0
Total Horizontal Scan Pixels to 525
and Total Vertical Scan Lines to 286
(Most of these values can be found in Sharp's datasheet for the screen)
click Finish
rename it to video_sync_generator
connect pixel_converter.out to video_sync_generator.in
Adding a video clock
The video sync generator and it's inputs require a clock that governs the refresh rate of the output screen, a pixel clock. Including front porches and blank periods we need to tick off 525*286 pixels and that 60 times a second (50Hz works too but lower rates will start to flicker). This results in a pixel clock of a little over 9Mhz. We have a lot of unused clocks left over in our PLL module so let's use one for this.
select the shift_clock module and click Edit to open the wizard again
click Next on the next few screens until you get to tab 3) Output Clocks
leave clock c0 as it is, it's still driving the SDRAM. click next to get to clock c1
tick the checkbox Use this clock
select the Enter output frequency radio button and set the frequency to 9 MHz
leave everything else on Default and click Finish (twice)
Up top, back in SOPC builder in the Clock Settings box our new 9MHz clock signal should now appear as shift_clk_c1.
now we have to tell the pixel-clock-components to use this clock instead of the default clk_50
tell dc_fifo.out to use shift_clk_c1 by clicking on the Clock column next to it that still says clk_50.
A selection list will pop up from which you can select the new clock source shift_clk_c1.
set pixel_converter.in also to shift_clk_c1
set video_sync_generator.in to shift_clk_c1
Wrapping up
Before we leave the SOPC builder and generate the new core lets switch the cpu back to use onchip_memory for it's program memory. Compared to onchip memory the SDRAM is actually quite slow. On top of having to service the display continuously we are also going to constantly hammer it with draw calls. It's not a bad idea to run the program from somewhere else (if possible).
select the cpu and click Edit to open the wizard
set the Reset Vector back to onchip_memory2 with Offset 0x0
and set the Exception Vector to onchip_memory2 with Offset 0x20
click Finish
Our new core should now looks like this:
click Generate to build the core and exit SOPC builder. Save the file when requested.
back in QuartusII, recompile the design
Updating the symbol and pin assignments
Now the following pin assignments obviously depend on how you connected the display to your DE0-Nano board. As an example I'll list the connections I used for my prototype board.
double click the mynios2.bdf file in the project navigator to open it in the schematic editor
to prevent a double creation of pins select all pins and wires around the symbol and delete them
right click the Nios symbol and select Update Symbol or Block (select any of the options presented)
the new signals from the Video Sync Generator will appear in the symbol
right click the symbol again and select Generate Pins for Symbol Ports
Save the file and recompile
Next, just as before assign the pins of the Video Sync Generator by either using the Pin Planer or by editing the mynios2.qsf file (here are my assignments). Note that we are salvaging 2 pins from the LED Port for the DISP signal and the backlight. Make sure to disconnect PIN_R8 from out_port_from_the_pio_led[7] and PIN_L3 from out_port_from_the_pio_led[6].
( DE0-Nano pinout )
recompile the design
and upload it to the board using the Programmer
Software
Hello
find your project folder on disk and delete the contents of the software sub folder as well as the .metadata folder
then fire up NiosII Software Build Tools for Eclipse and switch the workspace to your project folder. The workspace should open up completely empty
select File > New > NiosII Application and BSP from Template
select your DE0_NANO_SOPC.sopcinfo file as target hardware and call the project display_test
click Finish
The hello_world project (display_test) and the BSP project are created for you
right click the BSP project and select NiosII > Bsp Editor...
in the Bsp Editor under the Main tab in Settings/Hal tick the box enable_reduced_device_drivers
also tick enable_small_c_library
go to the Linker Script tab and under Linker Section Mappings switch all sections (.bss, .heap, etc.) to onchip_memory2
click Generate and Exit to close the Editor
select Project > Build All
right click the display_test project and select Run As > NiosII Hardware
the demo message should appear in the Nios Console
Hello from Nios II!
DMA
Now that we know that the core is functional we can initialize the DMA controller to send some data to the display.
We need to build a data structure in memory that tells the DMA controller what memory we want to be copied where. This structure is called a Descriptor and each Descriptor can describe a transfer of 64KB at a time. That means we need to chain a bunch of them together to transfer the entire frame buffer (480*272*4 = 522240 Bytes).
Here's a shortended version of the code with comments:
// 65532 (0xfffc) bytes * 7 = 458724
// +63516 (0xf81c) bytes
// frame buffer A alt_u8* buff = (alt_u8*)frameBufferA;
int i;
for(i = 0; i < 8; ++i) { alt_u16 size = (i<7)?0xfffc:0xf81c;
alt_avalon_sgdma_construct_mem_to_stream_desc( &dmaDescA[i], (i<7) ? (&dmaDescA[i+1]) : &dmaDescEND, (alt_u32*)buff, size, 0, i==0, i==7, 0);
buff+=size; } } int main() { // initialize the DMA and get a device handle // alt_sgdma_dev *dma = alt_avalon_sgdma_open("/dev/sgdma"); // now we just continuously copy framebufferA to the Video Sync Generator // while(1) { // init the descriptors // init_framebuffer(dma); // transfer // alt_avalon_sgdma_do_sync_transfer(dma, dmaDescA); } }
The complete code is here: hello_world.c1
This will effectively display the contents of the (uninitialized) memory on screen. Usually colorful garbage. Almost there!
Interrupt Service Routine
Now, our cpu is not going to be very useful if we keep it busy transferring data to the display. That's what the DMA controller is for and it can do this independently from the cpu. In fact it can be set up to run infinitely in a loop without any further cpu intervention. However, it can be very useful to be notified when a frame transfer has finished. Sometimes it is necessary to synchronize a program to the refresh rate or to switch the currently displayed memory to a different frame buffer.
This can be done with an interrupt service routine that's registered with the DMA controller and is called every time another descriptor chain has completed it's transfer.
Here's a shortened version of the program slightly modified for asynchronous DMA transfer, interrupt service routine and some drawing routines.
// The InterruptService Routine (actually a callback function called by the ISR)//void my_dma_callback(void *data) { // reset the OWNED_BY_HW bit in the descriptors to reuse the chain int i; for(i = 0; i < 8;++i) dmaDescA[i].control |=
1<<ALTERA_AVALON_SGDMA_DESCRIPTOR_CONTROL_OWNED_BY_HW_OFST;
// trigger another transfer all over again alt_avalon_sgdma_do_async_transfer((alt_sgdma_dev*)data, dmaDescA); } // this subroutine initializes a chain of descriptors, registers the// interrupt service routine and starts the first asynchronous transfer//void init_and_start_framebuffer(alt_sgdma_dev *dma) { // 480*272 lines * 4 bytes = 522240 bytes // 65532 (0xfffc) bytes * 7 = 458724 // +63516 (0xf81c) bytes // frame buffer A alt_u8* buff = (alt_u8*)frameBufferA; int i; for(i = 0; i < 8; ++i) {
alt_u16 size = (i<7)?0xfffc:0xf81c; alt_avalon_sgdma_construct_mem_to_stream_desc( &dmaDescA[i], (i<7) ? (&dmaDescA[i+1]) : &dmaDescEND, (alt_u32*)buff, size, 0, i==0, i==7, 0); buff+= size; }
// register our interrupt routine
alt_avalon_sgdma_register_callback(
dma, my_dma_callback,
ALTERA_AVALON_SGDMA_CONTROL_IE_CHAIN_COMPLETED_MSK
|ALTERA_AVALON_SGDMA_CONTROL_IE_GLOBAL_MSK,
(void*)dma);
// initiate the transfer
alt_avalon_sgdma_do_async_transfer(dma, dmaDescA); } // basic drawing routines --------------------------------------//// draw a pixelinline void setPix(const int x, const int y, const Color col, alt_u32* buffer) { buffer[x + y * 480] = col.color32; } // bresenham line drawingvoid line(alt_u32* buffer, int x0, int y0, int x1, int y1, Color color) { ... }
// fill the screen with a fractal (quite slow)
void MandelBrot(alt_u32* buffer) { ...
}
int main() { // switch on the backlight // IOWR_ALTERA_AVALON_PIO_DATA(PIO_LED_BASE, 64); // initialize the DMA and get a device handle // alt_sgdma_dev *dma = alt_avalon_sgdma_open("/dev/sgdma"); // assert the DISP signal // IOWR_ALTERA_AVALON_PIO_DATA(PIO_LED_BASE, 128+64);
// start the DMA //
init_and_start_framebuffer(dma); // now we actually have some time to draw something // Particle particles[2] = { {100,0, -1, -1}, {0, 120, -1, -1} }; Color col; int count;
for(count=1;;count++) { // blink an led to see the program running (gonna be too fast to see) IOWR_ALTERA_AVALON_PIO_DATA(PIO_LED_BASE, (count & 0x0001)+128+64); // draw a line in a random color col.color8.r = rand()%256; col.color8.g = rand()%256; col.color8.b = rand()%256; line(frameBufferA, particles[0].x, particles[0].y,
particles[1].x, particles[1].y, col); // bounce the particles around int i; for(i = 0; i < 2; ++i) { if(particles[i].x == 0 || particles[i].x == 479) particles[i].vx *= -1; particles[i].x += particles[i].vx; if(particles[i].y == 0 || particles[i].y == 271) particles[i].vy *= -1; particles[i].y += particles[i].vy; } } return 0; }
The complete source code is here.
That's it for now. Happy drawing.
Related Links
Video Sync Generator:
Video Sync Generator and Pixel Converter Cores
Scatter-Gather DMA Controller (and most other IP cores)
Embedded Peripherals IP - User Guide
Other projects you might like