Why using a symbol table?
Embedding a symbol table in ROM offers several advantages, all based on the fact that you can find functions and variables by name at run time. This is handy when the address of the function or the variable is not known at compile time.
This will could happen if the name of the function is passed by the user using via an interface, like a console for example. Excellent code exists to implement this, however it is statically linked and as you can see in this example generates lots of dependencies on your code. With the used of a symbol table, the program will be parsing a serial port looking for a function name. Once the whole function name is receive, the program will look in the symbol table for the address of the function, and invoke it.
Invoking a function by name is also handy when 2 independently compiled pieces of code are located in memory. For example, a bootloader and a main program. To save ROM space, the main program may want to link against the bootloader, again, to reuse drivers and other functions. One way of doing this is to statically declare the addresses of these functions. One drawback is that these addresses would have to be updated anytime the bootloader is modified (less likely to happen though).
Another example is the ability to patch pieces of a software suit. If you compile your code into separate entities-or modules, say module A, module B, module C, having a symbol table table will allow to dynamically link against each module, thus enabling the patching of only one module. No need to flash the whole microcontroller to update the program. Since it limits the amount of code necessary to reprogram the CPU, this is an interesting feature if you have a limited uplink to your microcontroller (say it's floating in space or roving on Mars, and there's only so many bytes per second you can reliably send there). Also it allows the patching of some code, while other critical code may stay untouched.
The following example will explain how to embed a symbol table in an LPC2000 microcontroller, using the logoMatic board.
Generating a symbol table
For this example I will use the WinArm toolchain, and will modify a stock logomatic firmware.
The logomatic stock makefile already generates the symbol table. This is done with the following rule:
# Create a symbol table from ELF output file.
@echo $(MSG_SYMBOL_TABLE) $@
$(NM) -n $< > $@
This will generate a *.sym file, which will contain information similar to this:
00000000 a CODE_BASE
00000000 a EXTMEM_MODE
00000000 a MAMCR_OFS
00000000 a PLLCON_OFS
00000000 a RAM_MODE
00000000 a REMAP
00000000 a VECTREMAPPED
00000000 a VPBDIV_Val
00000001 a PLLCON_PLLE
00000001 a VPBDIV_SETUP
00000002 a MAMCR_Val
00000002 a PLLCON_PLLC
00000004 a MAMTIM_OFS
00000004 a MAMTIM_Val
00000004 a PLLCFG_OFS
00000008 a PLLSTAT_OFS
0000000c a PLLFEED_OFS
00000010 a Mode_USR
The first column is the absolute address of the symbol in memory.
The second column specifies the type of the symbol. From the GNU NM utility page you can get a sense of what the letter means. T means the symbol is in the text section (a function most likely), B the symbol is in BSS etc...
The third column is the name of the symbol. In C it's straight forward, the name of the symbol is the name used in the code. C++ introduces name mangling which makes the process of finding a symbol a bit trickier.
Generating a linkable symbol table object
In order to simplify the parsing of the symbol table (once in ROM), and to limit its footprint in memory, I wrote that little piece of code that will:
- Remove necessary spaces
- Add end of string characters at the end of each line in the table (thus creating what are later referred to as entries)
Add this rule to the makefile to generate the formatted symbol table:
The utility takes the *.sym file and convert it into a *.mys file (formated symbol table). Once the *.mys file is generated, we need to convert it into an object linkable with the program.
This is done by adding the following rule to the makefile:
symbol_table.o : $(TARGET).sym.mys
@echo Converting $<
@cp $(<) $(*).tmp
@$(OBJCOPY) -I binary -O elf32-littlearm -B arm \
--rename-section .data=.symbol_table,contents,alloc,load,readonly,data \
--redefine-sym _binary_$*_tmp_start=symbol_table \
--redefine-sym _binary_$*_tmp_end=symbol_table_end \
--redefine-sym _binary_$*_tmp_size=symbol_table_size_sym \
We use arm-elf-objcopy to convert the text file into an elf file. By default arm-elf-objcopy will put the symbol table in the data section. We add modifiers to force the symbol table in a section called arm-elf-objcopy. We also rename the symbol table start and finish tags to simpler ones.
This creates an ELF file that can be linked against the main program.
Add the following rule to link against the main program:
As well, edit the build rule to include the second linking step
And modify the source of the final output file
# Create final output file (.bin) from ELF output file.
@echo $(MSG_FLASH) $@
$(OBJCOPY) -O binary $< $@
Edit the clean up rule to remove the .mys, .elfagain and symbol_table.o files
Mapping the symbol table in ROM
The final step is to map the symbol table object in ROM. This is done by modifying the linker script (main_memory_block.ld), by adding the symbol_table section.
The symbol_table section is located at the end of the all the ROMmed sections, this is so the mapping is not changed during the second link step.
. = ALIGN(4);
__symbol_table_start__ = .;
__symbol_table_start__ = .;
__symbol_table_end__ = __symbol_table_start__ + SIZEOF(.symbol_table);
. = ALIGN(4);
_etext = . ;
PROVIDE (etext = .);
Ensure the symbol_table is at the end of the flash section, by checking the map file (could also check the symbol table...)
Once you've done that, that's it, the symbol table is part of your program, and will be located at address &__symbol_table_start__, with a size of (&__symbol_table_end__ - &__symbol_table_start__) bytes.
I have attached a zip file that contains the modified logomatic firmware, to allow the embedding of the symbol table in ROM.
Here is how to programmatically access entries in the symbol table.
All over the build process, one assumption is that the mapping will not change between the first and second linkage. So far this assumption has been correct, though I've noticed that the make process does not always detect modifications made to the source code. I've found it's not a bad idea to always launch a clean before rebuilding the code.