Introduction: Software

To program a microprocessor, a programmer can choose many different programming languages that can differ widely in their complexity. The stack on the right shows different types of programming languages. Some languages are close to a form that humans understand; these are high-level languages. Examples include C/C++, Java, and Fortran. One line of a high-level language can encompass a complex series of operations. The same (or highly similar) high-level language code can be used for different computer architecture designs.

Some languages are close to a form that the hardware understands; these are low-level languages or machine languages. Machine language refers to a binary code that can be loaded to the processor memory and be interpreted by the processor. One line of a machine code represents one instruction in the processor, which is usually a small operation. Different computer architecture designs have different machine languages. For example, a code that is in ARM language (most embedded processors) cannot be executed on a machine with IA32 architecture (most general-purpose computer processors).

 

Assembly language is a language between high-level and machine language. It is similar to high-level in the sense that it uses texts and is sensible to a human. It is comparable to the machine language in the fact that it is machine-specific, and each line of code encompasses one single machine instruction. Most people do not use assembly to write code as it is a lot harder than high-level language and it cannot be ported between different architectures. However, in some exceptional cases, a programmer may need to do that. Assembly language is also used for debugging. In that case, the machine code is translated to assembly (disassembled) and the debugger can see which instruction is being executed.

A compiler is computer software that transforms computer code written in one programming language (the source language) into another programming language (the target language). The name “compiler” is primarily used for programs that translate source code from a high-level programming language to a lower-level language (e.g., assembly language or machine code). The compiler is a sophisticated, complex tool that can optimize and improve the code written by the programmer. The same high-level language code can be translated to many different versions of machine code (1-to-many).

 

The machine code that is created from the translation of one program file is called object code. Most often, this code cannot be directly executed on the machine. It needs to be linked to other object code (from the same program, but not the same file) or library object code (from a library) to create a code that can be executed on a machine. This file is called an executable. The program that combines multiple object code and library functions to create a single executable is called a linker. In essence, object code and executables both are in machine language. However, the object code is not complete on its own and has to be linked to other files to become an executable file.


An assembler translates assembly to machine code. Disassembler does the reverse. Both the assembler and disassembler perform a simple one-to-one translation between the assembly and machine code. When a programmer writes code in assembly, the machine uses an assembler to create an object code. When a programmer writes code in the high-level language and uses a compiler to generate object code, he/she can use a disassembler to turn object code to assembly to debug the program at the instruction level. The process of transforming a high-level language to an executable is called building.

The figure to the below summarizes these concepts. The pink callouts represent the tools. The white rectangles represent file formats that are sensible to humans. Finally, the gray rectangles represent file formats that are in machine code. 

ASK yourself