A Look Into The C CompilerGuess what happened after George recently made a few simple changes
to a product for a customer and then recompiled the code? Compiler
errors! In this column, he describes how he addressed the problem.
While I was reading an article recently, the author wrote that he
was using assembly language so he would have more control over the
code. I thought this wasn‘t true. The C language should give you
all of the control over the generated code you would ever want. A
short while later, a customer asked me to add some changes to a
product to try out some new features for a new family of
applications. I had recently written all of the code in C, so it
should have been fresh in my mind. These changes consisted of new
operating modes that were based on existing modes. Little did I
know that I was about to explore all of the code a compiler
generates in depth.
OUT OF SPACE
The microcontroller in this project is a Texas Instruments MSP430,
which I have written about before. I'm also using one of the
simpler (smaller) versions in the family,specifically the
MSP430F122IPW (http://focus.ti.com/docs/
prod/folders/print/msp430f122.html)。 It is a simple device with 4
KB of flash memory, 256 bytes of RAM, and several other
peripherals.
I made the changes and recompiled the code. What!?!Compiler errors.
The code was too big. How could that be?After thrashing about for a
while, I realized that in a 16-bit system (MSP430), the 4 KB is
2,048 words of memory space. And those 2,048 words are located from
address 0x0000 to 0x07FF. Yes, the compiler was correct. So, I
decided to look for a larger device that would plug right into the
PCB. No luck. All of the MSP430 devices that have a larger memory
are in a different package.
Because this change to the code was to experimentally evaluate new
operating features, I asked the customer if I could remove some of
the existing code to make room. The customer agreed, so I figured
it would be rather easy! Not so fast.
I started commenting out lines of C code that I thought would free
up enough memory. No good. The compiler still complained that we
were out of room. It was time to open up the lid on the compiler
and see what was going on.
COMPILER OUTPUTIn previous articles, I talked about and recommended the IAR
Systems compiler for the MSP430. It‘s free and compiles code up to
4 KB. It's a solid compiler at the right price. All compilers have
the option to produce a listing output that mixes the C source with
the assembly language code produced. I went back to the product‘s
original baseline code and generated listing outputs for each
module. Actually, this is my default method for compiling. I always
keep an eye on the compiler's output to ensure that none of my C
code is compiling into a large and complicated assembly language
mess.
I made a copy of the output listing for the keyboard routine. You
can find the complete listing on the Circuit Cellar FTP site. I
deleted some of the information to protect the customer and the
project. This is a long listing and a good candidate to help you
understand what the compiler is doing and learn MSP430 assembly
language. That isn‘t a typo. We can learn a lot about the assembly
language by studying the compiler output.
In the listing file, lines of C code begin with the line number of
the original C file, while lines of generated assembly language
begin with the \ character. Remember that this is a listing file.
It is meant to be read by a human,not necessarily processed further
by a machine.
Let's just look at some of the outputs and comment about what‘s
going on. I'll start with simple operations:
127
LEDTimer =
0;
// Timer = 0
\ 0000BC
8243......
MOV.W #0x0, &LEDTimer
The 16-bit variable LEDTimer is set to 0. Note that the assembly
language command moves a word of 0 in value directly to the address
of the variable LEDTimer. One line of C, one line of assembly! The
...... is the actual memory address for LEDTimer and that will be
filled in (resolved) at link time. Also notice that this is a
common operation and it has dedicated operations code for that
purpose. Finally,someone is listening:
131 ShortBeep(1); // One
Short beep
\ 0000CC 5C43
MOV.B #0x1, R12
\ 0000CE B012......
CALL #ShortBeep
The routine ShortBeep is called and passed a parameter value of 1.
The assembly language instructions load a 16-bit variable into R12
(register 12) and call the ShortBeep routine. In the IAR
implementation, C parameters are passed using the registers if
there are enough registers. If there are not enough free registers,
the parameters are passed using the stack. Depending on the
compiler design and the CPU design, this can be a source for slow
performance. Imagine 20 parameters to pass to a routine. If I have
more than four parameters, I make up a structure, save the
parameters in the structure, and pass a pointer to that structure
in the call to a routine. You might say, "Just make all of the
parameters global and then there is no need for passing anything."
Well, this works for smaller designs and is the same solution for C
or assembly, but it‘s just not practical for larger projects.
COMPARISONS
The C code in Listing 1 is comparing the LEDCounter variable to
the value 99. If the variable is larger, then the following section
of code enclosed in the {} brackets is executed. The assembly code
compares the 16-bit variable located at the address of LEDCounter
with the value of 0x64 (100 decimals) and then the JL instruction
is executed. I bet the JL instruction is defined as jump if the
comparison results evaluated to "less than." And the jump
destination bypasses the C code inside the {} brackets. The
comparison feature alone is enough of a reason to use C. Every
microcontroller has different compare instructions,different flag
meanings, and different jump instructions. I don't want to spend
another minute of my time reading manuals and trying to figure it
all out. And neither should you. I didn‘t mention that you also
need to consider whether the variable is 8, 16, or 32 bits and
signed or unsigned. OK. I'll move on. But I hope you got the point.
Before we do, notice on line 120 in Listing 1 that adding 0xff9C is
the equivalent of subtracting 100. Well, 0xff9C is 16 bit 2‘s
compliment notation for -100, and LEDCounter is a 16-bit signed
variable (see Listing 2)。 C rocks!
STATEMENTS & ROUTINES
Listing 2 shows five C statements that do the same operation of
zeroing a variable. Notice that some of the variables are 16-bit
and some are only 8-bit entities. What if we wrote them as follows:
SSTimer = LEDTimer = LEDCounter = BeepTimer = BeepTimerCounter = 0;
The IAR compiler and MSP430 did the operation in 20 bytes of code.
How much code would your favorite microcontroller and compiler take
to perform this function? What‘s the best form of C code to do the
job? Can assembly language do it any better? This is a trivial
example, but you can imagine a more complicated example that would
help you evaluate microcontrollers and compilers:
233
FanOnOff(ESC_ON);
\ 000262
5C43
MOV.B #0x1, R12
\ 000264 B012......
CALL #FanOnOff
Here's how I turn a fan on and off. I have a routine that is passed
a parameter. If the parameter is equal to on, I turn the fan on and
likewise for off. If, however, I have a routine for FanOn and
another for FanOff, then there would be no need for parameter
passing. And one less word for every time I call those functions.
It is probably a good thing to do on this project because I‘m
cramped for space. And it's an easy change:
239
SSTimer++;
\ 000272 9253......
ADD.W #0x1, &SSTimer
This code fragment increments a 16-bit variable named SSTimer. What
if I had written SSTimer = SSTimer +1;?Would the compiler load the
variable into a register, increment it, and then save that
variable?
The code generated in Listing 3 loads the 16-bit variable into R15,
adds 1 to it, saves that variable, and then uses the copy in R15
for the comparison. Is there a better way?What about this:
TriggerSwitchState++;
if (TriggerSwitchState > ESC_STATE_ACTIVE) {
That would produce the following:
\ 00001A 9253...... ADD.W #0x1,
&TriggerSwitchState
\ 00001E B2906302...... CMP.W #0x263, &TriggerSwitchState
\ 000024 1F38 JL
??OperateKbStates_1
The code size went from eight words to six words, saving two words.
That is not a lot, but if this type of comparison is done
frequently, then three words could be saved every time.
SWITCH STATEMENTSProbably the most important concept to take away from what we‘ve
covered so far is that you need to get to know your compiler and
microcontroller so you can write C statements that will maximize
performance. Another area to investigate is how switch statements
are handled (see Listing 4)。
The switch statement jumps to different locations depending on the
value of OperMode. Note that OperMode‘s defined values are listed
in Listing 5. If the mode is CONT or a nondefined mode, nothing is
done. If the mode is SoftStart,High, Low, or Pulse, then a timer is
incremented.
Notice what the compiler generated. The 8-bit variable OperMode is
copied into R14. Then the value 2 is subtracted. If the result is
0, then the decrementing path is taken. It would be 0 if the
OperMode was SoftStart. Look at the other tests the control portion
takes to figure out which value OperMode was set to. All of that
work is done in 10 word locations. Also consider the switch
statement at line 399 of the listing file on the Circuit Cellar FTP
site. The case statements are different and slightly different code
is generated.
Another complicated switch statement is shown in Listing 6. It is
close to the others but different enough to produce significantly
different code. A jump table is produced in this switch statement.
The possible states for LED_State are in Listing 7.
Note that all of the states are in numerical order with no missing
states. I made it easy for the compiler to build and use a jump
table. What if there were missing states? Well, the table could be
built with the default state inserted for those missing entries.
What if we used a switch statement and we were switching on a small
list of ASCII values? Good question. Why not put that into your C
compiler and microcontroller and see what happens? Not all
microcontrollers and compilers are created equal. With IAR, you are
looking at over 20 years of compiler design experience. And with
Texas Instruments,I‘ve seen their CPUs evolve over the past 30
years. So, in all fairness, don't compare an 8031 or a 6800 to a
modern CPU.
ARRAYS
The examples in Listing 8 are not found in the listing posted on
the Circuit Cellar FTP site. Arrays are another area to
investigate.
DispQue is an array of a structure, and one element in that
structure is Type. The C code is saving the local variable nType
(new Type) in the array. So, first the 8-bit variable LCDQueInPtr
is loaded into R15, and then that variable is converted to a 16-bit
entity by extending the sign. So, 0xFF would become 0xFFFF and 0x01
would become 0x0001. If I had defined LCDQueInPtr as an unsigned
type, the sign extension operation would not be required. Next, R15
is shifted to the left once and then once again. This is a left
shift by two or a multiplication by four. And four is the size, in
bytes, of the structure. So if LCDQueInPtr were pointing to the 0th
element,then R15 would hold 0x0000. If it were pointing to the
fifth element, then R15 would hold 5 × 4 = 20 = 0x0014. The last
step is to save the variable in R12 (the new Type) to the DispQue
array offset by the contents of R15. This is a bit of insight on
how arrays and structures compile to assembly language. Rather
slick I must say.
Listing 9 is an example of an array operation. The compiler knows
all there is to know about the memory location of LastLine2Dig[0].
It‘s at a specific address as is LastLine2Dig[1]. So the compiler
just uses those addresses.
SAVING SPACEAgain, if we know the address of an array element, we get the code
in Listing 10. Space is saved for the 8-bit local(temporary)
variables Digit[0] and Digit[1] on the stack. When the routine is
terminated, the contents of this variable are lost. Because the
compiler knows the location of the variable, it's rather
straightforward to save new values in that variable.
Another example of arrays of structures and for(;;)loops is in
Listing 11. In the code, i is a local 8-bit variable on the stack.
We are attempting to set all the BeepQue[].Type locations to 0.
Notice the ??InitBeeperQue_0: and ??InitBeeperQue_1:labels. The
compiler generated them for destinations of jump instructions. The
??InitBeeperQue_1: location is the test to see if i is less than
MAX_BEEPS. If not, control flows to the ??InitBeeperQue_0 location.
Notice that because i is an 8-bit variable, the SXT instruction is
required. If I were looking for speed, I would have made i a 16-bit
variable and then I would not need to execute that SXT instruction.
The more RAM is used, the less code is used. Isn‘t that always the
case?
Well, we just scratched the surface of compiler output. I used the
listing outputs to find large areas of code that could be deleted
to make room for the new features. The customer got his unit on
time. Remember we‘ve got a code review going on from last time.
Next time, I'll talk about an off-grid solar home I recently
completed.
George Martin (gmartin@circuitcellar.com) began his career in the
aerospace industry in 1969. After five years at a real job, he set
out on his own and co-founded a design and manufacturing firm
(www.embeddeddesigner.com)。 His designs typically include
servo-motion control, graphical input and output, data acquisition,
and remote control systems. George is a charter member of the
Ciarcia Design Works Team. He‘s currently working on a mobile
communications system that announces highway info. He is a
nationally ranked revolver shooter.
PROJECT FILESTo download code, go to ftp://ftp.circuit
cellar.com/pub/Circuit_Cellar/2008/214.
SOURCEMSP430F122IPW Microcontroller
Texas Instruments, Inc.
www.ti.com