Discussion:
[avr-gcc-list] When does the Stack Frame Pointer (Y) get setup?
Bob Paddock
2012-07-03 17:05:31 UTC
Permalink
When does the Stack Frame Pointer (Y) get setup?

Should the following code work? I find it trashing memory in bss:

void id_filter_init8( void ) __attribute__ ((naked)) __attribute__
((section(".init8")));
void id_filter_init8( void )
{
uint8_t ff_u8[128U];

/* memset() should really be used here */
for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++ )
{
ff_u8[ idx_u16 ] = 0U;
}

/* Anything that makes ff_u8 be used, so it is not optimized away in
this example: */
eeprom_update_block ( ff_u8, sysgrp_mask_eeprom_u8, sizeof(
sysgrp_mask_eeprom_u8 ) );
}

.lss:

00008cc4 <__ctors_end>:
8cc4: 11 24 eor r1, r1
8cc6: 1f be out 0x3f, r1 ; 63
8cc8: cf ef ldi r28, 0xFF ; 255 Set Y and stack
8cca: df e3 ldi r29, 0x3F ; 63
8ccc: de bf out 0x3e, r29 ; 62
8cce: cd bf out 0x3d, r28 ; 61
8cd0: 00 e0 ldi r16, 0x00 ; 0
8cd2: 0c bf out 0x3c, r16 ; 60
8cd4: 18 be out 0x38, r1 ; 56
8cd6: 19 be out 0x39, r1 ; 57
8cd8: 1a be out 0x3a, r1 ; 58
8cda: 1b be out 0x3b, r1 ; 59

[snip lots of stuff that changes r28/r29, doesn't save them]

[Is Y really a frame pointer value at this point?:]

void id_filter_init8( void ) __attribute__ ((naked)) __attribute__
((section(".init8")));
void id_filter_init8( void )
{
93b4: fe 01 movw r30, r28
93b6: 31 96 adiw r30, 0x01 ; 1
#include "event.h"
extern void event_bug( void );
#endif

void id_filter_init8( void ) __attribute__ ((naked)) __attribute__
((section(".init8")));
void id_filter_init8( void )
93b8: ce 01 movw r24, r28
93ba: 8f 57 subi r24, 0x7F ; 127
93bc: 9f 4f sbci r25, 0xFF ; 255
{
uint8_t ff_u8[128U];

for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++ )
{
ff_u8[ idx_u16 ] = 0U;
93be: 11 92 st Z+, r1
void id_filter_init8( void ) __attribute__ ((naked)) __attribute__
((section(".init8")));
void id_filter_init8( void )
{
uint8_t ff_u8[128U];

for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++ )
93c0: e8 17 cp r30, r24
93c2: f9 07 cpc r31, r25
93c4: e1 f7 brne .-8 ; 0x93be <id_filter_init8+0xa>
{
ff_u8[ idx_u16 ] = 0U;
}

[Looks like Y Frame Pointer is set and used as such from here on, and
gets saved in function calls:]

MAIN_OS int main( void )
{
95d2: cd b7 in r28, 0x3d ; 61
95d4: de b7 in r29, 0x3e ; 62
95d6: cb 54 subi r28, 0x4B ; 75
95d8: d0 40 sbci r29, 0x00 ; 0
95da: cd bf out 0x3d, r28 ; 61
95dc: de bf out 0x3e, r29 ; 62

[more stuff]

return( 0 ); /* When main() returns the .FINIsh function will be
called to put I/O in its lowest power state. Probably the same as the
power up state */
}/* main() */
9aec: 80 e0 ldi r24, 0x00 ; 0
9aee: 90 e0 ldi r25, 0x00 ; 0
9af0: c5 5b subi r28, 0xB5 ; 181
9af2: df 4f sbci r29, 0xFF ; 255
9af4: cd bf out 0x3d, r28 ; 61
9af6: de bf out 0x3e, r29 ; 62
9af8: 08 95 ret

Did I miss anything in the documentation that would tell me not to use
auto variables in .initX sections after .init2 that sets the stack?
Weddington, Eric
2012-07-03 18:21:09 UTC
Permalink
Hi Bob,

Avr-libc user manual, section on Memory Sections:

http://www.nongnu.org/avr-libc/user-manual/mem_sections.html

According to that, it looks like .init2.

A disassembly of the whole program should allow you to verify what happens in the .initX sections.

I'm not sure why your program doesn't seem to work...

Eric
-----Original Message-----
Paddock
Sent: Tuesday, July 03, 2012 11:06 AM
To: AVR-GCC
Subject: [avr-gcc-list] When does the Stack Frame Pointer (Y) get setup?
When does the Stack Frame Pointer (Y) get setup?
void id_filter_init8( void ) __attribute__ ((naked)) __attribute__
((section(".init8")));
void id_filter_init8( void )
{
uint8_t ff_u8[128U];
/* memset() should really be used here */
for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++ )
{
ff_u8[ idx_u16 ] = 0U;
}
/* Anything that makes ff_u8 be used, so it is not optimized away in
this example: */
eeprom_update_block ( ff_u8, sysgrp_mask_eeprom_u8, sizeof(
sysgrp_mask_eeprom_u8 ) );
}
8cc4: 11 24 eor r1, r1
8cc6: 1f be out 0x3f, r1 ; 63
8cc8: cf ef ldi r28, 0xFF ; 255 Set Y and stack
8cca: df e3 ldi r29, 0x3F ; 63
8ccc: de bf out 0x3e, r29 ; 62
8cce: cd bf out 0x3d, r28 ; 61
8cd0: 00 e0 ldi r16, 0x00 ; 0
8cd2: 0c bf out 0x3c, r16 ; 60
8cd4: 18 be out 0x38, r1 ; 56
8cd6: 19 be out 0x39, r1 ; 57
8cd8: 1a be out 0x3a, r1 ; 58
8cda: 1b be out 0x3b, r1 ; 59
[snip lots of stuff that changes r28/r29, doesn't save them]
[Is Y really a frame pointer value at this point?:]
void id_filter_init8( void ) __attribute__ ((naked)) __attribute__
((section(".init8")));
void id_filter_init8( void )
{
93b4: fe 01 movw r30, r28
93b6: 31 96 adiw r30, 0x01 ; 1
#include "event.h"
extern void event_bug( void );
#endif
void id_filter_init8( void ) __attribute__ ((naked)) __attribute__
((section(".init8")));
void id_filter_init8( void )
93b8: ce 01 movw r24, r28
93ba: 8f 57 subi r24, 0x7F ; 127
93bc: 9f 4f sbci r25, 0xFF ; 255
{
uint8_t ff_u8[128U];
for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++ )
{
ff_u8[ idx_u16 ] = 0U;
93be: 11 92 st Z+, r1
void id_filter_init8( void ) __attribute__ ((naked)) __attribute__
((section(".init8")));
void id_filter_init8( void )
{
uint8_t ff_u8[128U];
for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++ )
93c0: e8 17 cp r30, r24
93c2: f9 07 cpc r31, r25
93c4: e1 f7 brne .-8 ; 0x93be <id_filter_init8+0xa>
{
ff_u8[ idx_u16 ] = 0U;
}
[Looks like Y Frame Pointer is set and used as such from here on, and
gets saved in function calls:]
MAIN_OS int main( void )
{
95d2: cd b7 in r28, 0x3d ; 61
95d4: de b7 in r29, 0x3e ; 62
95d6: cb 54 subi r28, 0x4B ; 75
95d8: d0 40 sbci r29, 0x00 ; 0
95da: cd bf out 0x3d, r28 ; 61
95dc: de bf out 0x3e, r29 ; 62
[more stuff]
return( 0 ); /* When main() returns the .FINIsh function will be
called to put I/O in its lowest power state. Probably the same as the
power up state */
}/* main() */
9aec: 80 e0 ldi r24, 0x00 ; 0
9aee: 90 e0 ldi r25, 0x00 ; 0
9af0: c5 5b subi r28, 0xB5 ; 181
9af2: df 4f sbci r29, 0xFF ; 255
9af4: cd bf out 0x3d, r28 ; 61
9af6: de bf out 0x3e, r29 ; 62
9af8: 08 95 ret
Did I miss anything in the documentation that would tell me not to use
auto variables in .initX sections after .init2 that sets the stack?
_______________________________________________
AVR-GCC-list mailing list
https://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Bob Paddock
2012-07-03 22:00:02 UTC
Permalink
On Tue, Jul 3, 2012 at 2:21 PM, Weddington, Eric
Post by Weddington, Eric
http://www.nongnu.org/avr-libc/user-manual/mem_sections.html
According to that, it looks like .init2.
I'm very familiar wit that section. It says the stack pointer is set,
so call/returns can be used thereafter etc.
Beyond that it does not say anything about the frame pointer. As all
of this .initX stuff is not standard C, other than bss being zeroed
before main() starts,
there may be no reason that the frame pointer should be valid at this
point, and this is simply a documentation issue that should have a
warning added.
Post by Weddington, Eric
A disassembly of the whole program should allow you to verify what happens in the .initX sections.
Did that, that is where the code I posted came from.
Looks to me like r28/r29 is set in .init2, however it does not seem to
be preserved until main() sets it again, then it is preserved there
after.
Post by Weddington, Eric
I'm not sure why your program doesn't seem to work...
It does not work because the code I posted is overwriting a structure
array that was initialized in an earlier .init8 section, and is now
filed with zeros because of the errant frame pointer. Its not a
question of why my code is broken, but a question of should it be
broken?

All my, relevant to this issue, code is in C. I'm not doing anything
with the registers. Is the frame pointer guaranteed to be preserved
from the time it is set in .init2 until it is used in main(), across
all of the .initX sections (which are *not* proper functions with the
normal epilog/prologs that would push/pop the r28/r29 pair)? It does
not look like it is from the listing, and certainly does not behave
like it is. I make heavy use of the .initX sections, however only the
one has anything put on the stack, which is what I posted.
--
http://blog.softwaresafety.net/
http://www.designer-iii.com/
http://www.wearablesmartsensors.com/
Weddington, Eric
2012-07-03 22:44:22 UTC
Permalink
-----Original Message-----
Sent: Tuesday, July 03, 2012 4:00 PM
To: Weddington, Eric
Cc: AVR-GCC
Subject: Re: [avr-gcc-list] When does the Stack Frame Pointer (Y) get setup?
Post by Weddington, Eric
I'm not sure why your program doesn't seem to work...
It does not work because the code I posted is overwriting a structure
array that was initialized in an earlier .init8 section, and is now
filed with zeros because of the errant frame pointer. Its not a
question of why my code is broken, but a question of should it be
broken?
All my, relevant to this issue, code is in C.
Well it's really hard to follow what you posted as you mix up your C and the assembly listing. I can't recreate something that I can test on my end to find out what is going on.

I noticed that you have that you're clearing the array in .init8. You have a note in a comment that says that memset() should really be used instead. Have you tried it with memset()?

For that matter, you're setting the array to all zeros. Why not just make the array static? It will end up in the .bss section which gets zeroed out in the startup code. You don't have to fiddle with a .init8 section.

Eric
Bob Paddock
2012-07-03 23:03:04 UTC
Permalink
Post by Weddington, Eric
Well it's really hard to follow what you posted as you mix up your C and the assembly listing.
That is right from the .lss as the compiler produced it. Optimizing
compilers don't make easy to read code for Humans.
Post by Weddington, Eric
I can't recreate something that I can test on my end to find out what is going on.
I noticed that you have that you're clearing the array in .init8. You have a note in a comment that says that memset() should really be used instead. Have you tried it with memset()?
Yes, memset does not work. It has the same frame pointer issue. I
posted the for() loop code sample that you can use to reproduce the
problem above the compiler .lss output. Don already did explained the
problem, and it is what I thought it was, an oversight in the
documentation.
Post by Weddington, Eric
For that matter, you're setting the array to all zeros. Why not just make the array static? It will end up in the .bss section which gets zeroed out in the startup code.
A static array would take the 128 bytes forever. I only need them to
be filled with zeros long enough for them to be written to the EEPROM
on the first system power up (Power Up Flag set in the MCUSR
register). It would make more sense if you saw the real code. That
was a minimal test case that I posted that showed the problem.
Post by Weddington, Eric
You don't have to fiddle with a .init8 section.
I'm well aware of that. I use the .initX sections as
pseudo-constructors and the .finiX sections as pseudo-destructors.

Makes me wonder if this .initX/.finiX stuff is breaking LTO?
Weddington, Eric
2012-07-04 00:06:47 UTC
Permalink
-----Original Message-----
Sent: Tuesday, July 03, 2012 5:03 PM
To: Weddington, Eric
Cc: AVR-GCC
Subject: Re: [avr-gcc-list] When does the Stack Frame Pointer (Y) get setup?
I'm well aware of that. I use the .initX sections as
pseudo-constructors and the .finiX sections as pseudo-destructors.
Maybe you should use C++? ;-)
Makes me wonder if this .initX/.finiX stuff is breaking LTO?
Hmm. I wouldn't necessarily think so. But that's just speculation.
Don Kinzer
2012-07-03 22:40:02 UTC
Permalink
Is the frame pointer guaranteed to be preserved from the time it is set in .init2 until it is used in main() [...]
The *stack pointer* is initialized in .init2 but the *frame pointer*
is not. Rather, the frame pointer is set up in the prologue for each
function that needs it (i.e. has local storage or takes the address of
one or more parameters). The "naked" attribute tells the compiler to
omit the function prologue and epilogue which, apparently, also causes
the frame pointer setup to be omitted as well. It would be useful to
add a caveat to the documentation about using local
variables in naked functions.

One way to work around the problem is to create a separate helper
function to do the work and then call it from the .init8 function.
You'll need to ensure that the helper function is *not* static to
avoid having the compiler inline the code. You may also have to
adjust the compiler options to prevent the compiler from inlining the
code. The output below (resulting from code derived from your
example) was obtained using the default call cost and inline size
option settings.

Don Kinzer

--- Example Code ---
void id_filter_init8( void ) __attribute__ ((naked))
__attribute__((section(".init8")));
void id_filter_init8( void )
{
98: 07 d0 rcall .+14 ; 0xa8 <bar>
9a: 03 d0 rcall .+6 ; 0xa2 <main>
9c: 2a c0 rjmp .+84 ; 0xf2 <_exit>
...
void bar(void)
{
a8: df 93 push r29
aa: cf 93 push r28
ac: cd b7 in r28, 0x3d ; 61
ae: de b7 in r29, 0x3e ; 62
b0: c0 58 subi r28, 0x80 ; 128
b2: d0 40 sbci r29, 0x00 ; 0
b4: 0f b6 in r0, 0x3f ; 63
b6: f8 94 cli
b8: de bf out 0x3e, r29 ; 62
ba: 0f be out 0x3f, r0 ; 63
bc: cd bf out 0x3d, r28 ; 61
be: fe 01 movw r30, r28
c0: 31 96 adiw r30, 0x01 ; 1
uint8_t ff_u8[128U];

/* memset() should really be used here */
for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++)
c2: ce 01 movw r24, r28
c4: 8f 57 subi r24, 0x7F ; 127
c6: 9f 4f sbci r25, 0xFF ; 255
{
ff_u8[ idx_u16 ] = 0U;
c8: 11 92 st Z+, r1
void bar(void)
{
uint8_t ff_u8[128U];

/* memset() should really be used here */
for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++)
ca: e8 17 cp r30, r24
cc: f9 07 cpc r31, r25
ce: e1 f7 brne .-8 ; 0xc8 <bar+0x20>
{
ff_u8[ idx_u16 ] = 0U;
}
foo(ff_u8[ val ]); // <-- this is to prevent the code above from
being optimized away
d0: 80 91 00 01 lds r24, 0x0100
d4: fe 01 movw r30, r28
d6: e8 0f add r30, r24
d8: f1 1d adc r31, r1
da: 81 81 ldd r24, Z+1 ; 0x01
dc: e1 df rcall .-62 ; 0xa0 <foo>
}
de: c0 58 subi r28, 0x80 ; 128
e0: df 4f sbci r29, 0xFF ; 255
e2: 0f b6 in r0, 0x3f ; 63
e4: f8 94 cli
e6: de bf out 0x3e, r29 ; 62
e8: 0f be out 0x3f, r0 ; 63
ea: cd bf out 0x3d, r28 ; 61
ec: cf 91 pop r28
ee: df 91 pop r29
f0: 08 95 ret
Don Kinzer
2012-07-03 22:51:27 UTC
Permalink
Is the frame pointer guaranteed to be preserved from the time it is set in .init2 until it is used in main() [...]
The *stack pointer* is initialized in .init2 but the *frame pointer*
is not. Rather, the frame pointer is set up in the prologue for each
function that needs it (i.e. has local storage or takes the address of
one or more parameters). The "naked" attribute tells the compiler to
omit the function prologue and epilogue which, apparently, also causes
the frame pointer setup to be omitted as well. It would be useful to
add a caveat to the documentation about using local
variables in naked functions.

One way to work around the problem is to create a separate helper
function to do the work and then call it from the .init8 function.
You'll need to ensure that the helper function is *not* static to
avoid having the compiler inline the code. You may also have to
adjust the compiler options to prevent the compiler from inlining the
code. The output below (resulting from code derived from your
example) was obtained using the default call cost and inline size
option settings.

Don Kinzer

--- Example Code ---
void id_filter_init8( void ) __attribute__ ((naked))
__attribute__((section(".init8")));
void id_filter_init8( void )
{
98: 07 d0 rcall .+14 ; 0xa8 <bar>
9a: 03 d0 rcall .+6 ; 0xa2 <main>
9c: 2a c0 rjmp .+84 ; 0xf2 <_exit>
...
void bar(void)
{
a8: df 93 push r29
aa: cf 93 push r28
ac: cd b7 in r28, 0x3d ; 61
ae: de b7 in r29, 0x3e ; 62
b0: c0 58 subi r28, 0x80 ; 128
b2: d0 40 sbci r29, 0x00 ; 0
b4: 0f b6 in r0, 0x3f ; 63
b6: f8 94 cli
b8: de bf out 0x3e, r29 ; 62
ba: 0f be out 0x3f, r0 ; 63
bc: cd bf out 0x3d, r28 ; 61
be: fe 01 movw r30, r28
c0: 31 96 adiw r30, 0x01 ; 1
uint8_t ff_u8[128U];

/* memset() should really be used here */
for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++)
c2: ce 01 movw r24, r28
c4: 8f 57 subi r24, 0x7F ; 127
c6: 9f 4f sbci r25, 0xFF ; 255
{
ff_u8[ idx_u16 ] = 0U;
c8: 11 92 st Z+, r1
void bar(void)
{
uint8_t ff_u8[128U];

/* memset() should really be used here */
for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++)
ca: e8 17 cp r30, r24
cc: f9 07 cpc r31, r25
ce: e1 f7 brne .-8 ; 0xc8 <bar+0x20>
{
ff_u8[ idx_u16 ] = 0U;
}
foo(ff_u8[ val ]); // <-- this is to prevent the code above from
being optimized away
d0: 80 91 00 01 lds r24, 0x0100
d4: fe 01 movw r30, r28
d6: e8 0f add r30, r24
d8: f1 1d adc r31, r1
da: 81 81 ldd r24, Z+1 ; 0x01
dc: e1 df rcall .-62 ; 0xa0 <foo>
}
de: c0 58 subi r28, 0x80 ; 128
e0: df 4f sbci r29, 0xFF ; 255
e2: 0f b6 in r0, 0x3f ; 63
e4: f8 94 cli
e6: de bf out 0x3e, r29 ; 62
e8: 0f be out 0x3f, r0 ; 63
ea: cd bf out 0x3d, r28 ; 61
ec: cf 91 pop r28
ee: df 91 pop r29
f0: 08 95 ret
Joerg Wunsch
2012-07-04 07:13:43 UTC
Permalink
Post by Bob Paddock
When does the Stack Frame Pointer (Y) get setup?
There's a "stack pointer", which is set up early (on modern AVRs,
it's already set up by the hardware).

There's a frame pointer (Y), which is set up upon each entry of a
function which needs it. Not all functions need a stack frame.
--
cheers, J"org .-.-. --... ...-- -.. . DL8DTL

http://www.sax.de/~joerg/ NIC: JW11-RIPE
Never trust an operating system you don't have sources for. ;-)
Joerg Wunsch
2012-07-04 07:43:42 UTC
Permalink
Post by Bob Paddock
Did I miss anything in the documentation that would tell me not to
use auto variables in .initX sections after .init2 that sets the
stack?
If you declare the function being "naked", you cannot expect the
compiler to allocate a stack frame.
--
cheers, J"org .-.-. --... ...-- -.. . DL8DTL

http://www.sax.de/~joerg/ NIC: JW11-RIPE
Never trust an operating system you don't have sources for. ;-)
Bob Paddock
2012-07-06 14:57:17 UTC
Permalink
Post by Joerg Wunsch
Post by Bob Paddock
Did I miss anything in the documentation that would tell me not to
use auto variables in .initX sections after .init2 that sets the
stack?
If you declare the function being "naked", you cannot expect the
compiler to allocate a stack frame.
What one expects and reality are often different. Before I submit a
patch for the documentation, to prevent others from shooting
themselves in the foot, I wanted to make sure I understood how stack
frames are working.

In looking at the .lss file, I see something that I do not understand,
when comping code for the
XMega128A1 using GCC 4.7.1-rc1

Is the timing of the XMega OUT instruction different than a non-XMega
part, in that it would prevent an interrupt between two sequential
OUT instructions? What happens when an interrupt happens between the
instructions that manipulate the stack pointer in the epilogue/prologue?
If there is 256 bytes of stack head-room between the heap/bss maybe nothing
(it would not be noticed as a problem).

Going back to Anatoly original OS_Main/OS_Task patch:

http://www.mail-archive.com/avr-gcc-***@nongnu.org/msg03812.html

Using his example code from that email, copied in to the file main.c,
compiling using the Makefile at the end of this message, Non-XMega run:

# Non-XMega the stack manipulation is properly protected.
# The non-XMega parts delay a cycle when enabling interrupts,
# hence it is safe to save stack_hi, enable interrupts, save_lo,
# this code disables interrupts when modifying the stack pointer 0x3E/0x3F:
00000090 <__prologue_saves__>:
[snip pushs]
b4: cd b7 in r28, 0x3d ; 61
b6: de b7 in r29, 0x3e ; 62
b8: ca 1b sub r28, r26
ba: db 0b sbc r29, r27
bc: 0f b6 in r0, 0x3f ; 63
be: f8 94 cli ; Disable Interrupts
c0: de bf out 0x3e, r29 ; 62
c2: 0f be out 0x3f, r0 ; 63 Enable Interrupts are delayed
till after next instruction
c4: cd bf out 0x3d, r28 ; 61
c6: 09 94 ijmp

000000c8 <__epilogue_restores__>:
[snip pops]
f0: 0f b6 in r0, 0x3f ; 63
f2: f8 94 cli ; Disable Interrupts
f4: de bf out 0x3e, r29 ; 62
f6: 0f be out 0x3f, r0 ; 63 Enable Interrupts are delayed
till after next instruction
f8: cd bf out 0x3d, r28 ; 61
fa: ca 2f mov r28, r26
fc: db 2f mov r29, r27
fe: 08 95 ret

I'm fine with the above code, however I question this
XMega run:

# This code DOES NOT disable interrupts when modifying the stack
# pointer 0x3D/0x3E:
000002b0 <__prologue_saves__>:
[snip pushs]
2d4: cd b7 in r28, 0x3d ; 61
2d6: de b7 in r29, 0x3e ; 62
2d8: ca 1b sub r28, r26
2da: db 0b sbc r29, r27
2dc: cd bf out 0x3d, r28 ; 61
; [What happens if an interrupt happens between these two out instructions?]
2de: de bf out 0x3e, r29 ; 62
2e0: 19 94 eijmp

000002e2 <__epilogue_restores__>:
[snip ldds]
306: ce 0f add r28, r30
308: d1 1d adc r29, r1
30a: cd bf out 0x3d, r28 ; 61
; [What happens if an interrupt happens between these two out instructions?]
30c: de bf out 0x3e, r29 ; 62
30e: ed 01 movw r28, r26
310: 08 95 ret

For anyone that wants to reproduce:

# -------------- Cut Here main.c --------------
__attribute__ ((noinline))
int fn0(long a1, long a2, long a3, long a4, long a5)
{
return 0;
}

// Note this is commented out, OS_MAIN/OS_TASK are not relevant at this point:
//__attribute__ ((OS_main))
int main(void)
{
volatile long long a; // local var, need function frame

a = 1;

return fn0(1, 2, 3, 4, 5); // use call-saved regs
}
# -------------- Cut Here --------------

And this Makefile of mine:

# -------------- Cut Here Makefile --------------
MCU=atxmega128a1
#
# NOTE, that this is commented out, it is the important point.
# Uncomment to see the Xmega main.lss output and compare.
#
#M_MCU=-mmcu=$(MCU)

SRC=main.c
TARGET=main
OBJDIR = .

CSTANDARD = -std=gnu99
CFLAGS += $(CSTANDARD)

# Functions prologues/epilogues expanded as call to appropriate
# subroutines. Code size will be smaller. Use subroutines for function
# prologue/epilogue. For complex functions that use many registers (that needs
# to be saved/restored on function entry/exit), this saves some space at the
# cost of a slightly increased execution time.
CFLAGS += -mcall-prologues

# -adhlns...: create assembler listing
CFLAGS += -Wa,-adhlns=$(<:%.c=$(OBJDIR)/%.lst)

# Be verbose:
CFLAGS += -v

ALL_CFLAGS = -v $(M_MCU) $(CFLAGS)

# Tools:
OBJDUMP = avr-objdump
CC = avr-gcc
SHELL = sh
REMOVE = rm -f

OBJ = $(SRC:%.c=$(OBJDIR)/%.o) $(CPPSRC:%.cpp=$(OBJDIR)/%.o)
$(ASRC:%.S=$(OBJDIR)/%.o)

all: build

build: elf lss

elf: $(TARGET).elf
lss: $(TARGET).lss

# Create extended listing file from ELF output file.
%.lss: %.elf
# -h Display section headers
# -S Intermix source code with disassembly
# -z Do not skip blocks of zero when disassembling
$(OBJDUMP) -h -S -z $(OBJDIR)/$< > $(OBJDIR)/$@

%.elf: $(OBJ)
$(info )
$(info ALL_CFLAGS = $(ALL_CFLAGS))
$(info )
$(CC) $(ALL_CFLAGS) $^ --output $(OBJDIR)/$@ $(LDFLAGS)

# Compile: create object files from C source files.
$(OBJDIR)/%.o : %.c
$(CC) -c $(ALL_CFLAGS) $< -o $@

clean:
$(REMOVE) $(TARGET).elf
$(REMOVE) $(TARGET).lss
# -------------- Cut Here --------------
Bob Paddock
2012-07-06 17:00:07 UTC
Permalink
I believe xmega automatical disable interrupts for one cycle after anything
that touches the stack and or int priority registers (it has been a few
years so I might have some details wrong).
Thank you.

That would be this in the XMega Manual:

To prevent corruption when updating the Stack Pointer from software, a
write to SPL will auto-
matically disable interrupts for up to 4 instructions or until the
next I/O memory write.
Georg-Johann Lay
2012-07-08 17:58:56 UTC
Permalink
Post by Bob Paddock
Post by Joerg Wunsch
Post by Bob Paddock
Did I miss anything in the documentation that would tell me not to
use auto variables in .initX sections after .init2 that sets the
stack?
If you declare the function being "naked", you cannot expect the
compiler to allocate a stack frame.
What one expects and reality are often different. Before I submit a
patch for the documentation, to prevent others from shooting
themselves in the foot, I wanted to make sure I understood how stack
frames are working.
The documentation says clearly that a naked function will neither get
a prologue nor an epilogue:

http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html#index-function-without-a-prologue_002fepilogue-code-2571
Post by Bob Paddock
Is the timing of the XMega OUT instruction different than a non-XMega
part, in that it would prevent an interrupt between two sequential
OUT instructions?
Writing SPL will switch off I-flag for some ticks.
Post by Bob Paddock
I'm fine with the above code, however I question this
# This code DOES NOT disable interrupts when modifying the stack
[snip pushs]
2d4: cd b7 in r28, 0x3d ; 61
2d6: de b7 in r29, 0x3e ; 62
2d8: ca 1b sub r28, r26
2da: db 0b sbc r29, r27
2dc: cd bf out 0x3d, r28 ; 61
; [What happens if an interrupt happens between these two out instructions?]
2de: de bf out 0x3e, r29 ; 62
2e0: 19 94 eijmp
The code does not disable IRQs because the hardware does.
Hint: To reproduce it is helpful to show an example that
is as easy as it can get. In particular, to compile the code
and see the generated assembly no Makefile is needed.

Simply

$ avr-gcc -mmcu=avrxmega5 foo.c -Os -S

for example with foo.c

void foo (void)
{
long volatile a = 0;
}

The epilogue then reads:

/* epilogue start */
adiw r28,4
out __SP_L__,r28
out __SP_H__,r29
pop r29
pop r28
ret

As frame setup is handled by prologue epilogue, you don't get frame
handling/setup with a naked function, of course.

The impact of naked will be the same is if you delete all
instructions up to the /* prologue: function */ comment and all
instructions after the /* epilogue start */ comment.

Besides acting on generated code, -mmcu= also takes effect on the
multilib variant of libgcc/libc that is linked with your code.

prologue_saves and epilogue_restores reside in libgcc. The xmega's
multilibs take advantage of the hardware magic that disables I.

See __prologue_saves__ for example:

http://gcc.gnu.org/viewcvs/trunk/libgcc/config/avr/lib1funcs.S?revision=185907&view=markup#l1671


Johann

Don Kinzer
2012-07-03 19:24:43 UTC
Permalink
You'll probably have to create a "static inline" function [...]
In my test, the stack frame setup still was not done with a call to a
static inline function. However, removing the "static inline"
resulted in the stack frame setup being in the code for the helper
function at the expense of a call from the .init8 function.

void id_filter_init8( void ) __attribute__ ((naked))
__attribute__((section(".init8")));
void id_filter_init8( void )
{
98: 07 d0 rcall .+14 ; 0xa8 <bar>
9a: 03 d0 rcall .+6 ; 0xa2 <main>
9c: 2a c0 rjmp .+84 ; 0xf2 <_exit>
...
void bar(void)
{
a8: df 93 push r29
aa: cf 93 push r28
ac: cd b7 in r28, 0x3d ; 61
ae: de b7 in r29, 0x3e ; 62
b0: c0 58 subi r28, 0x80 ; 128
b2: d0 40 sbci r29, 0x00 ; 0
b4: 0f b6 in r0, 0x3f ; 63
b6: f8 94 cli
b8: de bf out 0x3e, r29 ; 62
ba: 0f be out 0x3f, r0 ; 63
bc: cd bf out 0x3d, r28 ; 61
be: fe 01 movw r30, r28
c0: 31 96 adiw r30, 0x01 ; 1
uint8_t ff_u8[128U];

/* memset() should really be used here */
for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++)
c2: ce 01 movw r24, r28
c4: 8f 57 subi r24, 0x7F ; 127
c6: 9f 4f sbci r25, 0xFF ; 255
{
ff_u8[ idx_u16 ] = 0U;
c8: 11 92 st Z+, r1
void bar(void)
{
uint8_t ff_u8[128U];

/* memset() should really be used here */
for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++)
ca: e8 17 cp r30, r24
cc: f9 07 cpc r31, r25
ce: e1 f7 brne .-8 ; 0xc8 <bar+0x20>
{
ff_u8[ idx_u16 ] = 0U;
}
foo(ff_u8[ val ]); // <-- this is to prevent the code above from
being optimized away
d0: 80 91 00 01 lds r24, 0x0100
d4: fe 01 movw r30, r28
d6: e8 0f add r30, r24
d8: f1 1d adc r31, r1
da: 81 81 ldd r24, Z+1 ; 0x01
dc: e1 df rcall .-62 ; 0xa0 <foo>
}
de: c0 58 subi r28, 0x80 ; 128
e0: df 4f sbci r29, 0xFF ; 255
e2: 0f b6 in r0, 0x3f ; 63
e4: f8 94 cli
e6: de bf out 0x3e, r29 ; 62
e8: 0f be out 0x3f, r0 ; 63
ea: cd bf out 0x3d, r28 ; 61
ec: cf 91 pop r28
ee: df 91 pop r29
f0: 08 95 ret
Don Kinzer
2012-07-03 22:31:32 UTC
Permalink
Is the frame pointer guaranteed to be preserved from the time it is set in .init2 until it is used in main() [...]
The *stack pointer* is initialized in .init2 but the *frame pointer*
is not. Rather, the frame pointer is set up in the prologue for each
function that needs it (i.e. has local storage or takes the address of
one or more parameters). As I pointed out in an earlier reply, the
"naked" attribute tells the compiler to omit the function prologue and
epilogue which, apparently, also causes the frame pointer setup to be
omitted as well.

I demonstrated how to work around the issue in my second reply. It
would be useful to add a caveat to the documentation about using local
variables in naked functions.

Don Kinzer
Georg-Johann Lay
2012-07-04 14:21:54 UTC
Permalink
Post by Bob Paddock
When does the Stack Frame Pointer (Y) get setup?
- If a frame pointer is needed
- If its setup is not avoided, e.g. by attribute naked
Post by Bob Paddock
Should the following code work?
No. The code obviously relies on a frame but inhibits frame
generation by "naked".
Post by Bob Paddock
void id_filter_init8( void ) __attribute__ ((naked)) __attribute__
((section(".init8")));
void id_filter_init8( void )
{
uint8_t ff_u8[128U];
/* memset() should really be used here */
for( uint16_t idx_u16 = 0U; idx_u16 < (uint16_t) sizeof( ff_u8 ); idx_u16++ )
{
ff_u8[ idx_u16 ] = 0U;
}
Consider writing this as a constructor, i.e.
__attribute__ ((constructor))
instead of
__attribute__ ((naked, section(".init8")))

Johann
Loading...