Hosed UART

fgonza2 · February 28, 2013, 7:22pm

I am using a program with 3 interrupt sources. UART, P1INT and Timer1. There seems to be a race condition between P1INT and UART1. So after getting multiple interrupts the UART1 stops responding and i need to reset it (run the SerialUart1Init again) i have given max priority to the P1INT. I have not modified the UART1 code at all, using the wixel SDK. Any known issues or ideas on why would this happen ? THe symptom is basically that any bytes that i sent to the UART don’t get transmitted. The rest of the program continues to run without an issue.

DavidEGrayson · February 28, 2013, 8:07pm

Hello. It’s pretty hard to debug this without seeing a simplified version of your code, along with instructions for how to reproduce the undesirable behavior. I reviewed the Wixel’s UART library source code and the only thing I noticed is that we have lines that disable and enable the TX interrupt by setting the UTXxIE bit in IEN2. I believe these lines should compile to single instructions, but maybe somehow that still causes a problem. The IEN2 register also contains P1IE, which you are probably using.

–David

fgonza2 · February 28, 2013, 9:27pm

i will post the interrupt routine code. when i get home tonight, But basically the scenario is the following:

P1_6 is used as an input to monitor a square wave which is the output of a speed sensor. I use Timer1 to calculate the period of that signal, so i can calculate the speed of a motor and also the position by counting the number of pulses. The interrupt fires on rising edge and the P1INT ISR code only increments the value of a variable there and nothing else.
On the meantime there is a serial stream of data being received at 9600bps and another stream being sent at 9600bps as well. This is using UART1.
I use IEN2 and IP0,IP1 to give the highest priority possible to P1INT, so i don’t miss any pulses coming through P1INT.

Reproduction scenario:

Run the system with both the serial stream and the signal that triggers the P1INT (which is a square wave with a minimum period of 83ms. The UART will hang after 2/3 minutes, not always in the exact same time, that’s why i think it is some sort of race condition, as the square wave signal that triggers P1 has variable period that is unpredictable
If i dont trigger P1INT (by disconnecting the square wave source) the UART never hangs

Now that you mention that the UART disables interrupts, it makes sense now of why i am seeing missing pulses on my counter. Now, can this be the source of the problem? is there any drawback of modifying the UART code and never disable interrupts ?

fgonza2 · March 1, 2013, 8:19am

Here is the ISR code:

ISR(P1INT, 0)  // interrupt to capture the counter pulse (CP)
{
    P1IFG  &= ~0x08;      //  Clear interrupt plag of P1_3 (P1F3)
                          
    IRCON2 &= ~0x08;      //  Clear CPU interrupt flag
    
    if (UP_DOWN_PIN)      //  Read direction, and increment or decrement the total pulse count
        pulseCounter++;
    else
        pulseCounter--;
   
    tachoPulseTime=readTimer1Counter();   // Read value of timer used to calulate the pulse period
    T1CNTL = 0;           // Reset the counter to zero, so a new period measurement can take place
    timer1OvrFlow=0;
    newTachoReadAvailable=1;     // Flag used to signal that the interrupt has been attended, meaning fresh data is available
}

ISR(T3,0)  // interrupt used as serial command watchdog
{
    TIMIF &= ~0b00000111;      //   T3OVFIF Clear T3 overflow interrupt flag - bit 0 (clearing all interrupts)
    auxCounter++;              // variable used to expand the timer resolution, will count number of times that there is an overflow
    if(auxCounter==WATCHDOG_TIMEOUT)
    {
        watchDogExpired=1;      // set timer watchdog flag - will need to be cleared by the consumer
        auxCounter=0;
    }
}

P1INT i am giving interrupt priority 1. UART1 is priority 3. any ideas on why the UART gets stuck randomly ? only happens when both interrupts are enabled. WIth P1INT disabled or without the source triggering it, the UART works fine

DavidEGrayson · March 1, 2013, 6:05pm

The UART code only disables one interrupt. You could change it, but then I think you will need to have your main loop handle the transfer of bytes from the TX buffer to the UART, instead of having the interrupt do it.

I looked at the code you provided and do not have any ideas of what might be going wrong.

–David

fgonza2 · March 1, 2013, 10:39pm

any ideas on how to modify the uart library to make it non-interrupt driven ? i can use it in my main loop.

DavidEGrayson · March 1, 2013, 10:42pm

First you would need to understand what the code in the TX interrupt is doing, and then make a function in the main loop that serves the same purpose. However, I am not convinced that any of this will help because we still do not know why you are getting the unexpected behavior. --David

fgonza2 · March 2, 2013, 3:22am

i have instrumented the code to follow the behavior and the uartTxSend is called with the proper arguments, but nothing comes out. Something is hosing it, and it only happens when the two interrupts are enabled simultaneously. I will try increaseing the interrupt priority of the uart to see if that makes a difference

fgonza2 · April 20, 2013, 7:51pm

I resolved this in two ways:

Re-wrote the uart libraries using the DMA. during that i found that the problem was interrupt priority, the serial interrupts were lower priority as they are set on the uartN library overriding my interrupt priority settings that were set before calling the UART init settings. WIth that fixed i dont see the problem anymore. THe issue is well documented on the SDCC manual. Basically what was happening was that the uart ISRs were interrupted by another ISR, given the atomic access issues with SDCC, the UART ISR got hosed even after returning. Longer term it will be safer to do the UART libraries with DMA instead to avoid this risk/issue.

DavidEGrayson · April 21, 2013, 6:25am

Hello.

I am glad you were able to figure it out. There are some weird pitfalls with using interrupts in SDCC. Section 3.9.2 says:

SDCC Manual:

If the interrupt service routine is deﬁned without __using a register bank or with register bank 0 (__using 0), the compiler will save the registers used by itself on the stack upon entry and restore them at exit, however if such an interrupt service routine calls another function then the entire register bank will be saved on the stack. This scheme may be advantageous for small interrupt service routines which have low register usage.

If the interrupt service routine is deﬁned to be using a speciﬁc register bank then only a, b, dptr & psw are saved and restored, if such an interrupt service routine calls another function (using another register bank) then the entire register bank of the called function will be saved on the stack. This scheme is recommended for larger interrupt service routines.

SDCC’s treatment of bank 0 sounds like it is exactly what we want in order to be safe. Each interrupt should save the registers it uses to the stack and restore them afterwards. This is why we use bank 0 for all the interrupts in the Wixel SDK. If all the interrupts in your system use bank 0, there should not be a problem with registers getting clobbered. If you have two interrupts using the same non-zero bank and they have different priority levels, that could cause interrupts to step on eachother.

There are also unavoidable atomic access issues, but that should only cause a problem if two interrupts are trying to access the same variable or special function register. Was that happening in your system?

–David

fgonza2 · April 22, 2013, 10:51pm

hi, i am using, same bank 0 on all interrupts, but with different priorities.

When P1INT or USBINT interrupts the UART ISRs, the UART never recovers and has to be reset.

About the atomic access, by looking at SDCC documentation, it maybe easier to hit and not recover.

3.9.1.2 Common interrupt pitfall: non-atomic access
If the access to these variables is not atomic (i.e. the processor needs more than one instruction for the access
and could be interrupted while accessing the variable) the interrupt must be disabled during the access to avoid
inconsistent data.
Access to 16 or 32 bit variables is obviously not atomic on 8 bit CPUs and should be protected by disabling
interrupts. You’re not automatically on the safe side if you use 8 bit variables though. We need an example here:
f.e. on the 8051 the harmless looking ”flags |= 0x80;” is not atomic if flags resides in xdata. Setting
”flags |= 0x40;” from within an interrupt routine might get lost if the interrupt occurs at the wrong time.
”counter += 8;” is not atomic on the 8051 even if counter is located in data memory.
Bugs like these are hard to reproduce and can cause a lot of trouble.
3.9.1.3 Common interrupt pitfall: stack overflow
The return address and the registers used in the interrupt service routine are saved on the stack so there must be
sufficient stack space. If there isn’t variables or registers (or even the return address itself) will be corrupted. This
stack overflow is most likely to happen if the interrupt occurs during the ”deepest” subroutine when the stack is
already in use for f.e. many return addresses.
3.9.1.4 Common interrupt pitfall: use of non-reentrant functions
A special note here, int (16 bit) and long (32 bit) integer division, multiplication & modulus and floating-point
operations are implemented using external support routines. If an interrupt service routine needs to do any of these
operations then the support routines (as mentioned in a following section) will have to be recompiled using the
–stack-auto option and the source file will need to be compiled using the --int-long-reent compiler option.
Note, the type promotion required by ANSI C can cause 16 bit routines to be used without the programmer being !
aware of it. See f.e. the cast (unsigned char)(tail-1) within the if clause in section 3.13.2.
Calling other functions from an interrupt service routine is not recommended, avoid it if possible. Note that
when some function is called from an interrupt service routine it should be preceded by a #pragma nooverlay if it is
not reentrant. Furthermore nonreentrant functions should not be called from the main program while the interrupt
service routine might be active. They also must not be called from low priority interrupt service routines while a
high priority interrupt service routine might be active. You could use semaphores or make the function critical if
all parameters are passed in registers.
Also see section 3.8 about Overlaying and section 3.11 about Functions using private register banks.

fgonza2 · May 9, 2013, 11:42pm

hit this problem again. There definitely an issue with interrupts and the SDK UART libraries. If you use both UARTs at the same time, the one with the lowest interrupt priority will stop working and has to be reset. I have the two UARTs running simultaneously sending and receiving a stream of data (@9600 bps).

I will now move on and rewrite them using DMA as i need to use both for my application.