The Elektor Forum will close. See also this link. From Friday March 15st it is no longer possible to log in to the forum. However, the content of the forum will remain visible until the end of March. As of April 1st the forum will definitely go off the air.

Digibutler hangs

The topic on number crunching

Postby ponedelnik » Sun Jun 12, 2011 12:00 am

I bought a couple of Digibutlers in 2008 with the purpose of developing a homecontrol application (mainly heating control and regulation). But quickly other activities interfered and it is only recently that I could invest substantial time in my ancient project.

Once I had a Digibutler working (of course, I experienced a number of the problems discussed elsewhere in this forum) I connected it to my local ethernet and I tried to access it from each of the four PC present on it. Simultaneous connections invoking the
variables.htm
page (which reloads itself quite frequently) quickly caused the device to hang, with reboot being the only way out.

I wonder if this problem has already been discussed elsewhere in this forum, because I guess it cannot have got unnoticed for two years now. Anyway, I hope this contribution will prove useful to a few readers.

I have to substantially modify the software, so a deep dive into its guts was unavoidable. During my explorations, I stepped on the following code snippets in module freescale_http_server.c

1. Function :
 freescale_http_cmdcb
(interrupt routine)
      case M_CLOSED:         while( semaphore ){};         semaphore = 1;         freescale_http_remove( so ); // EMG - 3/31/06         semaphore = 0;

2. Function :
freescale_http_check
(interruptible routine)
       while( semaphore ){};       semaphore = 1;       while(freescale_http_connection(so) == 0)       {         semaphore = 0;         freescale_http_loop();       }

Obviously, this semaphore was a last minute patch to avoid a concurrent execution of two functions (session allocation/deallocation), with a good chance of leaving the session data structure in an inconsistent state.
But as it is done, the cure may well be worse than the illness :

 while( semaphore ){};


is obviously an infinite loop; the only way out of it is an interrupt routine clearing the semaphore. If this loop is located in the interrupt routine itself, the story ends there.
One solution is to remove the infinite loop and to move the call of the
freescale_http_remove
function to the interruptible
freescale_http_check
function when the semaphore is set.
It is this routine which has to protect its critical code from an interrupt, that is which has to manipulate the semaphore. When the interrupt (call-back) function finds the semaphore set, it writes the socket handle to a global variable and does nothing else. Once the interruptible routine has cleared the semaphore, it tests the global variable and, if set, does the remove action and clears the variable.
But what if the interrupt routine executes more than once while the other one is still in its critical code ? It would then overwrite its first writing, which would get lost, causing a resource leak, fatal over time. (I tested this condition, which actually occurs).
Hence, the global variable must have more than one slot, and possibly be implemented as a silo, first in first out:
------------------// Socket deferred closing silo data structure#define STC_SIZE 4//max number of sockets to close deferredvolatile M_SOCKsocket_to_close[STC_SIZE];int stc_ins = 0;// insert indexint stc_rem = 0;// remove indexvolatile int semaphore = 0;------------------


Here the modifications to the two functions:

------------------int freescale_http_cmdcb(int code, M_SOCK so, void * data){  int e = 0;  switch(code)  {    // socket open complete    case M_OPENOK:      msring_add(&emg_http_msring, so);      break;    // socket has closed       case M_CLOSED:         if (semaphore)         {           socket_to_close[stc_ins++] = so;           if (stc_ins >= STC_SIZE) stc_ins = 0;         }         else freescale_http_remove( so );    // EMG - 3/31/06          break;// let stale conn timer catch these------------------void freescale_http_check(void){  M_SOCK so;  if ( emg_http_server_socket == INVALID_SOCKET )      return ;     while(msring_del(&emg_http_msring, &so) == 0)     {       semaphore = 1;       while(freescale_http_connection(so) == 0)         {          semaphore = 0;          freescale_http_loop();         }       semaphore = 0;      while (stc_rem != stc_ins)       {         freescale_http_remove((M_SOCK) socket_to_close[stc_rem++]);         if (stc_rem >= STC_SIZE) stc_rem = 0;         html_vars[17]++;// occurrence counter       }     }

After these modifications my Digibutler has worked for days without trouble. Lets now get on to real work !'
ponedelnik
 
Posts: 10
Joined: Thu Jan 02, 2014 10:44 am

Postby ponedelnik » Sun Jun 12, 2011 12:00 am

Obviously, my efforts to nicely format my message did not produce the expected results. Hereafter an "unformatted" copy.
-----------------------------------------------------------------------------
I bought a couple of Digibutlers in 2008 with the purpose of developing a homecontrol application (mainly heating control and regulation). But quickly other activities interfered and it is only recently that I could invest substantial time in my ancient project.

Once I had a Digibutler working (of course, I experienced a number of the problems discussed elsewhere in this forum) I connected it to my local ethernet and I tried to access it from each of the four PC present on it. Simultaneous connections invoking the "variables.htm" page (which reloads itself quite frequently) quickly caused the device to hang, with reboot being the only way out.

I wonder if this problem has already been discussed elsewhere in this forum, because I guess it cannot have got unnoticed for two years now. Anyway, I hope this contribution will prove useful to a few readers.

I have to substantially modify the software, so a deep dive into its guts was unavoidable. During my explorations, I stepped on the following code snippets in module freescale_http_server.c

1. Function : freescale_http_cmdcb (interrupt routine)
case M_CLOSED:
while( semaphore ){ };
semaphore = 1;
freescale_http_remove( so ); // EMG - 3/31/06
semaphore = 0;

2. Function : freescale_http_check (interruptible routine)
while( semaphore ){ };
semaphore = 1;
while(freescale_http_connection(so) == 0)
{
semaphore = 0;
freescale_http_loop();
}

Obviously, this semaphore was a last minute patch to avoid a concurrent execution of two functions (session allocation/deallocation), with a good chance of leaving the session data structure in an inconsistent state.
But as it is done, the cure may well be worse than the illness :

while( semaphore ){ };

is obviously an infinite loop; the only way out of it is an interrupt routine clearing the semaphore. If this loop is located in the interrupt routine itself, the story ends there.
One solution is to remove the infinite loop and to move the call of the freescale_http_remove function to the interruptible freescale_http_check function when the semaphore is set.
It is this routine which has to protect its critical code from an interrupt, that is which has to manipulate the semaphore. When the interrupt (call-back) functions finds the semaphore set, it writes the socket handle to a global variable and does nothing else. Once the interruptible routine has cleared the semaphore, it tests the global variable and, if set, does the remove action and clears the variable.
But what if the interrupt routine executes more than once while the other one is still in its critical code ? It would then overwrite its first writing, which would get lost, causing a resource leak, fatal over time. (I tested this condition, which actually occurs).
Hence, the global variable must have more than one slot, and possibly be implemented as a silo, first in first out:

------------------
// Socket deferred closing silo data structure
#define STC_SIZE 4//max number of sockets to close deferred
volatile M_SOCKsocket_to_close[STC_SIZE];
int stc_ins = 0;// insert index
int stc_rem = 0;// remove index

volatile int semaphore = 0;
------------------

Here the modifications to the two functions:

------------------
int freescale_http_cmdcb(int code, M_SOCK so, void * data)
{
int e = 0;

switch(code)
{
// socket open complete
case M_OPENOK:
msring_add(&emg_http_msring, so);
break;

// socket has closed
case M_CLOSED:
if (semaphore)
{
socket_to_close[stc_ins++] = so;
if (stc_ins >= STC_SIZE) stc_ins = 0;
}
else freescale_http_remove( so ); // EMG - 3/31/06

break;// let stale conn timer catch these

------------------

void freescale_http_check(void)
{
M_SOCK so;
if ( emg_http_server_socket == INVALID_SOCKET )
return ;
while(msring_del(&emg_http_msring, &so) == 0)
{
semaphore = 1;
while(freescale_http_connection(so) == 0)
{
semaphore = 0;
freescale_http_loop();
}
semaphore = 0;

while (stc_rem != stc_ins)
{
freescale_http_remove((M_SOCK) socket_to_close[stc_rem++]);
if (stc_rem >= STC_SIZE) stc_rem = 0;
html_vars[17]++;// occurrence counter
}
}

After these modifications my Digibutler has worked for days without trouble. Lets now get on to real work !'
ponedelnik
 
Posts: 10
Joined: Thu Jan 02, 2014 10:44 am

Postby ponedelnik » Thu Jul 14, 2011 12:00 am

What I posted here are 'code snippets', which means that the unconcerned code is not modified. Just to avoid any confusion : hereafter the full code of the modified functions :

//---------------------------------------------------------------int freescale_http_cmdcb(int code, M_SOCK so, void * data){int e = 0;switch(code){// socket open completecase M_OPENOK:msring_add(&emg_http_msring, so);break;      // socket has closed         case M_CLOSED:     if (semaphore)    {     socket_to_close[stc_ins++] = so;     if (stc_ins >= STC_SIZE) stc_ins = 0;   }   else freescale_http_remove( so );  // EMG - 3/31/06                break;// let stale conn timer catch these       // passing received data// blocked transmit now readycase M_RXDATA:          // received data packet, let recv() handle it case M_TXDATA:          // ready to send more, loop will do ite = -1;        // return nonzero code to indicate we dont want it break;         default:      dtrap();             // not a legal case      return 0;   }   TK_WAKE(&to_emghttpsrv);    // wake server task   USE_VOID(data);   return e;}//---------------------------------------------------------------void freescale_http_check(void){M_SOCK so;if ( emg_http_server_socket == INVALID_SOCKET )    return ;   while(msring_del(&emg_http_msring, &so) == 0)   {     semaphore = 1;      while(freescale_http_connection(so) == 0)      {     semaphore = 0;       freescale_http_loop();      }      semaphore = 0;      while (stc_rem != stc_ins)      {     freescale_http_remove( (M_SOCK) socket_to_close[stc_rem++] );     if (stc_rem >= STC_SIZE) stc_rem = 0;      }      #if(MAX_NUMBER_OF_SESSIONS>0)            m_ioctl(so, SO_NONBLOCK, NULL);   /* make socket non-blocking */#endif   } // while (msring_del...freescale_http_loop();}
ponedelnik
 
Posts: 10
Joined: Thu Jan 02, 2014 10:44 am

Postby shorty3 » Thu Sep 01, 2011 12:00 am

Thank you for your valuable bug correction, I also wonder that this problem did not arise in the forum earlier.
However, in my current project (a shutter control) I use the transfer of html vars extensively (18 vars) and still after some days the server gets stuck. I also noticed that if I am continuously connected to digibutler the time when the server hangs is much earlier (less than 12 hours).
This happens without invoking the variables.htm page but while getting realtime data on the main page.
Unfortunately it seems that some internal memory is overwritten, because prior the server actually stops its communication via ethernet the debugging information via RS232 is corrupted and does only provide cryptic symbols.
I did not modify the http server (except the proposed modification in this thread).
I am planning to do some further tests this weekend but maybe someone can give me a hint where the problem might be located.

regards

Werner
shorty3
 
Posts: 7
Joined: Thu Jan 02, 2014 10:44 am


Return to Microcontrollers & Embedded

Who is online

Users browsing this forum: No registered users and 1 guest