Sign in to follow this  
Followers 0
a.yearsley

Stack Overflow Program Fault - RSLogix5000

11 posts in this topic

I keep getting a major fault occur on a CompactLogix L43 CPU which I can't get to the bottom of. Fault details: Major fault Type 04 - Program fault Code 84 - Stack overflow. Stack too small to perform operation Always occurs on the same rung where I am calling an AOI used many times in the program. The manual says to check for too much nesting of sub-routines etc. Total call stack for the rung in question is Task > Program > JSR > AOI. There are plenty of other places with more nesting involved. The CPU doesn't seem to be too heavily loaded. I'm running 2 periodic tasks (10ms update with ~2-3ms execution time and 30ms update with ~12ms execution time). On the comms side I have 1x Ethernet/IP PointIO rack (20ms RPI) and 1x SCADA (1000ms update rate with ~1000tags). Is there any way I can get a full call stack for the CPU? Any ideas as to how I can resolve the fault?

Share this post


Link to post
Share on other sites
can you post the entire ACD program file? ... if not, I'd try deactivating rungs one at a time in the subroutine that you mentioned until the problem goes away ... the error message that you've mentioned usually is caused by a JSR instruction "calling itself" so that an infinite loop develops ... after about 34 "jumps" the processor declares a "stack overflow" fault ...

Share this post


Link to post
Share on other sites
By de-activating the rung it kept faulting on ... the same fault just occurs but in an entirely different program, calling a different AOI. There are definitely no circular references within these AOIs, and each of the AOIs are pretty simple without any loops or anything like that. I've mucked around with the task parameters and noted that the frequency of the fault occurring increases based on the update rates for the tasks (i.e. if call the tasks at faster rate then faults more often). I've also noted that if I inhibit the 'fast' task then I don't get the stack overflow fault occurring at all. It'd be great to have a stack/call trace or something similar so I can see for sure if it is the interplay between the two tasks that is affecting this fault.

Share this post


Link to post
Share on other sites
once again - can you post the entire ACD program file? ... also (if you can't post the file) what are the priorities for your tasks? ...

Share this post


Link to post
Share on other sites
Sorry - I'm unable to post the ACD file. Priorities. Fast task = 9, Slow task (one where the fault is occurring) = 10 Initially I had them at the same priority (10). I then gave the fast task higher priority (9) and continued to get these stack overflow faults occurring. If I swap the priorities (i.e. slow task is higher priority) then the frequency of stack overflows significantly reduces ... but I get task overruns on the fast task (not particularly surprised about that).

Share this post


Link to post
Share on other sites
A stack overflow is 99% of the time caused by calling too many nested routines, or a circular referance. I suppose it MIGHT be possible, if you are calling AOI's with huge numbers of IN and/or OUT parameters, but I doubt it. Why the JSR->AOI? AOI's are designed to be used inline, not like separate subroutines.

Share this post


Link to post
Share on other sites
It seems that 'Stack Overflow' is also linked to CPU load. If I significantly reduce the load on the PLC (decrease the update rates of the tasks) then these faults disappear. I haven't quite figured out the tipping point for my application. I think it might be ~50-60% CPU loading when I start to get stack overflow faults occurring. I probably wasn't clear but ... "JSR > AOI" is me trying to list the entire path to the rung that was faulted. Within that sub-routine there are many lines of code and many AOI's used etc. etc (i.e. the 'main' routine of the program just calls a set of sub-routines where all the actual code resides. Just a way to logically organise the code). Hope that makes sense.

Share this post


Link to post
Share on other sites
It isn't decreasing the update rates that is fixing the problem (directly) then; it is that you have too many periodic tasks with short periods, and they are interrupting too often. How many periodic tasks to you have? And at what rate? In general, I have one fast PT (10 to 25 ms), and do all fast stuff in there. If I have a task that should run every 100ms, I call it every fourth pass through the fast PT. Then I'll have a slow PT (250 to 1000ms) for housekeeping functions or slow PID loops. Even there, I'll often break up my calls to the PID updates so I only do a handful of loops each time through.

Share this post


Link to post
Share on other sites
Interrupts that interrupt a large AOI can be the problem . If the interrupt also calls an AOI then two AOI stack states must be keept on the stack. In otherwords interrupts may be the problem.. You may also need to check for AOIs calling other AOIs.

Share this post


Link to post
Share on other sites
Just resolved the issue this morning. Issue was that I had 1x instance of an AOI called multiple times throughout the program (effectively using it as a 'Function' rather than a 'Function block'). At some point in time the AOI was getting called in the slower task then, whilst processing that AOI it is being interrupted by the faster task (higher priority) ... which then tries to call that same instance of the AOI. This overlap causes the major fault (not surprisingly). Solution - declare a unique instance of the AOI for each task. Use an instance of an AOI exclusively within 1 task only. Thanks to all who replied for your insights. Edited by a.yearsley

Share this post


Link to post
Share on other sites
I hope this is not correct as it destroys the concept of code reuse, especially if you have a lot of little AOI utility functions...which should become more common because that's what you typically see with Siemens programmers. I would strongly encourage you to direct your inquiry to AB's tech support so that hopefully if there really is something going on here which is in fact a bug, they will find it and fix it. Edited by paulengr

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0