Sign in to follow this  
Followers 0
gromit

PLC5 Lost Serial Data

15 posts in this topic

Gents, I have found an application that periodically loses serial data from the foreign device to the PLC. So a pulse would be recognized in the PLC but when it goes false the PLC never recognized it or never recieved the transition to false. What is a common, "best-practices" strategy to address this issue?

Share this post


Link to post
Share on other sites
Does that data come in at a constant speed? If so them look at the port activity and if you loose data your comms port is down. Now you have to look at which device is down PLC or device X. I might use a third party device as a buffer if the data is critical and you can't handle loosing it.

Share this post


Link to post
Share on other sites
Well. . . I know that one way to resolve this issue is to use a done bit of a TON timer to reset the lost/ left behind bit, but is there a more efficient solution?

Share this post


Link to post
Share on other sites
How consistent does that data come into the port? Does that data get duplicated or is every input different?

Share this post


Link to post
Share on other sites
I'm not exactly sure how to answer other than to say that the Ethernet communication port is 10 MBPS Half Duplex. Also, the PLC is a SLC5/05 rather than a PLC-5. The digital bits that get lost or left behind are packaged in two 16 bit integer files, N7:0 and N7:1. Rather than focusing on protocols and bandwidth specs, I would like this question to address how the PLC logic should address/manage critical-centric pulsed digital bits. For instance, the DCS is pulsing a bit true for five seconds, and then back false. It is sent to the PLC via a 3rd-party serial interface. At times the true-to-false transition is not recieved at the PLC and therefore does not reset the value to zero. Sooooo, other than using a TON timer done bit to turn off the bit recieved from the DCS after a short delay, how else could this be achieved. Considering the logic listed below: XIC N7:0/0 BST TON T4:0 1.0 5 0 NXB XIC T4:0/DN OTU N7:0/0 BND Thanks for any feedback. Edited by gromit

Share this post


Link to post
Share on other sites
That's exactly the logic I use with a particular DCS serial interface I work with. The DCS doesn't have the ability to write a "momentary" value to the PLC. Instead it writes a "1" and the PLC sets the value to 0 after 2 seconds. Of course we need separate commands for "start" and "stop" functions. Is your goal to correct instances where the DCS writes a 1 but fails to write the following 0, so that later "0 -> 1" transitions are seen properly by the PLC ? Or is this a watchdog pulse ? It seems this logic would defeat a watchdog function.

Share this post


Link to post
Share on other sites
Thanks for your reply Ken. I'm not using it as a watchdog pulse. Rather I am using it to reset the pulsed bits from the DCS that the PLC occassionally misses the true-to-false transition, thereby maintaining a true state. I was hoping for a more efficient solution, but believe the logic that I posted will work.

Share this post


Link to post
Share on other sites
I'm very curious about your statement that the PLC fails to get the True-to-False message. Are there times when the PLC also fails to get the False-to-True message, or will that not show up because the DCS operators keep "pushing the button" if the desired action fails to take place ? I would be amazed if the network or controller could malfunction in such a way that the data payload value actually made a difference to the success of the message transaction.

Share this post


Link to post
Share on other sites
I'd suggest you take a closer look at what you have and look for dropped data. Here's how I envision this system. 1. A Computer {aka PC} running some flavor of DCS Program which gives you a graphic of what it thinks it is doing. 2. A third party serial interface program makes the connection between you DCS and the SLC Data registers. 3. The SLC with a serial port and an ethernet port. Now what I imagine happens is that the DCS program sends the datagram with the true to false transition in and the 3rd party piece acknoqwledges to the DCS I've got and am sending your information. The DCS "hears" info sent and is fat dumb happy and shows that it sent a T to F on the gui. Now you send a T to F datagram to the 3rd party device. It says I received your data and will send it when I'm done with the F to T you asked me to send. Message queued. ANd that's where the fun begins I think. sorry No answr but hope my early morning / late night rambling shed some light.

Share this post


Link to post
Share on other sites
Ken, I can't be sure if the PLC misses any False-to-True transitions. If so, the operator will most likely continue to press the DCS push button until the sequence advances, as you suggested. I too was amazed the first time that I witnessed a malfunction such as this. It was on a TMR--(Triple Modular Redundant) PLC communicating with a Wonderware InTouch HMI application. The False-to-True stop pulse was received by the PLC for a gas turbine auxillary lube oil pump and it never reset to false after the short delay. Days, weeks, or months later, when the primary lube oil pump tripped, we came dangerously close to trashing the gas turbine prior to identifying and rectifying the issue. The Wonderware tech support suggested that this dysfunction must be shored up in the PLC application, rather than relying on the HMI. That was the first time I witnessed this lost or left-behind pulse data. The second time that I witnessed it was on the current issue whereby, the duct burner lightoff pulsed command was never reset, which caused am unintentional lightoff sequence, complete with gas double block and bleeds and a spark from the ignitor. Edited by gromit

Share this post


Link to post
Share on other sites
DCS's are never mission critical, and never have been no matter how many layers of redundancy you add. There are reams of data on this. Modern PLC's are rapidly driving down the reliability issue with TuV ratings to back it up. There are no 61508 rated DCS's. If you want a reliable system (not to mention for 10% of the cost), throw the DCS in the trash.

Share this post


Link to post
Share on other sites
It sounds like you are using a change-of-state interface. Remember...HMI's (and some DCS's) operate with a "change of state" or event based semantic. PLC's are generally polling devices and constantly receive updates as to state. So if something is wrong with a PLC it will get the updated message on the next scan. On a change-of-state system if the state change gets garbled, you are out of luck. The solution is to introduce a combination of handshaking and/or atomic states. For example, many HMI's support a semantic of a "momentary button". This is extremely dangerous because the actual implementation is something else entirely. It requires two separate events to occur...a "1" followed by a "0". If there is any sort of queue in between, the queue may receive both packets (if the message is delayed) and simplify it to just one state transition, or maybe none. In a similar vein, if something happens to communications along the way, you can get a "stuck" button, which is frequently even worse! The solution that I always implement is that the "event based" system sends just ONE signal, change of state for "button pressed". The PLC takes action when it receives the message and resets the bit. In other words, I never, ever rely on the HMI to ever send a "0". It's a momentary button after all so there's no value in receiving that second state in the first place. In most PC systems, they take this one step further. Usually they do nothing for "mouse down" events. They only interpret "mouse up" events as a "click". So if you click something accidentally and then realize it's a mistake, you can drag off the button without releasing the mouse and get out of making a mistake. It also naturally performs something of a "debounce" function. If you really need communication, it is best to use similar types of functions. For instance two counters (or sometimes 3) works great. The sender increments a counter. The receiver notes that the counter changed values and then updates an "acknowledge" counter by simply copying the sender's counter into the acknowledge counter. Adding watchdog timers to this finishes out the system and provides a way to detect failed communication. At which point you should go to the fail safe state in the event that the watchdog timers expire on failed communication.

Share this post


Link to post
Share on other sites
Thanks Paul. How can I determine if the system is event-based or atomic? Can you reference a white-paper that elaborates on how to manage this? So, you would never use the "momentary button" element offered by many HMI packages, such as Wonderware, CiTect, RSView32, Intellution, or even from a DCS or other foreign device?? I am familiar with the watch-dog timers approach but believe it to be overkill on a serial communication malfunction. I am interested in the counters approach. Does this approach use the set of counters for each point, or somehow for an entire packet? Short of bringing system to a safe stop, how would I then determine which bit did not get reset in order to know which one to toggle off? I have more questions, but might be answered upon reviewing a reference documents which exponds on this. Thanks for your detailed response.

Share this post


Link to post
Share on other sites
Each "bit" must be a distinct, separate communication, even if you waste a few bits by signalling a single state change with a separate packet. In practice it's quite easy to implement and consists of just two rungs of logic in the PLC: <do the normal thing that you are used to doing with a push button> <if the "glass push button" bit is set, clear it> That works for operators pushing buttons on HMI's. If it's an action where you need reliable communication (with fail safe states), there are a couple possibilities. One is that the remote process increments a counter. So the PLC simply maintains a local cached copy and when the local copy != the counter, it detects that a change happened. There is error checking because if the counter increments by more than "1" at a time, obviously there was a communication failure along the way, but I rarely implement this possibility. With just a counter, it is difficult to detect "no communication". So either you have a second counter which must periodically update (monitor by timer, move to fail safe state if it doesn't change), known as a "heartbeat", or you can sometimes rely on the communication mechanism itself for failure messages (only works for the process executing a write, and even then, it misses certain failures such as if the remote end crashes). All this being said, I'm a huge fan of distributed I/O and HMI's. At least with these, you get the choice of how to deal with failures and you get lots of diagnostic information. And your "vulnerability surface" (number of ways things can go wrong) decreases because instead of hundreds of wires going back to a central PLC cabinet, you have just a few network cables. Losing one of them causes far more widespread problems, but the number of faults is limited in size and extent, and can usually be repaired quicker since most faults occur at the end-points (devices) anyways.

Share this post


Link to post
Share on other sites
Thanks Paul. Much to digest. Very detailed and informative.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0