Sign in to follow this  
Followers 0
scalvert

NA5 screen Crashes

15 posts in this topic

Hello all, I'm new to the Sysmac Studio software and I'm having an issue with the HMI screens crashing.
I have a screen that shows the active alarms. I do this with the "User Alarm Viewer" tool and by adding Groups under the "User Alarms".

I have 42 different alarm groups and each one has its own tag that the expression looks at to trigger the alarms. I have made sure that I don't have any overlapping tags in any of the groups and the tag arrays are larger or equal to the size of the group associated with it. All the tags in the HMI global variables match the size of the ones in the PLC global variables.

The crashing issue only happens when I'm on the Alarm screen, but it doesn't happen right away, sometimes it takes 5 minutes and sometimes longer.

Has anyone seen this and does anyone know how fix it or even know what I should be looking for?

I can't attach any images, but I posted this on another forum but haven't gotten any responces.

 

Thanks for any help.

Share this post


Link to post
Share on other sites

42 alarm groups?  Why so many?  How many alarms in each group?  Are the triggers numeric or binary?

I've been using NA for years and never had an issue with the alarm viewer object.

Share this post


Link to post
Share on other sites

Thanks for the response.

Each robot, tool, welder were put into separate groups (just following customer's programming example).  I was actually seeing screen delays when going from screen to screen too, but I since placed all the robots into one group, the Tools into their own group and so on.

 

Each group varies in sizes, most are 16 bits some are 64, 8, 32....  They are triggered by a binary bit.

 

On a separate note that might be part of the issue.  The HMI's and robots are communicating over Ethernet/IP and we just noticed that the robots are sending packets to the PLC and to a random IP address that is non-existent.  Each robot is sending to a different IP address (none are sending to the same IP).  We are noticing network issues, as in our robots just stop sending I/O data and cause E-Stop faults.  I'm wondering if this could be causing the screens to crash.  We are looking into the robot side to see if we can fix it.

 

I'm not sure I'm allowed to post links to other forums or not, but if you go here ( http://www.plctalk.net/qanda/showthread.php?t=129283 ) you can see the images of the error message that pops-up when it crashes.

 

 

Share this post


Link to post
Share on other sites

Just trying to understand how your system is setup, so bear with me.

You have 42 alarm groups on an NA series HMI screen.

You say each group varies in size, do you mean the quantity of alarms in each group (ie. 16 bit = 16 alarms)?

Each group is 'triggered' by a binary bit. So does this mean that all the alarms in 1 group have a single bit that triggers all of them at once? Or each alarm has it's own trigger?

You mention tags and tag array's in your first post, what do you mean by these? Is this like an array of bits for each group (ie. Array A-Bit 0 triggers Group A-Alarm 0, Array A-Bit 1 triggers Group A-Alarm 1 and so on), with you setting the values for these bits elsewhere in Sysmac Studio?

Are you able to post a screenshot of one of the Alarm Groups in Sysmac Studio?

Roughly how many alarms in total are you working with (easiest way to find this out is export them all to Excel. Right Click "User Alarms" and click export)

A project I recently finished uses about 12 different alarm groups, with a combined total of over 250 alarms and I had no issues with it. I don't recall any delay's between screens either (except for the iPad's VNC connection to the HMI, but that is a different matter entirely :-))

Share this post


Link to post
Share on other sites

Hey BE, thanks for your response.

I had 42 alarm groups on the NA series HMI screen.  Within each group I would have different quantities of alarms (e.g. A fixture tool has 32 alarms and a robot tip change would have 8 alarms.  The Tool alarm group would have a tag that would be a 32-bit union and the tip change alarm group would be a 8-bit union).  **side note, I said array because these are tagged to unions and they looked line arrays when I first looked at them.  Ignore the "array" term in the first post.*

So alarm group for R1 tip change would have the tag "Alarm_R1_TP" which would be a data type of UNION_8.

UNION_8 looks like this

B - Boolean(7)

BYT - BYTE

So if the tag "Alarm_R1_TP.B(0) is set high, the first alarm in the R1 Tip Change alarm group would be displayed on the NA screen.

 

I have 818 alarms in the NA screen.

 

It turns on that the robots are in Multicast and the traffic is only at 800Kbs, so the speed shouldn't be an issue.

Share this post


Link to post
Share on other sites

Thanks for confirming those points. One other thing, are you viewing the alarm history or just the active alarms when it crashes? Assuming you are view active alarms, does it crash if nothing is on the alarm screen?

I personally can't see anything wrong with how you have set that up, that said I have never used unions for alarm expressions. I have used structures with no issues, so I would assume unions would be fine also, but I will leave that to someone with more experience to comment on. 

If you haven't played around with the import/export function for user alarms, I would suggest having a play around with it before doing the following. I would hate to have to enter 800 alarms again :-0. You can export all user alarms into one spread sheet, or do individual groups. Just right click the relevant group and click export to do individual groups. Or right click User Alarms and click export to export all the alarms into one spread sheet.

At this point, my thoughts would be to export all your alarms to an excel sheet as I mentioned above. Then delete all the alarms on your HMI program, upload to the NA and see if you still have issue when on the alarm screen. If you still have the problem, then it probably isn't your alarms causing the issue.

If you don't have any issues after deleting the alarms, I would be inclined to make a copy of your spread sheet. Then edit the copy, and remove about 80% of the alarms. Import the copy back into the HMI program, re-upload and see if the problem arises again. If all is well, keep adding alarms until the problem shows up again. If you exported all the groups individually, you could just import individual groups back in, instead of editing a large spreadsheet, this might make life easier.

If the NA crashes basically as soon as you add some of the union alarms back in, remove them again and see if you can add some alarms in that reference non-union variables (probably just make some up from other booleans in the program), and see if you still have the problem. If the problem still happens with non-union alarms, then again, it's probably not your alarms causing the issue.

I know its tedious, time consuming and painful, but basically if I was in your shoes I would be trying to determine the cause based on elimination, and then slowly re-introducing everything until the fault happens again.

Share this post


Link to post
Share on other sites
7 hours ago, BE said:

If you haven't played around with the import/export function for user alarms, I would suggest having a play around with it before doing the following. I would hate to have to enter 800 alarms again :-0. You can export all user alarms into one spread sheet, or do individual groups. Just right click the relevant group and click export to do individual groups. Or right click User Alarms and click export to export all the alarms into one spread sheet.

Hey BE, yes I used the export/import to create these alarms.  Manually entering all these would have been painful

8 hours ago, BE said:

At this point, my thoughts would be to export all your alarms to an excel sheet as I mentioned above. Then delete all the alarms on your HMI program, upload to the NA and see if you still have issue when on the alarm screen. If you still have the problem, then it probably isn't your alarms causing the issue.

If you don't have any issues after deleting the alarms, I would be inclined to make a copy of your spread sheet. Then edit the copy, and remove about 80% of the alarms. Import the copy back into the HMI program, re-upload and see if the problem arises again. If all is well, keep adding alarms until the problem shows up again. If you exported all the groups individually, you could just import individual groups back in, instead of editing a large spreadsheet, this might make life easier.

So, I have done this.  I cut the alarms in about half and downloaded the program to the NA screen.  It seemed to run fine (no crashes in 24hrs of running) and then I added one group back one at a time.  I got to about 22 groups and it crashed.  I then removed that last group and placed another group in there and I had the same crashing issue.  One thing I didn't do was export the alarms when it wasn't crashing to see how many alarms were in there.  I wonder if there is a maximum?  I haven't been able to see that anywhere in a manual or anything.

 

8 hours ago, BE said:

Thanks for confirming those points. One other thing, are you viewing the alarm history or just the active alarms when it crashes? Assuming you are view active alarms, does it crash if nothing is on the alarm screen?

Yes, just when viewing the active alarm screen.

Share this post


Link to post
Share on other sites

The only other thing I would try then is to merge some of your groups together. Maybe do a group "Robots 1-5" etc, and reduce the quantity of groups to about 10 (these 10 groups would include all your alarms). Try uploading these groups and see if you still have the problem. Basically, just trying to determine if it is the large amount of groups that are causing it, or if it is the amount of alarms.

Share this post


Link to post
Share on other sites

Hey BE, Sorry I meant to mention that.  I now have all my alarm groups merged together (i.e. Robots, Tools, Utilities, Safety...) and I'm still having the issue.  We are in contact with Omron directly now and they have mentioned that only 2 HMI's should be used with 1 controller (we have 4, but 2 are turned off because of other issues we were seeing).

 

One thing to note.  We have all our remote I/O (safe/non-safe) communicating on EtherCat.  Our HMI's and robots are communicating on Ethernet IP.  Our task execution time is set to 4ms, but is going over.  From my understanding, the Ethernet IP network is scanned at the end if each task scan so with the time being over, we don't have time to scan the Ethernet IP network.  This might be the root cause of the crashes.  We are seeing other issues here too, screen lag, Sysmac Studios freezing/crashing, robots losing comms which could all be from the scan time issues, because we are connected via ethernet.

 

We have been directed by Omron to try some changes and once implemented, I'll keep you posted on the solution/results.

 

Scott

Share this post


Link to post
Share on other sites

Sounds good.

I had some weird problems come up on my last project due to task execution times being too low. Increasing the time solved my issues, but I didn't have anything like what you are dealing with.

Good luck, let us all know how you go. :dance:

Share this post


Link to post
Share on other sites

You may be on the right track.  System services in NJ (which includes Ethernet) is lowest priority, so if you are exceeding task period, there may not be any time to perform the communications.  I know it's a bit late in the game for this, but that isn't an issue with the newer NX-Series controllers.  

When a bunch of HMIs get to requesting data from it, there's issues like this.  

I assume they told you that you can try to put some of your less critical code into Task 16, 17, or 18 and run them slower than the Task 4 primary.  This can make a huge difference to the main cycle time.  In Controller Setup, Operation Settings, there is an option called System Service Monitoring Settings. Raising those values may also help a bit.

Share this post


Link to post
Share on other sites

Hey guys, quick update.

 

So just like Crossbow suggested, we are moving less-critical code into task 17 which is set to 20ms scan time.  This has our primary task scanning at a rate of 3ms.  We are still getting EtherCAT comms issues where it drops out and stops all robots.  We are going to be keeping an eye on this one, but maybe with the scan time change it will go away?

They have released a new version (1.45) of sysmac studios that will help in the prevention of the software crashing.

 

I will update further with any extra findings.

 

Thanks for your suggestions.

Share this post


Link to post
Share on other sites

Ok, all seems good now.  Scan time of less then 3ms and the screens are stable now.  We were having the Ethercat crashing, but we think its a cisco switch issue and nothing to do with Sysmac studios.

 

Thanks for all your help.

Share this post


Link to post
Share on other sites

Was the etherCAT crash due to the Cisco switch?  

Share this post


Link to post
Share on other sites

There shouldn't be a Cisco switch on the EtherCAT.  No standard Ethernet devices should be connected.  If you need to split the cable, you need a dedicated EtherCAT splitter.

1 person likes this

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0