Using System Logs to troubleshoot a PLC momentary loss of functionality

project2501 · October 10, 2023, 8:30pm

Problem statement: I am notified the PLC is “down”. On connecting to it, I see that it is up. I can see through Scada connections that we lost OPC UA connectivity to ETH0, I can see from VFD Drive failures and corresponding error messages on those drives that we lost Ethernet IP Scanner on ETH1. So something certainly happened. Trying to determine if the controller itself stopped/restarted or instead if the controller lost power.

In the logs now and is there a clear process or error I can use to determine what took place?

Thanks!

Beno · October 10, 2023, 8:40pm

It will be a multifaceted approach for sure.
The key would be having a pretty good idea of the time.
If you know that, review each of the logs (or pull the whole lot and filter with an external program of choice) and take at look around that time.
You should be able to build up a profile of what happen from getting snips from each of the processes from around that time.

There is always our support group to help you go through the logs as well.

project2501 · October 10, 2023, 8:55pm

Thank you. I do know what time it happened and I’ve been taking that approach, using the time to decipher. As opposed to pasting logs here I may open up a ticket with the support team.
Thanks!

Beno · October 10, 2023, 8:59pm

No question there are some variables here.
The two main ones are how much experience one has with reading Linux log files in the past.
And secondly, how much time you have spent poking around the EPIC log files.
The first time for both will take some time to dig into, but like anything, the more you look with the intent to learn, the faster and more confident you will become.
It will always be a balance of learning and getting support.