Avatar

Please consider registering
guest

sp_LogInOut Log In sp_Registration Register

Register | Lost password?
Advanced Search

— Forum Scope —




— Match —





— Forum Options —





Minimum search word length is 3 characters - maximum search word length is 84 characters

sp_Feed Topic RSS sp_TopicIcon
Client: Automatically recreating Subscriptions with MonitoredItems after Reconnect
July 1, 2020
11:00, EEST
Avatar
Seb
Member
Members
Forum Posts: 8
Member Since:
July 1, 2020
sp_UserOfflineSmall Offline

Hi all,

I’m working with the latest Prosys OPC UA Client SDK for Java. My Client connects to a OPC UA Server that regularly loses power, gets disconnected from the network, …

Therefore, I implemented my Client with autoReconnect=true, but the Client only reestablishes the connection without the subscription and monitoreditems – not very useful.

Is there a simple way to tell the SDK to recreate subscriptions/monitoreditems from the previous session or do i have to do this manually?

Regards
Seb

July 2, 2020
12:42, EEST
Avatar
Matti Siponen
Moderator
Members

Moderators
Forum Posts: 346
Member Since:
February 11, 2020
sp_UserOfflineSmall Offline

Hello,

Could you verify that the version of the Prosys OPC UA SDK for Java you’re using is 4.3.0? Reconnecting to a Server with automatic reconnection should preserve Subscriptions and MonitoredItems. Could you show Client logs starting from when the connection to the Server is lost to when the Client reconnects to the Server?

July 2, 2020
17:19, EEST
Avatar
Seb
Member
Members
Forum Posts: 8
Member Since:
July 1, 2020
sp_UserOfflineSmall Offline

Hello Matti,

thanks for the quick feedback. I’m using v4.3.0-1075.

For logs, please see https://pastebin.com/1bwrzBje

Thanks!

July 3, 2020
9:26, EEST
Avatar
Matti Siponen
Moderator
Members

Moderators
Forum Posts: 346
Member Since:
February 11, 2020
sp_UserOfflineSmall Offline

On line 66 of the log, onCreate of SubscriptionAliveListener is called, which suggests that the Client has recreated the Subscription and its MonitoredItems.

Is it possible that after a restart something on the Server has changed making the previously valid NodeIds invalid? For example, if the order of the Server’s NamespaceArray has changed after a restart, NamespaceIndices of NodeIds of MonitoredItems would need to be changed to match the new order. This NamespaceArray is visible to Clients in NamespaceArray Property of the Server object.

July 3, 2020
10:18, EEST
Avatar
Seb
Member
Members
Forum Posts: 8
Member Since:
July 1, 2020
sp_UserOfflineSmall Offline

That might be the issue. We also had to (occasionally) re-subscribe to the same nodes in other clients (UaExpert for example), too.

Do you know if the OPC UA Spec defines a behaviour for this?

I will be able to check next Tuesday and get back to this thread with the results.

July 3, 2020
12:10, EEST
Avatar
Bjarne Boström
Moderator
Moderators
Forum Posts: 1026
Member Since:
April 3, 2012
sp_UserOfflineSmall Offline

Hi,

IF the namespaceuri’s indexes being changing on restart is the only problem, you can try by using the MonitoredDataItem constructors taking on ExpandedNodeIds instead of NodeIds. Then use ExpandedNodeIds that use the uri instead of the index. Note however that any conversion error we do internally (since the servicerequest expects NodeIds) will fail on a runtime exception, which might be hard to notice. Let us know if this helped.

This is sort of an ongoing problem with OPC UA that almost everyone seem to make (SDK vendors and users). Basically current best practice is to never use the index (thus NodeId) at all and just rely on the uri (ExpandedNodeId, but read below for UaNodeId as it should really be used; but that is our SDK-specifics), and convert it to index at API boundaries. This can be done with the client.getNamespaceTable(). However note that this obiviously does not work for monitored items, since the resulted NodeId in this case would be only specific to the session where it was made, thus therefore we tried making the ExpandedNodeId alternatives (but error handling is a problem sort of still).

It took us too long to notice that NodeId should not be used in APIs, so we are sort of stuck with them for now. Personally I would like to eventually “purge” them from all public APIs, but that is a long effort. However, where possible,we try to make all new API with UaNodeId, which is sort our hybrid between NodeId and ExpandedNodeId that is always a “local reference” NodeId, but it does not contain the index at all (with ExpandedNodeId it is a problem with .equals since you’ll need the NamespaceTable to check uri vs index since it can have both). That makes it possible to use without having an active connection to the server and the index is basically calculated for each request based on the namespacetable (or it at least should). Due to combination with UaNamespace (once we’ll implent interning) we should be getting very close to the NodeId performance without it’s problems (there is some edge-cases if e.g. a server fails to provide a namespaceuri for an index, which is an error of course, but still something we need to handle).

So this time it is not an error of anything in the spec (other than it is maybe complicated to realize this), just our implementation being sub-optimal yet.

July 7, 2020
17:20, EEST
Avatar
Seb
Member
Members
Forum Posts: 8
Member Since:
July 1, 2020
sp_UserOfflineSmall Offline

Hi Bjarne,

thanks for the comprehensive explanation. +1 on not using NodeIds in public APIs – the resultion of the namespace ids should be hidden from the caller.

We just tried again restarting the server multiple times and had a look at the namespace ids: they seem to be stable. If it’s the changing namespace ids, then it’s a somewhat rare race condition on the server’s side. Sadly, I do not have access to the server’s source code.

I also increased logging verbosity in our client: We found out that our MonitoredDataItemListener stops getting called on loss off connection and never recovers. (despite the sdk claiming to be connected and subscribed again)

Next we will try our setup with another opc ua (demo) server to narrow down the problem to the client or server side.

I hope we can find a solution to this problem, it’s really holding us back. 🙁

July 8, 2020
10:52, EEST
Avatar
Bjarne Boström
Moderator
Moderators
Forum Posts: 1026
Member Since:
April 3, 2012
sp_UserOfflineSmall Offline

Could you use https://www.prosysopc.com/blog/opc-ua-wireshark/ to check what request we send and their responses? You will need to either use NONE or SIGN (only) MessageSecurityMode, but if that is possible it is the best way to test since it shows what is actually transmitted in the network/byte level. If you want, you can also send them to uajava-support@prosysopc.com.

(I only now did personally check your log, sorry)

There is one .. differentiating factor from the usual so to say. On line 8 we see Bad_ServerHalted. I have not typically been seeing servers to use that statuscode (but of course my view of the world of all OPC UA Servers is somewhat limited). And the description text of that code (also seen in the logs) is “The server has stopped and cannot process any requests.”. Basically in this situation the subscripions cannot be remade (the server does not let us), however if the UaClient fails to make some Subscriptions when it reconnects, it will reattempt at every status check interval (which is one second by default). So it shouldn’t matter in this case, but still I would not completely rule out some issue regarding this since it would mean we can reconnect and that is successful, but requests after that are failing. Note that since it was from UaClient$PublishTask, it would mean it was a PublishRequest that the server could not process, thus at least that is one reason why there is no notifications (since they would come with PublishResponses) in that particular instance.

Since there is the call to onAfterCreate it should mean the Subscription was successfully made to the server. Could you log so that the entire SDK is on DEBUG? And since there would be a lot of info it needs to be routed to a file and preferably sent by email to uajava-support@prosysopc.com (IF you could combine this with the Wireshark log it would be the best information possible). Alternatively you could try to look them yourself. There should be a line on DEBUG “createMonitoredItems: ” from the Subscription class.

July 8, 2020
13:35, EEST
Avatar
Seb
Member
Members
Forum Posts: 8
Member Since:
July 1, 2020
sp_UserOfflineSmall Offline

Hi Bjarne,

thanks for the feedback. We just tried another server and the behavior was as expected: After loss of connection, the client reconnected, recreated the subscription and received data in the listener. I suspect a faulty server implementation.

I will also try to improve logging and get back to you with more information.

July 9, 2020
12:46, EEST
Avatar
Seb
Member
Members
Forum Posts: 8
Member Since:
July 1, 2020
sp_UserOfflineSmall Offline

Hello everyone,

we found something new, please see attached logs: https://pastebin.com/SK3WYCeM

Why would the SDK report all monitoredItemIds as 0?

July 9, 2020
13:56, EEST
Avatar
Bjarne Boström
Moderator
Moderators
Forum Posts: 1026
Member Since:
April 3, 2012
sp_UserOfflineSmall Offline

That is something the server returns to us, i.e. that means the server did return 0 (check from Wireshark). That is not a valid id (0 is not, all other UInt32 values would be, OPC UA UInt32 does not have a null value, though us being in the java-world UnsignedInteger can be null within the JVM, but UA-binary-wise it is same as 0). Most likely in this case the individual operation (UA has 2 levels, entire service call vs an individual operations within the calls, i.e. they can partially succeed) of creating the monitored item failed and not the entire service call. Note that the error code is unfortunately not logged, but if you have the MonitoredDataItem, you can call .getErrorCode() for it to see the StatusCode which should then indicate what is the problem (if that is GOOD, then I would say it is a server bug). Alternatively if you were capturing with Wireshark check the CreateMonitoredItemsResponse (entire service call statuscode + individual operation’s own statuscode).

You can do the error checking in SubscriptionAliveListener.onAfterCreate for this investigation, but you might also do it just need to do an actual logic of removing the item from the Subscription and then it is up-to you what to do (depends on the error code). Also sorry, seems in this case SDK cannot yet handle re-doing the item (basically, if the entire call does not fail and instead the individual item failed, most likely they would just keep on failing on any future calls; thus should those be even attempted?; but this is a bit more complicated and is outside of the SDK’s current level of automation it can provider for you). Most typically the situation could be that e.g. the NodeId does not exist (which is in turn another gray area, as server is allowed to make such item, but it does not have to, but if the node is deleted while the item exist and then later added back it should work again by the specification).

July 13, 2020
11:02, EEST
Avatar
Seb
Member
Members
Forum Posts: 8
Member Since:
July 1, 2020
sp_UserOfflineSmall Offline

Hi all,

please see new logs regarding recreating subscriptions/monitoreditems: https://pastebin.com/t5nsi76h

The server allows connection before the expected nodes exist in the address space. The nodes show up after a few seconds/minutes, but at that point the Java SDK already failed to add all nodes as monitoreditems.

Is there a feature to “retry creating subscriptions/Monitoreditems” analogous to the “retry connection” feature?

July 13, 2020
13:48, EEST
Avatar
Bjarne Boström
Moderator
Moderators
Forum Posts: 1026
Member Since:
April 3, 2012
sp_UserOfflineSmall Offline

At least this partially solves the issue at least in that sense that it is not (as I would define it) a SDK bug. Though I think eventually we should get to a point where this would not matter.

Anyway, for now this is a problematic situation. There is no automatic-out-of-the-box to re-attempt creation of the items (other than losing connection again and thus an another attempt via reconnecting logic). But depending on what you would consider as an acceptable solution there are ways to workaround this.

If you Subscription.removeItem(…) and add the item again with Subscription.addItem(…), it would behave the same way as you would be adding it the first time (this keeps the MonitoredDataItemListener). Though note that the add behaves differently depending are we connected or not. If we are not, it is just added and then when the client is connected it will be made (though it could fail on the individual level). If we are connected, errors would be thrown as ServiceException (service-level) or StatusException (individual-operation-level) and it could also return the MonitoredItemCreateResult (which is null instead if we are not connected).

For bulk, addItems can take multiple items, but it can only throw on service-level errors, and the individual operation results must be checked from the retuned MonitoredItemCreateResult[].

If you subtype the Subscription class, you could call “protected void createItems()”, which is what is internally called on (re)connect, though this is only useful if all items failed.

So in basically all cases, you will need to implement some kind of watchdog to retry the failed items, since you basically have out-of-band information that they would eventually success. I would recommend to remove the failed items in the SubscriptionAliveListener.onAfterCreate(…), put them in a queue and retry them periodically and on success removing them from the queue. This is sort of how a collection of UaClients would be handled to get the initial connection in some of our own applications (I’m aware that this also would need some improvements).

Note that if you think the nodes are something the server should know to be valid, in the current specification version it could do the items: https://reference.opcfoundation.org/v104/Core/docs/Part4/5.12.2/#5.12.2.1 “If a NodeId is known to be valid by a Server but the corresponding Node Attributes are currently not available, the Server may allow the creation of a MonitoredItem and return an appropriate Bad StatusCode in the Publish response.”, you could also try to contact the server’s manufacturer to see if they would implement that.

Hopefully some day we will have something that would work out-of-the-box for this kind of situation. Note that the situation not exactly the same as with reconnecting (but the same principles sort of apply). With reconnecting it sort of doesn’t matter how many times we try to connect, since there is no SDK-user-originated-requests that could be attempted (as no connection). But once we do have the connection (and have fetched some data from the address space), we have one channel to send requests, and this is the same used by the SDK user. Therefore we need to be at least mindful of how much stuff we can put there automatically not to delay user requests too much (matters if low connection speeds and/or a lot of failing items, e.g. 10000s of them). But probably we should have some logic in place to allow this in the future (just that if we have no knowledge that the operation would ever succeed, it would be a bit wasteful to do those calls).

July 17, 2020
9:09, EEST
Avatar
Seb
Member
Members
Forum Posts: 8
Member Since:
July 1, 2020
sp_UserOfflineSmall Offline

Hello Bjarne,

thanks for the great information. Meanwhile, our server vendor was able to reproduce the issue on their side and agreed to fix it – our requirements for their server included the nodes being available from the very beginning.

However, i like the sultion with putting failed items into a queue and retrying them periodically the most – items that work work and items that failed will work eventually. I will keep that one in mind.

Forum Timezone: Europe/Helsinki

Most Users Ever Online: 1919

Currently Online:
37 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Top Posters:

Heikki Tahvanainen: 402

hbrackel: 144

rocket science: 88

pramanj: 86

Francesco Zambon: 83

Ibrahim: 78

Sabari: 62

kapsl: 57

gjevremovic: 49

Xavier: 43

Member Stats:

Guest Posters: 0

Members: 735

Moderators: 7

Admins: 1

Forum Stats:

Groups: 3

Forums: 15

Topics: 1523

Posts: 6449

Newest Members:

rust, christamcdowall, redaahern07571, nigelbdhmp, travistimmons, AnnelCib, dalenegettinger, howardkennerley, Thomassnism, biancacraft16

Moderators: Jouni Aro: 1026, Pyry: 1, Petri: 0, Bjarne Boström: 1026, Jimmy Ni: 26, Matti Siponen: 346, Lusetti: 0

Administrators: admin: 1