-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client loses connection to gateway peer and cannot recover without a restart when endorsement policy is changed #332
Comments
@andrew-coleman what is your opinion? |
@denyeart Since this a highly contrived error scenario test, we don't believe this should block the 2.4 release. For the next release, we propose the following enhancement:
|
Mis-matched proposal responses is a server-side issue related to smart contracts, so I think the key thing is to ensure that mismatched responses are logged at the server end so they can be investigated by whoever has operational responsibility for the Fabric nodes. There's certainly no harm in sending some specific information about the reason for the endorsement failure to the client so the issue can be diagnosed correctly at that end, but I don't think it's strictly necessary to send proposal response content back to the client. In some respects this might even be considered bad security practice since it gives any attacker additional information about the internal behaviour of the network. |
My change hyperledger/fabric#3012 is for CLI, and I'm not aware of the gateway. I agree with the two enhancement proposed by @andrew-coleman. |
The gateway has been reusing the protoutil.CreateTx function to build the transaction envelope from the set of endorsements. This function changed recently to add extra information into the error message if the proposal responses don’t all match. This change was identified as causing a problem for the gateway (hyperledger/fabric-gateway#332). This commit reverts that change, and instead writes the base64 response payloads to the log. It also refactors the functions slightly to remove duplicate unmarshalling of protos when invoked from the gateway. Signed-off-by: andrew-coleman <[email protected]>
With the latest images
I can't recreate this problem and still see the mismatches being reported and logged. |
First I will note that the client could probably recover but would have to be coded to detect the scenario and create a new client and gateway to recover. The problem I see reports 2 errors
But either way no further submitted transaction succeeds until the client was restarted
Unfortunately client side grpc logging doesn't tell me much only saying what the error message says. The peer logs don't provide anything helpful and if I turn on too much logging then I struggled to recreate the problem.
I recreated the scenario outside of the endorsement policy by having a long running transaction create MVCC Read Conflicts so it looks related to multiple long running transactions failing which have a large write set, in my case writing to 2000 keys but small amounts of data.
A recent change to fabric here hyperledger/fabric#3012 changed the output for when proposal responses don't match by including base64 version of protobuf bytes into the error which subsequently end up in the logs. If I revert this change then I haven't been able to hit the problem. It looks like this change was intended for when the peer is used as a cli but also occurs when the peer is running as a server.
If I don't use the long running txns with large number of keys then the problem doesn't occur, so this is specific to the contrived application being run to test the scenario, so it's not likely to be a common problem so I will leave it up for discussion about whether the recent change to outputting base64 when proposals don't match is of value and should be output when the peer is running in server mode.
The text was updated successfully, but these errors were encountered: