Socket closed error in Google Storage SDK in DataFlow pipeline - google-cloud-storage

I am using google-cloud-storage (1.54.0) in my DataFlow pipeline(2.29.0) to write files to Google Storage.
I see the below error randomly.
Error message from worker: java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: com.google.cloud.storage.StorageException: Socket closed org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187) org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108) org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.lambda$onTrigger$1(ReduceFnRunner.java:1058) org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnContextFactory$OnTriggerContextImpl.output(ReduceFnContextFactory.java:445) org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SystemReduceFn.onTrigger(SystemReduceFn.java:130) org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.onTrigger(ReduceFnRunner.java:1061) org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.emit(ReduceFnRunner.java:932)

When running a distributed system, especially at scale, you need to be able to handle transient errors (as well as idempotence in the face of retry).

Related

The listener for Azure function was unable to start - Microsoft.Azure.EventHubs.Processor Encountered error

I am getting the below error whilst running my Python Azure Function on the local machine in VSCode.
For clarification the message is:
The listener for function 'Functions.IoT_Data-Handler' was unable to
start. Microsoft.Azure.EventHubs.Processor: Encountered error while
fetching the list of EventHub PartitionIds. System.Private.CoreLib: A
connection attempt failed because the connected party did not properly
respond after a period of time, or established connection failed
because connected host has failed to respond.
This error has never occurred before in the time I have started using VSCode for Azure functions (since last September). The only thing that has changed recently is that I now deploy this function within an Azure Function premium resource, but really that should not matter in the dev environment.
For information, this function is hooked up to an Azure IoT-Hub endpoint and is simply reading and processing the uplink data before saving it to an Azure SQL database.
Can anyone offer any advice?
Check if my below findings help to fix your issue:
As #PeterBons said, check the connection string given correctly in the local.settings.json:
Whatever the Event Hub Endpoint/IoT Hub Endpoint Connection String given in the file local.settings.json, that property name should be mapped in the function.json file.
Try replacing the IoT Hub Connection String without the consumer group name as mentioned in this GitHub Issue #5512
I found similar issues in the SO 1 & 2 which will be helpful to fix your issue.

Challenge in data from REST API using Azure Data Factory - access issue

We are trying to reach to an API hosted in our company network using rest connector in ADF (SHIR is used). Linked service connection is successful but dataset is unable to read the data and copy activity is as well failing with below error. Please suggest your thoughts in resolving the same.
Failure happened on 'Source' side. ErrorCode=UserErrorFailToReadFromRestResource,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred while sending the request.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Net.Http.HttpRequestException,Message=An error occurred while sending the request.,Source=mscorlib,''Type=System.Net.WebException,Message=Unable to connect to the remote server,Source=System,''Type=System.Net.Sockets.SocketException,Message=A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond ,Source=System,'
This error is mostly seen due to firewall issues. You might want to verify your network firewall setting to allow the API request to be read.
Also, verify if your API call is working as expected using other API testing tools. If the issue persists you can raise a support ticket for engineers to investigate more on the issue.
If you are able to preview data in your source , then check your sink connection as this issue can occur when the Sink in the copy activity is behind a firewall, I was getting the same issue and I tried copying to a container without a firewall and it worked. Its weird that the error is related to Source and the issue is with Sink.

Redis ECONNRESET from Cloud Function

I am getting a lot of this message
Error: Redis connection to x.x.x.x:6379 failed - read ECONNRESET
I know what it means, but I don't know how it happens and how to troubleshoot it.
My case is I have a redis server running on a Compute Engine for caching purposes. I have an app running on Cloud Function that is using cachegoose to cache query to my mongodb and it is set to use the redis server. And as you can tell in the Console I see that error message.
It seems when the app is trying to send a request to redis it makes it and some other times don't. The reason with that is because I can see some data going in and out of the redis db by checking with KEYS *
My question would be how the ECONNRESET issue is occurred? Is it like there is an issue with connection between Cloud Function to Compute Engine? If that's so, how do I troubleshoot it?

Timeout in uploading a big file to google cloud storage

I'm having trouble uploading large files to Google Cloud Storage. I successfully uploaded a 700MB file, but when I tried a 5GB text file, it threw the following exception. I was unable to find a solution with a Google search.
The problem is in the main method of a simple java class.
Exception in thread "main" java.lang.RuntimeException: java.net.SocketTimeoutException: Read timed out
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:312)
at sun.security.ssl.InputRecord.read(InputRecord.java:350)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:893)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:850)
......
Getting java.net.SocketTimeoutException: Connection timed out in android it looks like you may need to jump up your connection timeout setting. The link is for android, but the same thing applies, and it's implemented exactly the same.
With larger files, especially on mobile devices and wireless connections, you're much more likely to have your uploads interrupted by a broken connection. The solution to this is to make your upload resilient against broken connections. Google Cloud Storage handles this using a technique called Resumable Uploads. You'll need to make use of this technique so your application can recover from network issues.
In Java SDK we have option to change the connection timeout and retry.
HttpTransportOptions transportOptions = StorageOptions.getDefaultHttpTransportOptions();
transportOptions = transportOptions.toBuilder().setConnectTimeout(60000).setReadTimeout(60000)
.build();
var storage = StorageOptions.newBuilder()
.setRetrySettings(RetrySettings.newBuilder().setMaxAttempts(2).build())
.setTransportOptions(transportOptions)
.setProjectId("project_id").build().getService();

ClientSession is closed by HornetQ

we encountered the following exception in HornetQ (with HornetQ 2.2.5 GA with JBoss 4.3.3, with the InVM connector. both the client and the server are on the same machine):
hornetq-failure-check-thread,Connection failure has been detected: Did not receive data from invm:0.
the error code is 3 (which is HornetQException.CONNECTION_TIMEDOUT).
this causes the RemotingServiceImpl.FailureCheckAndFlushThread to run, which writes the following log multiple times:
Client connection failed, clearing up resources for session 95406085-7b3a-11e2-86d3-005056b14e26
note that in our application we reuse our ClientSessions. we have one instance of ClientSession per connection (we open multiple connections, one per each client), and the above problem caused one of the sessions to be closed.
after reading this post: Connection timeout issues - Connection failure has been detected
I understood that we need to configure the following on our ServerLocator instance (which is used to create the ClientSessionFactory that creates our ClientSessions):
ServerLocator locator = HornetQClient.createServerLocatorWithoutHA(connectorConfig);
locator.setClientFailureCheckPeriod(Long.MAX_VALUE);
locator.setConnectionTTL(-1);
this configurtion solved the problem, and the above error was not reproduced.
our problem is the following - in case the sessions will be closed again by HornetQ from some other reason, how can we create new sessions instead of the closed ones?
I'm asking this because after we found the session was closed (and before we set the clientFailure and clientTTL values), we tried to create a new session by calling the createSession(false, true, true) method on the ClientSessionFactory instance (we create that instance upon system startup only once and resue it since) and it failed with the following error:
HornetQException[errorCode=0 message=Failed to create session]
so we didn't succed to create new sessions, and the only solution was restarting the JBoss.
note that we can't restart our application on the client site, so we need to find a way to create new sessions in case the old ones were closed from some reason.
Instead of doing that, you should probably configure retry and use a proper value, that way your connection will be reconnected.
But since you're using inVM, and as long as you don't stop the server you should be fine with that configuration. However if you intend to restart just the server, you could use reconnectionRetry (-1) and the session would be reattached or recreated seamesly to you.
anyway I would recommend you going to a newer version beyond 2.2.5.