XmlReader blocks after first read from a NetworkStream - deserialization

I am having problems deserializing data from a network stream. Once the socket is opened the first read statement is successful but a subsequent attempt to deserialize the data blocks without an error (although it will eventually timeout).
I have checked that the correct data is being sent and that it is correctly formed. To confirm this I tried the same code and same data but using a file stream. I don't get the same behavior - the deserialization step does not block.
I realize that file streams and network streams have some differences but I would have expected the behavior to be the same.
TcpClient client = new TcpClient();
client.Connect(Server, Port);
NetworkStream stream = client.GetStream();
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
XmlReader reader = XmlReader.Create(stream, settings);
MyData recievedData;
XmlSerializer xmlSerializer = new XmlSerializer(typeof(MyData ));
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.Element && xmlReader.Name == "MyRoot")
{
// The following statement blocks
recievedData= (MyData)xmlSerializer.Deserialize(reader);
}
}
After some additional testing I noted that the deserialization step will resume correctly if I force the server to resend the same message. However the subseqent read statement will also work (picking up the duplicate message) and then block again on the deserialization step.
It might also be worth mentioning the message is a single element. Something like the following:
<MyRoot xmlns="http://www.mydomain.com/mydata" someattribute="123" />

Related

Error handling in Spring Cloud Kafka Streams

I'm using Spring Cloud Stream with Kafka Streams. Let's say I have a processor which is a Function which converts a KStream of Strings to a KStream of CityProgrammes. It invokes an API to find the City by name and an other transformation which finds any events near that city.
Now the problem is that any error happens during the transformation, the whole application stops. I want to send that one particular message to a DLQ and move along. I've been reading for days and everyone suggests to handle errors within the called services but that is a nonesense in my opinion, plus I still need to return a KStream: how do I do that within a catch?
I also looked at UncaughtExeptionHandler but it is not aware of the message and only able to restart the processing which won't skip this invalid message.
This might sound like an A-B problem so the question rephrased: how do I maintain the flow in a KStream when an exception occurs and send the invalid item to the DLQ?
When it comes to the application-level errors you have, it is up to the application itself how the error is handled. Kafka Streams and the Spring Cloud Stream binder mainly support deserialization and serialization errors at the framework level. Although that is the case, I think your scenario can be handled. If you are using Kafka Client prior to 2.8, here is an SO answer I gave before on something similar: https://stackoverflow.com/a/66749750/2070861
If you are using Kafka/Streams 2.8, here is an idea that you can use. However, the code below should only be used as a starting point. Adjust it according to your use case. Read more on how branching works in Kafka Streams 2.8. The branching API is significantly refactored in 2.8 from the prior versions.
public Function<KStream<?, String>, KStream<?, Foo>> convert() {
Foo[] foo = new Foo[0];
return input -> {
final Map<String, ? extends KStream<?, String>> branches =
input.split(Named.as("foo-")).branch((key, value) -> {
try {
foo[0] = new Foo(); // your API call for CitiProgramme converion here, possibly.
return true;
}
catch (Exception e) {
Message<?> message = MessageBuilder.withPayload(value).build();
streamBridge.send("to-my-dlt", message);
return false;
}
}, Branched.as("bar"))
.defaultBranch();
final KStream<?, String> kStream = branches.get("foo-bar");
return kStream.map((key, value) -> new KeyValue<>("", foo[0]));
};
}
}
The default branch is ignored in this code because that only contains the records that threw exceptions. Those were handled by the catch statement above in which we send the records to a DLT programmatically. Finally, we get the good records and map them to a new KStream and send it through the outbound.

Kafka Producer : Handle Exception in Async Send with Callback

I need to catch the exceptions in case of Async send to Kafka. The Kafka producer Api comes with a fuction send(ProducerRecord record, Callback callback). But when I tested this against following two scenarios :
Kafka Broker Down
Topic not pre created
The callbacks are not getting called. Rather I am getting warning in the code for unsuccessful send (as shown below).
Questions :
So are the callbacks called only for specific exceptions ?
When does Kafka Client try to connect to Kafka broker while async send : on every batch send or periodically ?
Kafka Warning Image
Note : I am also using linger.ms setting of 25 sec to batch send my records.
public class ProducerDemo {
static KafkaProducer<String, String> producer;
public static void main(String[] args) throws IOException {
final Logger logger = LoggerFactory.getLogger(ProducerDemo.class);
Properties properties = new Properties();
properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "127.0.0.1:9092");
properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.ACKS_CONFIG, "1");
properties.setProperty(ProducerConfig.LINGER_MS_CONFIG, "30000");
producer = new KafkaProducer<String, String>(properties);
String topic = "first_topic";
for (int i = 0; i < 5; i++) {
String value = "hello world " + Integer.toString(i);
String key = "id_" + Integer.toString(i);
ProducerRecord<String, String> record = new ProducerRecord<String, String>(topic, key, value);
producer.send(record, new Callback() {
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
//execute everytime a record is successfully sent or exception is thrown
if(e == null){
// No Exception
}else{
//Exception Handling
}
}
});
}
producer.close();
}
You will get those warning for non-existing topic as a resilience mechanism provided with KafkaProducer. If you wait a bit longer(should be 60 seconds by default), the callback will be called eventually:
Here's my snippet:
So, when something goes wrong and async send is not successful, it will eventually fail with a failed future or/and a callback with exception.
If you are not running it transactionally, it can still mean that some messages from the batch have found their way to the broker, while others haven't.
It will most certainly be a problem if you need a blocking-style acknowledgement to the upstream system(like http ingestion interface, etc.) per every message that is sent to Kafka. The only way to do that is by blocking every message with the future's get, as described in the documentation:
In general, I've noticed a lot of question related to KafkaProducer delivery semantics and guarantees. It can definitely be documented better.
One more thing, since you mentioned linger.ms:
Note that records that arrive close together in time will generally
batch together even with linger.ms=0 so under heavy load batching will
occur regardless of the linger configuration
For the first question, here is the answer.
As per the apache kafka documentation, you can capture below exceptions using onCompletion method when you are implementing Callback interface
https://kafka.apache.org/25/javadoc/org/apache/kafka/clients/producer/Callback.html
For the second question, the combination of below properties control when to send the records and as far as i understand, it's same for synchronous or asynchronous call.
linger.ms
max.block.ms
https://kafka.apache.org/documentation/#linger.ms
So are the callbacks called only for specific exceptions ?
Yes, that's how it works. From documentation (2.5.0):
* Fully non-blocking usage can make use of the {#link Callback} parameter to provide a callback that
* will be invoked when the request is complete.
Notice the important part: when the request is complete, what means that the producer must have accepted the record and sent the ProduceRequest to Kafka Broker. Without digging too deep into internals, this means that broker metadata must be present and the partition must exist.
When it comes to formal specification, you'd need to take a good look at send()'s Javadoc and possibly at KafkaProducer's implementation of doSend method. Out there you're going to see that multiple exceptions can be thrown at the in submitting call (instead of returning a future and invoking callback), e.g. :
if broker metadata is not available in timeout given,
if data could not be serialized,
if serialized form was too large, etc.

How to close InputStream which fed into Response(jax.rs)

#GET
#Path("/{id}/content")
#Produces({ "application/octet-stream" })
public Response getDocumentContentById(#PathParam("id") String docId) {
InputStream is = getDocumentStream(); // some method which gives stream
ResponseBuilder responseBuilder = Response.ok(is);
responseBuilder.header("Content-Disposition", "attachment; filename=" + fileName);
return responseBuilder.build();
}
Here how can I close the InputStream is ? If something(jax.rs) closes automatically. Please give me some information. Thank you.
When you're wanting to stream a custom response, the most reliable way I've found is to return an object that contains the InputStream (or which can obtain the stream in some other way at some point), and to define a MessageBodyWriter provider that will do the actual streaming at the right time.
For example, this code is part of Apache Taverna, and it streams back the zipped contents of a directory. All that the main code needs to do to use it is to return a ZipStream as the response (which can be packaged in a Response or not) and to ensure that it is dealing with returning the application/zip content type. The final point to note is that since this is dealing with CXF, you need to manually register the provider; unlike with Glassfish, they are not automatically picked up. This is a good thing in sophisticated scenarios, but it does mean that you need to do the registration.

ASP.NET Web Api: Delegate after Request

I have a problem with streams and the web api.
I return the stream which is consumed by the web api. Currently, i put the socket into a pool after getting the stream. but this cause some errors.
Now, I must putthe socket into the pool AFTER the request ended. (The stream was consumed and is now closed).
Is there a delegate for this or some other best practises?
Example code:
public HttpResponseMessage Get(int fileId)
{
HttpResponseMessage response = null;
response = new HttpResponseMessage(HttpStatusCode.OK);
Stream s = GetFile(id);
response.Content = new StreamContent(fileStream);
}
GetFile(int id)
{
FSClient fs = GetFSClient();
Stream s = fs.GetFileStream(id);
AddFSToPool(fs);
return s;
}
GetFile uses a self-programmed FileServer-Client.
It has an option to reuse FileServer-Connections. This connections will be stored in a pool. (In the pool are only unused FileServer-connections). If the next request calls GetFSClient() it gets an connected one from the pool (and removes it from the pool).
But if another requests comes in and uses a FileServer-Connection which is in the pool (because unused), there is still the problem, that the Stream is possibly in use.
Now I want to do the "put the FSClint into the pool" after the request ended and the stream is fully consumed.
Is there an entry point for that?
Stream is seen as a volatile/temporary resource - no wonder it implements IDisposable.
Also Stream is not thread-safe since it has a Position which means if it is read up to the end, it should be reset back to start and if two Threads reading the stream they will most likely read different chunks.
As such, I would not even attempt to solve this problem. Re-using streams on a web site (inherently multi-user / multi-threaded) not recommended.
UPDATE
As I said, still think that the best option is to re-think the solution but if you need to register something that runs after request finishes, use RegisterForDispose on request:
public HttpResponseMessage Get(HttpRequestMessage req, int fileId)
{
....
req.RegisterForDispose(myStream);
}

Help with a Windows Service/Scheduled Task that must use a web browser and file dialogs

What I'm Trying To Do
I'm trying to create a solution of any kind that will run nightly on a Windows server, authenticate to a website, check a web page on the site for new links indicating a new version of a zip file, use new links (if present) to download a zip file, unzip the downloaded file to an existing folder on the server, use the unzipped contents (sql scripts, etc.) to build an instance of a database, and log everything that happens to a text file.
Forms App: The Part That Sorta Works
I created a Windows Forms app that uses a couple of WebBrowser controls, a couple of threads, and a few timers to do all that except the running nightly. It works great as a Form when I'm logged in and run it, but I need to get it (or something like it) to run on it's own like a Service or scheduled task.
My Service Attempt
So, I created a Windows Service that ticks every hour and, if the System.DateTime.Now.Hour >= 22, attempts to launch the Windows Forms app to do it's thing. When the Service attempts to launch the Form, this error occurs:
ActiveX control '8856f961-340a-11d0-a96b-00c04fd705a2' cannot be instantiated because the current thread is not in a single-threaded apartment.
which I researched and tried to resolve by either placing the [STAThread] attribute on the Main method of the Service's Program class or using some code like this in a few places including the Form constructor:
webBrowseThread = new Thread(new ThreadStart(InitializeComponent));
webBrowseThread.SetApartmentState(ApartmentState.STA);
webBrowseThread.Start();
I couldn't get either approach to work. In the latter approach, the controls on the Form (which would get initialized inside IntializeComponent) don't get initialized and I get null reference exceptions.
My Scheduled Task Attempt
So, I tried creating a nightly scheduled task using my own credentials to run the Form locally on my dev machine (just testing). It gets farther than the Service did, but gets hung up at the File Download Dialog.
Related Note: To send the key sequences to get through the File Download and File Save As dialogs, my Form actually runs a couple of vbscript files that use WScript.Shell.SendKeys. Ok, that's embarassing to admit, but I tried a few different things including SendMessage in Win32 API and referencing IWshRuntimeLibrary to use SendKeys inside my C# code. When I was researching how to get through the dialogs, the Win32 API seemed to be the recommended way to go, but I couldn't figure it out. The vbscript files was the only thing I could get to work, but I'm worried now that this may be the reason why a scheduled task won't work.
Regarding My Choice of WebBrowser Control
I have read about the System.WebClient class as an alternative to the WebBrowser control, but at a glance, it doesn't look like it has what I need to get this done. For example, I needed (or I think I needed) the WebBrowser's DocumentCompleted and FileDownload events to handle the delays in pages loading, files downloading, etc. Is there more to WebClient that I'm not seeing? Is there another class besides WebBrowser that is more Service-friendly and would do the trick?
In Summary
Geez, this is long. Sorry! It would help to even have a high level recommendation for a better way to do what I'm trying to do, because nothing I've tried has worked.
Update 10/22/09
Well, I think I'm closer, but I'm stuck again. I should end up with a decent-sized zip file with several files in it, but the zip file resulting from my code is empty. Here's my code:
// build post request
string targetHref = "http://wwwcf.nlm.nih.gov/umlslicense/kss/login.cfm";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(targetHref);
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
// encoding to use
Encoding enc = Encoding.GetEncoding(1252);
// build post string containing authentication information and add to post request
string poststring = "returnUrl=" + fixCharacters(targetDownloadFileUrl);
poststring += getUsernameAndPasswordString();
poststring += "&login2.x=0&login2.y=0";
// convert to required byte array
byte[] postBytes = enc.GetBytes(poststring);
request.ContentLength = postBytes.Length;
// write post to request
Stream postStream = request.GetRequestStream();
postStream.Write(postBytes, 0, postBytes.Length);
postStream.Close();
// get response as stream
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream responseStream = response.GetResponseStream();
// writes stream to zip file
FileStream writeStream = new FileStream(fullZipFileName, FileMode.Create, FileAccess.Write);
ReadWriteStream(responseStream, writeStream);
response.Close();
responseStream.Close();
The code for ReadWriteStream looks like this.
private void ReadWriteStream(Stream readStream, Stream writeStream)
{
// taken verbatum from http://www.developerfusion.com/code/4669/save-a-stream-to-a-file/
int Length = 256;
Byte[] buffer = new Byte[Length];
int bytesRead = readStream.Read(buffer, 0, Length);
// write the required bytes
while (bytesRead > 0)
{
writeStream.Write(buffer, 0, bytesRead);
bytesRead = readStream.Read(buffer, 0, Length);
}
readStream.Close();
writeStream.Close();
}
The building of the post string is taken from my previous forms app that works. I compared the resulting values in poststring for both sets of code (my working forms app and this one) and they're identical.
I'm not even sure how to troubleshoot this further. Anyone see anything obvious as to why this isn't working?
Conclusion 10/23/09
I finally have this working. A couple of important hurdles I had to get over. I had some problems with the ReadWriteStream method code that I got online. I don't know why, but it wasn't working for me. A guy named JB in Claudio Lassala's Virtual Brown Bag meeting helped me to come up with this code which worked much better for my purposes:
private void WriteResponseStreamToFile(Stream responseStreamToRead, string zipFileFullName)
{
// responseStreamToRead will contain a zip file, write it to a file in
// the target location at zipFileFullName
FileStream fileStreamToWrite = new FileStream(zipFileFullName, FileMode.Create);
int readByte = responseStreamToRead.ReadByte();
while (readByte != -1)
{
fileStreamToWrite.WriteByte((byte)readByte);
readByte = responseStreamToRead.ReadByte();
}
fileStreamToWrite.Flush();
fileStreamToWrite.Close();
}
As Will suggested below, I did have trouble with the authentication. The following code is what worked to get around that issue. A few comments inserted addressing key issues I ran into.
string targetHref = "http://wwwcf.nlm.nih.gov/umlslicense/kss/login.cfm";
HttpWebRequest firstRequest = (HttpWebRequest)WebRequest.Create(targetHref);
firstRequest.AllowAutoRedirect = false; // this is critical, without this, NLM redirects and the whole thing breaks
// firstRequest.Proxy = new WebProxy("127.0.0.1", 8888); // not needed for production, but this helped in order to debug the http traffic using Fiddler
firstRequest.Method = "POST";
firstRequest.ContentType = "application/x-www-form-urlencoded";
// build post string containing authentication information and add to post request
StringBuilder poststring = new StringBuilder("returnUrl=" + fixCharacters(targetDownloadFileUrl));
poststring.Append(getUsernameAndPasswordString());
poststring.Append("&login2.x=0&login2.y=0");
// convert to required byte array
byte[] postBytes = Encoding.UTF8.GetBytes(poststring.ToString());
firstRequest.ContentLength = postBytes.Length;
// write post to request
Stream postStream = firstRequest.GetRequestStream();
postStream.Write(postBytes, 0, postBytes.Length); // Fiddler shows that post and response happen on this line
postStream.Close();
// get response as stream
HttpWebResponse firstResponse = (HttpWebResponse)firstRequest.GetResponse();
// create new request for new location and cookies
HttpWebRequest secondRequest = (HttpWebRequest)WebRequest.Create(firstResponse.GetResponseHeader("location"));
secondRequest.AllowAutoRedirect = false;
secondRequest.Headers.Add(HttpRequestHeader.Cookie, firstResponse.GetResponseHeader("Set-Cookie"));
// get response to second request
HttpWebResponse secondResponse = (HttpWebResponse)secondRequest.GetResponse();
// write stream to zip file
Stream responseStreamToRead = secondResponse.GetResponseStream();
WriteResponseStreamToFile(responseStreamToRead, fullZipFileName);
responseStreamToRead.Close();
sl.logScriptActivity("Downloading update.");
firstResponse.Close();
I want to underscore that setting AllowAutoRedirect to false on the first HttpWebRequest instance was critical to the whole thing working. Fiddler showed two additional requests that occurred when this was not set, and it broke the rest of the script.
You're trying to use UI controls to do something in a windows service. This will never work.
What you need to do is just use the WebRequest and WebResponse classes to download the contents of the webpage.
var request = WebRequest.Create("http://www.google.com");
var response = request.GetResponse();
var stream = response.GetResponseStream();
You can dump the contents of the stream, parse the text looking for updates, and then construct a new request for the URL of the file you want to download. That response stream will then have the file, which you can dump on the filesystem and etc etc.
Before you wonder, GetResponse will block until the response returns, and the stream will block as data is being received, so you don't need to worry about events firing when everything has been downloaded.
You definitely need to re-think your approach (as you've already begun to do) to eliminate the Forms-based application approach. The service you're describing needs to operate with no UI at all.
I'm not familiar with the details of System.WebClient, but since it
provides common methods for sending
data to and receiving data from a
resource identified by a URI,
it will probably be your answer.
At first glance, WebClient.DownloadFile(...) or WebClient.DownloadFileAsync(...) will do what you need.
The only thing I can add is that once you have scraped your screen and have the fully qualified name of the file you want to download, you could pass it along to the Windows/DOS command 'get' which will fetch files via HTTP. You can also script a command-line FTP client if desired. It's been a long time since I tried something like this in Windows, but I think you're almost there. Once you have fetched the correct file, building a batch file to do everything else should be pretty easy. If you are more comfortable with Unix, google "unix services for windows" just keep an eye on the services they start running (DHCP, etc). There are some nice utilities which will let your treat dos as a unix-like shell (ls -l, grep, etc) Finally, you could try another language like Perl or Python but I don't think that's the kind of advice you were looking for. :)