PipedInputStream / PipedOutputStream, ImageIO and ffmpeg

PipedInputStream / PipedOutputStream, ImageIO and ffmpeg - scala

I have the following code in Scala:
val pos = new PipedOutputStream()
val pis = new PipedInputStream(pos)
Future {
LOG.trace("Start rendering")
generateFrames(videoRenderParams.length) {
img ⇒ ImageIO.write(img, "PNG", pos)
}
pos.flush()
IOUtils.closeQuietly(pos)
LOG.trace("Finished rendering")
} onComplete {
case Success(_) ⇒
LOG.trace("Complete successfully")
case Failure(err) ⇒
LOG.error("Can't render stuff", err)
IOUtils.closeQuietly(pis)
IOUtils.closeQuietly(pos)
}
val prc = (ffmpegCli #< pis).!(logger)
the Future simply writes the generated images one by one to the OutputStream. Now the ffmpeg process reads the input images from stdin and converts them to MP4 file.
That works pretty well, but for some reason sometimes I'm getting the following stacktraces:
I/O error Pipe closed for process: <input stream>
java.io.IOException: Pipe closed
at java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:260)
at java.io.PipedInputStream.receive(PipedInputStream.java:226)
at java.io.PipedOutputStream.write(PipedOutputStream.java:149)
at scala.sys.process.BasicIO$.loop$1(BasicIO.scala:236)
at scala.sys.process.BasicIO$.transferFullyImpl(BasicIO.scala:242)
at scala.sys.process.BasicIO$.transferFully(BasicIO.scala:223)
at scala.sys.process.ProcessImpl$PipeThread.runloop(ProcessImpl.scala:159)
at scala.sys.process.ProcessImpl$PipeSource.run(ProcessImpl.scala:179)
At the same time I'm getting the following error from another stream:
javax.imageio.IIOException: I/O error writing PNG file!
at com.sun.imageio.plugins.png.PNGImageWriter.write(PNGImageWriter.java:1168)
at javax.imageio.ImageWriter.write(ImageWriter.java:615)
at javax.imageio.ImageIO.doWrite(ImageIO.java:1612)
at javax.imageio.ImageIO.write(ImageIO.java:1578)
at
So it seems that the streams were broken somewhere in between, so ffmpeg can not read the data, and ImageIO can not write the data.
What is even more interesting - the problem is reproducible only on certain Linux server (Amazon). It works flawlessly on other Linux boxes. So I wonder if somebody could point me out to the possible causes of this error.
What I've tried so far:
use Oracle JDK 8 and OpenJDK
use different versions of FFMPEG
Nothing worked by the moment.

The problem was kind of predictable and weird at the same time. So there were ten concurrent ffmpeg processes scheduled to handle the input, and the input was a set of hundreds of FullHD pictures. Obviously that takes lot of computation capacity, hence the kernel randomly shut down ffmpeg processes, causing Java wrapper to report broken input and output pipe at the same time.
Thus /var/log/messages contained many of logs like below:
Out of memory: Kill process 25778 (java) score 159 or sacrifice child
Killed process 25931 (ffmpeg) total-vm:2337040kB, anon-rss:966340kB, file-rss:104kB
Reducing the number of concurrent ffmpeg processes solved the issue.

Related

Mirth is reading too slow from disk

I am using Mirth 3.0.1 version. I am reading a file (using File Reader) having 34,000 records. Every record is having 45 columns and are pipe(|) separated. Mirth is taking too much time while reading the file from the disk. Mirth is installed on the same server where file is located.Earlier, I was facing the java head space issue which I resolved after setting the -Xms1024m -Xmx4096m in files mcserver.vmoptions & mcservice.vmoptions. Now I have to solve reading performance issue. Please find in attachment the channel for the same.

The answer to this problem is highly dependent on the solution itself. As an example, if you are doing transformations when you benchmark, it might be that the problem is not with reading the files, but rather with doing massive amounts of filtering and transformations in Mirth. Since Mirth converts everything you configure into basically one gigantic Javascript that executes on the server, it might just as well be that this is causing the performance problem. Pre-processor scripts might also create a problem if you do something that causes Mirth to read the whole file.
It migh also be that your 34.000 lines in the file contains huge quantities of information, simply making the file very big and extensive to process. If every record in the file is supposed to create new messages within Mirth, you might also want to check your batch settings for the reader.
And in addition to this, the performance of the read operations from disk is of course affected a lot by the infrastructure and hardware of the platform itself. You did mention that you are reading the files locally and that you had to increase the memory for Mirth. All of this could of course be a problem in itself. To make a benchmark you would want to compare this to something else. Maybe write a small Java program to just read the file to compare performance outside of Mirth.

Thanks for the suggestions.
I have used router.routeMessage('channelName','PartOfMsg') to route the 5000 records(from one channel to second channel) from the file having 34000 of records. This has helped to read faster from the file and processing the records at the same time.
For Mirth Community, below is the code to route the msg from one channel to other channel, this solution is also for the requirement if you have bulk of records to process in batches
In Source Transformer,
debug = "ON";
XML.ignoreWhitespace = true;
logger.debug('Inside source transformer "SplitFileIntoFiles" of channel: SplitFile');
var
subSegmentCounter = 0,
xmlMessageProcessCounter = 0,
singleFileLimit = 5000,
isError = false,
xmlMessageProcess = new XML(<delimited><row><column1></column1><column2></column2></row></delimited>),
newSubSegment = <row><column1></column1><column2></column2></row>,
totalPatientRecords = msg.children().length();
logger.debug('Total number of records found in the patient input file are: ');
logger.debug(totalPatientRecords);
try{
for each (seg in msg.children())
{
xmlMessageProcess.appendChild(newSubSegment);
xmlMessageProcess['row'][xmlMessageProcessCounter] = msg['row'][subSegmentCounter];
if (xmlMessageProcessCounter == singleFileLimit -1)
{
logger.debug('Now sending the 5000 records to the next channel from channel DOR Batch File Process IHI');
router.routeMessage('DOR SendPatientsToMedicare',xmlMessageProcess);
logger.debug('After sending the 5000 records to the next channel from channel DOR Batch File Process IHI');
xmlMessageProcessCounter = 0;
delete xmlMessageProcess['row'];
}
subSegmentCounter++;
xmlMessageProcessCounter++;
}// End of FOR loop
}// End of try block
catch (exception)
{
logger.error('The exception has been raised in source transformer "SplitFileIntoFiles" of channel: SplitFile');
logger.error(exception);
globalChannelMap.put('isFailed',true);
globalChannelMap.put('errDesc',exception);
return true;
}
if (xmlMessageProcessCounter > 1)
{
try
{
logger.debug('Now sending the remaining records to the next channel from channel DOR Batch File Process IHI');
router.routeMessage('DOR SendPatientsToMedicare',xmlMessageProcess);
logger.debug('After sending the remaining records to the next channel from channel DOR Batch File Process IHI');
delete xmlMessageProcess['row'];
}
catch (exception)
{
logger.error('The exception has been raised in source transformer "SplitFileIntoFiles" of channel: SplitFile');
logger.error(exception);
globalChannelMap.put('isFailed',true);
globalChannelMap.put('errDesc',exception);
return true;
}
}
return true;
// End of JavaScript
Hope, this will help.

Google cloud storage gsutil tool with Java

If we have around 30G files (ranged from 50MB to 4GB) need to be uploaded to Google Cloud Storage everyday, according to google docs, gsutil might be the only fitted choice, isn't it?
I want to call gsutil command by Java, now the code below can work. But If I delete that while loop, the program will stop immediately after the runtime.exec(command) but python process was started but doing no uploading and it will soon be killed. I wonder why.
The reason I read from sterr stream is inspired by Pipe gsutil output to file
I decide whether gsutil finish executing by read util the last line of its status output, but is it a reliable way? Is there any better ways to detect whether gsutil execution is end in Java?
String command="python c:/gsutil/gsutil.py cp C:/SFC_Data/gps.txt"
+ " gs://getest/gps.txt";
try {
Process process = Runtime.getRuntime().exec(command);
System.out.println("the output stream is "+process.getErrorStream());
BufferedReader reader=new BufferedReader(new InputStreamReader(process.getErrorStream()));
String s;
while ((s = reader.readLine()) != null){
System.out.println("The inout stream is " + s);
}
} catch (IOException e) {
e.printStackTrace();
}

There are certainly more than one way to upload 30G worth of data per day to GCS. Since you are working in Java, have you considered to use the Cloud Storage API Java client library?
https://developers.google.com/api-client-library/java/apis/storage/v1
As for the specific questions about calling gsutil from Java using Runtime.exec(), I suspect when there is no while loop, the program will exit immediately after creating the sub-process, causing the "process" variable to be GC'ed, which might kill the sub-process.
I think you should wait for the sub-process to complete, which is effectively what the while loop is doing. Or you can just call waitFor() and check the existValue() if you don't care about the output:
http://docs.oracle.com/javase/7/docs/api/java/lang/Process.html

I draw the following pic according to Zhihong Yao's explanation. Hope it can help anyone with the same question as mine.

Why does this code work successfully with Enumerator.fromFile?

I wrote the file transferring code as follows:
val fileContent: Enumerator[Array[Byte]] = Enumerator.fromFile(file)
val size = file.length.toString
file.delete // (1) THE FILE IS TEMPORARY SO SHOULD BE DELETED
SimpleResult(
header = ResponseHeader(200, Map(CONTENT_LENGTH -> size, CONTENT_TYPE -> "application/pdf")),
body = fileContent)
This code works successfully, even if the file size is rather large (2.6 MB),
but I'm confused because my understanding about .fromFile() is a wrapper of fromCallBack() and SimpleResult actually reads the file buffred,but the file is deleted before that.
MY easy assumption is that java.io.File.delete waits until the file gets released after the chunk reading completed, but I have never heard of that process of Java File class,
Or .fromFile() has already loaded all lines to the Enumerator instance, but it's against the fromCallBack() spec, I think.
Does anybody knows about this mechanism?

I'm guessing you are on some kind of a Unix system, OSX or Linux for example.
On a Unix:y system you can actually delete a file that is open, any filesystem entry is just a link to the actual file, and so is a file handle which you get when you open a file. The file contents won't become unreachable /deleted until the last link to it is removed.
So: it will no longer show up in the filesystem after you do file.delete but you can still read it using the InputStream that was created in Enumerator.fromFile(file) since that created a file handle. (On Linux you actually can find it through the special /proc filesystem which, among other things, contains the filehandles of each running process)
On windows I think you will get an error though, so if it is to run on multiple platforms you should probably check test your webapp on windows as well.

Limiting gstreamer pipeline throughput to simulate live source

I'm developing an RTSP server that should emulate a live source, while streaming the data from a file.
What I currently have is mostly based on gst-rtsp-server example test-readme.c, only with the following pipeline:
gst_rtsp_media_factory_set_launch(factory, "( "
"filesrc location=stream.mkv ! matroskademux name=demuxer "
"demuxer. ! queue ! rtph264pay name=pay0 pt=96 "
"demuxer. ! queue ! rtpmp4gpay name=pay1 pt=97 "
")");
This works very well, except for one problem: when the RTSP client (which uses RTSP/TCP interleave transport) is not able to receive data, the whole pipeline locks up until the client is ready again, and then resumes at the original position without any jump.
Since I want to emulate live source which cannot buffer its video indefinitely, the desired behavior in this case is to continue playing the file, so when the client blocks for 5 seconds, it will lose 5 seconds of recording.
I've attempted to achieve this by limiting queue sizes and setting them as leaky (by setting them as queue max-size-bytes=1000000 max-size-time=1000000000 leaky=upstream, which should provide buffer to ~1 second of video, but no more). This did not work entirely as I hoped: the source and demuxer filled the queue and then completely emptied themselves in 0.1 sec.
I figured I need some way to throttle pipeline throughput before the queue, either by limiting the demuxer to real-time demuxing, or finding/making a gstreamer filter that will let through 1 second of data per 1 second of real time.
Do you have any hints on how to do this?

So it seems that while leaky queue and limiter can be done, they don't help much in this regard as GStreamer RTSP implementation has its own queue for outgoing TCP data. What appears to work is keeping the pipeline unchanged and patching gst-rtsp-server module to limit its queue length (to 1 MB in this case, recent version also limit message count to 100):
--- gst-rtsp-server-1.4.5/gst/rtsp-server/rtsp-client.c 2014-11-06 11:20:28.000000000 +0100
+++ gst-rtsp-server-1.4.5-r1/gst/rtsp-server/rtsp-client.c 2015-04-28 14:25:14.207888281 +0200
## -3435,11 +3435,11 ##
gst_rtsp_client_set_send_func (client, do_send_message, priv->watch,
(GDestroyNotify) gst_rtsp_watch_unref);
/* FIXME make this configurable. We don't want to do this yet because it will
* be superceeded by a cache object later */
- gst_rtsp_watch_set_send_backlog (priv->watch, 0, 100);
+ gst_rtsp_watch_set_send_backlog (priv->watch, 1000000, 100);
GST_INFO ("client %p: attaching to context %p", client, context);
res = gst_rtsp_watch_attach (priv->watch, context);
return res;

Weird Winsock recv() slowdown

I'm writing a little VOIP app like Skype, which works quite good right now, but I've run into a very strange problem.
In one thread, I'm calling within a while(true) loop the winsock recv() function twice per run to get data from a socket.
The first call gets 2 bytes which will be casted into a (short) while the second call gets the rest of the message which looks like:
Complete Message: [2 Byte Header | Message, length determined by the 2Byte Header]
These packets are round about 49/sec which will be round about 3000bytes/sec.
The content of these packets is audio-data that gets converted into wave.
With ioctlsocket() I determine wether there is some data on the socket or not at each "message" I receive (2byte+data). If there's something on the socket right after I received a message within the while(true) loop of the thread, the message will be received, but thrown away to work against upstacking latency.
This concept works very well, but here's the problem:
While my VOIP program is running and when I parallely download (e.g. via browser) a file, there always gets too much data stacked on the socket, because while downloading, the recv() loop seems actually to slow down. This happens in every download/upload situation besides the actual voip up/download.
I don't know where this behaviour comes from, but when I actually cancel every up/download besides the voip traffic of my application, my apps works again perfectly.
If the program runs perfectly, the ioctlsocket() function writes 0 into the bytesLeft var, defined within the class where the receive function comes from.
Does somebody know where this comes from? I'll attach my receive function down below:
std::string D_SOCKETS::receive_message(){
recv(ClientSocket,(char*)&val,sizeof(val),MSG_WAITALL);
receivedBytes = recv(ClientSocket,buffer,val,MSG_WAITALL);
if (receivedBytes != val){
printf("SHORT: %d PAKET: %d ERROR: %d",val,receivedBytes,WSAGetLastError());
exit(128);
}
ioctlsocket(ClientSocket,FIONREAD,&bytesLeft);
cout<<"Bytes left on the Socket:"<<bytesLeft<<endl;
if(bytesLeft>20)
{
// message gets received, but ignored/thrown away to throw away
return std::string();
}
else
return std::string(buffer,receivedBytes);}

There is no need to use ioctlsocket() to discard data. That would indicate a bug in your protocol design. Assuming you are using TCP (you did not say), there should not be any left over data if your 2byte header is always accurate. After reading the 2byte header and then reading the specified number of bytes, the next bytes you receive after that constitute your next message and should not be discarded simply because it exists.
The fact that ioctlsocket() reports more bytes available means that you are receiving messages faster than you are reading them from the socket. Make your reading code run faster, don't throw away good data due to your slowness.
Your reading model is not efficient. Instead of reading 2 bytes, then X bytes, then 2 bytes, and so on, you should instead use a larger buffer to read more raw data from the socket at one time (use ioctlsocket() to know how many bytes are available, and then read at least that many bytes at one time and append them to the end of your buffer), and then parse as many complete messages are in the buffer before then reading more raw data from the socket again. The more data you can read at a time, the faster you can receive data.
To help speed up the code even more, don't process the messages inside the loop directly, either. Do the processing in another thread instead. Have the reading loop put complete messages in a queue and go back to reading, and then have a processing thread pull from the queue whenever messages are available for processing.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse