Managing multiple events for the same file change generated by pdflatex with a DispatchQueue - swift

I am currently using a DispatchQueue and DispatchSourceFileSystemObject for tracking file changes on a pdf file which is generated by pdflatex. The problem i have is that when pdflatex generate the pdf, it sends several '.write' events and not only one. The job of the handler is to update the corresponding view where the pdf is displayed and i want to avoid to update several times the view with basically the same change(the handler is called every time an event of a specific flag is received). I want to call the handler only once and when the last '.write' event of the same pdf generation occur. For example if the pdflatex produce 10 '.write' events, the handler should be called only when the tenth event has been received.
I have tried to:
check with a flag the current event received to ignore future event with the same flag, and then sleep for some seconds to wait until i receive all the '.write' events but this is not a solution, because depending of the pdf file to generate it could take different time for the process pdflatex to complete.
get the modification date of the file, but with several '.write' sometime the date is the same for every '.write' and sometime it changes for 1 second etc.. so using the Date as a way to call the handler is not a good idea.
I am using a serial queue so the operation are not concurrent in the same queue. I would like, if possible, to continue to use the DisaptchQueue and only if there is no solution i would also appreciate a possible implementation with OperationQueue and Operation or BlockOperation.

I Solved the problem in this way: every time i receive a '.write' event, i try to build a pdfDocument with his url, if the document is nil it means the pdflatex process is still sending '.write' events. In the last '.write' event i can successfully get a pdfDocument because the document has been completed.

Related

Activiti, how to add listener for timer endding?

img
Hello, I have encounterred a problem about adding a listener to the end event of a timer. I used an intermediate catch event timer to wait a certain period(5 min). After 5 min, the flow goes to task2.
I want to update data in another table(some code in java), so I need a listener that listens the end event of the timer. However, the methods that I tried failed. Would you mind showing me a feasible and easily acomplishable way to do that?
Thanks!
Why not simply add a Task Listener to the complete event of your timer step?
https://www.activiti.org/javadocs/org/activiti/engine/delegate/tasklistener

Background Process as NSOperation or Thread to monitor and update File

I want to check if a pdf file is changed or not, and if is changed i want to update the corresponding view. I don't know if it's more suitable to use a background process as a Thread or as an NSOperation to do this task. The Apple Documentation says: "Examples of tasks that lend themselves well to NSOperation include network requests, image resizing, text processing, or any other repeatable, structured, long-running task that produces associated state or data.But simply wrapping computation into an object doesn’t do much without a little oversight".
Also, if I understood correctly from the documentation, a Thread once started can't be stopped during his execution while an NSOperation could be paused or stopped and also they could rely on dependency to wait the completion of another task.
The workflow of this task should be more or less this diagram:
Task workflow
I managed to get the handler working after the notification of type .write has been sent. If i monitor for example a *.txt file everything works as expected and i receive only one notification. But i am monitoring a pdf file which is generated from terminal by pdflatex and thus i receive with '.write' nearly 15 notification. If i change to '.attrib' i get 3 notification. I need the handler to be called only once, not 15 or 3 times. Do you have any idea how can i do it or is not possible with a Dispatch Source? Maybe there is a way to execute a dispatchWorkItem only once?
I have tried to implement it like this(This is inside a FileMonitor class):
func startMonitoring()
{
....
let fileSystemRepresentation = fileManager.fileSystemRepresentation(withPath: fileStringURL)
let fileDescriptor = open(fileSystemRepresentation, O_EVTONLY)
let newfileMonitorSource = DispatchSource.makeFileSystemObjectSource(fileDescriptor: fileDescriptor,
eventMask: .attrib,
queue: queue)
newfileMonitorSource.setEventHandler(handler:
{
self.queue.async
{
print(" \n received first write event, removing handler..." )
self.newfileMonitorSource.setEventHandler(handler: nil)
self.test()
}
})
self.fileMonitorSource = newfileMonitorSource
fileMonitorSource!.resume()
}
func test()
{
fileMonitorSource?.cancel()
print(" restart monitoring ")
startMonitoring()
}
I have tried to reassign the handler in test(), but it's not working(if a regenerate the pdf file, what is inside the new handler it's not executed) and to me, doing in this way, it seems a bit boilerplate code. I have also tried the following things:
suspend the DispatchSource in the setEventHandler of startMonitoring() (passing nil), but then when i am resuming it, i get the remaining .write events.
cancel the DispatchSource object and recall the startMonitoring() as you can see in the code above, but in this way i create and destroy the DispatchSource object everytime i receive an event, which i don't like because the cancel() function shoul be called in my case only when the user decide to disable this feauture i am implementing.
I will try to write better how the workflow of the app should be so you can have an more clear idea of what i am doing:
When the app starts, a functions sets the default value of some checkboxes of the window preference. The user can modify this checkboxes. So when the user open a pdf file, the idea is to launch in a background thread the following task:
I create a new queue call it A and launch asynch an infinite while where i check the value of the UserDefault checkboxe (that i use to reload and update the pdf file) and two things could happen
if the user set the value to off and the pdf document has been loaded there could be two situations:
if there is no current monitoring of the file (when the app starts): continue to check the checkboxe value
if there is currently a monitoring of the file: stop it
if the user set value to on and the pdf document has been loaded in this background thread (the same queue A) i will create a class Monitor (that could be a subclass of NSThread or a class that uses DispatchSourceFileSystemObject like above), then i will call startMonitoring() that will check the date or .write events and when there is a change it will call the handler. Basically this handler should recall the main thread (the main queue) and check if the file can be loaded or is corrupted and if so update the view.
Note: The infinite while loop(that should be running in the background), that check the UserDefault related to the feature i am implementing it's launched when the user open the pdf file.
Because of the problem above (multiple handlers calls), i should use the cancel() function when the user set checkboxe to off, and not create/destroy the DispatchSource object everytime i receive a .write event.

flink streaming window trigger

I have flink stream and I am calucating few things on some time window say 30 seconds.
here what happens it is giving me result my aggregating previous windows as well.
say for first 30 seconds I get result 10.
next thiry seconds I want fresh result, instead I get last window result + new
and so on.
so my question is how I get fresh result for each window.
You need to use a purging trigger. What you want is FIRE_AND_PURGE (emit and remove window content), what the default flink trigger does is FIRE (emit and keep window content).
input
.keyBy(...)
.timeWindow(Time.seconds(30))
// The important part: Replace the default non-purging ProcessingTimeTrigger
.trigger(new PurgingTrigger[..., TimeWindow](ProcessingTimeTrigger))
.reduce(...)
For a more in depth explanation have a look into Triggers and FIRE vs FIRE_AND_PURGE.
A Trigger determines when a window (as formed by the window assigner) is ready to be processed by the window function. Each WindowAssigner comes with a default Trigger. If the default trigger does not fit your needs, you can specify a custom trigger using trigger(...).
When a trigger fires, it can either FIRE or FIRE_AND_PURGE. While FIRE keeps the contents of the window, FIRE_AND_PURGE removes its content. By default, the pre-implemented triggers simply FIRE without purging the window state.
The functionality you describe can be found in Tumbling Windows: https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/windows.html#tumbling-windows
A bit more detail and/or code would help :)
I'm little late into this question but I encountered the same issue with OP's. What I found out later was a bug in my own code. FYI my mistake could be good reference for your problem.
// Old code (modified to be an example):
val tenSecondGrouping: DataStream[MyCustomGrouping] = userIdsStream
.keyBy(_.somePartitionedKey)
.window(TumblingProcessingTimeWindows.of(Time.of(10, TimeUnit.SECONDS)))
.trigger(ProcessingTimeTrigger.create())
.aggregate(new MyCustomAggregateFunc(new MyCustomGrouping()))
Bug happened at new MyCustomGrouping: I unintentionally created a singleton MyCustomGrouping object and reusing it in MyCustomAggregateFunc. As more tumbling windows created, the later aggregation results grow crazy! The fix was to create new MyCustomGrouping each time MyCustomAggregateFunc is triggered. So:
// New code, problem solved
...
.aggregate(new MyCustomAggregateFunc(() => new MyCustomGrouping()))
// passing in a func to create new object per trigger

Moving from file-based tracing session to real time session

I need to log trace events during boot so I configure an AutoLogger with all the required providers. But when my service/process starts I want to switch to real-time mode so that the file doesn't explode.
I'm using TraceEvent and I can't figure out how to do this move correctly and atomically.
The first thing I tried:
const int timeToWait = 5000;
using (var tes = new TraceEventSession("TEMPSESSIONNAME", #"c:\temp\TEMPSESSIONNAME.etl") { StopOnDispose = false })
{
tes.EnableProvider(ProviderExtensions.ProviderName<MicrosoftWindowsKernelProcess>());
Thread.Sleep(timeToWait);
}
using (var tes = new TraceEventSession("TEMPSESSIONNAME", TraceEventSessionOptions.Attach))
{
Thread.Sleep(timeToWait);
tes.SetFileName(null);
Thread.Sleep(timeToWait);
Console.WriteLine("Done");
}
Here I wanted to make that I can transfer the session to real-time mode. But instead, the file I got contained events from a 15s period instead of just 10s.
The same happens if I use new TraceEventSession("TEMPSESSIONNAME", #"c:\temp\TEMPSESSIONNAME.etl", TraceEventSessionOptions.Create) instead.
It seems that the following will cause the file to stop being written to:
using (var tes = new TraceEventSession("TEMPSESSIONNAME"))
{
tes.EnableProvider(ProviderExtensions.ProviderName<MicrosoftWindowsKernelProcess>());
Thread.Sleep(timeToWait);
}
But here I must reenable all the providers and according to the documentation "if the session already existed it is closed and reopened (thus orphans are cleaned up on next use)". I don't understand the last part about orphans. Obviously some events might occur in the time between closing, opening and subscribing on the events. Does this mean I will lose these events or will I get the later?
I also found the following in the documentation of the library:
In real time mode, events are buffered and there is at least a second or so delay (typically 3 sec) between the firing of the event and the reception by the session (to allow events to be delivered in efficient clumps of many events)
Does this make the above code alright (well, unless the improbable happens and for some reason my thread is delayed for more than a second between creating the real-time session and starting processing the events)?
I could close the session and create a new different one but then I think I'd miss some events. Or I could open a new session and then close the file-based one but then I might get duplicate events.
I couldn't find online any examples of moving from a file-based trace to a real-time trace.
I managed to contact the author of TraceEvent and this is the answer I got:
Re the exception of the 'auto-closing and restarting' feature, it is really questions about the OS (TraceEvent simply calls the underlying OS API). Just FYI, the deal about orphans is that it is EASY for your process to exit but leave a session going. This MAY be what you want, but often it is not, and so to make the common case 'just work' if you do Create (which is the default), it will close a session if it already existed (since you asked for a new one).
Experimentation of course is the touchstone of 'truth' but I would frankly expecting unusual combinations to just work is generally NOT true.
My recommendation is to keep it simple. You need to open a new session and close the original one. Yes, you will end up with duplicates, but you CAN filter them out (after all they are IDENTICAL timestamps).
The other possibility is use SetFileName in its intended way (from one file to another). This certainly solves your problem of file size growth, and often is a good way to deal with other scenarios (after all you can start up you processing and start deleting files even as new files are being generated).

libspotify C sending zeros at the end of track

I'm using libspotify SDK, C library for win32.
I think to have a right setup, every session callback is registered. I don't understand why i can't receive the call for end_of_track, while music_delivery continues to be called with zero padding 22050 long frames.
I attempt to start playing first loading the track with sp_session_load; till it returns SP_ERROR_IS_LOADING I post a message on my message queue (synchronization method I've used, PostMessage win32 API) in order to reload again with same API sp_session_load. As soon as it returns SP_ERROR_OK I use the sp_session_play and the music_delivery starts immediately, with correct frames.
I don't know why at the end of track the libspotify runtime then start sending zero padded frames, instead of calling end_of_track callback.
In other conditions it works perfectly: I've used the sp_track obtained from a album browse, so the track is fully loaded at the moment I load to the current session for playing: with this track, it works fine with end_of_track called correctly. In the case with padding error, I search the track using its Spotify URI and got the results; in this case the track metadata are not still ready (at the play attempt) so I used that kind of "polling" on sp_session_load with PostMessage.
Can anybody help me?
I ran into the same problem and I think the issue was that I was consuming the data too fast without giving other threads time to do any work since I was spending all of my time in the music_delivery callback. I found that if I add some throttling and notify the main thread that it can wake up to do some processing, the extra zeros at the end of track is reduced to one delivery of 22,050 frames (or 500ms at 44.1kHz).
Here is an example of what I added to my callback, heavily borrowed from the jukebox.c example provided with the SDK:
/* Buffer 1 second of data, then notify the main thread to do some processing */
if (g_throttle > format->sample_rate) {
pthread_mutex_lock(&g_notify_mutex);
g_notify_do = 1;
pthread_cond_signal(&g_notify_cond);
pthread_mutex_unlock(&g_notify_mutex);
// Reset the throttle counter
g_throttle = 0;
return 0;
}
As I said, there was still 22,050 frames of zeros delivered before the track stopped, but I believe libspotify may purposely do this to ensure that the duration calculated by the number of frames received (song_duration_ms = total_frames_delivered / sample_rate * 1000) is greater than or equal to the duration reported by sp_track_duration. In my case, the track I was trying to stream was 172,000ms in duration, without the extra padding the duration calculated is 171,796ms, but with the padding it was 172,296ms.
Hope this helps.