Gstreamer appsrc seeking video being grabbed over http by chunks - streaming

I am making some media player for watching videos recorded on server (is made by me too).
Videos are just big daily .ts files and sending whole file is not an option. So I made availability for HTTP request that sends response with ~20 sec of video data.
HTTP request contains byte offset in video so server can seek and read fast.
On client side data is pushed to appsrc and displayed by pipeline.
appsrc's properties:
duration is set correctly (with error less than half a second),
stream-type is set to GST_APP_STREAM_TYPE_SEEKABLE so I can perform seeks (seek-data signals)
appsrc has connected 'need-data' and 'seek-data' signals. There is offset is being remembered.
'need-data' uses offset to request next chunk of video (and adds size of data to offset so I can request another one later).
'seek-data' changes offset if seek was requested.
If I watch video from start everything is fine and chunks are grabbed one by another. If I try to perform a seek problems start.
For example, seek function:
//pipeline is defined outside of function
void Seek(gint64 offset_ns)
{
// current position, duration of video and seek position
gint64 pos_ns, dur_ns, seek_ns;
dur_ns = GetCurrentVideoDuration();
gst_element_query_position(pipeline,GST_FORMAT_TIME,&pos_ns);
seek_ns = pos_ns + offset_ns;
if (seek_ns < 0)
seek_ns = 0;
gst_element_seek (pipeline, 1, GST_FORMAT_TIME,
GST_SEEK_FLAG_ACCURATE | GST_SEEK_FLAG_FLUSH,
GST_SEEK_TYPE_SET, seek_ns,
GST_SEEK_TYPE_SET, dur_ns);
}
After function call 'seek-data' is called
gboolean seek_data_callback (GstElement * appsrc, guint64 offset, gpointer udata)
{
//remember requested offset
lastOffset = offset;
return true;
}
Let's say pos_ns was 5000000000 (5 seconds: 5 * GST_SECOND);
offset_ns was 30000000000 (30 seconds : 10 * GST_SECOND)
So seek_ns = pos_ns + offset_ns = 35 * GST_SECOND
With this, lastOffset should be increasing, but sometimes it decreases, increases, equal to duration and it looks like I'm missing something.
I'm not sure how does offset is being calculated in GStreamer and I don't know if it is possible to calculate offset by myself.
What problem can this be?

Related

Kafka Streams - GroupBy - Late Event - persistentWindowStore - WindowBy with Grace Period and Suppress

My purpose to calculate success and fail message from source to destination per second and sum their results in daily bases.
I had two options to do that ;
stream events then group them time#source#destination
KeyValueBytesStoreSupplier streamStore = Stores.persistentKeyValueStore("store-name");
sourceStream.selectKey((k, v) -> v.getDataTime() + KEY_SEPERATOR + SRC + KEY_SEPERATOR + DEST ).groupByKey().aggregate(
DO SOME Aggregation,
Materialized.<String, AggregationObject>as(streamStore)
.withKeySerde(Serdes.String())
.withValueSerde(AggregationObjectSerdes));
After trying this approach above we noticed that state store is getting increase because of number of unique keys are increasing and if i am correct, because of state topics are only "compact" they are never expires.
NumberOfUniqueKeys = 86.400 seconds in a day X SOURCE X DESTINATION
Then we thought that if we do not put a time field in a KEY block, we can reduce state store size. We tried windowing operation as second approach.
using windowing operation with persistentWindowStore, CustomTimeStampExtractor, WindowBy, Suppress
WindowBytesStoreSupplier streamStore = Stores.persistentWindowStore("store-name", Duration.ofHours(6), Duration.ofSeconds(1), false);
sourceStream.selectKey((k, v) -> SRC + KEY_SEPERATOR + DEST)
.groupByKey() .windowedBy(TimeWindows.of(Duration.ofSeconds(1)).grace(Duration.ofSeconds(5)))
.aggregate(
{
DO SOME Aggregation
}, Materialized.<String, AggregationObject>as(streamStore)
.withKeySerde(Serdes.String())
.withValueSerde(AggregationObjectSerdes))
.suppress(Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded())).toStream();`
After trying that second approach, we reduced state store size but now we had problem with late arrive events. Then we added grace period with 5 seconds with suppress operation and in addition using grace period and suppress operation did not guarantee to handle all late arrived events, another side effect of suppress operation is a latency because it emits result of aggregation after window grace period.
BTW
using windowing operation caused a getting WARNING message like
"WARN 1 --- [-StreamThread-2] o.a.k.s.state.internals.WindowKeySchema : Warning: window end time was truncated to Long.MAX"
I checked the reason from source code and I found from here
https://github.com/a0x8o/kafka/blob/master/streams/src/main/java/org/apache/kafka/streams/state/internals/WindowKeySchema.java
/**
* Safely construct a time window of the given size,
* taking care of bounding endMs to Long.MAX_VALUE if necessary
*/
static TimeWindow timeWindowForSize(final long startMs,
final long windowSize) {
long endMs = startMs + windowSize;
if (endMs < 0) {
LOG.warn("Warning: window end time was truncated to Long.MAX");
endMs = Long.MAX_VALUE;
}
return new TimeWindow(startMs, endMs);
}
BUT actually it does not make any sense to me that how endMs can be lower than 0...
Questions ?
What if we go through with approach 1, how can we reduce state store size ? In approach 1, It was guaranteed that all event will be processed and there will be no missing event because of latency.
What if we go through with approach 2, how should i tune my logic and catch late arrival data and reduce latency ?
Why do i get Warning message in approach 2 although all time fields are positive in my model ?
What can be other options that you can suggest other then these two approaches ?
I need some expert help :)
BR,
According to mail kafka mail group about warning message
WARNING message like "WARN 1 --- [-StreamThread-2] o.a.k.s.state.internals.WindowKeySchema : Warning: window end time was truncated to Long.MAX"
It was written to me :
You can get this message "o.a.k.s.state.internals.WindowKeySchema :
Warning: window end time was truncated to Long.MAX"" when your
TimeWindowDeserializer is created without a windowSize. There are two
constructors for a TimeWindowDeserializer, are you using the one with
WindowSize?
https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/TimeWindowedDeserializer.java#L46-L55
It calls WindowKeySchema with a Long.MAX_VALUE
https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/TimeWindowedDeserializer.java#L84-L90

Why is the turtlebot not moving continously?

if __name__ == '__main__':
rospy.init_node('gray')
settings = termios.tcgetattr(sys.stdin)
pub = rospy.Publisher('cmd_vel', Twist, queue_size=1)
x = 0
th = 0
node = Gray()
node.main()
We make the publisher(cmd_vel) in main, and run the main function of class gray.
def __init__(self):
self.r = rospy.Rate(10)
self.selecting_sub_image = "compressed" # you can choose image type "compressed", "raw"
if self.selecting_sub_image == "compressed":
self._sub = rospy.Subscriber('/raspicam_node/image/compressed', CompressedImage, self.callback, queue_size=1)
else:
self._sub = rospy.Subscriber('/usb_cam/image_raw', Image, self.callback, queue_size=1)
self.bridge = CvBridge()
init function makes a subscriber, which runs 'callback' when it gets data.
def main(self):
rospy.spin()
Then it runs the spin() function.
v, ang = vel_select(lvalue, rvalue, left_angle_num, right_angle_num, left_down, red_dots)
self.sendv(v, ang)
Inside the callback function, it gets a linear speed and angular speed value, and runs a sendv function to send it to the subscribers.
def sendv(self, lin_v, ang_v):
twist = Twist()
speed = rospy.get_param("~speed", 0.5)
turn = rospy.get_param("~turn", 1.0)
twist.linear.x = lin_v * speed
twist.angular.z = ang_v * turn
twist.linear.y, twist.linear.z, twist.angular.x, twist.angular.y = 0, 0, 0, 0
pub.publish(twist)
and... sendv function sends it to the turtlebot.
It has to move continuously, because if we do not publish data, it still has to move with the speed it got from the last publish. Also, callback function runs every 0.1 seconds, so it keeps sending data.
But it does not move continously. It stops for a few seconds, and go for a very short time, and stops again, and go for a very short time, and so on. The code which selects the speed works correctly, but the code who sents it to the turtlebot does not work well. Can anyone help?
Also, callback function runs every 0.1 seconds.
I believe this is incorrect. I see that you have made a self.r object but never used it anywhere in the code to achieve an update rate of 10hz. If you want to run the main loop at every 0.1 seconds, you will have to call your commands within the following loop (see rospy-rates) before calling rospy.spin():
self.r = rospy.Rate(10)
while not rospy.is_shutdown():
<user commands>
self.r.sleep()
However, this would not help you either since your code is publishing to /cmd_vel within the subscriber callback which gets called only on receiving some data from the subscriber. So basically, your /cmd_vel is not being published at the rate of 10hz but rather at the rate at which you are receiving the data from the subscribed topic ('/raspicam_node/image/compressed'). Since these are image topics, they might be taking a lot of time to be updated hence the delay in your velocity commands to the robot.

In Unity, how to segment the user's voice from microphone based on loudness?

I need to collect voice pieces from a continuous audio stream. I need to process later the user's voice piece that has just been said (not for speech recognition). What I am focusing on is only the voice's segmentation based on its loudness.
If after at least 1 second of silence, his voice becomes loud enough for a while, and then silent again for at least 1 second, I say this is a sentence and the voice should be segmented here.
I just know I can get raw audio data from the AudioClip created by Microphone.Start(). I want to write some code like this:
void Start()
{
audio = Microphone.Start(deviceName, true, 10, 16000);
}
void Update()
{
audio.GetData(fdata, 0);
for(int i = 0; i < fdata.Length; i++) {
u16data[i] = Convert.ToUInt16(fdata[i] * 65535);
}
// ... Process u16data
}
But what I'm not sure is:
Every frame when I call audio.GetData(fdata, 0), what I get is the latest 10 seconds of sound data if fdata is big enough or shorter than 10 seconds if fdata is not big enough, is it right?
fdata is a float array, and what I need is a 16 kHz, 16 bit PCM buffer. Is it right to convert the data like: u16data[i] = fdata[i] * 65535?
What is the right way to detect loud moments and silent moments in fdata?
No. you have to read starting at the current position within the AudioClip using Microphone.GetPosition
Get the position in samples of the recording.
and pass the optained index to AudioClip.GetData
Use the offsetSamples parameter to start the read from a specific position in the clip
fdata = new float[clip.samples * clip.channels];
var currentIndex = Microphone.GetPosition(null);
audio.GetData(fdata, currentIndex);
I don't understand what exactly you convert this for. fdata will contain
floats ranging from -1.0f to 1.0f (AudioClip.GetData)
so if for some reason you need to get values between short.MinValue (= -32768) and short.MaxValue(= 32767) than yes you can do that using
u16data[i] = Convert.ToUInt16(fdata[i] * short.MaxValue);
note however that Convert.ToUInt16(float):
value, rounded to the nearest 16-bit unsigned integer. If value is halfway between two whole numbers, the even number is returned; that is, 4.5 is converted to 4, and 5.5 is converted to 6.
you might want to rather use Mathf.RoundToInt first to also round up if a value is e.g. 4.5.
u16data[i] = Convert.ToUInt16(Mathf.RoundToInt(fdata[i] * short.MaxValue));
Your naming however suggests that you are actually trying to get unsigned values ushort (or also UInt16). For this you can not have negative values! So you have to shift the float values up in order to map the range (-1.0f | 1.0f ) to the range (0.0f | 1.0f) before multiplaying it by ushort.MaxValue(= 65535)
u16data[i] = Convert.ToUInt16(Mathf.RoundToInt(fdata[i] + 1) / 2 * ushort.MaxValue);
What you receive from AudioClip.GetData are the gain values of the audio track between -1.0f and 1.0f.
so a "loud" moment would be where
Mathf.Abs(fdata[i]) >= aCertainLoudThreshold;
a "silent" moment would be where
Mathf.Abs(fdata[i]) <= aCertainSiltenThreshold;
where aCertainSiltenThreshold might e.g. be 0.2f and aCertainLoudThreshold might e.g. be 0.8f.

AVAudioPCMBuffer built programmatically, not playing back in stereo

I'm trying to fill an AVAudioPCMBuffer programmatically in Swift to build a metronome. This is the first real app I'm trying to build, so it's also my first audio app. Right now I'm experimenting with different frameworks and methods of getting the metronome looping accurately.
I'm trying to build an AVAudioPCMBuffer with the length of a measure/bar so that I can use the .Loops option of the AVAudioPlayerNode's scheduleBuffer method. I start by loading my file(2 ch, 44100 Hz, Float32, non-inter, *.wav and *.m4a both have same issue) into a buffer, then copying that buffer frame by frame separated by empty frames into the barBuffer. The loop below is how I'm accomplishing this.
If I schedule the original buffer to play, it will play back in stereo, but when I schedule the barBuffer, I only get the left channel. As I said I'm a beginner at programming, and have no experience with audio programming, so this might be my lack of knowledge on 32 bit float channels, or on this data type UnsafePointer<UnsafeMutablePointer<float>>. When I look at the floatChannelData property in swift, the description makes it sound like this should be copying two channels.
var j = 0
for i in 0..<Int(capacity) {
barBuffer.floatChannelData.memory[j] = buffer.floatChannelData.memory[i]
j += 1
}
j += Int(silenceLengthInSamples)
// loop runs 4 times for 4 beats per bar.
edit: I removed the glaring mistake i += 1, thanks to hotpaw2. The right channel is still missing when barBuffer is played back though.
Unsafe pointers in swift are pretty weird to get used to.
floatChannelData.memory[j] only accesses the first channel of data. To access the other channel(s), you have a couple choices:
Using advancedBy
// Where current channel is at 0
// Get a channel pointer aka UnsafePointer<UnsafeMutablePointer<Float>>
let channelN = floatChannelData.advancedBy( channelNumber )
// Get channel data aka UnsafeMutablePointer<Float>
let channelNData = channelN.memory
// Get first two floats of channel channelNumber
let floatOne = channelNData.memory
let floatTwo = channelNData.advancedBy(1).memory
Using Subscript
// Get channel data aka UnsafeMutablePointer<Float>
let channelNData = floatChannelData[ channelNumber ]
// Get first two floats of channel channelNumber
let floatOne = channelNData[0]
let floatTwo = channelNData[1]
Using subscript is much clearer and the step of advancing and then manually
accessing memory is implicit.
For your loop, try accessing all channels of the buffer by doing something like this:
for i in 0..<Int(capacity) {
for n in 0..<Int(buffer.format.channelCount) {
barBuffer.floatChannelData[n][j] = buffer.floatChannelData[n][i]
}
}
Hope this helps!
This looks like a misunderstanding of Swift "for" loops. The Swift "for" loop automatically increments the "i" array index. But you are incrementing it again in the loop body, which means that you end up skipping every other sample (the Right channel) in your initial buffer.

MSP430 Music Player Can't Produce Note Higher than Certain Frequency

I'm trying to complete an assignment that requires me to make a music player using the MSP430 microprocessor and Launchpad kit. I have the player completely working, but for some reason when I try to play above a certain note, it outputs rapid clicking instead of the tone.
I know the speaker can produce a higher tone, so I am fairly certain it's an issue with my software, probably creating some sort of math error. Here is my code (at least the part that handles the notes):
asm(" .length 10000");
asm(" .width 132");
#include "msp430g2553.h"
//-----------------------
// define the bit mask (within P1) corresponding to output TA0
#define TA0_BIT 0x02
// define the port and location for the button (this is the built in button)
// specific bit for the button
#define BUTTON_BIT 0x04
#define PLUS_BUTTON 0x08 //Defines the "GO FASTER" button to P1.3
#define MINUS_BUTTON 0x10 //Defines the "SLOW DOWN" button to P1.4
#define SHIFT 0x20
//----------------------------------
// Some global variables (mainly to look at in the debugger)
volatile unsigned halfPeriod; // half period count for the timer
volatile unsigned long intcount=0; // number of times the interrupt has occurred
volatile unsigned soundOn=0; // state of sound: 0 or OUTMOD_4 (0x0080)
volatile int noteCount = 0;
volatile int noteLength = 0;
volatile int deltaHP=1; // step in half period per half period
volatile unsigned int plus_on;
volatile unsigned int minus_on;
volatile double speed = 1;
volatile int shiftkey = 0;
static const int noteArray[] = {800, 1000, 900, 800}; //THESE ARE THE NOTES
static const int noteLengths[] = {200, 500, 500, 500};
void init_timer(void); // routine to setup the timer
void init_button(void); // routine to setup the button
// ++++++++++++++++++++++++++
void main(){
WDTCTL = WDTPW + WDTHOLD; // Stop watchdog timer
BCSCTL1 = CALBC1_1MHZ; // 1Mhz calibration for clock
DCOCTL = CALDCO_1MHZ;
//halfPeriod=noteArray[0]; // initial half-period at lowest frequency
init_timer(); // initialize timer
init_button(); // initialize the button
_bis_SR_register(GIE+LPM0_bits);// enable general interrupts and power down CPU
}
// +++++++++++++++++++++++++++
// Sound Production System
void init_timer(){ // initialization and start of timer
TA0CTL |=TACLR; // reset clock
TA0CTL =TASSEL1+ID_0+MC_2; // clock source = SMCLK, clock divider=1, continuous mode,
TA0CCTL0=soundOn+CCIE; // compare mode, outmod=sound, interrupt CCR1 on
TA0CCR0 = TAR+noteArray[0]; // time for first alarm
P1SEL|=TA0_BIT; // connect timer output to pin
P1DIR|=TA0_BIT;
}
// +++++++++++++++++++++++++++
void interrupt sound_handler(){
TACCR0 += (noteArray[noteCount]); // advance 'alarm' time
if (soundOn){ // change half period if the sound is playing
noteLength++;
if (noteLength >= (speed* noteLengths[noteCount])) {
noteLength=0;
noteCount++;
if (noteCount == sizeof(noteArray)/sizeof(int)) {
//halfPeriod += deltaHP;
noteCount = 0;
//deltaHP=-deltaHP;
}
}
}
TA0CCTL0 = CCIE + soundOn; // update control register with current soundOn
++intcount; // advance debug counter
}
ISR_VECTOR(sound_handler,".int09") // declare interrupt vector
Currently I have just 4 random notes in there with 4 random lengths to demonstrate the error. The strange clicking noise happens somewhere between a note value of 800 and 900. Am I just missing something in my code that would produce an error for a number smaller than 8xx? I don't see any spots for division errors or the like but I could be wrong.
Thank you.
ALSO: I should note that when the error occurs, the clicking lasts a very long time, much longer than the corresponding length for that note, but it isn't permanent. Eventually the player moves on to the next note and plays it normally as long as it's larger than 900 or so.
If the interrupt handler does not execute fast enough, the setting of the next event (TACCR0 += noteArray[...]) will come too late, i.e., after that timer value has already been reached. So the next timer interrupt will fire not after 800 ticks but after 216+800 ticks.
You might try to optimize the interrup handler function.
In particular, floating-point emulation can take hundreds of cycles; remove speed.
However, instead of toggling the output in software, you should take advantage of the hardware capabilites, and generate the waveform with the PWM function: run the timer in Up mode, and use set/reset output mode for the second CCR (see section 12.2.5.2 of the User's Guide).
(This implies that you need timer interrupts only to start/stop notes, so to fit into the 216 limit, you probably want to use a second timer based on a much slower clock.)