Custom Metadata with Icecast - metadata

I need to add additional metadata to an Icecast stream (beyond Artist and Song Title). I've tried a number of ideas but none seems to yield anything. The situation is made more complicated in that the metadata also passes through Wowza, which is re-streaming the Icecast stream. Is there a canonical list of metadata fields supported by Icecast, and does anyone have experience passing custom metadata as part of an Icecast stream and on to Wowza?

My node-icy module is capable in reading in an arbitrary Icecast stream and intercepting and/or adding "metadata" events to an output stream. You are essentially "proxying" the stream. A good (though slightly complicated) example might be here: proxy.js
Do note though that the "metadata" format is a semicolon-delimited String of key-value pairs, but Icecast clients only react to the StreamTitle value, so just stuff all the information you want into there:
StreamTitle='Pink Floyd - Welcome to the Machine';
I've done things like send a metadata event every second to keep a track position counter (though that may have been a little network-heavy):
StreamTitle='Pink Floyd - Welcome to the Machine (0:12/4:02)';
Good luck!

Though it's a bit complicated, the Savonet/Liquidsoap scripting language has facilities to modify/set stream metadata. There's a couple examples in the Liquidsoap wiki at https://wiki.sourcefabric.org/display/LS/UserScripts

Related

Reading & writing text in Scala, getting the encoding right?

I'm reading and writings some text files in Scala. As a complete beginner in the language, I wanted to make sure to find the right way to do it, e.g. get the encoding right.
So most of the stuff I found (also on SO ) recommends I use io.Source.fromFile.However, after trying it out like so, reading a UTF-8 file:
val user_list = Source.fromFile("usernames.txt").getLines.toList
val user_list = Source.fromFile("usernames.txt", enc="UTF8").getLines.toList
I looked at the docs but was left with some questions.
Get the encoding right:
the docs show that I can set an encoding in Source.fromFile as I tried above. Looking at the man on Codec and the types listed there, I was wondering if those are all my codec options - is there e.g. no Utf-16, Big-Endian vs Little-Endian, etc.?
I am slightly obsessed with this since it used to trip me up in Python a lot. Is this less of concern with Scala for some reason?
Get the reading in right:
All the examples I looked at used the getLines method and postprocessed it with MkString or List, etc. Is there any advantage to that over just reading in the entire file (my files are small) in one go?
Get the writing out right:
Every source I could find tells me that Scala has no file writing function and to use the Java FileWriter. I was surprised by this - is this still accurate?
Looking at it I feel the question might be a little broad for SO, so I'd be happy to take it back if it does not meet the requirements. At this point, I'm not struggling with specific examples but rather trying to set things up in a way I don't get in trouble later.
Thanks!
Scala only has a basic IO api in the standard library. For the most part you just use the java apis. The fact that a decent api from java exists is probably why the Scala team is not prioritizing having a robust and fully featured IO api.
There are also third party scala libraries you could use as well however. Better Files I've never used but heard good things about as a Scala file api. As well as fs2 which provides functional, streaming IO. I'm sure there are others out there as well.
For encoding, there are many possible encoding available. It's just that only a couple of the most common ones are available as static fields, the rest you typically access through Codec("Encoding Name"). Most apis will also let you just enter a String directly instead of needing to get a Codec instance first. The codec is really just a wrapper over java.nio.charset.Charset. You can run java.nio.charset.Charset.availableCharsets() to see all of the encodings available on your system.
As far as reading, if the files are small you can load them fully into memory if you prefer that. The only reason not to do so is if you want to avoid the extra memory use of loading the entire file at once if reading through line by line is enough. You may want to use Vector instead of List for efficiency reasons (Vector is better in many cases and should probably be preferred as a default collection, but tradition and old habits die hard and most people/guides seem to default to List, but this is a whole other topic)

Chords in MIDI?

I'm looking for a way to represent chords in a MIDI file.
Note that I'm not looking to represent chord voicings. That can be trivially done with multiple note-on messages. But if I do that, then I have to do some sort of note-on to chord analysis every time I read the MIDI file back in, and that's a major nuisance especially since I already know the chord structures when I write the file.
Rather, I'm looking for something more akin to guitar tablature or fake books. That is, I want to record "C" or "Cm" or "I" or "I" or “iii7" at a particular point in time.
So my questions...
Is there a standard way to do this? (I'm not finding one, but I don't know the current spec thoroughly.)
Is there a non-standard way of doing this?
I'm considering using the "tag" facility of the lyric/display meta event. It appears as though I can invent {#chord=Cm} and that should be transparent to any reader, past, present, or future, who doesn't understand this usage. Am I reading the standard right? Would this be a reasonable, essentially private, non-standard extension?
The MIDI specification provides for values such as "note on" and "pitch value" (as seen here) which are only represented as integers.
Depending on the MIDI Type (there are 3), you should be able to save the chord values similarly to the way that you suggested. Karaoke files are created this way.
If you are using Windows, you could try something like Noteworthy Composer. The link also contains a suggestion for playback.
You are absolutely right, you can implement custom meta event and place such events before groups of NoteOn/NoteOff that represent a chord. I don't know what programming language you use, but for C# you can take a look at DryWetMIDI. It allows create custom meta events, read and write them. This article of the library docs shows how to do this.

How would you go about writing a Parser similar to Facebook Graph Search

I've read quite a few articles giving a bit of background information on how Facebook implemented their Graph Search. All of which seem to just glance over the actual implementation details of the parser they are using.
Such as https://www.facebook.com/notes/facebook-engineering/under-the-hood-building-graph-search-beta/10151240856103920
From that page:
We combined various parsing techniques to build a substring parser:
suppose a user inputs, say, "friends New York" and that we have
defined a comprehensive set of all the potential page titles our
system can handle. Our parser could then generate exactly the Graph
Search titles that contain the user's input, including things like
"friends who live in New York" and "friends who have visited New
York." If we could find a way to appropriately rank those suggested
titles for the Graph Search typeahead, we would have a good start.
I'm really interested in learning about the methods one would use to tackle this problem. What Algorithm / Techniques would be used to write such a system ?
Any links would be much appreciated too.
I was thinking about implementing something similar.. wanted to ask Q here on SO and found that this is already asked..
Here is what I have been thinking to start with -
Assume facebook search engine "knows" about the underlying data store (a complex graph). So the search engine understands key words like "Friends", "Relative" and other such relationships and does not treat them like a trivial word in english language.
In such case, a good idea could be to parse the user input (using client side javascript) to a JSON and send it over to the search engine .. a couple of benefits .. the parsing can be done on client side, save network bandwidth by not sending unwanted data, server side handling for the parsed input as JSON is way better..etc
Lets call this JSON fbJSON.. because apart from being a JSON .. it adheres to a certain format.. You can create a spec for your format.. such that the JSON that is sent over to search engine necessarily contains some information.. this can make life a bit easier .. just like we have geoJSON etc..
Use an NLP program to parse the user input into fbJSON [I still have to think about this]
This is a broad approach upon which i m embarking upon.. the only bottleneck is point #4..because I do not have much experience with NLPs..

advice on choosing different binary xml tools

My requirement is to compress xml file into a binary format, transmit it and decompress it (lightening fast) before i start parsing it.
There are quite a few binary xml protocols and tools available. I found EXI (efficient xml interchange) better as compared to others. Tried its open source version Exificient and found it good.
I heard about google protocol buffers and facebook's thrift, can any one tell me if these two can do the job i am looking for?
OR just let me know if there is anything better then EXI i should look for.
Also, There is a good XML parser VTD-XML (haven't tried myself, just googled about it and read some articles) that accomplishes better parsing performances as compared to DOM,SAX and Stax.
I want best of both worlds, best compression + best parsing performance, any suggestions?
One more thing regarding EXI, how can EXI claim to be fast at parsing a decoded XML file? Because it is being parsed by DOM, SAX or STax? I would have believed this to be true if there was another binary parser for reading the decoded version. Correct me if i am wrong.
ALSO, is there any good C++ open source implementation for EXI format? A version in java is available by EXIficient, but i am not able to spot a C++ open source implementation?
There is one by agile delta but that's commercial.
You mention protocol buffers (protobuf); this is a binary format, but has no direct relationship to XML. In partiular, no member-names (element names / attribute names / namespaces) are encoded - it is just the data (with numeric markers for identifiers).
As such, you cannot reconstruct arbitrary XML from a protobuf stream unless you already know how to map "field 3" etc.
However! If you have an object-model that works with both XML and protobuf, the transform is trivial; deserialize with either - serialize with either. How well this works depends on the implementation; for example, it is trivial with protobuf-net and is actually how I do the codegen (load the binary; write as XML; run the XML through an xslt layer to emit code).
If you actually just want to transfer object data (and XML is just a proposed implementation detail), then I thoroughly recommend protobuf; platform independent, a wide range of implementations, version-tolerant, very small output, and very fast processing at both read and write.
Nadeem,
These are very good questions. You might be new to the domain, but the same questions are frequently asked by XML veterans. I'll try to address each of them.
I heard about google protocol buffers and facebook's thrift, can any one tell me if these two can do the job i am looking for?
As mentioned by Marc, Protocol Buffers and Thrift are binary data formats, but they are not XML formats designed to transport XML data. E.g., they have no support for XML concepts like namespaces, attributes, etc., so mapping between XML and these binary formats would require a fair bit of work on your part.
OR just let me know if there is anything better then EXI i should look for.
EXI is likely your best bet. The W3C completed a pretty thorough analysis of XML format implementations and found the EXI implementation (Efficient XML) consistently achieved the best compactness and was one of the fastest. They also found it consistently achieved better compactness than GZIP compression and even packed binary formats like ASN.1 PER (see W3C EXI Evaluation). None of the other XML formats were able to do that. In the tests I've seen comparing EXI with Protocol Buffers, EXI was at least 2-4 times smaller.
I want best of both worlds, best compression + best parsing performance, any suggestions??
If it is an option, you might want to consider the commercial products. The W3C EXI tests mentioned above used Efficient XML, which is much faster than EXIficient (sometimes >10 times faster parsing and >20 times faster serializing). Your mileage may vary, so you should test it yourself if it is an option.
One more thing regarding EXI, how can EXI claim to be fast at parsing a decoded XML file?
The reason EXI can be smaller and faster to parse than XML is because EXI can be streamed directly to/from memory via the standard XML APIs without ever producing the data in an intermediate XML format. So, instead of serializing your data as XML via a standard API, compressing the XML, sending the compressed XML, decompressing the XML on the other end, then parsing it through one of the XML APIs, ... you can serialize your data directly as EXI via a standard XML API, send the EXI, then parse the EXI directly through one of the XML APIs on the other side. This is a fundamental difference between compression and EXI. EXI is not compression per-se -- it is a more efficient XML format that can be streamed directly to/from your application.
Hope this helps!
Compression is unified with the grammar system in EXI format. The decoder API generally give you a sequence of events such as SAX events when you let decoders process EXI streams, however, decoders are not internally converting EXI back into XML text to feed into another parser. Instead, the decoder does all the convoluted decompression/scanning process to yield an API event sequence such as SAX. Because EXI and XML are compatible at the event level, it is fairly straightforward to write out XML text given an event sequence.

http live streaming (HLS): mixing streams and playlists in index file

I'm implementing a small HLS playlist parser. I was wondering if a variant playlist could also contain streams.
I.e. Is the following allowed?
#EXTM3U
#EXT-X-TARGETDURATION:8
#EXT-X-MEDIA-SEQUENCE:2680
#EXTINF:8,
https://priv.example.com/fileSequence2680.ts
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1280000
http://example.com/low.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=2560000
http://example.com/mid.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=7680000
http://example.com/hi.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=65000,CODECS="mp4a.40.5"
http://example.com/audio-only.m3u8
It doesn't really make sense but the spec doesn't explicitly say it's not allowed.
If it were allowed, I would expect a player to play fileSequence2680.ts then files from low.m3u8, mid.m3u8, hi.m3u8 or audio-only.m3u8 depending on the bandwidth.
Thanks
Probably not. Passing such a playlist through Apple's mediastreamvalidator is probably the best way to fond out if this is supported (I doubt it).