Looking for a way to output the top viewed youtube video to a text file for a search term - perl

I would like to output just the top youtube video for a particular search term, e.g Tennis to a text file. Command line options are what I prefer but am open to other solutions.

You can fetch the data you need in XML format from YouTube's API.
(Note: The results may differ from the HTML website)
Then parse the XML with anything you want, e.g. Perl's XML::LibXML::XPathContext. It's a bit fiddly though, if you haven't used that module before.
Once you have the video URL, you can pass it to youtube-dl.

Related

Writing MP4 tags for M4A or MP4 audio files

I have a strange problem with MP4 tagging.. I can figure out 2 styles of tags, one that works with mp3tag and tagscanner, another that works with MusicBee.. But I can't figure out one that universally works with all of those. So I write 2 sets of tags into the file...
and even this isn't enough.. Players like AIMP and Clementine still can't read MP4 files I tagged this way. I need to open mp3tag load my files and save them.. then it will write tags that those music players understand.. but I can't find good documentation anywhere.
Does anyone know what kind of tags I need to write to make all of them be able to read the tags? I tried to look mp4s that work in all of them and it is no use, I see tags like "Artist".. I already write a tag called "Artist".. I mean it looks like "Artist" in exif also, this is the tag that I wrote that MusicBee understands.
I use the AudioGenie Windows Library to write the tags. There are 2 different methods for writing a tag.. one is called an ISLT text frame (which I have no idea what that is) and requires an integer code as well as text when writing. Another is called an iTune text frame and requires a string frame ID as well as text.
I tried to shove MP3 ID3v2 tags in both of those as well, to see if that was what the third group of players that can't read my tags wanted. But that didn't work. I only tried this because I read somewhere that ID3v2 tags are widely used in MP4 files (it was only on one comment in stackoverflow that I read this, so I'm skeptical)
Could someone point me in the right direction?

Extract data from many PDF forms

I regularly receive large numbers of the same PDF form. I want to extract the data from them into a text file. I'd like to do this via a script of some sort. I'm working in a UNIX environment.
Is this possible? I've googled my brains out and can't find anything.
Text in PDF is represented by text elements in page content streams. The streams are commonly compressed. If you have the time and resources you can use ISO 32000-1:2008 or Adobe PDF 1.7 specification to build your own PDF parser. Or it may be more practical to use a 3rd party app as an intermediate translation step.
There are utilities that will decode the stream and give you clear text. One option is PDFtk Server which will work in your environment. Another option is to use the Poppler PDF Rendering Library which has a command line utility "pdftotext" useful for searching for strings in PDFs.

How to read PDF file from right to left on iOS

I am working on a basic project that reads pdf files from a server and show them on the screen.
The issue is that i want to read that files from right to left as a page.
Like Massimo Cafaro say :
If you want to extract some content from a pdf file, then you may want to read the following:
Parsing PDF Content
from the Quartz 2D programming guide.
Basically, you will use a CGPDFScanner object to parse the contents, which works as follows. You register a few callbacks that will be automatically invoked by Quartz 2D upon encountering some pdf operators in the pdf stream. After this initial step, you then actually start parsing the pdf stream.
Taking a brief look at your code, it appears that you are not following the steps required to parse the pdf content of the page you get through CGPDFDocumentGetPage(). You need first to setup the callbacks using CGPDFOperatorTableCreate() and CGPDFOperatorTableSetCallback(), then you get the page, you need to create a content stream using that page (using CGPDFContentStreamCreateWithPage()) and then instantiate a CGPDFScanner through CGPDFScannerCreate() and actually start scanning through CGPDFScannerScan().
The "Parsing PDF Content" section of the document pointed out by the above URL gives you all of the information required to implement pdf parsing.
if you don't try anything you can start with this project link

How to extract data from a web site and format to raw text - iPhone Dev

I have been looking around for a while and not found anything useful, also not sure if I have worded the question in the clearest fashion so apologies
I have a section of an app I am building called 'Company News'. The company in question has a news page on their website which displays a title, an excerpt of text and a read more option.
At the minute in the iPhone application I just have a UIWebView which links to that URL, displays an error if no connection is available. However, if my user clicks a story to read the news obviously it opens up a new page, I want to avoid having to build in 'back' and 'forward' buttons and stay away from it looking like a browser within the app.
With that said, I am looking for a way to just extract that data from the website and just display it in my app as raw text. I am not particularly bothered about rich text formatting or anything fancy. I would just like the title and body of text.
Is this possible?
In essence, then, you are looking for an HTML parser.
Assuming the HTML you wish to parse has a predictable format, the approach I would take is to load the HTML via whatever URL loading system you want - e.g. NSURLConnection, ASIHTTPRequest, etc.
Then you will need to parse the raw HTML. I use XPath. It requires that you learn the syntax, but it should work.
For more details about how you might use XPath for parsing HTML, see the second response to this question. You will need to link to libxml2 in your project then use XPath to extract the nodes of interest.
Scraping web pages in this way is fragile, though, because it depends on the structure of a page you don't control and which could be changed unpredictably.

List Out all video from url

I am trying to list out all Video from a url. For this i m sending an request to "You Tube"
url as "http://www.youtube.com/" and want to list out all available video . But i didn't get anything from that request ? any idea or any documentation hint ?
There are utilities for downloading youtube videos (for example Linux has youtube-dl), but it's not uncommon for sites with large numbers of downloadable files to prevent attempts to simply download everything - and even though you said you wanted to list rather than download all the videos, that's unfortunately what it would suggest to a website administrator.
Besides, files on youtube are not accessed by simple urls like http://www.youtube.com/filename
Something more is required. I don't think you can treat the (what is it?) 11 character alphabet soup as a filename, it's a parameter passed to the software which streams back the video.
EDIT: youtube-dl is a command-line program in Linux and probably BSD. You need to know the URL of the Youtube video so you can type (for example)
youtube-dl http://www.youtube.com/watch?v=Z1JZ9O15280
If you had a list of these URLs you could put them in a file and make a bulk download script - but that takes us back to your original question.
In Firefox I would right-click on a link to a Youtube video and choose 'copy link location'. Then paste the URLs one at a time into a text file. But this question is drifting away from mere programming...