I run a small record label and we have a bunch of audio files stored on Amazon's S3. We want them converted to MP3's with a standard bitrate. I read about the NYTimes converting all their PDF's using EC2 and since I'm a nerdy web programmer, I'm intrigued. Instead of downloading all the files and converting them by hand, I'm wondering what it takes to set up an EC2 instance and get it set up to convert files? I want to be able to control it from my web server with PHP, so is the approach to create a virtual LAMP stack and install the LAME encoder?
If you want to convert your audio files (I'm assuming .wav since it's a pretty common format pre-format conversion) to mp3 LAME is a solid encoder.
A full blown LAMP stack is highly unnecessary for using LAME, a simple shell script will suffice.
This will convert all *.wav files in the current directory to .mp3 files if they do not have a converted copy already in-place (LAME doesn't care about clobbering output files).
#!/bin/bash
for file in *.wav; do
dest="${file%wav}mp3"
if [[ -e "$file" ]] && [[ ! -e "$dest" ]]; then
lame "$file" "$dest"
fi
done
You will want to look through man lame for the conversion options specific to your VBR/CBR/ABR (variable, constant and average bitrate) needs.
While the above answer would work if you already had the files in the local EC2, you'll have to fetch each song from S3 into EC2, either into a pipe for conversion or into a temporary file, then either pipe it back up to S3 or store it in a temp file and then send it back to EC2.
Haven't actually used EC2, so not sure what kind of storage you're working with, but you should have plenty of space to store the one temporary mp3.
You would probably also want to create some way of tracking status, probably by doing a listing on your bucket before you start.
Probably a perl script using the S3 module would be more suited, but I'm too lazy to type that all in here :).
You could use Elastic MapReduce for this. Although you'd have to play around a bit to get it to spit out separate files as output.
Related
In a perl script, I try to convert svg files to pdf. This works great by just refering to Inkscape:
system "inkscape -D -z --file=$in --export-pdf=$out";
But it is enormously slow even for little 100 KB files, I mean it can be minutes per file, causing the script to fail when running with a time-out constrain, eg. on a webserver.
To speed up, I have read about svg2pdf as a standalone, but never found a binary for Win7 or managed to compile it, even with the libcairo dlls present.
My last idea now is to use the CPAN module Cairo. It makes me hoping that it can convert an svg file to pdf, but in the documentation I only find drawings and surfaces, but no method to write/convert.
Has anyone experience with that?
Making my comment an answer: You could try rsvg-convert which is part of the librsvg library. It's probably faster than Inkscape but it's still an external command.
I am looking for a tool that would ease the modification of text configuration files for tasks like:
Set ForwardAgent yes on /etc/ssh/ssh_config
Append HGUSER to AcceptEnv in /etc/ssh/sshd_config (that's more complex as it does accept several params, if yours is not alread there it should add it)
Most important:
running it several times should have no side effects.
if something looks weird, it should complain (for example if you find the same line several times in a file, or if the expected syntax does not match).
Is there any linux tool that can easily be used to automate things like this?
The whole point is to be able to write these config patches somewhere so you can deploy them on several machines or on a new machine when needed.
I would certainly do this with bash scripting. Here is a great tutorial.
http://linuxconfig.org/Bash_scripting_Tutorial
to change a line in a file you could do something like:
check the file exists
grep for the value you want to change - error if it appears multiple times or something
use sed to change that line
to append something to a file
check if file exists
grep to ensure it hasn't been appended to already
echo whatever >> file - the double greater than appends to a file
with each of these I would make a backup copy of the file first, just in case something goes wrong
You might want to have a look at the Unified Configuration Interface (UCI) used in Embedded Linux systems. If you have the flexibility to adapt the UCI format for your config files, this is pretty similar to what you are looking for.
I would like to view my CSV files in a column-aligned format from the command line, with something like less, but my CSV files are sometimes gigabytes big, and I'm using a little computer (Netbook, 1GB RAM, 8GB HD, 1GHz processor), so I don't want to waste a lot of memory or processing power viewing the file.
I mention that I'd like to use something like less because I would like to be able to navigate around within the file.
cat FILE | column -s, -t | less is one thought, but cat is still going to try to print the whole file and I'm not sure how much buffering the pipes will use (if any) or what sort of caching less employs.
This question is similar to this other question, but I'm specifically interested in viewing large files using minimal resources preferably already on the machine. I don't presently use VI or EMACS, and think they'd both be overkill here. VI, for instance, would be a 27MB install for a utility acting merely as a viewer.
First of all, less can open oversized files. Second, both vim (which I use with the Largefile plugin and with files over 8 GB) and emacs can do it.
But... Most of the time, viewing a big file in a 80x40 (or a bit bigger) terminal is useless... so you should filter it with something like (f)grep or process it with awk. If you want only the start or end, then there are head and tail.
HTH
Check the tail \ head commands.
Or even better, Download VIM source and compile it. That should be easy enough. Version 5.8 source is 1Mb before decompressing (4MB after). Enjoy.
I need to quickly join a number of separate m4a files in to one large one. Is there any way to do so via CLI on in Mac OS X?
I think I found a way myself without any transcoding nessesary, with some inspiration from Coxys answer. MP4Box lets me do it: MP4Box -cat file1.m4a -cat file2.m4a output.m4a. It doesn´t retain any metadata but for my purpose its just fine.
Now, given Coxys answer, are there any pitfalls with the file I just haven´t discovered yet?
Unfortunately AAC files cannot be joined by simple concatenation, which is what I assume you mean when you mention doing it via the command line?
If you installed something like ffmpeg then you could certainly build up a process to take in AAC audio files, convert them to the uncompressed wave data, join them all up, then export the resulting file back to AAC again.
Alternately, if using the CLI is not a hard requirement then you could do this with iTunes and Audacity.
Usually both files are availble for running some diff tool but I need to find the differences in 2 binary files when one of them resides in the server and another is in the mobile device. Then only the different parts can be sent to the server and file updated.
There is the bsdiff tool. Debian has a bsdiff package, too, and there are high-level programming language interfaces like python-bsdiff.
I think that a jailbreaked iPhone, Android or similar mobile device can run bsdiff, but maybe you have to compile the software yourself.
But note! If you use the binary diff only to decide which part of the file to update, better use rsync. rsync has a built-in binary diff algorithm.
You're probably using the name generically, because diff expects its arguments to be text files.
If given binary files, it can only say they're different, not what the differences are.
But you need to update only the modified parts of binary files.
This is how the Open Source program called Rsync works, but I'm not aware of any version running on mobile devices.
To find the differences, you must compare. If you cannot compare, you cannot compute the minimal differences.
What kind of changes do you do to the local file?
Inserts?
Deletions?
Updates?
If only updates, ie. the size and location of unchanged data is constant, then a block-type checksum solution might work, where you split the file up into blocks, compute the checksum of each, and compare with a list of previous checksums. Then you only have to send the modified blocks.
Also, if possible, you could store two versions of the file locally, the old and modified.
Sounds like a job for rsync. See also librsync and pyrsync.
Cool thing about the rsync algorithm is that you don't need both files to be accessible on the same machine.