Create .nfo-files recursively - find

I have a collection of concert-media (audio and video). They're all organised using the same pattern:
./ARTISTNAME/[YYYY-MM-DD] VENUE, CITY
Basically I want to write a script which goes through the folders, looks for the [YYYY-MM-DD]-folders and get the information regarding artist (one folder level above), date, venue and location (using the name of the folder it just found) and writing the information into a .nfo-file which it saves into the found folder.
I already did quite a research on this topic, found a similar script but I am stuck because it searches files instead of folders:
#!/bin/bash
find . -path "*[????-??-??]*" | while read folder; do # the script is supposed to look for the concert-folders
-> band_name=`echo $file_name | sed 's/ *-.*$//'` # Get rid of song name
-> band_name=`echo $band_name | sed 's/^.*\///'` # Get rid of file path
-> song_name=`echo $file_name | sed 's/^.*- *//'` # Get rid of band name
-> song_name=`echo $song_name | sed 's/.avi//'` # Get rid of file extension
-> new_file_name=`echo $file_name | sed 's/.avi/.nfo/'` # Make new filename
-> echo "Making $new_file_name..."
echo -en "<musicvideo>\n<venue>$venue_name</venue>\n<city>$city_name</city><date>$date>\date>\n<artist>$band_name</artist>\n</musicvideo>\n" > "$new_file_name"
done
After changing the first part of the script (making it look for the folders with "[YYYY-MM-DD]") I understand that the second part of the script allocates the "tags" (such as artist, date, location, etc.). But I don't know how to make the script take the tags from folder names. Basically help is needed after the "->".
At the last part of the script it is supposed to write the collected information for this folder into a .nfo-file (e.g. FOLDERNAME.nfo).

Here is a simple Python example to get you started. If you want to port it to shell you can.
First, my setup:
test
test/BOB
test/BOB/[3011-01-01] Lollapalooza 213, Saturn Base 5
test/THE WHO
test/THE WHO/[1969-08-17] Woodstock, Woodstock
The code:
#!/usr/bin/env python
import os
import os.path
import re
import sys
def handle_concert(dirname, artist, date, venue, city):
"""Create a NFO file in the directory."""
# the {} are replaced by each argument in turn, like printf()
# Using a triple quote is a lot like a here document in shell, i.e.
# cat <EOF
# foo
# EOF
nfo_data = """<musicvideo>
<venue>{}</venue>
<city>{}</city>
<date>{}</date>
<artist>{}</artist>
</musicvideo>
""".format(venue, city, date, artist)
nfo_file = "[{}] {}, {}.nfo".format(date, venue, city)
# when the with statement is done the file is closed and fully
# written.
with open(os.path.join(dirname, nfo_file), "w") as fp:
fp.write(nfo_data)
# This is a regular expression which matches:
# */FOO/[YYYY-MM-DD] VENUE, CITY
# Where possible, white space is left intact
concert_re = re.compile(r'.*/(?P<artist>.+)/\[(?P<date>\d{4}-\d{2}-\d{2})\]\s+(?P<venue>.+),\s+(?P<city>.+)')
def handle_artist(dirname):
"""Found an ARTIST directory. Look for concerts.
If a subdirectory is found, see if it matches a concert.
When a concert is found, handle it.
"""
for i in os.listdir(dirname):
subdir = os.path.join(dirname, i)
m = concert_re.match(subdir)
if m:
print subdir # to watch the progress
handle_concert(subdir, m.group("artist"), m.group("date"),
m.group("venue"), m.group("city"))
def walk_collection(start_dir):
"""Examine contents of start_dir.
If a directory is found, assume it is an ARTIST.
"""
for i in os.listdir(start_dir):
# os.path.join ensures paths are right regardless of OS
dirname = os.path.join(start_dir, i)
if os.path.isdir(dirname):
print dirname # to watch the progress
handle_artist(dirname)
if __name__ == "__main__":
collection_dir = sys.argv[1] # equiv of $1
walk_collection(collection_dir)
How to run it:
$ nfo-creator.py /path/to/your/collection
The results:
<musicvideo>
<venue>Woodstock</venue>
<city>Woodstock</city>
<date>1969-08-17</date>
<artist>THE WHO</artist>
</musicvideo>
and
<musicvideo>
<venue>Lollapalooza 213</venue>
<city>Saturn Base 5</city>
<date>3011-01-01</date>
<artist>BOB</artist>
</musicvideo>
You can add more handle_ functions to handle differing formats. Just define a new regular expression and a handler for it. I kept this really simple for learning purposes with plenty of shell oriented comments.
This could totally be done in the shell. But writing this Python code was way easier and allows for more growth in the future.
Enjoy, and feel free to ask questions.

Related

Change one word in lots of HTML pages with another word from a list of words

I have about 2000 HTML pages (all pages have the same content except for the city name). I have the list of city names, and i need each page to have 1 city name.
How can I change the City name in each page?
city name list: birmingham
montgomery
mobile
huntsville
tuscaloosa
hoover.. etc...
and I need to make each page like this:
title: birmingham,
next page;
title: montgomery,
and so on.
I need the change to happen in the title:Example (City Name)
and in 2 other h2 tags.
Thank you very much for your attention!
Update:
This script is for the existing files. It will hierarchically find all the index.html files in the current directory and will replace the "string_to_replace" string with that file's parent directory's name which is the city name in your case. It will also make that name capitalized before the replacement.
Feel free to update the tamplate_string variable value in the script so that it fits to the string which is used in your index.html files in place of the city name.
#!/bin/bash
template_string="string_to_replace"
current_dir=`pwd`
find $current_dir -name 'index.html' | while read file; do
dir=`basename $(dirname "$file")`
city="$(tr '[:lower:]' '[:upper:]' <<< ${dir:0:1})${dir:1}"
sed -i -e 's/'$template_string'/'$city'/g' $file
done
Initial answer:
My initial suggestion is to use a bash script (e.g. script.sh) similar to this:
#!/bin/bash
file="./cities.txt"
template="./template.html"
template_string="string_to_replace"
while IFS= read line
do
cp $template $line".html"
sed -i -e 's/'$template_string'/'$line'/g' $line".html"
echo "$line"
done <"$file"
and run it from bash terminal:
$ source script.sh
What you need to have:
cities.txt with cities names list, e.g.
London
Yerevan
Berlin
template.html with the html template you need to have in each file. Make sure the city name is set as "string_to_replace" in it, e.g. title: string_to_replace
Since you did not mention anything related to the files names, the files will be named like London.html, Yerevan.html,...
Let me know in case you don't need to create new files, and need to make replacement in the existing ones. In this case we'll need to update the script a bit after you tell me how you know which string is going to be used in the exact file.

Seperate and process odd named text files from an even named text files in Python

I am new to programming & python and is trying to write a program to process astronomical data.I have a huge list of files naming like ww_12m_no0021.spc, ww_12m_no0022.spc and so on. I want to move all the odd numbered files and even numbered files in two seperate folders.
import shutil
import os
for file in os.listdir("/Users/asifrasha/Desktop/python_test/input"):
if os.path.splitext(file) [1] == ".spc":
print file
shutil.copy(file, os.path.join("/Users/asifrasha/Desktop/python_test/output",file))
which is actually copying all the spc file to a different folder. I am struggling a bit on how I can only copy the odd number files (no0021, no0023…) to a seperate folder. Any help or suggestions will be much appreciated!!!
import os
import shutil
# Modify these to your need
odd_dir = "/Users/asifrasha/Desktop/python_test/output/odd"
even_dir = "/Users/asifrasha/Desktop/python_test/output/even"
for filename in os.listdir("/Users/asifrasha/Desktop/python_test/input"):
basename, extenstion = os.path.splitext(filename)
if extenstion == ".spc":
num = basename[-4:] # Get the numbers (i.e. the last 4 characters)
num = int(num, 10) # Convert to int (base 10)
if num % 2: # Odd
dest_dir = odd_dir
else: # Even
dest_dir = even_dir
dest = os.path.join(dest_dir, filename)
shutil.copy(filename, dest)
Obviously you can simplify it a bit; I'm just trying to be as clear as possible.
Assuming your files are named ww_12m_no followed by the number:
if int(os.splitext(file)[0][9:])%2==1:
#file is oddly numbered, go ahead and copy...
If the length of the first half of the name changes, I would use regex... I didn't test the code, but that's the gist of it. I'm not sure this question belongs here though...

perl quoting in ftp->ls with wildcard

contents of remote directory mydir :
blah.myname.1.txt
blah.myname.somethingelse.txt
blah.myname.randomcharacters.txt
blah.notmyname.1.txt
blah.notmyname.2.txt
...
in perl, I want to download all of this stuff with myname
I am failing really hard with the appropriate quoting. please help.
failed code
my #files;
#files = $ftp->ls( '*.myname.*.txt' ); # finds nothing
#files = $ftp->ls( '.*.myname.*.txt' ); # finds nothing
etc..
How do I put the wildcards so that they are interpreted by the ls, but not by perl? What is going wrong here?
I will assume that you are using the Net::FTP package. Then this part of the docs is interesting:
ls ( [ DIR ] )
Get a directory listing of DIR, or the current directory.
In an array context, returns a list of lines returned from the server. In a scalar context, returns a reference to a list.
This means that if you call this method with no arguments, you get a list of all files from the current directory, else from the directory specified.
There is no word about any patterns, which is not suprising: FTP is just a protocol to transfer files, and this module only a wrapper around that protocoll.
You can do the filtering easily with grep:
my #interesting = grep /pattern/, $ftp->ls();
To select all files that contain the character sequence myname, use grep /myname/, LIST.
To select all files that contain the character sequence .myname., use grep /\.myname\./, LIST.
To select all files that end with the character sequence .txt, use grep /\.txt$/, LIST.
The LIST is either the $ftp->ls or another grep, so you can easily chain multiple filtering steps.
Of course, Perl Regexes are more powerful than that, and we could do all the filtering in a single /\.myname\.[^.]+\.txt$/ or something, depending on your exact requirements. If you are desperate for a globbing syntax, there are tools available to convert glob patterns to regex objects, like Text::Glob, or even to do direct glob matching:
use Text::Glob qw(match_glob);
my #interesting = match_glob ".*.myname.*.txt", $ftp->ls;
However, that is inelegant, to say the least, as regexes are far more powerful and absolutely worth learning.

Bitwise comparision of two directories(files) in Perl

I am trying to achieve the follwing using perl
A script that performs bitwise comparison of files from two directories
(the directory names are passed as arguments to the script in the command line).
The script should read all files from the first directory and all subdirectories, and
compare them to the corresponding files (e.g. files with the same names) in the
second directory.
The result of the script - (PASSED or FAILED) is formed according to:
The result is FAILED when at least one file from the first directory is not bitwise
equal to the corresponding file in the second directory or the second directory
has no corresponding file.
Otherwise test is PASSED.
So far I have tried the approach in this thread created by me - Comparing two directories using Perl . After some point I realized I am essentially trying to do simulate "diff -r dir1 dir2" which isn't the goal, How can one perform bitwise comparision operation on two directories?
EDIT: Test Case
/dir1 /dir2
-- file1 -- file1
-- file2 -- file2
-- file3
-- ....
-- ...
---/subDir1
--file1
--file2
file1 of dir1 contains :- foo bar
file1 of dir2 contains :- foo
Result - Fail
file1 of dir1 contains :- foo bar
file1 of dir2 contains :- foo bar
Result - Pass.
The script should essentially extract files with same names present in different directories.
I would do something like this:
Open dir1
Read all filenames into an array
Open dir2
Read all filenames into an array
For any case in which a filename in dir1 matches a filename in dir2 or vice versa, begin compare logic
Use Digest::MD5 here to perform an MD5 comparison of the two files. If even one bit is off, you will get different checksums.
Code example from Digest::MD5...
use Digest::MD5 qw(md5 md5_hex md5_base64);
$digest = md5($data);
$digest = md5_hex($data);
$digest = md5_base64($data);
# OO style
use Digest::MD5;
$ctx = Digest::MD5->new;
$ctx->add($data);
$ctx->addfile(*FILE);
$digest = $ctx->digest;
$digest = $ctx->hexdigest;
$digest = $ctx->b64digest;
Generate an MD5 hash for each file and compare them, then pass or fail accordingly.

How can I interact with ClearCase from Perl?

My project needs couple of things to be extracted from ClearCase data using the Perl script in a excel sheet,those are -
By giving two particular time line or two baseline.
all the activity associated within that baseline (column header "activity")
Owner's id (column header-Owner)
all the element associated within a particular activity. (column header-"element details")
For each element the versions associated (column header-"Versions")
for each element the total number of lines of code,total number of lines of code added,total number of lines of code deleted,total number of lines of code changed..(column header"No. of lines of code","lines of code added","lines of code deleted" & " lines of code changed")
Please kindly help me on this...
Basically, ClearCase Perl scripting is based on parsed outputs of system and cleartool commands.
The scripts are based on a cleartool run cmd like package CCCmd, and used like:
use strict;
use Config;
require "path/to/CCCmd.pm";
sub Main
{
my $hostname = CCCmd::RunCmd('hostname');
chomp $hostname;
my $lsview = CCCmd::ClearToolNoError("lsview -l -pro -host $hostname");
return 1;
}
Main() || exit(1);
exit(0);
for instance.
So once you have the basic Perl structure, all you need is the right cleartool commands to analyze, based on fmt_ccase directives.
1/ all the activity associated within that baseline (column header "activity")
ct descr -fmt "%[activities]CXp" baseline:aBaseline.xyz#\ideapvob
That will give you the list of activities (separated by ',').
For each activity:
2/ Owner's id (column header-Owner)
ct descr -fmt "%u" activity:anActivityName#\ideapvob
3/ all the element associated within a particular activity. (column header-"element details")
Not sure: activities can list their versions (see /4), not easily their elements
4/ For each element the versions associated (column header-"Versions")
For a given activity:
ct descr -fmt "%[versions]CQp\n" activity:anActivityName#\ideapvob
5/ for each element the total number of lines of code,total number of lines of code added,total number of lines of code deleted,total number of lines of code changed..(column header"No. of lines of code","lines of code added","lines of code deleted" & " lines of code changed")
That can be fairly long, but for each version, you can compute the extended path of the previous version and make a diff.
I would advise using for all that a dynamic view, since you can access any version of a file from there (as opposed to a snapshot view).
Also if you need to use perl with Clearcase have a look at the CPAN module ClearCase::CtCmd. I would recommend to use this perl module for invoking clearcase commands.
For the CCCmd package, I had to remove the double-quotes in the RunCmd and RunCmdNoError subs to get it to work.