How to read a file from the disk if less than X days old, if older, refetch the html file - perl

I wish to read an html file off of the internet and cache it. Then when I go back, because I'm debugging, I don't want to hammer the servers with the numerous requests I'll need. I don't want to get my IP banned for slamming the server over and over again just because I'm debugging. So my code needs to look something like:
if ((file > days_old) || !(file exists))
fetch html file from internet
save file to disk
else
read it from the disk
Because there will be multiple files, I'll need to include a variable name in the file name so the file is unique and I can easily look it up again.
I just learned Perl this semester and we only learned the basics & a bit of regex, once I get this I should be mostly fine.
Thanks!

Use an existing module:
Cache::Cache
HTTP::Cache::Transparent
If you really want to implement your own, you'll want to look at the If-Modified-Since and ETag HTTP headers to determine when to re-fetch a file, rather than an arbitrary days_old number you suck out of your thumb. You will also have to generate a unique filename, preferably with a hash function, while retaining the original URL to cater for hash collisions.

Related

How to replace value in txt file with powershell from GitHub

I want to build a simple script that may be useful for others as well, but I have only very basic programming knowledge and can't do it myself without learning how to write powershell scripts from scratch.
What this script is supposed to do is, open an INI file (really just a txt), look for a variable with an assigned value and replace that value from a txt hosted on GitHub, save and then run a program.
This is for the tracker list of qBittorrent, since that feature still hasn't been implemented and the only other script that I could find that does this is for linux and mac, there seem to be none for windows.
The basic idea is this:
get-content "c:\users\[user]\appdata\roaming\qbittorrent\qbittorrent.ini"
# This is where pseudo code starts
get file from "[github-link.txt]"
save file to cache # keeping it is useless as it gets updated daily
find variable "Session\AdditionalTrackers=" in qbittorrent.ini
replace value of variable with content of cached file # this is what I struggle with most when looking for example code. Everything I could find specified the exact string that needed replacing, which in this case is quite long and may change with every update of the file.
overwrite original file
launch program qbittorrent.exe
end script
Conveniently or most likely deliberately all (most) of the tracker lists on GitHub are already formatted in a way that they can be directly pasted into the file without having to worry about formatting. Example.
I can totally understand if nobody wants to do the work, but I would greatly appreciate it and possibly others that are looking for a stopgap for the lacking feature.
If this already exists, go ahead and call me an idiot and while you're at it drop a link ;)
I just found a little tool called Power Automate and it pretty much does what I was looking for. It's not quite as elegant as a single click script but it does the job. Sadly I can't share the "flow" I built because, well, there is no option for it - thanks Microsoft. So, I'll try my best to write it out.
Not quite a "solution" but pretty to close to it.
Here is the "flow":
get file from web // from github for example
read text from file // read downloaded .txt file
read text from file // read qBittorrent.ini
crop text // crop between flags in qBittorrent.ini use "Session\AdditionalTrackers=" as start and "Session\GlobalMaxRatio=" as end and save to cropVar2
crop text // crop before flag use "Session\AdditionalTrackers=" as flag and save to cropVar1
crop text // crop after flag use cropVar2 as flag and save to cropVar3
replace text // replace cropVar2 with content of downloaded file and save to cropVar2
write text to file // write cropVar1,cropVar2,cropVar3
end flow
Keep in mind that any changes to the qBittorrent.ini may change the order of the entries. Which means you have to check if it's still correct after every update and after every change you make in the options. This is a massive cludge after all...
You can input fail saves so that you won't break anything if the order changed.

System.IO - Does BinaryReader/Writer read/write exactly what a file contains? (abstract concept)

I'm relatively new to C# and am attempting to adapt a text encryption algorithm I designed in wxMaxima into a Binary encryption program in C# using Visual Studio forms. Because I am new to reading/writing binary files, I am lacking in knowledge regarding what happens when I try to read or write to a filestream.
For example, instead of encrypting a text file as I've done in the past, say I want to encrypt an executable or any other form of binary file.
Here are a few questions I don't understand:
When I open a file stream and use binaryreader will it read in an absolute duplicate of absolutely everything in the file? I want to be able to, for example, read in an entire file, delete the original file, then create a new file with the old name and write the entire binary stream back. Will this reproduce the original file exactly or will there be some sort of corruption that must otherwise be accounted for?
Because it's an encryption program, I was hoping to add in a feature that would low-level "format" the original file before deleting it so it would be theoretically inaccessible by combing the physical data of a harddisk. If I use binarywriter to overwrite parts of the original file with gibberish will it be put on the same spot on the harddisk or will the file become fragmented and actually just redirect via the FAT to some other portion of the harddisk? Obviously there's no point in overwriting the original file with gibberish if it's not over-writing the original cluster on the harddisk.
For your first question: A BinaryReader is not what you want. The name is a bit misleading: it "Reads primitive data types as binary values in a specific encoding." You probably want a FileStream.
Regarding the second question: That will not be easy: please see the "How SDelete Works" section of SDelete for an explanation. Brief extract in case that link breaks in the future:
"Securely deleting a file that has no special attributes is relatively straight-forward: the secure delete program simply overwrites the file with the secure delete pattern. What is more tricky is securely deleting Windows NT/2K compressed, encrypted and sparse files, and securely cleansing disk free spaces.
Compressed, encrypted and sparse are managed by NTFS in 16-cluster blocks. If a program writes to an existing portion of such a file NTFS allocates new space on the disk to store the new data and after the new data has been written, deallocates the clusters previously occupied by the file."

How to replace a file inside a zip on iOS?

I need to replace a file on a zip using iOS. I tried many libraries with no results. The only one that kind of did the trick was zipzap (https://github.com/pixelglow/zipzap) but this one is no good for me, because what really do is re-zip the file again with the change and besides of this process be to slow for me, also do something that loads the whole file on memory and make my application crash.
PS: If this is not possible or way to complicated, I can settle for rename or delete an specific file.
You need to find a framework where you can modify how data is read and written. You would then use some form of mmap to essentially read and write small chunks. Searching on NSData and mmap resulted in this Post, however you can use mmap from the posix level too. Ps it will be slower than using pure memory no way around that.
Got it WORKING!! JXZip (https://github.com/JanX2/JXZip) has made exactly what I need, they link to libzip (http://www.nih.at/libzip/) that is a fully equiped library for working with ZIP files and JXZip have all the necessary Objective-C wrapper code. Thanks for all the replys.
For archive purposes, as the author of zipzap:
Actually zipzap does exactly what you want. If you replace an entry within a zip file, zipzap will do the minimum necessary to update it: it will skip writing all entries before the replaced entry, then write out the entry, then write out all entries after the replaced entry without recompressing. At the moment, it does require sufficient memory for the entries after the replaced entry though.

Configuration Key Value Store

I'm in the planning stages of a script/app that I'm going to need to write soon. In short, I'm going to have a configuration file that stores multiple key value pairs for a system configuration. Various applications will talk to this file including python/shell/rc scripts.
One example case would be that when the system boots, it pulls the static IP to assign to itself from that file. This means it would be nice to quickly grab a key/value from this file in a shell/rc script (ifconfig `evalconffile main_interface` `evalconffile primary_ip` up), where evalconffile is the script that fetches the value when provided with a key.
I'm looking for suggestions on the best way to approach this. I've tossed around the idea of using a plain text file and perl to retrieve the value. I've also tossed around the idea of using YAML for the configuration file since there may end up being a use case where we need multiple values for a key and general expansion. I know YAML would make it accessible from python and perl, but I'm not sure what the best way to access it from a quickly access it from a shell/rc script would be.
Am I headed in the right direction?
One approach would be to simply do the YAML as you wanted, and then when a shell/RC wants a key/value pair, they would call a small Perl script (the evalconffile in your example) that would parse YAML on the shell script's behalf and print out the value(s)
SQLite will give you greatest flexibility, since you don't seem to know the scope of what will be stored in there. It appears there's support for it in all scripting languages you mentioned.

How can I limit file types in CGI file uploads in Perl?

I am using CGI to allow the user to upload some files. I just want the just to be able to upload .txt or .csv files. If the user uploads file with any other format then I want to be able to put out an error message.
I saw that this can be done by javascript: http://www.codestore.net/store.nsf/unid/DOMM-4Q8H9E
But is there a better way to achieve this? Is there is some functionality in Perl that allows this?
The disclaimer on the site to you link to is important:
Note: This is not entirely foolproof as people can easily change the extension of a file before uploading it, or do some other trickery, as in the case of the "LoveBug" virus.
If you really want to do this right, let the user upload the file, and
then use something like File::MimeInfo::Magic (or file(1), the
UNIX utility) to guess the actual file type. If you don't like the
file type, delete the file and give the user an error message.
I just want the just to be able to upload .txt or .csv files.
Sounds easy, doesn't it? It's not. And then some.
The simple approach is just to test that the file ends in ‘.txt’ or ‘.csv’ before storing it on the filesystem. This should be part of a much more in-depth validation of what the filename is allowed to contain before you let a user-submitted filename anywhere near the filesystem.
Because the rules about what can go in a filename are complex on some platforms (especially Windows) it's usually best to create your own filename independently with a known-good name and extension.
In any case there is no guarantee that the browser will send you a file with a usable name at all, and even if it does there is no guarantee that name will have ‘.txt’ or ‘.csv’ at the end, even if it is a text or CSV file. (Some platforms simply do not use extensions for file typing.)
Whilst you can try to sniff the contents of the file to see what type it might be, this is highly unreliable. For example:
<html>,<body>,</body>,</html>
could be plain text, CSV, HTML, XML, or a variety of other formats. Better to give the user an explicit control to say what file type they're uploading (or use one file upload field per type).
Now here's where it gets really nasty. Say you've accepted the upload and stored it as /data/mygoodfilename.txt, and the web server is correctly serving it as the Content-Type ‘text/plain’. What do you think the browser interprets it as? Plain text? You should be so lucky.
The problem is that browsers (primarily IE) don't trust your Content-Type header, and instead sniff the contents of the file to see if it looks like something else. Serve the above snippet as plain text, and IE will happily treat it as HTML. This can be a huge problem, because HTML can include client-side scripts that will take over the user's access to the site (a cross-site-scripting attack).
At this point you might be tempted to sniff the file on the server-side, for example using the ‘file’ command, to check it doesn't contain ‘<html>’. But this is doomed to failure. The ‘file’ command does not sniff for all the same HTML tags as IE does, and other browsers sniff differently anyway. It's quite easy to prepare a file that ‘file’ will claim is not HTML, but that IE will nevertheless treat as if it is (with security-disaster implications).
Content-sniffing approaches such as ‘file’ will give you only a false sense of security. This is a convenience tool for loose guessing of filetypes and not an effective security measure.
At this point your last desperate possibilities are things like:
serving all user-uploaded files from a separate hostname, so that a script injection attack can't purloin the credentials of your main site;
serving all user-uploaded files through a CGI wrapper, adding the header ‘Content-Disposition: attachment’ so that browsers won't attempt to display them directly;
only accepting uploads from trusted users.
On unix the easiest way is to do an JRockway suggested. If not on unix then your options are limited. You can examine the file extension and you can examine the contents to verify. I'm assuming for you specific case that you only want "* seperated value" text files. So one of the Text::CSV::* modules may be useful in verifying the file is the type you asked for.
Security for this operation is a whole other ball of wax.
try this:
$file_name = "file.txt";
$file_cmd = "file \"$file_name"\";
$file_type = `$file_cmd`;
return 0 unless($file_type =~ /(ASCII|text)/i)