Why does the restored file hasn't previous symbols? - unicode

I restored the text file (file.txt) after accidentally deleting it with MiniTool Power
Data Recovery 8.0
Although the recovered file weighs the same (584 KB), but it looks completely different. It looks like this:
I understand that my file was broken after recovering. But why doesn't the file contains any previous symbols (Latin and Cyrillic characters)?

The software could not fulfill your expectations. Your file was not recovered, but an unrelated piece/sector of the mass storage medium. It contains random binary junk.

Related

2 files with same hash, but 1 is corrupted and 1 isn't

I found something very weird on a project.
I have 2 files :
One is the input file, it's a .bip file which you can open with GIS software like QGIS
here's the input. this file is provided by the CCSDS and accessible here
The other is the output after been compressed and decompressed by a lossless compression algorithm (CCSDS 123 by ESA)
Those 2 files shares the exact same sha256 and sha1 hash, so they are identical.
3226009de97d66589fc58cdc9af377e6315ccc69a7095bec8dc04447bf3cea2e test_ptn_x100y36z17_16u.bip
3226009de97d66589fc58cdc9af377e6315ccc69a7095bec8dc04447bf3cea2e test_ptn_decomp.bip (sha256 shown here).
The thing is, if the entry is showed by QGIS, the second one displays a message and refuses to open it shows this message (translated : the file test_ptn_decomp.bip is not a recognized or valid data source)
Is there something i don't understand with hashes ? i've tried moving files to other directories and renaming but nothing changes QGIS wise.
It is highly unlikely you got a different content with same sha256 hash by chance. So I'll assume the files are identical. Anyway, it is easy to use any diff program to compare.
So there should be some other differences, things that come to mind:
file name might contain some meaningful information needed by QGIS. Try renaming decompressed file e.g. decomp_ptn_x100y36z17_16u.bip, maybe x100.. is essential?
There are some additional files, that must have matching names. Do you have a .hdr file, as explained in QGIS tutorials?
https://www.qgistutorials.com/en/docs/open_bil_bip_bsq_files.html

hypercard data file types and start patterns needed

I am trying to recover deleted HyperCard data files.
I need to know both file extentions and "start Patterns" in the files, so that we can recover the deleted HyperCard data files.
By "data files", do you mean "stack" document files?
File extensions were not relevant to HyperCard "stacks" on Macintosh systems before OSX.
The 4-character file type identifier STAK was used by the Mac OS to associate HyperCard files ("stacks") with the HyperCard application, consistent with the then method of association. (Accordingly, file extensions were not used for this purpose by Mac OS prior to OSX; the filename including extension had no impact on its association to an application).
Accordingly, the first characters in a HyperCard stack were a bunch of Mac-specific characters including the substring "STAK". (the rest of those Mac characters won't paste here).
Have you considered how you will recover both data fork and resource fork? This may involve different challenges depending on the filesystem you are using, or how the original files were stored until now, because modern filesystems do not store data forks and resource forks in the same way that HyperCard "stack" files (and other documents from that era) used to.
Are you running an old OS in an emulator for this recovery task? It may make it easier.
In case it helps, you can also often get the "stack script" by opening a HyperCard "stack" file in your texteditor of choice.

Can SAP detect encoding and line endings?

How to read ASCII files with mixed line endings (Windows and Unix) and UTF-16 Big Endian files in SAP?
Background: our ABAP application must read some of our configuration files. Most of them are ASCII files (normal text files) and one is Unicode Big Endian. So far, the files were read using ASCII mode and things were fine during our test.
However, the following happened at customers: the configuration files are located on a Linux terminal, so it has Unix Line Endings. People read the configuration files via FTP or similar and transport it to the Windows machine. On the Windows machine, they adapt some of the settings. Depending on the editor, our customers now have mixed line endings.
Those mixed line ending cause trouble when reading the file in ASCII mode in ABAP. The file is read up to the point where the line endings change plus a bit more but not the whole file.
I suggested reading the file in BINARY mode, remove all the CRs, then replace all the remaining LFs by CR LF. That worked fine - except for the UTF-16 BE file for which this approach results in a mess. So the whole thing was reverted.
I'm not an ABAP developer, I just have to test this. With my background in other programming languages I must assume there is a solution and I tend to decline a "CAN'T FIX" resolution of this bug.
you can use CL_ABAP_FILE_UTILITIES=>CHECK_FOR_BOMto determine which encoding the file has and then use the constants of class CL_ABAP_CHAR_UTILITIES to process further.

System.IO - Does BinaryReader/Writer read/write exactly what a file contains? (abstract concept)

I'm relatively new to C# and am attempting to adapt a text encryption algorithm I designed in wxMaxima into a Binary encryption program in C# using Visual Studio forms. Because I am new to reading/writing binary files, I am lacking in knowledge regarding what happens when I try to read or write to a filestream.
For example, instead of encrypting a text file as I've done in the past, say I want to encrypt an executable or any other form of binary file.
Here are a few questions I don't understand:
When I open a file stream and use binaryreader will it read in an absolute duplicate of absolutely everything in the file? I want to be able to, for example, read in an entire file, delete the original file, then create a new file with the old name and write the entire binary stream back. Will this reproduce the original file exactly or will there be some sort of corruption that must otherwise be accounted for?
Because it's an encryption program, I was hoping to add in a feature that would low-level "format" the original file before deleting it so it would be theoretically inaccessible by combing the physical data of a harddisk. If I use binarywriter to overwrite parts of the original file with gibberish will it be put on the same spot on the harddisk or will the file become fragmented and actually just redirect via the FAT to some other portion of the harddisk? Obviously there's no point in overwriting the original file with gibberish if it's not over-writing the original cluster on the harddisk.
For your first question: A BinaryReader is not what you want. The name is a bit misleading: it "Reads primitive data types as binary values in a specific encoding." You probably want a FileStream.
Regarding the second question: That will not be easy: please see the "How SDelete Works" section of SDelete for an explanation. Brief extract in case that link breaks in the future:
"Securely deleting a file that has no special attributes is relatively straight-forward: the secure delete program simply overwrites the file with the secure delete pattern. What is more tricky is securely deleting Windows NT/2K compressed, encrypted and sparse files, and securely cleansing disk free spaces.
Compressed, encrypted and sparse are managed by NTFS in 16-cluster blocks. If a program writes to an existing portion of such a file NTFS allocates new space on the disk to store the new data and after the new data has been written, deallocates the clusters previously occupied by the file."

Adding RCS Header in Binary files

I am using RCS source control and need to check in an binary file (gif image and a jar file) how do I add an $Header$ keyword so that the version information is replaced in this file during check in and get revealed when I issue "ident" command.
For text files like Java, XML etc we usually add the RCS header comments and public strings but no idea about binary files.
Basically, you don't.
Binary file formats don't typically have a way to have a variable-length chunk of arbitrary data. Even if there's a region of the file that can contain arbitrary data, the length of the expansion can vary from one checkout to another (e.g., if it goes from version 1.9 to 1.10), and that's likely to mess up the file.
For this to work, the binary format would have to tolerate a change in the size of the header string. For example, if the version number changes from 1.9 to 1.10, the RCS co command (which has no knowledge of the binary file format) will replace the string in-place, changing the offset of all data following the string. If the file format has a comment section, and that section's size is stored as a number, co isn't going to update that number.
Compiler-generated object and executable files often have RCS version information in them, but it's usually generated from the source file(s); objects and executables themselves typically aren't stored in a version control system.
Before the initial checkin of a binary file, you should run rcs -i -kb filename, so that the RCS co command doesn't attempt to do keyword replacement (just in case the file happens to accidentally contain something that looks like an RCS keyword).
If you have a binary file that you've checked out of an RCS system, and you want to know which version it is, you'll have to compare it to each of the versions in RCS. (My own get-versions might be useful for this.)
If you have a way of storing textual metadata in the file, you could also consider annotating your binary file with a timestamp. You can then correlate the timestamp with the revision by looking at the RCS log.
You mentioned Excel files. I just tried some experiments. The new .xlsx format is really a zip file; anything you put in the Comment section will be compressed, and not visible to ident. The older .xls format, at least for the small file I tried, does store the Comment section in readable text, so ident works -- but when I checked in a file, RCS expanded the Comment from "$Header:$" to "$Header: /home/kst/2012-12-06/RCS/foo.xls,v 1.1 2012-12-06 11:47:48-08 kst Exp kst $"; when I tried to open it with Excel, I got:
Excel found unreadable content in 'foo.xls'.
and it was unable to recover the contents.
In general you can't, but certain binaries has a ASCII slot to place RCS headers.
For example ZIP files
% zip -z archive.zip
$Header$
And then, after CVS handling:
% unzip -l archive.zip
$Header: /cygdrive/c/cvsroot/archive.zip,v 1.2 2020/10/14 13:46:06 omg Exp $
There are dozen of extensions extensions that are actually a zip file where you can do this: odt, pdf, ... but use carefully and prefer short RCS headers like Version or Date, because RCS don't know the slot size, and may corrupt the file.