Compare Json Files in Beyond Compare - diff

How can I compare two minified json files in beyond compare? Is there a built in file format for json? I'm looking to compare two pretty print representations of the underlying json objects.

In this thread a representative says:
While not in the box yet, we do have a JSON sorted format available for download in our Additional File Formats section:
With a link to Scooter Software Downloads

You can achieve this specialized diff functionality by defining a new file format conversion rule in beyond compare. This example was conducted in the Windows OS.
Step 0: Create a python conversion script to render the formatted json. Save the following python script somewhere on your harddrive
import json
import sys
sourceFile = sys.argv[1]
targetFile = sys.argv[2]
with open(sourceFile, 'r') as file_r:
# Load json data
data = json.load(file_r)
# Write formatted json data
with open(targetFile, 'w') as file_w:
json.dump(data, file_w, indent=4)
Step 1: Navigate in the BeyondCompare menu to: Tools-->File Formats...
Step 2: Create new file format entry by clicking on the + button and select Text Format
Step 3: Enter *.json into the file format's Mask field, and any description that will help you recall the file format's purpose.
Step 4: Define the file format's conversion settings. Select the Conversion tab and select External program (unicode filenames) from the pull down.
In the Loading field write the following shell command
python C:\Source\jsonPrettyPrint.py "%s" "%t"
Step 5: Press the Save button and optionally rename the file format by right clicking it in the File Formats Name and Mask table.
Further specializations of the json dumping could be considered by looking at the python documentation, eg sort_keys=True

Related

doc or docx: Is there safeway to identify the type from 'requests' in python3?

1) How can I differentiate doc and docx files from requests?
a) For instance, if I have
url='https://www.iadb.org/Document.cfm?id=36943997'
r = requests.get(url,timeout=15)
print(r.headers['content-type'])
I get this:
application/vnd.openxmlformats-officedocument.wordprocessingml.document
This file is a docx.
b) If I have
url='https://www.iadb.org/Document.cfm?id=36943972'
r = requests.get(url,timeout=15)
print(r.headers['content-type'])
I get this
application/msword
This file is a doc.
2) Are there other options?
3) If I save a docx file as doc or vice-versa may I have recognition problems (for instance, for conversion to pdf?)? Is there any kind of best practice for dealing with this?
The mime headers you get appear to be the correct ones: What is a correct mime type for docx, pptx etc?
However, the sending software can only go on what file its user selected – and there still are a lot of people sending files with the wrong extension. Some software can handle this, others cannot. To see this in action, change the name of a PNG image to end with JPEG instead. I just did on my Mac and Preview still is able to open it. When I press ⌘+I in the Finder it says it is a JPEG file, but when opened in Preview it gets correctly identified as a "Portable Network Graphics" file. (Your OS may or may not be able to do this.)
But after the file is downloaded, you can unambiguously differ between a DOC and a DOCX file, even if the author got its extension wrong.
A DOC file starts with a Microsoft OLE Header, which is quite complicated structure. A DOCX file, on the other hand, is a compound file format containing lots of smaller XML files, compressed together using a standard ZIP file compression. Therefore, this file type always will start with the two characters PK.
This check is compatible with Python 2.7 and 3.x (only one needs the decode):
import sys
if len(sys.argv) == 2:
print ('testing file: '+sys.argv[1])
with open(sys.argv[1], 'rb') as testMe:
startBytes = testMe.read(2).decode('latin1')
print (startBytes)
if startBytes == 'PK':
print ('This is a DOCX document')
else:
print ('This is a DOC document')
Technically it will confidently state "This is a DOC document" for anything that does not start with PK, and, conversely, it will say "This is a DOCX document" for any zipped file (or even a plain text file that happens to start with those two characters). So if you further process the file based on this decision, you may find out it's not a Microsoft Word document after all. But at least you will have tried with the proper decoder.

How do I prompt the user to choose a file that will be loaded in matlab?

I want the user of a script I'm writing to be able to navigate to a file containing their data and to load the data into the workspace. For example, if a csv file contains two cells with the values 1 and 2 respectively, I want the user to simply choose this file and those two values will be assigned to a variable in the workspace.
I have looked at using:
filename = uigetfile('*.xlsx','*.csv')
But that just returns the name of the file. Perhaps I could construct a full path to where the file they choose is found, and then read it in that way (using xlsread or csvread) but I think I'm probably missing something here. It seems that there should be a more straightforward way of doing it.
I believe that you're looking for the uiopen() function. This function will:
Open dialog box for selecting files to load into workspace.
On default, this function will display in a file explorer dialog box with the filter set to all MATLAB® files (with file extensions *.m, *.mlx, *.mat, *.fig, *.mdl, and *.slx).
However, you can import data from data files like CSV files and spreadsheets as well. Simply select the (All Files) option for the Files of Type field.
Once you've selected the data file you're interested in, you will be prompted with another GUI object that previews the data you are about to load into MATLAB's workspace. If you're satisfied with the format of the variables presented in the preview, simply hit the green check-mark at the right-side of the tool-box ribbon in the GUI object and, huzzah, all of the data file's contents have been loaded into separate variables (named according to their respective headers).
Alternatively, though this is undeniably a longer-winded and uglier approach, if you'd like to use the filename returned from uigetfile('*.xlsx', '*.csv'), you could use the importdata() function. This will output a struct that contains each of the variables from your data file as a separate field:
[filename, pathname] = uigetfile( ...
{'*.csv;', 'CSV file (*.csv)';
'*.xlsx', 'Excel Spreadsheet file (*.xlsx)'; ...
'*.*', 'All Files (*.*)'}, 'Pick a File to Import');
full_filename = fullfile(pathname, filename);
[A, delimiterOut] = importdata(full_filename);

How do I export SAS text string to a Word document using DDE?

I would like to export a string character from SAS to a word document (.docx) using Dynamic Data Exchange (DDE). Is this possible?
The SAS documentation on this is old and suggests I use the following commands:
filename testit dde 'winword|"file_path"!bookmark' notab;
data _null_;
file testit;
put 'insertstuff';
run;
SAS returns an error message:
ERROR: Physical file does not exist
Works for me.
filename testit dde 'winword|"e:\blah.docx"!bookmark' notab;
data null;
file testit;
put 'insertstuff';
run;
Steps:
Create a word document and save it in the specified path.
In word document, create a bookmark by going to Insert->Bookmark, give it a name of 'Bookmark' and press Add
Make sure both word and SAS are open. And that the document is open in word.
Run the SAS code.
Super late to the party, but there are a few issues that could cause that error:
You don't need "" around the filename, should just be:
filename testit dde 'winword|file_path!bookmark' notab;
data _null_;
file testit;
put '[Insert "stuff"]';
run;
The file path may be spelled incorrectly
You may not have permissions to the filepath. This is likely if it is a work machine.
To check:
Navigate to the filepath in the file explorer
Right click on the file
Open properties
Look for your username, click on it, and it will show you what permission you have
You have a missing/incorrect file extension (ie .doc not .docx ect)
Hope you were able to figure this out at the time :P

How to open ASCII file in WEKA software

I have converted the .tiff file into ascii format with the help of ArcGIS, now i want to open that same file in WEKA, and it is asking me to open file in .arff format which i am clueless on how to convert ascii file into that, as format for ascii file is .TXT.
It's difficult to see the issue without some sample data or error message, but it appears that the file can't be read into Weka in its current state.
You could try formatting the dataset to comply with the Attribute-Relation File Format.
Failing this, you could also format the dataset into a Comma-Delimited File Format with header information on the first row, and data underneath. CSV Files are accepted into Weka quite fine.
Hope this Helps!
Considering that you are working with satellite imagery and that you know R, you could try something like this:
library(raster)
library(foreign)
library(RWeka)
dir.satellite <- '../tiffs' # Folder with your satellite TIF files
# Read them from their full paths
bands <- list.files(file.path(dir.satellite), full.names = T,
pattern = '.TIF$')
stkTIF <- raster::stack(bands) # group them into a rasterStack object
# Write the WEKA arff file
write.arff(as.matrix(stkTIF),
file = file.path(dir.satellite, 'your_file_name.arff'))

How can I prevent MATLAB from automatically modifying .dat file variable names upon import using the dataset function?

So, I currently have a MATLAB script that does stuff with data and then, using a template .dat file, creates about 20 more .dat files with only a single column being changed (I've been using the dataset and export functions to read and write the files, respectively). The program that will use the .dat files, ExperimentBuilder, requires that the headers have names that start with dollar signs (for example: $image). However, when I use the dataset function in MATLAB to import the template file, I get this warning:
Warning: Variable names were modified to make them valid MATLAB identifiers.
It then replaces all the dollar signs in the variables to x_ (for example, x_image), which would be fine if it would let me change it back to the $ format. But whenever I try to using set , it just gives me this warning again and reverts it back to x_, which is unreadable by ExperimentBuilder.
I know I could just do a quick copy and paste on each file with the original headings, but I would like to know if there's a way to fix this problem in the actual code.
Thanks!
Thing is the MATLAB database uses the header names to provide access to the columns by name, this is why the header names must be valid identifiers (isvarname() states that it must starts with a letter, and contains only valid alphanumeric characters [a-zA-Z0-9_]).
The easiest solution would be manually write the header line yourself (including names starting with $), while separately exporting the data without the headers:
export(ds, ..., 'WriteObsNames',false)
(Note that dataset.export overwrites files by default, so you'll have to export first, then prepend the header line at the beginning of the file. Or if you're comfortable modifying MATLAB own functions, then go edit dataset.export and change the fopen mode from overwrite 'wt' to append 'at' mode).