Converting GB2312 Encoded MIME to Readable String - powershell

I apologize in advance for sounding ignorant when I ask this question, but I'm not very good at conceptualizing the concept of encoding and decoding data.
As an example, I have access to a MIME encoded text with the following value:
=?GBK?B?xqw=?=
I know that (or pretty sure I know that) it's encoded using GB2312. Translated using online decoders tells me it's the word "sheet" in English. Is there a way to decode this into even its source language characters that I can even put into a 3rd party translator to read it in English from PowerShell? I feel like an idiot for asking this question because I'm not even sure I'm asking it in an intelligent way due to my general lack of understanding of all the core pieces involved.
I've tried looking through the Encoding class, but it doesn't have anything from what I can tell that will support this type. Are there any other modules or something available that I'm not aware of that can facilitate this?
Thank you for any assistance someone may provide and appreciate being schooled on this topic.

Easiest way is by using either System.Web.HttpUtility.UrlDecode or System.Net.Mail.Attachment class in Microsoft framework. With the later you can do:
$unicodeString = "=?GBK?B?xqw=?="
$attachment = [System.Net.Mail.Attachment]::CreateAttachmentFromString("", $unicodeString)
Write-Host $attachment.Name
It prints 片 that is "sheet" in Chinese, according to Google Translate.

Related

What is multicodec and how it is related to multihash?

I don't have any background with this subject.
To try to understand them better, I read:
Multihash
CIDv1: Multicodec prefix
From what I understand, the multihash is the algorithm used to hash (one way) the value. so it means, we can't go back (we can't decode the hash to the value).
Questions
I don't understand, in simple words, what is multicodec and if it's related to decoding the hash to a value (which makes no sense).
what is the motivation to multicodec prefix?
The multicodec is related to decoding the value the hash points to, if that makes it easier to understand. Don't worry, no magic hash decoding is happening ;). Remember we're making CIDs, and we can use CIDs to lookup content. However then we have the question of "how do we decode this data we just retrieved?", the multicodec solves that problem for us. Reading From Data to Data Structures might help clear up some confusion.
The multicodec prefix allows IPFS to evolve to support new and different encodings for the data that's actually put into IPFS. This refers to IPLD, and you can actually find the answer you're looking for under Links (with information about the codecs under Codecs):
For links we use a CID. A CID is an extension of multihash, in fact a multihash is part of a CID. We simply add a codec to a multihash that tells us what format the data is in (JSON, CBOR, Bitcoin, Ethereum, etc). This way, we can actually link between data in different formats and any link to data anyone ever gives us can be decoded so that it can become more than just a series of bytes.
CID is a standard that anyone can implement, even people that have no other interest in IPLD beyond the need for hash links to different data types can use it.

Reading & writing text in Scala, getting the encoding right?

I'm reading and writings some text files in Scala. As a complete beginner in the language, I wanted to make sure to find the right way to do it, e.g. get the encoding right.
So most of the stuff I found (also on SO ) recommends I use io.Source.fromFile.However, after trying it out like so, reading a UTF-8 file:
val user_list = Source.fromFile("usernames.txt").getLines.toList
val user_list = Source.fromFile("usernames.txt", enc="UTF8").getLines.toList
I looked at the docs but was left with some questions.
Get the encoding right:
the docs show that I can set an encoding in Source.fromFile as I tried above. Looking at the man on Codec and the types listed there, I was wondering if those are all my codec options - is there e.g. no Utf-16, Big-Endian vs Little-Endian, etc.?
I am slightly obsessed with this since it used to trip me up in Python a lot. Is this less of concern with Scala for some reason?
Get the reading in right:
All the examples I looked at used the getLines method and postprocessed it with MkString or List, etc. Is there any advantage to that over just reading in the entire file (my files are small) in one go?
Get the writing out right:
Every source I could find tells me that Scala has no file writing function and to use the Java FileWriter. I was surprised by this - is this still accurate?
Looking at it I feel the question might be a little broad for SO, so I'd be happy to take it back if it does not meet the requirements. At this point, I'm not struggling with specific examples but rather trying to set things up in a way I don't get in trouble later.
Thanks!
Scala only has a basic IO api in the standard library. For the most part you just use the java apis. The fact that a decent api from java exists is probably why the Scala team is not prioritizing having a robust and fully featured IO api.
There are also third party scala libraries you could use as well however. Better Files I've never used but heard good things about as a Scala file api. As well as fs2 which provides functional, streaming IO. I'm sure there are others out there as well.
For encoding, there are many possible encoding available. It's just that only a couple of the most common ones are available as static fields, the rest you typically access through Codec("Encoding Name"). Most apis will also let you just enter a String directly instead of needing to get a Codec instance first. The codec is really just a wrapper over java.nio.charset.Charset. You can run java.nio.charset.Charset.availableCharsets() to see all of the encodings available on your system.
As far as reading, if the files are small you can load them fully into memory if you prefer that. The only reason not to do so is if you want to avoid the extra memory use of loading the entire file at once if reading through line by line is enough. You may want to use Vector instead of List for efficiency reasons (Vector is better in many cases and should probably be preferred as a default collection, but tradition and old habits die hard and most people/guides seem to default to List, but this is a whole other topic)

How to shift bytes of an NSString?

I have a NSString like #"123456". I want to convert this string into byte array and then I want to shift some bytes using some arithmetic operations. Then I want to apply SHA256Hash on that and finally want to encrypt a string using the final result. I have tried many approaches but still got no success. I am very confused in this.If someone wants to look at code i'll post the code.
Edit:
My actual goal is to encrypt an string using AES256 encryption algorithm. And I want to generate my own key and I want to pass my own IV.
I assume you're trying achieve some kind of security. On the other hand it does not look like you're very familiar with the tools and methods you're using. This is a bad start.
Security is a very difficult thing to do—even for experienced developers. Maybe there's a way to reuse some existing implementation for your security needs.
My advice would be not to reinvent things, especially when they are as hard and as crucial as security.

Approaches to programming application-level protocols?

I'm doing some simple socket programming in C#. I am attempting to authenticate a user by reading the username and password from the client console, sending the credentials to the server, and returning the authentication status from the server. Basic stuff. My question is, how do I ensure that the data is in a format that both the server and client expect?
For example, here's how I read the user credentials on the client:
Console.WriteLine("Enter username: ");
string username = Console.ReadLine();
Console.WriteLine("Enter plassword: ");
string password = Console.ReadLine();
StreamWriter clientSocketWriter = new StreamWriter(new NetworkStream(clientSocket));
clientSocketWriter.WriteLine(username + ":" + password);
clientSocketWriter.Flush();
Here I am delimiting the username and password with a colon (or some other symbol) on the client side. On the server I simply split the string using ":" as the token. This works, but it seems sort of... unsafe. Shouldn't there be some sort of delimiter token that is shared between client and server so I don't have to just hard-code it in like this?
It's a similar matter for the server response. If the authentication is successful, how do I send a response back in a format that the client expects? Would I simply send a "SUCCESS" or "AuthSuccessful=True/False" string? How would I ensure the client knows what format the server sends data in (other than just hard-coding it into the client)?
I guess what I am asking is how to design and implement an application-level protocol. I realize it is sort of unique to your application, but what is the typical approach that programmers generally use? Furthermore, how do you keep the format consistent? I would really appreciate some links to articles on this matter as well.
Rather than reinvent the wheel. Why not code up an XML schema and send and receive XML "files".
Your messages will certainly be longer, but with gigabyte Ethernet and ADSL this hardly matters these days. What you do get is a protocol where all the issues of character sets, complex data structures have already been solved, plus, an embarrassing choice of tools and libraries to support and ease your development.
I highly recommend using plain ASCII text if at all possible.
It makes bugs much easier to detect and fix.
Some common, machine-readable ASCII text protocols (roughly in order of complexity):
netstring
Tab Delimited Tables
Comma Separated Values (CSV) (strings that include both commas and double-quotes are a little awkward to handle correctly)
INI file format
property list format
JSON
YAML Ain't Markup Language
XML
The world is already complicated enough, so I try to use the least-complex protocol that would work.
Sending two user-generated strings from one machine to another -- netstrings is the simplest protocol on my list that would work for that, so I would pick netstrings.
(netstrings will will work fine even if the user types in a few colons or semi-colons or double-quotes or tabs -- unlike other formats that choke on certain commonly-typed characters).
I agree that it would be nice if there existed some way to describe a protocol in a single shared file such that that both the server and the client could somehow "#include" or otherwise use that protocol.
Then when I fix a bug in the protocol, I could fix it in one place, recompile both the server and the client, and then things would Just Work -- rather than digging through a bunch of hard-wired constants on both sides.
Kind of like the way well-written C code and C++ code uses function prototypes in header files so that the code that calls the function on one side, and the function itself on the other side, can pass parameters in a way that both sides expect.
Tell me if you discover anything like that, OK?
Basically, you're looking for a standard. "The great thing about standards is that there are so many to choose from". Pick one and go with it, it's a lot easier than rolling your own. For this particular situation, look into Apache "basic" authentication, which joins the username and password and base64-encodes it, as one possibility.
I have worked with two main approaches.
First is ascii based protocol.
Ascii based protocol is usally based on a set of text commands that terminate on some defined delimiter (like a carriage return or semicolon or xml or json). If your protocol is a command based protocol where there is not a lot of data being transferred back and forth then this is the best way to go.
FIND\r
DO_SOMETHING\r
It has the advantage of being easy to read and understand because it is text based.
The disadvantage (may not be a problem but can be) is that there can be an unknown number of bytes being transferred back and forth from the client and the server. So if you need to know exactly how many bytes are being sent and received this may not be the type of protocol you want.
The other type of protocol is binary based with fixed sized messages that are sent in the header. This has the advantage of knowing exactly how much data the client is expected to receive. It also can potentially save you bandwith depending on what your sending across. Although, ascii can also save you space too, it depends on your application requirements. The disadvantage of a binary based protocol is that it is difficult to understand by just looking at it....requiring you to constantly look at documentation.
In practice, I tend to mix both strategies in protocols I have defined based on my application's requirements.

How safe is information contained within iPhone app compiled code?

I was discussing this with some friends and we began to wonder about this. Could someone gain access to URLs or other values that are contained in the actual objective-c code after they purchase your app?
Our initial feeling was no, but I wondered if anyone out there had definitive knowledge one way or the other?
I do know that .plist files are readily available.
Examples could be things like:
-URL values kept in a string
-API key and secret values
Yes, strings and information are easily extractable from compiled applications using the strings tool (see here), and it's actually even pretty easy to extract class information using class-dump-x (check here).
Just some food for thought.
Edit: one easy, albeit insecure, way of keeping your secret information hidden is obfuscating it, or cutting it up into small pieces.
The following code:
NSString *string = #"Hello, World!";
will produce "Hello, World!" using the strings tool.
Writing your code like this:
NSString *string = #"H";
string = [stringByAppendingString:#"el"];
string = [stringByAppendingString:#"lo"];
...
will show the characters typed, but not necessarily in order.
Again: easy to do, but not very secure.
When you purchase an app it is saved on your hard disk as "FooBar.ipa"; that file is actually in Zip format. You can unzip it and inspect the contents, including searching for strings in the executable. Try it! Constant values in your code are not compressed, encrypted, or scrambled in any way.
I know this has already been answered, but I want to give my own suggestion too.
Again, please remember that all obfuscation techniques are never 100% safe, and thus are not the best, but often they are "good enough" (depending on what you want to obfuscate). This means that a determined cracker will be able to read your strings anyways, but these techniques may stop the "casual cracker".
My other suggestion is to "crypt" the strings with a simple XOR. This is incredibly fast, and does not require any authorization if you are selling the app through the App Store (it does not fall into the categories of algorithms that require authorization for exporting them).
There are many snippets around for doing a XOR in Cocoa, see for example: http://iphonedevsdk.com/forum/iphone-sdk-development/11352-doing-an-xor-on-a-string.html
The key you use could be any string, be it a meaningless sequence of characters/bytes or something meaningful to confuse readers (e.g. use name of methods, such as "stringWithContentsOfFile:usedEncoding:error:").