Is there a Zfill type function in micro python. Zfill in micro python - micropython

I am trying to put a python script on a pyboard that’s running micro python. Is there a python equivalent of .zfill In MicroPython?

there is no inbuilt zfill but I use this zfl function
def zfl(s, width):
# Pads the provided string with leading 0's to suit the specified 'chrs' length
# Force # characters, fill with leading 0's
return '{:0>{w}}'.format(s, w=width)
This might be useful to you? Just pass the string and the string width you require.

welcome to SO!
No, MicroPython does not have a zfill method on strings.
If you're looking for a specific width, you'll need to get a len(str) and then concatenate the desired string of "0"s to the start of string.

Related

Convert unicode string into NFC in Rust

Let's say I have a std::String, contents unknown, that like "Mañana" has combining characters and I want to convert it to unicode NFC, a la String.prototype.normalize in Javascript or unicodedata.normalize in Python.
I found this crate on crates.io but it seems to contain only methods for working with individual characters. How would I convert an entire string? Convert to bytes and iterate pairwise and check for combining characters using the functions in that crate? What would that even look like in rust?
You can indeed use the unicode_normalization crate. More specifically, check out the nfc method.

write()-ing an encoded string in Python 3.x

I've got a unicode string (s) which I want to write into a file.
In Python 2 I could write:
open('filename', 'w').write(s.encode('utf-8'))
But this fails for Python 3. Apparently, s.encode() returns something of type 'bytes', which the write() function does not accept:
TypeError: must be str, not bytes
Does anyone know how to port the above code to Python 3?
Edit:
Thanks to all of you who proposed using binary mode! Unfortunately, this causes a problem with the \n characters. Is there any way to achieve the same result I had with Python 2 (namely to encode non-ANSI characters in UTF-8 while keeping the OS-specific rendition of \n)?
Thanks!
You do not want to muck around with manually encoding each and every piece of data like that! Simply pass the encoding as an argument to open, like this:
#!/usr/bin/env python3.2
slist = [
"Ca\N{LATIN SMALL LETTER N WITH TILDE}on City",
"na\N{LATIN SMALL LETTER I WITH DIAERESIS}vet\N{LATIN SMALL LETTER E WITH ACUTE}",
"fa\N{LATIN SMALL LETTER C WITH CEDILLA}ade",
"\N{GREEK SMALL LETTER BETA}-globulin"
]
with open("/tmp/sample.utf8", mode="w", encoding="utf8") as f:
for s in slist:
print(s, file=f)
Now if you the file you made, you’ll see that it says:
$ cat /tmp/sample.utf8
Cañon City
naïveté
façade
β-globulin
And you can see that those are the right code points this way:
$ uniquote -x /tmp/sample.utf
Ca\x{F1}on City
na\x{EF}vet\x{E9}
fa\x{E7}ade
\x{3B2}-globulin
See how much easier that is? Let the stream object handle any low-level encoding or decoding for you.
Summary: Don't call encode or decode yourself when all you are doing is using them to process a homogeneous stream that's all of it in the same encoding. That's way too much bother for zero gain. Use the encoding argument just once and for all.
Open the file in binary mode, that's the least invasive way in terms of changes.
On the other hand, you could set the output file encoding with open() and avoid explicit string encoding altogether.
You might want to read the manual of the open() function.
Open the file in binary mode
open('filename', 'wb').write(s.encode('utf-8'))

How to do preg_replace on a long string

I want to be able to find and replace a long line javascript code. The code has a lot / and \ in it too.
Is this even possible?
You can modify the limit manually so PHP will allow you to handle very long strings.
Put the following line somewhere before calling preg_replace.
ini_set('pcre.backtrack_limit', 99999999999);
Even better, if can modify your php.ini file, you can change the value of pcre.backtrack_limit from there so the new limit will be globally available.
It depends how long - there is an upper length limit (see http://nz.php.net/manual/en/function.preg-last-error.php for how to detect if you reach it).
You can escape variables going into your pattern with preg_quote if you need to, which takes care of the / and \ characters.
PHP string functions have size limit and sadly those limits are not specified...you will have to divide the whole sting into chunk of smaller strings ....then run preg_replace
on each of the string..then combine those strings together..that is what I did.

Convert 32-char md5 string to integer

What's the most efficient way to convert an md5 hash to a unique integer to perform a modulus operation?
Since the solution language was not specified, Python is used for this example.
import os
import hashlib
array = os.urandom(1 << 20)
md5 = hashlib.md5()
md5.update(array)
digest = md5.hexdigest()
number = int(digest, 16)
print(number % YOUR_NUMBER)
You haven't said what platform you're running on, or what the format of this hash is. Presumably it's hex, so you've got 16 bytes of information.
In order to convert that to a unique integer, you basically need a 16-byte (128-bit) integer type. Many platforms don't have such a type available natively, but you could use two long values in C# or Java, or a BigInteger in Java or .NET 4.0.
Conceptually you need to parse the hex string to bytes, and then convert the bytes into an integer (or two). The most efficient way of doing that will entirely depend on which platform you're using.
There is more data in a MD5 than will fit in even a 64b integer, so there's no way (without knowing what platform you are using) to get a unique integer. You can get a somewhat unique one by converting the hex version to several integers worth of data then combining them (addition or multiplication). How exactly you would go about that depends on what language you are using though.
Alot of language's will implement either an unpack or sscanf function, which are good places to start looking.
If all you need is modulus, you don't actually need to convert it to 128-byte integer. You can go digit by digit or byte by byte, like this.
mod=0
for(i=0;i<32;i++)
{
digit=md5[i]; //I presume you can convert chart to digit yourself.
mod=(mod*16+digit) % divider;
}
You'll need to define your own hash function that converts an MD5 string into an integer of the desired width. If you want to interpret the MD5 hash as a plain string, you can try the FNV algorithm. It's pretty quick and fairly evenly distributed.

How should I handle digits from different sets of UNICODE digits in the same string?

I am writing a function that transliterates UNICODE digits into ASCII digits, and I am a bit stumped on what to do if the string contains digits from different sets of UNICODE digits. So for example, if I have the string "\x{2463}\x{24F6}" ("④⓶"). Should my function
return 42?
croak that the string contains mixed sets?
carp that the string contains mixed sets and return 42?
give the user an additional argument to specify one of the three above behaviours?
do something else?
Your current function appears to do #1.
I suggest that you should also write another function to do #4, but only when the requirement appears, and not before .
I'm sure Joel wrote about "premature implementation" in a blog article sometime recently, but I can't find it.
I'm not sure I see a problem.
You support numeric conversion from a range of scripts, which is to say, you are aware of the Unicode codepoints for their numeric characters.
If you find an unknown codepoint in your input data, it is an error.
It is up to you what you do in the event of an error; you may insert a space or underscore, or you may abort conversion. What you would do will depend on the environment in which your function executes; it is not something we can tell you.
My initial thought was #4; strictly based on the fact that I like options. However, I changed my mind, when I viewed your function.
The purpose of the function seems to be, simply, to get the resulting digits 0..9. Users may find it useful to send in mixed sets (a feature :) . I'll use it.
If you ever have to handle input in bases greater than 10, you may end up having to treat many variants on the first 6 letters of the Latin alphabet ('ABCDEF') as digits in all their forms.