How do I extract only numbers from a string in Dart? - flutter

I was trying to sort the string using -
for (var t in creditcards) {
print(t['firstyear'].toString().replaceAll('/[^0-9]/', ''));
}
Its output -
Rs.500
Rs.1000
Rs.500
Rs.0
Rs.499
Rs.499 + applicable taxes
Rs.500
Rs.500
Rs.495 + taxes
Rs.0
I want to remove ' + applicable taxes' from this and parse it as integer.

Without an example of how your String is it's hard to say what or how can we avoid certain patterns. This works as long as your Strings is always in the format Letters.numbers (any amount of letters, then a dot, then any amount of numbers)
String x = 'Rs.499 + applicable taxes';
String y = ' asdhui Euro.3243 + applicable taxes';
final RegExp firstRegExp = RegExp('[a-zA-Z]+(\.[0-9]+)');
final RegExp example = RegExp(r'\w+(\.\d+)'); //this is basically the same that the one above, but also check for numbers before the dot
//consider static final RegExp to avoid creating a new instance every time a value is checked
print(firstRegExp.stringMatch(x));
print(example.stringMatch(x));
print(example.stringMatch(y));
for (var t in creditcards) {
print(example.stringMatch(t['firstyear'].toString()));
}
RegExp check for the pattern [a-zA-Z] which means any letter, + match the previous one or more times (the prvious means it will match a one or more letters, but not space, tabs or any special character), (.[0-9]+) the . checks for a dot (\ is used because . is a special character in regExp and you want to search explicitly for the dot), [0-9]+ checks for one or more numbers after the dot
RegExp is used to check for patterns, check more about it in
regExp Dart and some examples about special characters in RegExp

Related

Extracting range of unpadded string

I'd like to extract the Range<String.Index> of a sentence within its whitespace padding. For example,
let padded = " El águila (🦅). "
let sentenceRangeInPadded = ???
assert(padded[sentenceRangeInPadded] == "El águila (🦅).") // The test!
Here's some regex that I started with, but looks like variable length lookbehinds aren't supported.
let sentenceRangeInPadded = padded.range(of: #"(?<=^\s*).*?(?=\s*$)"#, options: .regularExpression)!
I'm not looking to extract the sentence (could just use trimmingCharacters(in:) for that), just the Range.
Thanks for reading!
You may use
#"(?s)\S(?:.*\S)?"#
See the regex demo.
Details
(?s) - a DOTALL modifier making . match any char, including line break chars
\S - the first non-whitespace char
(?:.*\S)? - an optional non-capturing group matching
.* - any 0+ chars as many as possible
\S - up to the last non-whitespace char.

How can I obtain only word without All Punctuation Marks when I read text file?

The text file abc.txt is an arbitrary article that has been scraped from the web. For example, it is as follows:
His name is "Donald" and he likes burger. On December 11, he married.
I want to extract only words in lower case and numbers except for all kinds of periods and quotes in the above article. In the case of the above example:
{his, name, is, Donald, and, he, likes, burger, on, December, 11, he, married}
My code is as follows:
filename = 'abc.txt';
fileID = fopen(filename,'r');
C = textscan(fileID,'%s','delimiter',{',','.',':',';','"','''});
fclose(fileID);
Cstr = C{:};
Cstr = Cstr(~cellfun('isempty',Cstr));
Is there any simple code to extract only alphabet words and numbers except all symbols?
Two steps are necessary as you want to convert certain words to lowercase.
regexprep converts words, which are either at the start of the string or follow a full stop and whitespace, to lower case.
In the regexprep function, we use the following pattern:
(?<=^|\. )([A-Z])
to indicate that:
(?<=^|\. ) We want to assert that before the word of interest either the start of string (^), or (|) a full stop (.) followed by whitespace are found. This type of construct is called a lookbehind.
([A-Z]) This part of the expression matches and captures (stores the match) a upper case letter (A-Z).
The ${lower($0)} component in the regex is called a dynamic expression, and replaces the contents of the captured group (([A-Z])) to lower case. This syntax is specific to the MATLAB language.
You can check the behaviour of the above expression here.
Once the lower case conversions have occurred, regexp finds all occurrences of one or more digits, lower case and upper case letters.
The pattern [a-zA-Z0-9]+ matches lower case letters, upper case letters and digits.
You can check the behavior of this regex here.
text = fileread('abc.txt')
data = {regexp(regexprep(text,'(?<=^|\. )([A-Z])','${lower($0)}'),'[a-zA-Z0-9]+','match')'}
>>data{1}
13×1 cell array
{'his' }
{'name' }
{'is' }
{'Donald' }
{'and' }
{'he' }
{'likes' }
{'burger' }
{'on' }
{'December'}
{'11' }
{'he' }
{'married' }

Access, How can I change lowercase letter of first letter in last name to uppercase

I would like to change lowercase letter of first letter in last name to uppercase by using code
my code from form is :
Option Compare Database
Private Sub Text19_Click()
Text19 = UCase(Text19)
End Sub
but there is no change to my table!
Furthermore, how can I find last name with a space, comma or period and make it without a space, comma and period.
such as
Moon,
Moon.
[space] Moon
change them to just
Moon
If there is no change to your table, maybe your field is not bound to the recordset? Maybe you need to 'Refresh' your form.
Also, it looks like you are trying to use this code on a TextBox?
Code would be as follows:
Private Sub Text19_DblClick(Cancel As Integer)
Text19 = Trim(Text19) ' Get rid of leading and trailing spaces.
If right(Text19, 1) = "." Or right(Text19, 1) = "," Then ' Remove comma, period
Text19 = left(Text19, Len(Text19) - 1)
End If
Text19 = UCase(left(Text19, 1)) & Mid(Text19, 2)
End Sub

Matching Unicode punctuation using LPeg

I am trying to create an LPeg pattern that would match any Unicode punctuation inside UTF-8 encoded input. I came up with the following marriage of Selene Unicode and LPeg:
local unicode = require("unicode")
local lpeg = require("lpeg")
local punctuation = lpeg.Cmt(lpeg.Cs(any * any^-3), function(s,i,a)
local match = unicode.utf8.match(a, "^%p")
if match == nil
return false
else
return i+#match
end
end)
This appears to work, but it will miss punctuation characters that are a combination of several Unicode codepoints (if such characters exist), as I am reading only 4 bytes ahead, it probably kills the performance of the parser, and it is undefined what the library match function will do, when I feed it a string that contains a runt UTF-8 character (although it appears to work now).
I would like to know whether this is a correct approach or if there is a better way to achieve what I am trying to achieve.
The correct way to match UTF-8 characters is shown in an example in the LPeg homepage. The first byte of a UTF-8 character determines how many more bytes are a part of it:
local cont = lpeg.R("\128\191") -- continuation byte
local utf8 = lpeg.R("\0\127")
+ lpeg.R("\194\223") * cont
+ lpeg.R("\224\239") * cont * cont
+ lpeg.R("\240\244") * cont * cont * cont
Building on this utf8 pattern we can use lpeg.Cmt and the Selene Unicode match function kind of like you proposed:
local punctuation = lpeg.Cmt(lpeg.C(utf8), function (s, i, c)
if unicode.utf8.match(c, "%p") then
return i
end
end)
Note that we return i, this is in accordance with what Cmt expects:
The given function gets as arguments the entire subject, the current position (after the match of patt), plus any capture values produced by patt. The first value returned by function defines how the match happens. If the call returns a number, the match succeeds and the returned number becomes the new current position.
This means we should return the same number the function receives, that is the position immediately after the UTF-8 character.

How to read a specific number (or word) from an answer

I have an .nc file I'm reading in matlab, and getting info out of the time variable.
the code looks like this
>> ncreadatt(model_list{3},'T','units')
ans =
'months since 1850-01-01'
what I want to do is get just the '1850' out of the answer.
Regular expression is a very powerful tool to parse and manipulate strings.
Matlab has regexp command:
line = 'months since 1850-01-01';
res = regexp( line, '\s(\d+)-', 'tokens', 'once');
year = str2double(res{1})
And the results is:
year =
1850
The regular expression used '\s(\d+)-' means:
\s - look for a single white space character (the space before 1850).
'(\d+)' - look for one or more digit ('\d+'), the parentheses means that all charcters matching here will be saved as a "token".
'-' - look for a single '-' after the digits.
You can play with it on ideone.