How to use numbers as delimiters in MATLAB strsplit function - matlab

As the title suggests I'm looking to detect where the numbers are in a string and then to just take the substring from the larger string. EG
If I have say zero89 or eight78, I would just like zero or nine returned. When using the strsplit function I have:
strsplit('zero89', but what do I put here?)

Interested in regexp that will provide you more options to explore with?
Extract numeric digits -
regexp('zero89','\d','match')
Extract anything other than digits -
regexp('zero89','\d+','Split')

strsplit('zero89', '\d', 'DelimiterType', 'RegularExpression')
Or just using regexp:
regexp('zero89','\D+','match')
I got the \D+ from here

Assuming you mean this strsplit?
strsplit('zero89', '8')

Related

finding a comma in string

[23567,0,0,0,0,0] and other value is [452221,0,0,0,0,0] and the value should be contineously displaying about 100 values and then i want to display only the sensor value like in first sample 23567 and in second sample 452221 , only the these values have to display . For that I have written a code
value = str2double(str(2:7));see here my attempt
so I want to find the comma in the output and only display the value before first comma
As proposed in a comment by excaza, MATLAB has dedicated functions, such as sscanf for such purposes.
sscanf(str,'[%d')
which matches but ignores the first [, and returns the next (i.e. the first) number as a double variable, and not as a string.
Still, I like the idea of using regular expressions to match the numbers. Instead of matching all zeros and commas, and replacing them by '' as proposed by Sardar_Usama, I would suggest directly matching the numbers using regexp.
You can return all numbers in str (still as string!) with
nums = regexp(str,'\d*','match')
and convert the first number to a double variable with
str2double(nums{1})
To match only the first number in str, we can use the regexp
nums = regexp(str,'[(\d*),','tokens')
which finds a [, then takes an arbitrary number of decimals (0-9), and stops when it finds a ,. By enclosing the \d* in brackets, only the parts in brackets are returned, i.e. only the numbers without [ and ,.
Final Note: if you continue working with strings, you could/should consider the regexp solution. If you convert it to a double anyways, using sscanf is probably faster and easier.
You can use regexprep as follows:
str='[23567,0,0,0,0,0]' ;
required=regexprep(str(2:end-1),',0','')
%Taking str(2:end-1) to exclude brackets, and then removing all ,0
If there can be values other than 0 after , , you can use the following more general approach instead:
required=regexprep(str(2:end-1),',[-+]?\d*\.?\d*','')

Can I directly load text with numbers in CCC,CC format ? (K4)

I have input with floats stored like 1000,50, ie. the decimal points are replaced by commas.
Is there an option in K to load these numbers directly into floats ?
When using
data:("SFF" ;";",";") 0:. filename
I get 0ns, of course, because the numbers are not recognized as floats.
I load them as strings now, and convert them using ssr like
c:.:' .q.ssr'[data;",";"."]
but that is extremely slow.
Is there an option somewhere to have K load these numbers in CCC,CC format as floats directly ? Normal format and ccc,cc format are not mixed, any file has just one of them.
If there is not, I imagine that it must by quite easy to replace a "." somewhere in the Q-binary where the load-function sits, with a ",", to get a version which loads these numbers. Has anybody tried that ? Or any other tip to load big files with these numbers in reasonable time ?
Cheers,
Co
If ssr' is slow for your task you may find this tiny function useful:
c2p:{c:-1_sums count each x;p:ss[r:raze x;","];r[p]:".";(0,c) _ r}
Update: an alternative version:
c2p:{p:ss[r:raze x;","];r[p]:".";(0,-1_sums count'[x])_r}
It concatenates all strings into a single long string, finds positions of commas, replaces commas with periods then splits that long string:
q)N:1000000
q)s:string[N?100000],'",",'string N?1000
q)\t r1:ssr'[s;",";"."]
4284
q)\t r2:c2p s
242
q)r1~r2
1b
I was thinking something like find (?) combined with indexing/applying
q)N:1000000
q)s:string[N?100000],'",",'string N?1000
q)\ts {s[x;y]:"."}./:flip(til count s;s?\:",")
967 52972144
q)s
"93912.794"
"57144.788"
"77809.659"
"7839.47"
"6363.523"
"44761.244"
"65699.712"
It's not perfect but that's the general idea. I'm sure there is an easier way...

Getting Variables from a data structure and creating a matrix from those variables

I have a data structure which has data points named Vel1 to Vel1520. However, when I apply Uorder = orderfields(MeanU_Velocity); the variables put in the order Vel1 Vel10 Vel100 Vel1000 Vel1001 Vel1002 etc. Is there any way to sort the data structure such that it lists the variables from 1 to 1520 in ascending order? Regards, Jer
An easy fix to this is to always use the same number of digits. 0001, 0002, ..., 0010, ..., 1520
instead of num2str(42), try sprintf('Vel%04d', 42). This prints formatted text to a string. %04d is a special code that says: fill with zeros, reserve 4 places, print an integer number. Have a look at the documentation and look at matlabs formatted strings tutorial for more comprehensive examples.

reading via matlab a number after a specific string in a txt file

I re explain my pb in a large a.txt file i have
Amount of Food is 1
Desired Travel is 5
I need to read the 1 after the 'Amount of Food is ' expression and the 5 after the 'Desired Travel is' expression, Thanks again
You can have a look at this: with regexpi you can simply look for numbers in your strings.
The syntax is as simple as this:
startIndex = regexpi(str,expression)
where the expression parameter is a regex expression (i.e. '\d*' to retrieve consecutive digits).
In your specific case a way to perform this with regular expressions would be:
First you have to decide what strings are valid in your search
for example:
firstpar = 'First parameter is [0-9]+';
means that you are looking for a string 'First parameter is '
that ends with a sequence of digits.
Then you could use regexp or regexpi in the following way:
results = regexp(mystring, firstpar, 'match');
Where mystring is the text you perform the search on and 'match' means that you want parts of the text as output, not indexes.
Now, results is a cell matrix with each cell containing a string that appeared in your text and fulfilled your firstpar definition. In order to extract just the numbers from cell matrix of strings you could use regexp again, but now helping yourself with cellfun, which iteratively applies your command to all cells of a cell matrix:
numbers = cellfun(#(x) str2num(regexp(x, '[0-9]+', 'match', 'once')), results);
numbers is an array of numbers that you were looking for.
You can do the same for different string patterns - if you want to have a more general string definitions (instead of straightforward firstpar that we used here) read matlab documentation about regular expressions (alexcasalboni pasted it in his comment), scroll down to Input Arguments and expand 'expressions'.
The difference between regexp and regexpi is that the latter is case insensitive.

Trouble determining the pattern for NSRegularExpression...?

i am relatively new to NSRegularExpression and just can't come up with a pattern to find a string within a string....
here is the string...
##$294#001#[12345-678[123-456-7#15665#2
I want to extract the string..
#001#[12345-678[123-456-7#
for more info I know that there will be 3 digits(like 001) between two # 's and 20 characters between the last two # 's..
I have tried n number of combinations but nothing seem to work. any help is appreciated.
How about something like this:
#[0-9]{3}#.{20}#
If you know that the 20 characters will always consist of digits, [ and -, your pattern would become:
#[0-9]{3}#[0-9\[\-]{20}#
Be careful with the backslashes: When you use create the pattern with a string literal (#"..."), you need to add an extra backslash before each backslash.
You can test NSRegularExpression patterns without recompiling each time by using RegexTester https://github.com/liyanage/regextester