Suppose I have a char variable in Matlab like this:
x = 'hello my name $ is Sean $ Daley.';
I want to replace the first '$' with the symbol '&', and the second '$' with the symbol '#'.
Furthermore, if I have a more complicated char such that pairs of '$' repeat many times, I want to repeat the same pattern. So the following:
y = 'hello $ my $ name is $ Sean $ Daley $.$.';
would be transformed into:
'hello & my # name is & Sean # Daley &.#.'
I have tried coding this manually via for loops and while loops, but the code is just so ugly. Are there any simple functions that I can use?
Since you're dealing with single characters and non-nested pairs of flags, you can easily do this with a simple call to find and some indexed replacement:
y = 'hello $ my $ name is $ Sean $ Daley $.$.';
index = find(y == '$');
y(index(1:2:end)) = '&';
y(index(2:2:end)) = '#';
And the result:
y =
'hello & my # name is & Sean # Daley &.#.'
Related
Suppose I have a char variable in Matlab like this:
x = 'hello ### my $ name is Sean Daley.';
I want to replace the first '###' with the char '&', and the first '$' with the char '&&'.
Note that the character groups I wish to swap have different lengths [e.g., length('###') is 3 while length('&') is 1].
Furthermore, if I have a more complicated char such that pairs of '###' and '$' repeat many times, I want to implement the same swapping routine. So the following:
y = 'hello ### my $ name is ### Sean $ Daley ###.$.';
would be transformed into:
'hello & my && name is & Sean && Daley &.&&.'
I have tried coding this (for any arbitrary char) manually via for loops and while loops, but the code is absolutely hideous and does not generalize to arbitrary character group lengths.
Are there any simple functions that I can use to make this work?
y = replace(y,["###" "$"],["&" "&&"])
The function strrep is what you are looking for.
I have a file with multiple lines; but a specific line contains tons of information, with several repeated expressions. I'm trying to extract some specific values. I first tried some commands with sed, for instance, but with no success. So, I was wondering if you could give me some insights.
So, here you have one fraction of the unique line of the given document I mentioned:
[...]6[&length_range={0.19
[... a lot of more information here in between ...]
0.01},habitat.set.prob={0.01,0.03,0.56,0.01,0.01,0.34,0.01,0.01,0.01},DLOOP.rate_median=0.04131395026396427,length=
[...]
10[&length_range={0.19
[... a lot of more information here in between ...]
0.01},habitat.set.prob={0.21,0.33,0.56,0.01,0.01,0.33,0.01,0.01,0.61},DLOOP.rate_median=0.04131395026396427,length=
[...]
My aim here is first to extract all the values that is between the brackets, after "habitat.set.prob={". and put them in a single line in a text file.
Also, it would be important to extract the numbers that appears just before the expression "[&length_range=]", which in this case are "6" and "10". They are the label of the set of numbers after "prob={"
So the set of numbers I want to extract always appears between "habitat.set.prob={" and "},DLOOP.rate_median", while the other number (the label) is always rigth before "[&length_range="; but what is before the label is not the same expression; actually it is a random number.
The goal then is end up with a file with the following characteristcs:
6 0.21,0.33,0.56,0.01,0.01,0.33,0.01,0.01,0.61
10 0.21,0.33,0.56,0.01,0.01,0.33,0.01,0.01,0.61
and so on …
What do you think? Is this possible?
I started with this very basic command at least to try to extract the set of numbers, but it didn't work
sed -n "/habitat.set.prob={/,/},DLOOP.rate_median=/ p"
| Well... I got some improvement.
I was able to get the values at least:
awk '{gsub("habitat.set.prob={","\n");printf"%s",$0}' filename | awk -F'},' '{print $1"}"}' | grep -iv "TREE" > stats.txt
|
Many thanks in advance.
Cheers,
Luiz
Something like that:
sed -rn '/.*[0-9]+\[&length_range=\{/,/habitat.set.prob=\{/{s/.*\b([0-9]+)\[&length_range.*/\1/p; s/.*habitat.set.prob=\{([^D]+)\},DLOOP.rate.*/\1/p}' habitat
6
0.01,0.03,0.56,0.01,0.01,0.34,0.01,0.01,0.01
10
0.21,0.33,0.56,0.01,0.01,0.33,0.01,0.01,0.61
The first part '/.a./,/.b./' searches from pattern a to b, distributed over multiple lines. The -n told sed to do non-printing as default.
In '/.a./,/.b./{s/.c./.d./p; s/.e./.f./p}'
there are two substitution commands with p=print in curly braces.
I am not sure if you really digged a little, so not providing the complete answer, but let's hope this would help you:
for the first part: getting the no(which you call as label) you didn't mention if there is any specific pattern, so try this (data is the file which contains the actual input) - you need to work on how to get the number and tweak the RE a bit
sed -n 's/.*\([0-9][0-9]*\).*length_range.*/\1/p' data
For the other part which gives the numericals between habitat and DLOOP:
sed -n 's/.*habitat.set.prob=\(.*\),DLOOP.*/\1/pg' data | tr '{' ' ' | tr '}' ' '
Now, try to take this as a starter and work on your output to get your desired result!
To explain a bit:
In the first section - I am trying to capture the numericals between anything(.*) and (.*)length_range [you can escape the character [ and & by using \ in front of them]
In the second section: I am capturing pattern in between habitat.set.prob and DLOOP and then doin a tr to remove the brackets.
#include <iostream>
using namespace std;
int main()
{
string p = "1:2:3:4"; //input your string
int arr[4] = {}; //create a new empty integer array to put the integers in it
for(int i=0, j=0; i <p.length(); i++){//loop on the string to extract integers
if( p[i] == ':'){continue;}//if the value = ':' skip it and continue
arr[j]=(int)p[i]-48;j++;//put the integer in the array we created
}
cout << "String={"<<arr[0]<<" "<<arr[1]<<" "<<arr[2]<<" "<<arr[3]<<"}";//print the array
return 0;
}
i wonder if there is the possibility to read a .csv file looking like:
0,0530,0560,0730,....
90,15090,15290,157....
i should get:
0,053 0,056 0,073 0,...
90,150 90,152 90,157 90,...
when using dlmread(path, '') matlab spits out an error saying
Mismatch between file and Format character vector.
Trouble reading 'Numeric' field frin file (row 1, field number 2) ==> ,053 0,056 0,073 ...
i also tried using "0," as the delimiter but matlab prohibits this.
Thanks,
jonnyx
str= importdata('file.csv',''); %importing the data as a cell array of char
for k=1:length(str) %looping till the last line
str{k}=myfunc(str{k}); %applying the required operation
end
where
function new=myfunc(str)
old = str(1:regexp(str, ',', 'once')); %finding the characters till the first comma
%old is the pattern of the current line
new=strrep(str,old,[' ',old]); %adding a space before that pattern
new=new(2:end); %removing the space at the start
end
and file.csv :
0,0530,0560,073
90,15090,15290,157
Output:
>> str
str=
'0,053 0,056 0,073'
'90,150 90,152 90,157'
You can actually do this using textscan without any loops and using a few basic string manipulation functions:
fid = fopen('no_delim.csv', 'r');
C = textscan(fid, ['%[0123456789' 10 13 ']%[,]%3c'], 'EndOfLine', '');
fclose(fid);
C = strcat(C{:});
output = strtrim(strsplit(sprintf('%s ', C{:}), {'\n' '\r'})).';
And the output using your sample input file:
output =
2×1 cell array
'0,053 0,056 0,073'
'90,150 90,152 90,157'
How it works...
The format string specifies 3 items to read repeatedly from the file:
A string containing any number of characters from 0 through 9, newlines (ASCII code 10), or carriage returns (ASCII code 13).
A comma.
Three individual characters.
Each set of 3 items are concatenated, then all sets are printed to a string separated by spaces. The string is split at any newlines or carriage returns to create a cell array of strings, and any spaces on the ends are removed.
If you have access to a GNU / *NIX command line, I would suggest using sed to preprocess your data before feeding into matlab. The command would be in this case : sed 's/,[0-9]\{3\}/& /g' .
$ echo "90,15090,15290,157" | sed 's/,[0-9]\{3\}/& /g'
90,150 90,152 90,157
$ echo "0,0530,0560,0730,356" | sed 's/,[0-9]\{3\}/& /g'
0,053 0,056 0,073 0,356
also, you easily change commas , to decimal point .
$ echo "0,053 0,056 0,073 0,356" | sed 's/,/./g'
0.053 0.056 0.073 0.356
I have files with entries of the form:
$$
y = x^2
$$
I'm looking for a way (specifically using sed) to convert them to:
\begin{equation}
y = x^2
\end{equation}
The solution should not rely on the form of the equation (which may also span mutiple lines) nor on the text preceding the opening $$ or following the closing $$.
Thanks for the help.
sed '
/^\$\$$/ {
x
s/begin/&/
t use_end_tag
s/^.*$/\\begin{equation}/
h
b
: use_end_tag
s/^.*$/\\end{equation}/
h
}
'
Explanation:
sed maintains two buffers: the pattern space (pspace) and the hold space (hspace). It operates in cycles, where during each cycle it reads a line and executes the script for that line. pspace is usually auto-printed at the end of each cycle (unless the -n option is used), and then deleted before the next cycle. hspace holds its contents between cycles.
The idea of the script is that whenever $$ is seen, hspace is first checked to see if it contains the word "begin". If it does, then substitute the end tag; otherwise substitute the begin tag. In either case, store the substituted tag in the hold space so it can be checked next time.
sed '
/^\$\$$/ { # if line contains only $$
x # exchange pspace and hspace
s/begin/&/ # see if "begin" was in hspace
t use_end_tag # if it was, goto use_end_tag
s/^.*$/\\begin{equation}/ # replace pspace with \begin{equation}
h # set hspace to contents of pspace
b # start next cycle after auto-printing
: use_end_tag
s/^.*$/\\end{equation}/ # replace pspace with \end{equation}
h # set hspace to contents of pspace
}
'
This might work for you (GNU sed):
sed -r '1{x;s/^/\\begin{equation}\n\\end{equation}/;x};/\$\$/{g;P;s/(.*)\n(.*)/\2\n\1/;h;d}' file
Prime the hold space with the required strings. On encountering the marker print the first line and then swap the strings in anticipation of the next marker.
I can not help you with sed, but this awk should do:
awk '/\$\$/ && !f {$0="\\begin{equation}";f=1} /\$\$/ && f {$0="\\end{equation}";f=0}1' file
\begin{equation}
y = x^2
\end{equation}
The f=0is not needed, if its not repeated.
I have the following code, and i am wanting to store the entire line that contains the matching expression, but currently i am able to store only the expression itself.
expr='\hello';
fileread = regexp(filetext, expr, 'match');
fid = fopen('data.txt', 'wt');
fprintf(fid, '%s\n',fileread{:});
suppose my file contains:
Hello,my name is X
X hello
Not this line
my file data.txt stores
hello
hello
instead of the entire line containing the expression.
desired data.txt
Hello,my name is X
X hello
what am i doing wrong?
Based on the way you are interacting with the regexp function I will assume you have all the file text in a single variable. Let's imagine that variable takes the following form:
my name is hello there
Hello,my name is X
X hello
Not this line
For your reference, I've constructed this variable using sprintf
string = sprintf('my name is hello there\nHello,my name is X\n X hello\n Not this line')
You can extract the lines which have hello with the following regexp:
[~,~,~,d] = regexp(string, '.*?[H|h]ello.*?\n')
The results can be retrieved from the cell array with:
>> d{1}
ans =
my name is hello there
>> d{2}
ans =
Hello,my name is X
>> d{3}
ans =
X hello
Note that I used a couple of lazy quantifiers .*?, check out Laziness Instead of Greediness at this link if you would like to learn more: http://www.regular-expressions.info/repeat.html
What you're doing wrong is not using the MATLAB regexp function correctly. If you look under "Return Substrings using 'match' Keyword" on this site, you will see that the result you got is what is expected for what your code stated (it returns the parts of the input string that match the regular expression you supplied). I was going to post a suggestion, but someone beat me to it ;-). Good luck.