Okay so i have a string of data
data= "DTG: 20191005/0925Z VAAC: LONDON VOLCANO: ASKJA"
so i want to make it a dictionary like
{DTG:'20191005/0925Z',VAAC:'LONDON',VOLCANO:'ASKJA'}
i used split function to split these up and make a list but somehow i cant seem to make it
print(data.split(":"))
I have to make it as a list like:
{DTG:'20191005/0925Z',VAAC:'LONDON',VOLCANO:'ASKJA'}
Can anyone help?
ok so if u can do the above one how about data from the file
https://www.sendspace.com/file/zg0kmh
To begin with, {DTG:'20191005/0925Z',VAAC:'LONDON',VOLCANO:'ASKJA'}
is called a dictionary and not a list.
Now to create that dictionary, you woukd want to split the string on whitespace , rather than :, then you can zip alternate elements together to make a tuple of keys and values, then convert that zip to dict
Note that the below solution assumes that the each individual word is a key or a value. If you have a value or a key like hello world, this logic won't work, then perhaps another logic to split the string (another character instead of whitespace, or a regex split) will be more beneficial
data= "DTG: 20191005/0925Z VAAC: LONDON VOLCANO: ASKJA"
#Split string on whitespace
arr = data.split()
#Zip alternate elements together
arr_zip = zip(arr[::2], arr[1::2])
#Convert dict to zip and print it
dct = dict(arr_zip)
print(dct)
The output will be {'DTG:': '20191005/0925Z', 'VAAC:': 'LONDON', 'VOLCANO:': 'ASKJA'}
Assuming that your data is in exactly this format, you could use the following code:
data = "DTG: 20191005/0925Z VAAC: LONDON VOLCANO: ASKJA"
parts = data.split()
output = {
'DTG': parts[1],
'VAAC': parts[3],
'VOLCANO': parts[5],
}
However, this doesn't take into account spaces very well, if there are spaces in your data (e.g. "NEW YORK").
Using dict.update:
data = "DTG: 20191005/0925Z VAAC: LONDON VOLCANO: ASKJA"
d = dict()
d.update(s.split(':') for s in data.replace(': ', ':').split())
Output:
{'DTG': '20191005/0925Z', 'VAAC': 'LONDON', 'VOLCANO': 'ASKJA'}
data= "DTG: 20191005/0925Z VAAC: LONDON VOLCANO: ASKJA"
data = data.split()
dic = {}
for d in data:
if ":" in d:
key = d.replace(":","")
else:
dic[key] = d
print(dic)
Output:
{'DTG': '20191005/0925Z', 'VAAC': 'LONDON', 'VOLCANO': 'ASKJA'}
Related
How to sort String ArrayList divided by "," separator?
In arraylist, string is dataType, and each index is stored as below.
someList[0] = "abc,xxx,1"
someList[1] = "abc,xxx,3"
someList[2] = "abc,xxx,2"
someList[3] = "abc,xxx,5"
someList[4] = "abc,xxx,4"
The problem is I want to split and rearrange based on the last number(1,3,2,5,4 -> 1,2,3,4,5). How could I achieve this? I would really appreciate for the answer
You can try this:
someList.sort((a, b)=>a.split(',').last.compareTo(b.split(',').last));
I have a measurement device PCE-VDL, which gives me measurements in following CSV format below, which I need to import to OCTAVE for further investigation.
Especially I need to import last 3 columns with xyz acceleration data.
The file is in CSV format with delimiter of semicolon ";".
I have tried:
A_1 = importdata ("file.csv", ";", 3);
but have recieved
error: missing_idx(10): out of bound 9
The CSV file looks like this:
#PCE-VDL X - TableView series
#2020.16.11
#Date;Time;Duration [s];t [°C];RH [%];p [mbar];aX [g];aY [g];aZ [g];
2020.28.10;16:16:32:0000;00:000;;;;0,0195;-0,0547;1,0039;
2020.28.10;16:16:32:0052;00:005;;;;0,0898;-0,0273;0,8789;
2020.28.10;16:16:32:0104;00:010;;;;0,0977;-0,0313;0,9336;
2020.28.10;16:16:32:0157;00:015;;;;0,1016;-0,0273;0,9297;
The numbers in last 3 columns have also decimal coma and not decimal point. So there probably should be done also some conversion.
Thank you very much for any help.
Regards
EDIT: 18.11.2020
Thanks for help. I have tried now following:
A_1_str = fileread ("file.csv");
A_1_str_m = strrep (A_1_str, ".", "-");
A_1_str_m = strrep (A_1_str_m, ",", ".");
save "A_1_str_m.csv" A_1_str_m;
A_1 = importdata ("A_1_str_m.csv", ";", 8);
and still receive error: file_content(140): out of bound 139
There is probably some problem with time format in first columns, which I do not want to read. I just need last three columns.
After my conversion, the file looks like this:
# Created by Octave 5.1.0, Wed Nov 18 21:40:52 2020 CET <zdenek#ASUS-F5V>
# name: A_1_str_m
# type: sq_string
# elements: 1
# length: 7849
#PCE-VDL X - TableView series
#2020-16-11
#Date;Time;Duration [s];t [°C];RH [%];p [mbar];aX [g];aY [g];aZ [g];
2020-28-10;16:16:32:0000;00:000;;;;0.0195;-0.0547;1.0039;
2020-28-10;16:16:32:0052;00:005;;;;0.0898;-0.0273;0.8789;
2020-28-10;16:16:32:0104;00:010;;;;0.0977;-0.0313;0.9336;
Thanks for support!
You can first read the data with fileread, which stores the data as a string. Then you can manipulate the string like this:
new_string = strrep(string, ",", ".");
strrep replaces all occurrences of a pattern within a string. Afterwards you save this data as a separate file or you overwrite the existing file with the manipulated data. When this is done you proceed as you have tried before.
EDIT: 19.11.2020
To avoid the additional heading lines in the new file, you can save it like this:
fid = fopen("A_1_str_m.csv", "w");
fputs(fid, A_1_str_m);
fclose(fid);
fputs will just write the string to the file.
The you can read the new file with dlmread.
A1_buf = dlmread("A_1_str_m.csv", ";");
A1_buf = real(A1); # get the real value of the complex number
A1_buf(1:3, :) = []; # remove the headlines
A1 = A1_buf(:, end-3:end-1); # get only the the 3 columns you're looking for
This will give you the three columns your looking for. But the date and time data will be ignored.
EDIT 20.11.2020
Replaced abs with real, so the sign of the value will be kept.
Use csv2cell from the io package.
I have a string of variable names as shown below:
{'"NORM TIME SEC, SEC, 9999997" "ROD FORCE, LBS, 3000118" "ROD POS, DEG, 3000216" P_ext_chamber_press P_ret_chamber_press "GEAR#1 POS INCH" 388821 Q_valve_gpm P_return 3882992 "COMMAND VOLTAGE VOLT"'}
the double quotes are for the variable names with spaces or special characters between the words" and a single word variable doesn't have any quotes around them. The variables are separated by one space. Some variable names are just numbers.
At the end, I want to create a cell with strings as follows:
{'NORM_TIME_SEC_SEC_9999997','ROD_FORCE_LBF_3000118','ROD_POS_DEG_3000216','P_ext_chamber_press','P_ret_chamber_press','GEAR#1_POS_INCH','3388821','Q_valve_gpm','P_return','3882992','COMMAND_VOLTAGE_VOLT'}
You can use regexp to first split it into groups and then replace all space with _
data = {'"abc def ghi" "jkl mno pqr" "stu vwx" yz"'};
% Get each piece within the " "
pieces = regexp(data{1}, '(?<=\"\s*)([A-Za-z0-9]+\s*)*(?=\"\s*)', 'match');
% 'abc def ghi' 'jkl mno pqr'
% Replace any space with _
names = regexprep(pieces, '\s+', '_');
% 'abc_def_ghi' 'jkl_mno_pqr'
Update
Since your last variable isn't surrounded by quotes, you could do something like the following
pieces = strtrim(regexp(data, '[^\"]+(?=\")', 'match'));
pieces = pieces{1};
pieces = pieces(~cellfun(#isempty, pieces));
% Replace spaces with _
regexprep(pieces, '\s+', '_')
I ended up forcing myself to study little bit on regular expression and the following seems to be working well for what I'm trying to do:
regexp(str,'(\"[\w\s\,\.\#]+\"|\w+)','match')
Probably not as robust as I want since I'm specifically singling out a certain set of special characters only, but so far, I haven't seen other special characters other than those in the data sets I have.
str = {'"abc def ghi" "jkl mno pqr" "stu vwx" yz"'};
Then
str_u = strrep(str,' ','_');
[str_q rest] = strtok(str_u,'"');
str_u = rest;
while ~strcmp(rest,'')
[token rest] = strtok(str_u,'"');
if ~(strcmp(token,'_')||strcmp(token,''))
if strcmp(token{1,1}(1),'_')
token{1,1} = strrep(token{1,1},'_','');
end
str_q = [str_q, token];
end
str_u = rest;
end
The resultant cell array is str_q, which will give the names of the variables
str_q = 'abc_def_ghi' 'jkl_mno_pqr' 'stu_vwx' 'yz'
I need to split a string into two components. As an example I have the string:
s = 'Hello1_1000_10_1_data'
and I want to split it into the two strings
str1 = 'Hello1_1000_10_1'
and
str2 = '_data'
the important point is that I can't be too sure of the format of the first string, the only thing that is sure is that the 'suffix' which is to be read into the second string always reads '_data'. What is the best way to do this? I looked up the documentation on strtok and regexp but they do not seem to offer me what I want.
If you always know the length of the suffix, you could just use that:
s = 'Hello1_1000_10_1_data'
str1 = s(1:end-5)
Or otherwise:
s = 'Hello1_1000_10_1_data'
suffix = length('_data')
str1 = s(1:end-suffix)
You can use:
s = 'Hello1_1000_10_1_data';
str = regexp(s, '(.*)(_data)', 'tokens'){1};
str{1} %// Hello1_1000_10_1
str{2} %// _data
If _data occurs several times in the file name, this will still work.
You can also use strsplit() as follow:
s = 'Hello1_1000_10_1_data';
suffix = '_data';
str = strsplit(s,suffix);
str1 = str{1}
In addition, you can use strsplit() with multiple delimiters.
You can use strfind():
s = 'Hello1_1000_10_1_data';
suffix = '_data';
i = strfind(s,suffix);
if any(i)
i = i(end);
prefix = s(1:i-1);
end
I have to read a textfile which contains a list of companycodes. The format of the textfile is:
[1233A12; 1233B88; 2342Q85; 2266738]
Even if I have read the file? Is it possible to compare these numbers with regular numbers? Because I have the codes from two different data-bases and one of them has regular firmnumbers (no characters) and the other has characters inside the firmnumbers.
Btw the file is big (50+mb).
Edit: I have added an additional number in the example because not all the numbers have a character inside
If you want to compare part of a string with a number, you could do it as follows:
combiString = '1234AB56'
myNumber= 1234
str2num(combiString(1:4))==myNumber
str2num(combiString(7:8))==myNumber
You can achieve this result by using regular expressions. For example, if str = '1233A12' you can write
nums = regexp(str, '(\d+)[A-Z]*(\d+)', 'tokens');
str1 = nums{1}(1);
num1 = str2num(str1{1});
str2 = nums{1}(2);
num2 = str2num(str2{1});