copying every nth line and duplicating it on it's following line - matlab

I am trying to make test files for the project, and I figured in order to make a bradycardia test file from an example file of a normal ECG.
Therefore I would need to copy every third line and insert it into the next line.
for example:
a = [
1
2
3
4
5
6
7
8
9
10]
and I want:
b = [
1
2
3
3
4
5
6
6
7
8
9
9
10]
and so on... but since the file is 6000 characters long, obviously i cannot manually copy it. And I need it to be 9000 characters long I've tried looking online on how to do this, and am having no luck.
Any suggestions?

b=zeros(floor(4/3*length(a)),1);
b(1:4:end)=a(1:3:end);
b(2:4:end)=a(2:3:end);
b(3:4:end)=a(3:3:end);
b(4:4:end)=a(3:3:end);

Another way:
b = a(sort([1:numel(a) 3:3:numel(a)]))

And here is a third faster and simpler method
b = a(round(1:0.75:numel(a)))

This only works if length(a) is a multiple of 3, but seems to be faster than the other answers, at least for large vectors:
b = reshape([reshape(a,3,[]); a(3:3:end).'],[],1);

Related

SPSS Merging Data with duplicate Keys

I am currently attempting to join 2 datasets using SPSS syntax but am struggling as I have duplicate values on the keys. I would like for the joined data to be duplicated for each instance of the key on the source dataset (or other way round as it doesn't matter which is the source).
The datasets are like the following -
Data1 (3rd column placeholder)
batch
run
date
A
1
1
A
2
1
A
3
1
B
1
1
C
1
1
C
2
1
D
1
1
E
1
1
Data2
batch
Value1
Value2
A
1
21
A
2
22
A
3
23
A
4
24
B
5
25
B
6
26
B
7
27
B
8
28
C
9
29
C
10
30
C
11
31
C
12
32
D
13
33
D
14
34
D
15
35
D
16
36
E
17
37
E
18
38
E
19
39
E
20
40
Current attempt
What I have just now is a method where I CASETOVARS on Data1 before matching it onto Data2 and then VARSTOCASES to expand it out. This works perfectly with my test data but, unfortunately, it requires that I know exactly how many 'runs' there will be. That will not be known in production. It could be 1 or more.
Is there a method to join these datasets while expanding the joined data into the multliple cases in the source?
I am open to using macros but am not able to utilise Python solutions for this (which would probably be easier!).
edit - Unfortunately, extensions are also not possible for me to use.
CASESTOVARS
/ID = batch .
DATASET ACTIVATE data2 .
MATCH FILES
/FILE = *
/TABLE = data1
/BY batch .
EXECUTE .
VARSTOCASES
/MAKE run FROM BATCH_RUN_ID.1 TO BATCH_RUN_ID.3 .
EXECUTE .
If Python and dependent extention command are not availabe, here's an idea how to solve the dynamic list length for the varstocases phase.
What you'll do is basically to create a new dataset with the maximum number of runs possible, attach your read dataset to it, and then set the varstocases to go for that maximum number of runs (blank rows are dropped automatically):
dataset name orig.
data list free/throwthisrow (f1) BATCH_RUN_ID.1 to BATCH_RUN_ID.50 (50F8.2) .
begin data
1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
end data.
add files /file=* /file=orig .
EXECUTE.
select if missing(throwthisrow).
VARSTOCASES
/MAKE run FROM BATCH_RUN_ID.1 TO BATCH_RUN_ID.50 /drop throwthisrow.
EXECUTE .
To complete your present approach you can use spssinc select variables extention command (see examples of use here and here and here). You will use it to automatically create a list of the variables you want to name in your varstocases command, so that the syntax will automatically adapt itself to the number of runs in the data:
So after varstocases and match files:
spssinc select variables macroname="!from" /properties pattern = "BATCH_RUN_ID".
VARSTOCASES /MAKE run FROM !from .

Scalding: Create list from column in Pipe

I need to take a pipe that has a column of labels with associated values, and pivot that pipe so that there is a column for each label with the correct values in each column. So f example if I have this:
Id Label Value
1 Red 5
1 Blue 6
2 Red 7
2 Blue 8
3 Red 9
3 Blue 10
I need to turn it into this:
ID Red Blue
1 5 6
2 7 8
3 9 10
I know how to do this using the pivot command, but I have to explicitly know the values of the labels. How can I can dynamically read the labels from the “label” column into a list that I can then pass into the pivot command? I have tried to create list with:
pipe.groupBy('id) {_.toList('label) }
, but I get a type mismatch saying it found a symbol but is expecting (cascading.tuple.Fields, cascading.tuple.Fields). Also, from reading online, it sounds like using toList is frowned upon. The number of things in 'label is finite and not that big (30-50 items maybe), but may be different depending on what sample of data I am working with.
Any suggestions you have would be great. Thanks very much!
I think you're on the right track, you just need to map the desired values to Symbols:
val newHeaders = lines
.map(_.split(" "))
.map(a=>a(1))
.distinct
.map(f=>Symbol(f))
.toList
The Execution type will help you to combine with the subsequent pivot, for performance reasons.
Note that I'm using a TypedPipe for the lines variable.
If you want your code to be super-concise, you could combine lines 1 & 2, but it's just a stylistic choice:
map(_.split(" ")(1))
Try using Execution to get the list of values from the data. More info on executions: https://github.com/twitter/scalding/wiki/Calling-Scalding-from-inside-your-application

Convert Matlab Console output to new expression

In order to debug a very complex set of functions, I want to isolate a subfunction from the workspace in order to make different test. Therefore a need selected values from the function workspace to be defined already. By setting a break point at the specific position I can "look" into the current workspace by displaying the values in the console, like the variable HF33
HF33 =
1.0777 0.0865 0.0955
-0.1891 0.8110 -0.1889
0.0935 0.0846 1.0755
Is there some function / script that could convert this result to a new Matlab expression that can be pasted somewhere else (for example at the head of a new script), e.g.:
HF33 = [ 1.0777, 0.0865, 0.0955;
-0.1891, 0.8110, -0.1889;
0.0935, 0.0846, 1.0755 ];
With that I could test the subfunction and its behavior by easily changing the given values and see whats happening without having the huge debug workspace running.
Is there some easy function like res2exp(HF33)?
First: Create this function to get the variable name
function out = varname(var)
out = inputname(1);
end
you can print it direct to console:
fprintf('%s =%s\n',varname(varToSave),mat2str(varToSave));
Or use fopen and fprint to write it in a file
fop = fopen('filename','w');
fprint(fop,'%s = %s' ,varname(varToSave),mat2str(varToSave));
fclose(fop);
I think this will help you
It might be a function like mat2str() you are looking for but it will not give exactly the printout you are asking for. Here is an example of how it could be used:
>> A = magic(4)
A =
16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1
>> B = mat2str(A)
B =
[16 2 3 13;5 11 10 8;9 7 6 12;4 14 15 1]
And if you want the output to be totally copy/paste-able you could use:
disp(['C = ',mat2str(A)])
C = [16 2 3 13;5 11 10 8;9 7 6 12;4 14 15 1]
I made this up just now. It is not formatted beautifully, but it achieves what you are trying to do - if I understand you correctly.
a = [ 2 3 4 5
4 5 5 6
3 4 5 6];
fprintf('\nb = [\n\n');
disp(a);
fprintf(']\n\n');
Copy and paste this and see if it does what you want. It's also very simple code, so you could modify it if the spacing and newline characters aren't where you want them.
You could also make a small function out of this if you wanted to.
If you want me to make a function of it, let me know... I can do it tomorrow. But you can probably figure it out.
Ehh, I just made the function. It didn't take long.
function reprint_matrix(matrix)
var_name = inputname(1);
fprintf('\n%s = [\n\n', var_name);
disp(matrix);
fprintf(']\n\n');
end
I'm not sure what you are looking for, but I think this will help you:
http://www.mathworks.com/matlabcentral/fileexchange/24447-generate-m-file-code-for-any-matlab-variable/content/examples/html/gencode_example.html
Did not use it because I use mat-files to transfer data.
You can combine it with the clipboard function:
clipboard('copy',gencode(ans))
Though there are several ways to write variables to text, saving variables as text is definitely bad practice if it can be avoided. Hence, the best advice I can give you is to solve your problem in a different way.
Suppose you want to use HF33 in your subfunction, then here is what I would recommend:
First of all, save your variable of interest:
save HF33 HF33
Then when you are in the function where you want to use this variable:
load HF33
This assumes that your working directory (not workspace) is the same in both cases, but otherwise you can simply add the path in your save or load command. If you want to display it you can now simply call the variable HF33 without a semicolon (this is probably the only safe way to display it exactly the way you expect in all cases).
Note that this method can easily be adapted to transfer multiple variables at once.

How to skip a column using txt2mat

I´m trying to import some csv files in matlab, but csvread is too slow.
I´m using txt2mat, but i don´t know how to skip the first column in the import.
This is the way im trying
myimportedfile = txt2mat(myfile,'ReadMode','block',1) %im skipping the headers too.
The reason i need to skip is beacause the first column is non-numerical data.
Is there a way to do this with txt2mat or is there a better way?
Tks in advance.
textscan gives you the ability to skip columns. It reads in data using an fprintf-like format string.
Example file:
Val1 Val2 Val3
1 2 3
4 5 6
7 8 9
Code:
tmp = textscan('example.txt', '%i %*i %i') % the * indicates fields to ignore
tmp{:}

How to extract certain columns from a big Notepad text file?

I have a big text file and the data in it are in 5 columns, but I need just the first and the last column of that.
It will take many days and probably with mistake if I want to enter the data of this two column one-by-one from here to another file.
Is there a fast way to do this?
For example:
1 1.0000000000000000 0.0000000000 S {0}
2 1.5000000000000000 0.3010299957 C {2}
3 1.7500000000000000 0.6020599913 S {0,2}
4 2.0000000000000000 0.7781512504 C {3}
5 2.3333333333333333 1.0791812460 C {3,2}
6 2.5000000000000000 1.3802112417 S {3,0,2}
7 2.5277777777777778 1.5563025008 S {0,3}
8 2.5833333333333333 1.6812412374 S {3,0,0,2}
9 2.8000000000000000 1.7781512504 C {5,2}
10 3.0000000000000000 2.0791812460 C {5,0,2}
I need the first column (numbering) and the last inside { }.
ALT + Left Mouse Click puts you in Column Mode Select. It's quite an useful shortcut that may help you.
in Notepad++, you can use regular expression to do replacement:
the regex for find and replace is:
^( +\d+).+\{([\d,]+)\}$
\1 \2
then can change the:
1 1.0000000000000000 0.0000000000 S {0}
2 1.5000000000000000 0.3010299957 C {2}
3 1.7500000000000000 0.6020599913 S {0,2}
4 2.0000000000000000 0.7781512504 C {3}
5 2.3333333333333333 1.0791812460 C {3,2}
6 2.5000000000000000 1.3802112417 S {3,0,2}
7 2.5277777777777778 1.5563025008 S {0,3}
8 2.5833333333333333 1.6812412374 S {3,0,0,2}
9 2.8000000000000000 1.7781512504 C {5,2}
10 3.0000000000000000 2.0791812460 C {5,0,2}
to:
1 0
2 2
3 0,2
4 3
5 3,2
6 3,0,2
7 0,3
8 3,0,0,2
9 5,2
10 5,0,2
if not want the leading space, then use:
^( +\d+).+\{([\d,]+)\}$
\1 \2
will change to:
1 0
2 2
3 0,2
4 3
5 3,2
6 3,0,2
7 0,3
8 3,0,0,2
9 5,2
10 5,0,2
You should use awk or gawk which is available on windows platform also. Use gawk "{print $1,$5}" inpfile > outfile. I copied your file named it 'one'. You can see the output which consists of 1st and 5th column of your file.
>gawk "{print $1, $5}" one
1 {0}
2 {2}
3 {0,2}
4 {3}
5 {3,2}
6 {3,0,2}
7 {0,3}
8 {3,0,0,2}
9 {5,2}
10 {5,0,2}
You can import it into Excel and manipulate it there.
If you are using .NET, FileHelpers may save you a lot of time. From your post we can't tell what technology you are hoping to use to accomplish this.
Ultraedit has a tool for selecting columns and opens large files (I tried a 900 Mb file on a 2008 desktop and it opened in 3 minutes). I think it has a demo version fully operational.
Excel could work if you do not have too many rows.
Cheers,
One more way is to copy the data to MS word file.
Then use
{Alt + left mouse click}
Then you can drag on the selected column and you can see only a single column is selected.
Copy and paste wherever you want.
There is only one way to convolve ungodly amounts of data. That is with the command prompt.
$cat text.txt | sed 's/{.*,//;s/ */ /g;s/[{}]//g' | awk '{print $1","$5}' > clean_text.csv
This 15 second fix is not available in Windows OS. It will take you less time to download and install Linux on that old dead computer in your closet than it will to get your data in and out of Excel.
Happy coding!