save ELKI output to a file - cluster-analysis

save ELKI output to a file - cluster-analysis

I want to save the cluster outputs to a file. for example I want to save cluster1 points into c1.txt and cluster2 points into c2.txt and so on.
ELKI release 0.7
java -jar elki.jar -dbc.in ./f1 -dbc.out ./dir1 -algorithm clustering.DBSCAN -dbscan.epsilon 5 -dbscan.minpts 10
but it has this error:
the following parameter were not processed: [-dbc.out,./dir1]
The command could not be correct, so how can I save clusters?

To save to a file, use
-resulthandler ResultWriter
-out folder/
to limit what files are being written, you can use -out.filter.

Related

How to associate hash-named .wsp files to my tagged graphite metrics?

I used graphite tagged metrics over grafana and whisper, but http://graphite/tags/delSeries removes something but not .wsp files.
And untagged metrics creates .wsp files in whisper data folder with human-readable names, but tagged metrics creates only hash-named folders and .wsp files in _tagged directory.
Like so:
/whisper
/data
/Players
registrations.wsp
today_registrations.wsp
/Gaming
playing_count.wsp
/_tagged
/f58
/010
f58010d4cef67599a31f4daaab4a53c4d7fd85a9faea546282d2058c40c7e7b9.wsp
/f56
/031
f56031052aec89dc9cc38e44dbe71b2eb08fb513a3e60d515eb1dc23f5b929d1.wsp
How to know .wsp file associated with my tagged metric?

I'm just running into that problem as well, how to map the actual path/tag metric to its corresponding hashed wsp file.
I don't think you can compute the actual metric name from the hash, but you can do the other way around, by using graphite's encoding methods.
I've quickly written a python script just for lab purpose:
- It can take several metric names in parameters and returns a mapping
Just log into your graphite host and create a python script in /opt/graphite/webapp/graphite/tags
#!/opt/graphite/bin/python3
import sys
from utils import TaggedSeries
for line in sys.stdin:
paths = line.split()
for path in paths:
# Normalize first
parsed = TaggedSeries.parse(path)
print( path + " -> /opt/graphite/storage/whisper/" + TaggedSeries.encode(parsed.path,'/',True) + ".wsp")
You can then pipe a list of metrics:
# echo "users.count;server=s1" |python mapper.py
users.count;server=s1 -> /opt/graphite/storage/whisper/_tagged/b6c/c91/b6cc916d608e4b145b318669606e79118cc41d316f96735dd43621db4fd2bcaf.wsp
You can also get all your tagged metrics and generate a file that you can later cat into the script. In this example i get all metrics associated associated with the tag 'server':
# curl -s "http://localhost/tags/findSeries?expr=server=~." | sed s/"\", \""/\\n/g > my_metrics
Then cat your metrics:
# cat my_metrics | python mapper.py
That's a starting point. From there you can easily do some simple scripting for deleting wsp files, like the ones not updated since a month by example.
graphite

%include centos kickstart unable to open input kickstart file

Hi would like to have multiple kickstart files which use a central kickstart file for the bulk of the install and a second file for the small differences. I'm building DVDs for distribution.
The first ks contains small config and has a %include line which points to a common ks file which should do most of the work.
I'm having trouble with %include line.
Fist of all have I understood what %include is for?
Second I think I have the syntax wrong because when I boot I get the following error message:
unable to open input kickstart file: Could not open/read file:///mnt/sysimage/media/dvd/ks/common.cfg
I am installing from a DVD what is the correct path or syntax to the files stored in a sub directory called /ks/ of the DVD's root?
I have tried the following:
%include /mnt/sysimage/media/dvd/ks/common.cfg
%include cdrom:/ks/common.cfg
Does anyone have any working examples?
Thanks in advance for your support

I eventually found part of the answer
%include /mnt/stage2/ks/common.cfg
The dvd is mounted as stage2
However I now get an error message saying it cant read the file
%%include
I can see the file and less it if I hit ctrl + alt + F1
Does anyone have a working simple example of how this should be written?

Open your isolinux/isolinux.cfg from the OS and give the ks file path as below . You can enter your kick start option in boot: prompt of dvd
label 1
kernel vmlinuz
append initrd=initrd.img nofb skipddc lang= devfs=nomount ramdisk_size=8192 ks=cdrom:/option1.cfg 1
label 2
kernel vmlinuz
append initrd=initrd.img nofb skipddc lang= devfs=nomount ramdisk_size=8192 ks=cdrom:/option2 2
label 3
kernel vmlinuz
append initrd=initrd.img nofb skipddc lang= devfs=nomount ramdisk_size=8192 ks=cdrom:/option3.cfg 3
Then edit /isolinux/boot.msg and add the enter the below details
Select installation:
1) option 1
2) option 2
3) option 3

Docker layer info post v1.10

I used to check the sizes of layers in an image using the Docker history command, although now that shows "missing" instead of layer IDs due to the 1.10 migration to content hashes.
I now retrieve the hashes of all layers in an image through these commands:
docker pull ubuntu
ID=$(docker inspect -f {{.Id}} ubuntu)
sudo jq .rootfs.diff_ids /var/lib/docker/image/aufs/imagedb/content/$(echo $ID|tr ':' '/')
This returns a list of content hashes of all layers in the ubuntu image:
"diff_ids": [
"sha256:2a4049cf895d2384cb93d19f46f0d62560a48b2b202787edad2dc6e4b95a923a",
"sha256:01fbb4b5fa1b76ccdc289de098ea61925c7f8d3364159761720617b096f27bcc",
"sha256:d3492de15d7c87ea9db9ab123214d334f4bcb1e40846b77beebb4c37dd134a45",
"sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef"
],
In /var/lib/docker/image/aufs/layerdb/sha256/ I see information about each layer such as parent and size but I noticed that the diff_ids in this folder are not the same as the above output:
> ls /var/lib/docker/image/aufs/layerdb/sha256/
2088e4744016dbe95308d1920060f1fbc4a095ba5b9517d758745fc3986f2632
2a4049cf895d2384cb93d19f46f0d62560a48b2b202787edad2dc6e4b95a923a
8c63d05abe660a2f3f04d754de3ee3d927a17b3623a8e2be6d727e697f4b1e10
f747ac597de13b7f1ff918874f80bb83004232d7d6d4d45ad8890b58cdc79adc
I then tried inspecting another folder such as /var/lib/docker/aufs/layers:
> ls /var/lib/docker/aufs/layers#
58e7ed1f6d4ba047c9c714e66f10c014008ef4aa133d334198b8b1b7673f16e7
c4dd5a81188e36457624849aaeea74d98ef571390db75d4a03efb5bccb8c04e3
d31f918b7f59fcf768a9ae609141152cd5ae63943aac042429e3d2e04d472bcc
e576c6d41b96bd6a47233a6c6ec2f586021aa945aae6bd0e73ab9d4ad051a94e
As you can see these are 4 other content hashes again. Can someone tell me what the connection is between all these hashes and how I can find the size of each layer of the Ubuntu image? I'd like to be able to match each diff_id in the first output with a size but I don't know how all these diff_ids in different folders are related.
EDIT: I solved it like this - /var/lib/docker/image/aufs/layerdb/sha256/ also contains a file called "diff" which contains the diff_id corresponding to the output of the first command. I used this output to map the size to the correct diff_id.

I solved it like this: /var/lib/docker/image/aufs/layerdb/sha256/ also contains a file called "diff" which contains the diff_id corresponding to the output of the first command. I used this output to map the size to the correct diff_id.

ELKI DBSCAN for million files

I am using dbscan for clustering points, as my points are more than 1 million I use r*-tree too.
I use ELKI in command line:
java -cp elki.jar
de.lmu.ifi.dbs.elki.application.KDDCLIApplication
-db.index tree.spatial.rstarvariants.rstar.RStarTreeFactory
-algorithm clustering.DBSCAN
-dbc.in points1.txt
-dbscan.epsilon 20
-dbscan.minpts 10
-out results3/DBSCANeps20min10
for small files its ok but for 4 million files the error occurred:
at de.lmu.ifi.dbs.elki.database.ids.integer.DoubleIntegerArrayQuickSort.quickSort(Unknown Source)

This is a known bug in an old version of ELKI, when there are many duplicate distances.
It can be resolved by updating to a current version.

Matlab Distributed Server parfor can't find mex opencv files

everyone!
I'm trying to parallelize an algorithm that uses mex files from mexopencv (KNearest.m, KNearest_.mexw32).
The program is based vlfeat (vlsift.mex32) + mexopencv (KNearest.m and KNearest_.mexw32).I classify descriptors obtained from images.
All the code is located on the fileshare
\\ LAB-07 \ untitled \ DISTRIB \ (this is the program code)
\\ LAB-07 \ untitled \ + cv (mexopencv)
When I run the program with matlabpool close everything works well.
Then I make matlabpool open (2 computers on 2 cores each. ultimately 4 worker, but now I use for testing only 2 workers on the computer and run the program which)
PathDependencises from fileshare -> \LAB-07\untitled\DISTRIB\ , \LAB-07\untitled+cv
Before parfor loop I train classifier on the local machine
classifiers = cv.KNearest
classifiers.train(Descriptors',Labels','MaxK',1)
Then run parfor
descr=vlsift(img);
PredictClasses = classifiers.predict(descr');
Error
Error in ==> KNearest>KNearest.find_nearest at 173
Invalid MEX-file '\\LAB-07\untitled\+cv\private\KNearest_.mexw32':
The specified module could not be found.
That is KNearest.m finds, but no KNearest_.mexw32. Because KNearest_.mexw32 located in private folder, I changed the code KNearest.m (everywhere where it appeal KNearest_ () changed to cv.KNearest_ (). Example: this.id = сv.KNearest_ ()) and placed in a folder with KNearest_.mexw32 KNearest.m. As a result, get the same error
Immediately after matlabpool open file search on workers
pctRunOnAll which ('KNearest.m')
'KNearest.m' not found.
'KNearest.m' not found.
'KNearest.m' not found.
pctRunOnAll which ('KNearest_.mexw32')
'KNearest_.mexw32' not found.
'KNearest_.mexw32' not found.
'KNearest_.mexw32' not found.
after cd \LAB-07\untitled+cv
pctRunOnAll which ('KNearest.m')
\\LAB-07\untitled\+cv\KNearest.m
\\LAB-07\untitled\+cv\KNearest.m % cv.KNearest constructor
\\LAB-07\untitled\+cv\KNearest.m
>> pctRunOnAll which ('KNearest_.mexw32')
\\LAB-07\untitled\+cv\KNearest_.mexw32
\\LAB-07\untitled\+cv\KNearest_.mexw32
\\LAB-07\untitled\+cv\KNearest_.mexw32
I ran and FileDependecies, but the same result.
I do not know this is related or not, I display during the execution of the program classifiers
after training and before parfor
classifiers =
cv.KNearest handle
Package: cv
Properties:
id: 5
MaxK: 1
VarCount: 128
SampleCount: 9162
IsRegression: 0
Methods, Events, Superclasses
Within parfor before classifiers.predict
classifiers =
cv.KNearest handle
Package: cv
Properties:
id: 5
I tested the file cvtColor.mexw32. I left in a folder only 2 files cvtColor.mexw32 and vl_sift
parfor i=1:2
im1=imread('Copy_of_start40.png');
im_vl = im2single(rgb2gray(im1));
desc=vl_sift(im_vl);
im1 = cvtColor(im1,'RGB2GRAY');
end
The same error, and vl_sift work, cvtColor no...

If the worker machines can see the code in your shared filesystem, you should not need FileDependencies or PathDependencies at all. It looks like you're using Windows. It seems to me that the most likely problem is file permissions. MDCS workers running under a jobmanager on Windows by default run not using your own account (they run using the "LocalSystem" account I think), and so may well simply not have access to files on a shared filesystem. You could try making sure your code is world-readable.
Otherwise, you can add the files to the pool by using something like
matlabpool('addfiledependencies', {'\\LAB-07\untitled\+cv'})
Note that MATLAB interprets directories with a + in as defining "packages", not sure if this is intentional in your case.
EDIT
Ah, re-reading your original post, and your comments below - I suspect the problem is that the workers cannot see the libraries on which your MEX file depends. (That's what the "Invalid MEX-file" message is indicating). You could use http://www.dependencywalker.com/ to work out what are the dependencies of your MEX file, and make sure they're available on the workers (I think they need to be on %PATH%, or in the current directory).

Edric thanks. There was a problem in the PATH for parfor. With http://www.dependencywalker.com/ looked missing files and put them in a folder +cv. Only this method works in parfor.
But predict in parfor gives an error
PredictClasses = classifiers.predict(descr');
??? Error using ==> parallel_function at 598
Error in ==> KNearest>KNearest.find_nearest at 173
Unexpected Standard exception from MEX file.
What() is:..\..\..\src\opencv\modules\ml\src\knearest.cpp:365: error: (-2) The search
tree must be constructed first using train method
I solved this problem by calling each time within parfor train
classifiers = cv.KNearest
classifiers.train(Descriptors',Labels','MaxK',1)
But it's an ugly solution :)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

save ELKI output to a file - cluster-analysis

To save to a file, use -resulthandler ResultWriter -out folder/ to limit what files are being written, you can use -out.filter.

Related

How to associate hash-named .wsp files to my tagged graphite metrics?

%include centos kickstart unable to open input kickstart file

Docker layer info post v1.10

ELKI DBSCAN for million files

Matlab Distributed Server parfor can't find mex opencv files

Categories

Resources