How do i convert multiple map coordinates into meters for DBSCAN Clustering - coordinates

i am trying to convert my map coordinates into meter so that i am able to do DBSCAN clustering with Epilson being the different distance metrics. Below are just some examples of the coordinates i am trying to convert. [x , y, z]. Please advise whether if it is possible to convert in the first place.
[1.1571329674332413, 103.91422107258136, 23.09],
[1.157132986240529, 103.91419597847637, 23.09],
[1.15718433426756, 103.91419602755067, 23.09],
[1.157184592013235, 103.91391175497597, 23.09],
[1.156977936505872, 103.91391155596845, 23.09],
[1.1569776750953964, 103.91419753027454, 23.09],
[1.157021604617708, 103.91419757105216, 23.09],
[1.1570215786242606, 103.91422096789638, 23.09],
[1.1571329674332413, 103.91422107258136, 23.09],

Related

Errors converting lat and long values to sf coordinates

I have a series of coordinates that I am trying to plot on simple maps. I have two columns containing latitude and longitude for each point, respectively. I am attempting to use the st_as_sf function to convert this data into sf points but I keep getting numerous errors, most recently
"Error in UseMethod("st_as_sf") :
no applicable method for 'st_as_sf' applied to an object of class "c('double', 'numeric')"
I have tried changing the class of the data to numeric, double, integer, etc. and I continue to get this error.
Below is an example of my data
point lat long
1 38.254 -76.712
2 38.123 -76.710
3 38.438 -76.699
4 38.254 -76.712
5 38.232 -76.733
st_as_sf(coords=c(lat, long), crs=4326)
You need to work on the arguments of sf::st_as_sf() function; in the first place must be your data object, most likely a data frame with columns named lat and long.
In the second place you specify coords, which should be a string vector of variable names, contained in your data object - so the names "lat" and "long" are expected enclosed in quotation marks (these are not R variables, but column names).
In the third place is expected a coordinate reference system to make sense of the coordinates provided in the second step; in case of WGS84 this will be 4326.
So consider this piece of code; it is formally correct, but places your points somewhere in Antarctica (should you flip the coordinates to long-lat the points would be placed in Washington, DC).
library(sf)
raw_pts <- data.frame(point = 1:5,
lat = c(38.254, 38.123, 38.438, 38.254, 38.232),
long = c(-76.712, -76.710, -76.699, -76.712, -76.733))
points <- st_as_sf(raw_pts, # first argument = data frame with coordinates
coords = c("lat", "long"), # name of columns, in quotation marks
crs = 4326) # coordinate reference system to make sense of the numbers
points
# Simple feature collection with 5 features and 1 field
# Geometry type: POINT
# Dimension: XY
# Bounding box: xmin: -76.733 ymin: 38.123 xmax: -76.699 ymax: 38.438
# Geodetic CRS: WGS 84
# point geometry
# 1 1 POINT (38.254 -76.712)
# 2 2 POINT (38.123 -76.71)
# 3 3 POINT (38.438 -76.699)
# 4 4 POINT (38.254 -76.712)
# 5 5 POINT (38.232 -76.733)

I have vibration data(g) in x and y direction for a Ball Bearing in 2 columns.Is there a way to find Manhattan distance just with this data & time?

I have frequency data in x and y direction for a Ball Bearing. And the Absolute time ofcourse in Another column. Is there a way to find Manhattan distance just with Frequecy and Absolute time.. so, can anyone just guide me?
For example the given file is like below
3.54393190998923540E+9 -6.80819749832153320E-2 -1.33635997772216800E-2
3.54393190998923540E+9 -6.80819749832153320E-2 -1.33635997772216800E-2
3.54393190998923540E+9 -6.80819749832153320E-2 -1.33635997772216800E-2
3.54393190998923540E+9 -6.80819749832153320E-2 -1.33635997772216800E-2
Here, first column is Time, And 2nd and 3rd columns are frequency data in x and y. How do we find manhattan distance here?
You can read all of the data in at once using dlmread and then access each of the columns individually:
M = dlmread('datafile.txt');
dlmread will figure out the delimiter and give you the correct number of columns in M:
M =
3.5439e+09 -6.8082e-02 -1.3364e-02
3.5439e+09 -6.8082e-02 -1.3364e-02
3.5439e+09 -6.8082e-02 -1.3364e-02
3.5439e+09 -6.8082e-02 -1.3364e-02
Now you can access column 2, for example, like so:
>> M(:,2)
ans =
-0.068082
-0.068082
-0.068082
-0.068082
Try
perl -i -pe 's/ /,/g;s/^/a$.=/' junk.txt

MATLAB phased.URA.step: where is my phaseshift?

I want to get the sensor array complex response ratios from an angle. I try to use phased.URA() and its method step(freq, angle), but it gives me equal numbers for each element for any angle. Where is my phaseshift, dudes?
Here is a small example:
ant_array = phased.URA();
disp(ant_array.step(3e8, [45; 0]));
It gives me a result:
1
1
1
1
Can anyone tell me, what does it mean and how can I use angle parameter correctly?

Collapse/mean data in Matlab with respect to a different set of data

I have two sets of data, but the sets have a different sizes.
Each set contains the measurements itself (MeasA and MeasB, both double) and the time point (TimeA and TimeB, datenum or julian date) when the measuring happened.
Now I want to match the smaller data set to the bigger one, and to do this, I want to mean the data points of the bigger set around the data resp. time points of the smaller set, to finally do some correlation analysis.
Edit:
Small Example how the data would look like:
MeasA = [2.7694 -1.3499 3.0349 0.7254 -0.0631];
TimeA = [0.2 0.4 0.7 0.8 1.3];
MeasB = [0.7147 -0.2050 -0.1241 1.4897 1.4090 1.4172 0.6715 -1.2075 0.7172 1.6302];
TimeB = [0.1 0.2 0.3 0.6 0.65 0.68 0.73 0.85 1.2 1.4];
And now I want to collapse MeasB and TimeB so that I get the mean of the measurement close to the timepoints in TimeA, so for example TimeB should look like this:
TimeB = [mean([0.1 0.2]) mean([0.3 0.6]) mean([0.65 0.68 0.73]) mean([0.85]) mean([1.2 1.4])]
TimeB = [0.15 0.4 0.69 0.85 1.3]
And then collapse MeasB like this too:
MeasB = [mean([0.7147 -0.2050]) mean([-0.1241 1.4897]) mean([1.4090 1.4172 0.6715]) mean([-1.2075]) mean([0.7172 1.6302])];
MeasB = [0.2549 0.6828 1.1659 -1.2075 1.1737]
The function interp1 is your friend.
You can get a new set of measurement for your set B, at the same time than set A by using:
newMeasB = interp1( TimeB , MeasB , TimeA ) ;
The first 2 parameters are your original Time and Measurements of the set you want to re interpolate, the last parameter is the new x axis (time in your example) on which you want the interpolated values to be calculated.
This way you do not end up with different sets of time between your 2 sets of measurements, you can compare them point by point.
Check the documentation of interp1 for more explanations and for options about the interpolation or any potential extrapolation.
edit:
Matlab doc used to have a great illustration of the function but I can't find it online so here goes:
So with the linear method, if the value is interpolated exactly between 2 points, the function will return the exact mean. If the interpolation is done closer to one point than another, the value returned will be proportionally closer to the value of the closest point.
The NaN can appear on the sides (beginning or end of returned vector) if the TimeA was not completely overlapped by timeB. The function cannot "interpolate" because there is no anchor point. However, the different options of interp1 allow you to "extrapolate" outside of the input range, or to assign another default value instead of the NaNs.

Using a Geo Distance Function on ELKI

I am using ELKI to mine some geospatial data (lat,long pairs) and I am quite concerned on using the right data types and algorithms. On the parameterizer of my algorithm, I tried to change the default distance function by a geo function (LngLatDistanceFunction, as I am using x,y data) as bellow:
params.addParameter (DISTANCE_FUNCTION_ID, geo.LngLatDistanceFunction.class);
However the results are quite surprising: it creates clusters of a repeated point, such as the example bellow:
(2.17199922, 41.38190043, NaN), (2.17199922, 41.38190043, NaN), (2.17199922, 41.38190043, NaN), (2.17199922, 41.38190043, NaN), (2.17199922, 41.38190043, NaN), (2.17199922, 41.38190043, NaN), (2.17199922, 41.38190043, NaN), (2.17199922, 41.38190043, NaN), (2.17199922, 41.38190043, NaN), (2.17199922, 41.38190043, NaN)]
This is an image of this example.
Whether I used a non-geo distance (for instance manhattan):
params.addParameter (DISTANCE_FUNCTION_ID, geo.minkowski.ManhattanDistanceFunction.class);
,the output is much more reasonable
I wonder if there is something wrong with my code.
I am running the algorithm directly on the db, like this:
Clustering<Model> result = dbscan.run(db);
And then iterating over the results in a loop, while I construct the convex hulls:
for (de.lmu.ifi.dbs.elki.data.Cluster<?> cl : result.getAllClusters()) {
if (!cl.isNoise()){
Coordinate[] ptList=new Coordinate[cl.size()];
int ct=0;
for (DBIDIter iter = cl.getIDs().iter();
iter.valid(); iter.advance()) {
ptList[ct]=dataMap.get(DBIDUtil.toString(iter));
++ct;
}
GeoPolygon poly=getBoundaryFromCoordinates(ptList);
if (poly.getCoordinates().getGeometryType()==
"Polygon"){
out.write(poly.coordinates.toText()+"\n");
}
}
}
To map each ID to a point, I use a hashmap, that I initialized when reading the database.
The reason why I am adding this code, is because I suspect that I may doing something wrong regarding the structures that I am passing/reading to/from the algorithm.
I thank you in advance for any comments that could help me to solve this. I find ELKI a very efficient and sophisticated library, but I have trouble to find examples that illustrate simple cases, like mine.
What is your epsilon value?
Geographic distance is in meters in ELKI (if I recall correctly); Manhattan distance would be in latitude + longitude degrees. For obvious reasons, these live on very different scales, and therefore you need to choose a different epsilon value.
In your previous questions, you used epsilon=0.008. For geodetic distance, 0.008 meters = 8 millimeter.
At epsilon = 8 millimeter, I am not surprised if the clusters you get consist only of duplicated coordinates. Any chance that above coordinates do exist multiple times in your data set?