How to measure distance in crs.simple? - leaflet

I have a non-geographic map aka flat image using CRS.Simple extended by a custom transformation. Everything works fine so far, but now I want to add a distance measurement button. I'm confident I could implement a distance measurement between two markers myself, but the dynamic line drawing and measuring is still a bit above my skills, so I hoped I could use a plugin. None of the ones I found, did offer this though. After looking at the plugins page of leaflet, I tried this fork https://github.com/aprilandjan/leaflet.measure of leaflet.measure originally by https://github.com/jtreml/leaflet.measure as it seemed to offer the ability to add custom units - in my case pixels.
I added this:
L.control.measure({
// distance formatter, output mile instead of km
formatDistance: function (val) {
return Math.round(1000 * val / scaleFactor) / 1000 + 'mapUnits';
}
}).addTo(map)
Unfortunately, the result is a number far too big compared to the pixelsize of the map (4096x4096). distance() returns the expected 1414.213562373095 between a point 1000,1000 and one at 2000,2000. Calculating distanctTo returns 8009572.105082839 instead though. I use this at the beginning of my file
var yx = L.latLng;
var xy = function(x, y) {
if (L.Util.isArray(x)) { // When doing xy([x, y]);
return yx(x[1], x[0]);
}
return yx(y, x); // When doing xy(x, y);
};
If I log val to the console, I get things like this:
20411385.176805027
7118674.47741132
20409736.502863288
7117025.8034695815
20409186.004645467
20409736.502863288
That's likely some problem of the function trying to calculate latlng without a proper reference system.
Anyone got an idea how to solve this? I feel like it can't be overly difficult, but I don't know exactly where to start.

I found a way to do it, even though it feels a bit 'hacky':
I replaced the line
var distance = e.latlng.distanceTo(this._lastPoint)
in the _mouseMove and the _mouseClick events of leaflet.measure with
var currentPoint = e.latlng;
var lastPoint = this._lastPoint;
var distance = map.distance(currentPoint, lastPoint);
as the distance() method of the map returns meters, or in the case of a flat image, pixel values. And those we can translate in whatever unit we want in our flat image.
If anyone has a more elegant way, I'm all ears!

Related

Problem normalizing log_e "Pitch" voiceAnalytics Swift iOS 13

I am retrieving values from SFVoiceAnalytics "Pitch." My goal is to transform the data to the raw Fundamental Frequency. According to the documentation the values are returned log_e.
When I apply exp() to the values returned I get the following ranges:
Male voice: [0.25, 1.85], expected: [85, 180]
Female voice: [0.2,1.6], expected: [165, 255]
For sake of simplicity I am using Apple's sample code "Recognizing Speech in Live Audio."
Thanks for the help!!
Documentation: https://developer.apple.com/documentation/speech/sfvoiceanalytics/3229976-pitch
if let result = result {
// returned pitch values
for segment in result.bestTranscription.segments {
if let pitchSegment = segment.voiceAnalytics?.pitch.acousticFeatureValuePerFrame {
for p in pitchSegment {
let pitch = exp(p)
print(pitch)
}
}
}
// Update the text view with the results.
self.textView.text = result.bestTranscription.formattedString
isFinal = result.isFinal
}
I ran into a similar problem lately and ultimately used another solution to retrieve pitch data.
I went with a pitch detection library for Swift called Beethoven. It detects pitches in real-time, whereas the voice analytics of SFSpeechRecognizer only returns them once the transcription is complete.
Beethoven hasn't been updated to work with Swift 5, but I didn't find it too difficult to get it to work.
Also, upon digging up why the values in voiceAnalytics were as they were, I found out via the documentation that the pitch is a normalized pitch estimate:
The value is a logarithm (base e) of the normalized pitch estimate for each frame.
My interpretation of this is likely that the values were normalized (divided by) the fundamental frequency, so I'm not sure it's possible to use this data to recover the absolute frequencies. It seems best used to convey interval changes from pitch-to-pitch.

Using low pass filter in matlab to get same endpoints of the data

This is an extension of my previous question: https://dsp.stackexchange.com/questions/28095/choosing-low-pass-filter-parameters
I am recording people from an overheard camera. I have tracks of each's head using some software. I want to periodicity from tracks due to head wobbling.
I apply low-pass butterworth filter. I want the starting point and ending point of the filtered to be same as unfiltered tracks.
Data:
K>> [xcor_i,ycor_i ]
ans =
-101.7000 -77.4040
-102.4200 -77.4040
-103.6600 -77.4040
-103.9300 -76.6720
-103.9900 -76.5130
-104.0000 -76.4780
-105.0800 -76.4710
-106.0400 -77.5660
-106.2500 -77.8050
-106.2900 -77.8570
-106.3000 -77.8680
-106.3000 -77.8710
-107.7500 -78.9680
-108.0600 -79.2070
-108.1200 -79.2590
-109.9500 -80.3680
-111.4200 -80.6090
-112.8200 -81.7590
-113.8500 -82.3750
-115.1500 -83.2410
-116.1500 -83.4290
-116.3700 -83.8360
-117.5000 -84.2910
-117.7400 -84.3890
-118.8800 -84.7770
-119.8400 -85.2270
-121.1400 -85.3250
-123.2200 -84.9800
-125.4700 -85.2710
-127.0400 -85.7000
-128.8200 -85.7930
-130.6500 -85.8130
-132.4900 -85.8180
-134.3300 -86.5500
-136.1700 -87.0760
-137.6500 -86.0920
-138.6900 -86.9760
-140.3600 -87.9000
-142.1600 -88.4660
-144.7200 -89.3210
Code(answer by #SleuthEye):
dataOut_x = xcor_i(1)+filter(b,a,xcor_i-xcor_i(1));
dataOut_y = ycor_i(1)+filter(b,a,ycor_i-ycor_i(1));
Output:
In the above example, the endpoint(to the left) is different for filtered and unfiltered tracks. How can I ensure it is same?
Your question is pretty ambiguous, and doesn't really have a specific question. I'm assuming you want to have your filtered data start at the same points as the measured data, but are unsure why this is not happening already, and how to do so.
A low pass filter is a filter which lowers the effect of rapid changes. One way of doing this, and the method which appears to be used here, is by using a rolling average. A rolling average is simply an average (mean) of the previous data points. It looks like you are using a rolling average of 5 data points. Therefore you need five points of raw data before your filter will give you a single data point.
-101.7000 -77.4040 }
-102.4200 -77.4040 } }
-103.6600 -77.4040 } }
-103.9300 -76.6720 } }
-103.9900 -76.5130 } Filter point 1. }
-104.0000 -76.4780 } Filter point 2.
-105.0800 -76.4710
-106.0400 -77.5660
-106.2500 -77.8050
-106.2900 -77.8570
-106.3000 -77.8680
-106.3000 -77.8710
In order to solve this problem, you could just append the first data point to the data set four times, as this means that the filter will produce the same number of points. This is a pretty rough solution, however, as you are creating new data. This could be achieved quite simply, for example if your dataset is called myArray:
firstEntry = myArray(1,:);
myNewArray = [firstEntry; firstEntry; firstEntry; firstEntry; myArray];
This will create four data points equal to your first data point, which should then allow you to apply the low pass filter to your data, and have it start at the same point.
Hope this helps, although it's worth bearing in mind that filtering ALWAYS results in a loss of data.
Because you don't want to implement it but want someone else to:
The theory as above is correct, but instead you need to add 2 values at the end of your vectors:
x_last = xcor_i(end);
y_last = ycor_i(end);
xcor_i = [xcor_i;x_last;x_last];
ycor_i = [ycor_i;y_last;y_last];
This gives the following:
As you can see the ends are pretty close to being the same now.

Trying to balance my dataset through sample_weight in scikit-learn

I'm using RandomForest for classification, and I got an unbalanced dataset, as: 5830-no, 1006-yes. I try to balance my dataset with class_weight and sample_weight, but I can`t.
My code is:
X_train,X_test,y_train,y_test = train_test_split(arrX,y,test_size=0.25)
cw='auto'
clf=RandomForestClassifier(class_weight=cw)
param_grid = { 'n_estimators': [10,50,100,200,300],'max_features': ['auto', 'sqrt', 'log2']}
sw = np.array([1 if i == 0 else 8 for i in y_train])
CV_clf = GridSearchCV(estimator=clf, param_grid=param_grid, cv= 10,fit_params={'sample_weight': sw})
But I don't get any improvement on my ratios TPR, FPR, ROC when using class_weight and sample_weight.
Why? Am I doing anything wrong?
Nevertheless, if I use the function called balanced_subsample, my ratios obtain a great improvement:
def balanced_subsample(x,y,subsample_size):
class_xs = []
min_elems = None
for yi in np.unique(y):
elems = x[(y == yi)]
class_xs.append((yi, elems))
if min_elems == None or elems.shape[0] < min_elems:
min_elems = elems.shape[0]
use_elems = min_elems
if subsample_size < 1:
use_elems = int(min_elems*subsample_size)
xs = []
ys = []
for ci,this_xs in class_xs:
if len(this_xs) > use_elems:
np.random.shuffle(this_xs)
x_ = this_xs[:use_elems]
y_ = np.empty(use_elems)
y_.fill(ci)
xs.append(x_)
ys.append(y_)
xs = np.concatenate(xs)
ys = np.concatenate(ys)
return xs,ys
My new code is:
X_train_subsampled,y_train_subsampled=balanced_subsample(arrX,y,0.5)
X_train,X_test,y_train,y_test = train_test_split(X_train_subsampled,y_train_subsampled,test_size=0.25)
cw='auto'
clf=RandomForestClassifier(class_weight=cw)
param_grid = { 'n_estimators': [10,50,100,200,300],'max_features': ['auto', 'sqrt', 'log2']}
sw = np.array([1 if i == 0 else 8 for i in y_train])
CV_clf = GridSearchCV(estimator=clf, param_grid=param_grid, cv= 10,fit_params={'sample_weight': sw})
This is not a full answer yet, but hopefully it'll help get there.
First some general remarks:
To debug this kind of issue it is often useful to have a deterministic behavior. You can pass the random_state attribute to RandomForestClassifier and various scikit-learn objects that have inherent randomness to get the same result on every run. You'll also need:
import numpy as np
np.random.seed()
import random
random.seed()
for your balanced_subsample function to behave the same way on every run.
Don't grid search on n_estimators: more trees is always better in a random forest.
Note that sample_weight and class_weight have a similar objective: actual sample weights will be sample_weight * weights inferred from class_weight.
Could you try:
Using subsample=1 in your balanced_subsample function. Unless there's a particular reason not to do so we're better off comparing the results on similar number of samples.
Using your subsampling strategy with class_weight and sample_weight both set to None.
EDIT: Reading your comment again I realize your results are not so surprising!
You get a better (higher) TPR but a worse (higher) FPR.
It just means your classifier tries hard to get the samples from class 1 right, and thus makes more false positives (while also getting more of those right of course!).
You will see this trend continue if you keep increasing the class/sample weights in the same direction.
There is a imbalanced-learn API that helps with oversampling/undersampling data that might be useful in this situation. You can pass your training set into one of the methods and it will output the oversampled data for you. See simple example below
from imblearn.over_sampling import RandomOverSampler
ros = RandomOverSampler(random_state=1)
x_oversampled, y_oversampled = ros.fit_sample(orig_x_data, orig_y_data)
Here it the link to the API: http://contrib.scikit-learn.org/imbalanced-learn/api.html
Hope this helps!

Minimum distance to plot points on UIMapview

my query is whats the minimum distance required to plot point on a UIMAPVIEW, so that they could be shown as distinct points.
For e.g., suppose if there are users in same apartment, their would be hardly any distance between their latitude-longitude. So how do i differentiate them !
There are usually two ways - one is clustering, which means you use a marker with a number that indicates how many underlying markers there are. When tapping on that, the user is then shown the separate markers (or a recursive zoom-in that splits the markers up more and more). Superpin (www.getsuperpin.com) is one example, but there are others out there.
Another approach is to actually offset the marker from its real location. For this, you need some kind of distribution algorithm that offsets it just enough - that is, set the markers as close together as possible while still giving them enough surface area to be seen/touched. For this, we use Fibonacci's Sunflower patten. What you'd have to do is identify all the Annotations that have the same coordinate, group them, and then draw each group in a sequence while offsetting one from the other - for ex. have some code that iterates along a spiral shape and drops down markers along that spiral.
Can put up sample code etc to help if you're wanting to go with the second approach, let me know.
EDIT: I found some code we wrote for this, but it's not objective C. Can you read it?
class SunFlower
{
static $SEED_RADIUS = 2;
static $SCALE_FACTOR = 4;
static $PI2 = 6.28318531; // PI * 2
static $PHI = 1.61803399; // (sqrt(5)+1) / 2
public static function calcPos($xc, $yc, $factor, $i)
{
$theta = $i * SunFlower::$PI2 / SunFlower::$PHI;
$r = sqrt($i) * $factor;
$x = $xc + $r * cos($theta);
$y = $yc - $r * sin($theta);
if ($i == 1) {
$y += ($factor * 0.5);
}
return array($x, $y);
}
}

How to use MapReduce for k-Means Spatial Clustering

I'm new to mongodb and map-reduce and want to evaluate spatial data by using a k-means spatial clustering. I found this article which seems to be a good description of the algorithm, but I have no clue how to translate this into a mongo shell script. Assume my data looks like:
{
_id: ObjectID(),
loc: {x: <longitude>, y: <latitude>},
user: <userid>
}
And I can use { k = sqrt(n/2) } where n is the number of samples.
I can use aggregates to get the bounding extents of the data and the count, etc.
I kind of got lost with the reference to a file of the cluster points, which I assume would be just another collection and I have no idea how to do the iteration or if that would be done in the client or the database?
Ok, I have made a little progress on this in that I have generated and array of initial random points that I need to compute the sum of least squares against during the map-reduce phase, but I do not know how to pass these to the map function. I took a stab at writing the map function:
var mapCluster = function() {
var key = -1;
var sos = 0;
var pos;
for (var i=0; i<pts.length; i++) {
var dx = pts[i][0] - this.arguments.pos[0];
var dy = pts[i][1] - this.arguments.pos[1];
var sumOfSquare = dx*dx + dy*dy;
if (i == 0 || sumOfSquares < sos) {
key = i;
sos = sumOfSquares;
pos = this.arguments.pos;
}
}
emit(key, pos);
};
I this case the cluster points are like, which is probably will not work:
var pts = [ [x,y], [x1,y1], ... ];
So for each mr iteration, we compare all the collection points against this array and emit the index of the point that we are closest to along with location of the collection point then in the reduce function the average of the points associated with each index would be used to create the new cluster point location. Then in the finialize function I can update the cluster document.
I assume I could do a findOne() on the cluster document to load the cluster points in the map function but do we want to load this document on every call to map? or is there a way to load it once for each iteraction?
So it looks like you can do the above using the scope variable like this:
db.main.mapReduce( mapCluster, mapReduce, { scope: { pnts: pnts, ... }} );
You have to be careful about variable names in the scope as these are placed in the scope of the map, reduce and finialize functions they can collide with existing variable names.
What have you tried?
Note that you will need more than one round of mappers.
With the canonical approach of running k-means on MR, you need one mapper/reducer per iteration.
So, can you try to write the map and reduce steps of a single iteration only?