Clustering a sequence of numbers - cluster-analysis

I have a small clustering problem- I have this sequence:
349, 1496, 348, 1497, 347, 1503, 1502, 1495, 353, 352, 351, 1501, 354, 1504, 1498, 1500
And I want to detect that there are two clusters- one around 350 and other around 1500. Is there any straightforward solution to this? So far I tried rounding to nearest 100, e.g. int(round(x1 / 100.0)) * 100, which does not always work because the numbers may vary; and the other is using silhouette method which seems too much for this small problem.

Sort the data.
Split at the largest Gap.

Related

Interrupt a line in Chart.js lines

I've a weird situation in chart.js, see the picture
Basically a dataset with 4 date and 4 numbers. All 4 numbers value are 1 (doesnt matter).
But actually the real data need to show just 2 intervals (1/1/2020 -> 2/2/2020) and (3/4/2021->6/6/2021). Basically without the segment in the middle.
In this case there is no way Chart.js would be able to understand to not drawn that segment, all values are 1 in all 4 different dates.
So the only solution in my mind is to sub divide all the intervals so I can place a NaN in the middle and use something like stepped:true for the line. But with a lot of data I basically double the numbers of dates making the graph more confusing.
So the question is.. Is there any way to specify for given point if it's a start or an end ?
Or maybe there is a better approach instead of a single line dataset?
Thank you.
Just pass 2 datasets:
const labels = Utils.months({count: 7});
const data = {
labels: labels,
datasets: [{
label: 'My First Dataset',
data: [65, 59, 80],
fill: false,
borderColor: 'rgb(75, 192, 192)',
tension: 0.1
},{
label: 'My 2nd Dataset',
data: [null, null, null, 81, 56, 55, 40],
fill: false,
borderColor: 'rgb(75, 40, 192)',
tension: 0.1
}]
};
If You pass objects instead of arrays as data, then you do not even have to pad with nulls

Flutter - CircularProgressIndicator max value

i'm trying to find a way to change max value of CircularProgressIndicator
child: CircularProgressIndicator(
strokeWidth: 10,
value: math().valuePercentage,
backgroundColor:
Color.fromARGB(255, 255, 152, 152),
color: Color.fromARGB(255, 233, 54, 54),
i was trying to do it with some math, but i stuck on this:
class math {
double valuePercentage = kcalData().Target / 10000;
}
for example, we have 3216 kcal and the maximum of CircularProgressIndicator is 1.0, so i decide to devide kcal value by 10000, but i want to make a maximum from my Target value. I'm bad at math, so maybe some of you can get it...
Any ideas? Or maybe there is another way to do this?
edit: maximum of indicator equals 1.0, can't set no more
value: currentValue / totalValue,
currentValue: value which you want to show on your indicator
totalValue: max value of your progressIndicator
Let's say your target 10000. This is the maximum. If you want to find how much is the percentage of 3126 in relation with 10000, then you have to divide 3126 with 10000. If you want to have a different max every time, then divide your value with your max.

Pyspark: comparing elements of RDD

Similar to the question I posted here about working with DFs, how can I retrieve the first element in each sequence, but in this situation using RDDs? I want to compare each item to the 1 previous. Items that repeat later in the sequence are acceptable ie (67,375, 14:20:14) might appear later in the RDD and should be kept.
Input
(67, 312, 12:09:00)
(67, 375, 12:23:00)
(67, 375, 12:25:00)
(67, 650, 12:26:00)
(75, 650, 12:27:00)
(75, 650, 12:29:00)
(75, 800, 12:30:00)
(67, 375, 14:20:14)
Output
(67, 312, 12:09:00)
(67, 375, 12:23:00)
(67, 650, 12:26:00)
(75, 650, 12:27:00)
(75, 800, 12:30:00)
(67, 375, 14:20:14)
This would work. But, my only concern is that, you cannot rely on the order of the output the transformation on rdd will result in. So, to retain the order, I strongly suggest you to sort by a column, here you fortunately have the timestamp column.
If you are not planning to sort by timestamp, then please go with dataframe windowing approach. Even there, you might need sorting :)
rdd = sc.parallelize([(67, 312, "12:09:00"),
(67, 375, "12:23:00"),
(67, 375, "12:25:00"),
(67, 650, "12:26:00"),
(75, 650, "12:27:00"),
(75, 650, "12:29:00"),
(75, 800, "12:30:00") ])
# Fix 1st two columns as keys.
rdd_fix_keys = rdd.map(lambda x:((x[0],x[1]),(x[2])))
# Group the values of similar keys.
rdd_group_by_key = rdd_regroup_keys.reduceByKey(lambda x,y:(x,y))
# Pick first occurence of the grouped values, as per your requirement.
rdd_pick_first_occurence = rdd_group_by_key.map(lambda x:(x[0], x[1][0]) if not isinstance(x[1], str) else x)
# Sort by timestamp.
rdd_pick_first_occurence.map(lambda x:(x[0][0],x[0][1],x[1])).sortBy(lambda x: x[2]).collect()
Note: The order is changed here.

Postgis - ST_within didn't do what I want. How to find a point in a hollow area?

See the screen print.
I ran a spatial query in Postgis to return the electoral constituency (area) that a point on the map lies in. The query uses a ST_within function where the point is within a polygon.
As you can see from the print, the point is not actually 'in' the polygon area of York Outer although technically you might say it's 'within' it, or at least Postgis thinks so. The point would actually lie in York Central.
I'm sure Postgis actually returns both but since I only fetch the first record from the cursor, this is what I see.
A point can only be in one electoral constituency at a time and this query has returned the wrong one or rather I asked the wrong question of the database.
Which function should I be using to ensure I always return the correct area for a point where it's possible the area may have a hollow interior or be a strange shape?
Thanks
Phil
This should work as you described it. Maybe something is wrong with the data? Could you provide a small repro, with polygon / point data?
Also, a somewhat common reason for such problems is not valid GIS data. You can check the polygon shape with PostGIS's ST_IsValid function. If the data is not valid, different tools might interpret it in different ways, and how GIS data is drawn might not match what PostGIS thinks this data represents, causing more confusion.
Here is a simple repro showing it works as you expect it to work, with point inside the outer polygon's hole only st_within the inner polygon, not the outer one:
select st_astext(point), name
from
(select
'outer' as name,
st_geomfromtext('polygon((0 0, 30 0, 30 30, 0 30, 0 0), (10 10, 20 10, 20 20, 10 20, 10 10))') g
union all
select
'inner' as name,
st_geomfromtext('polygon((10 10, 20 10, 20 20, 10 20, 10 10))') g
) shapes
cross join
(select st_geomfromtext('point(15 15)') point
union all
select st_geomfromtext('point(5 5)') point
) points
where st_within(point, g)
My results are
1 POINT(5 5) outer
2 POINT(15 15) inner
Considering your polygons and query are the way you described, it should work without any problems. Consider the following geometries ..
.. you see that the point lies only inside the inner polygon. If you perform a query with ST_Within giving the coordinates of the point, you should get only the inner polygon:
WITH j (geom) AS (VALUES
('POLYGON((35 10, 45 45, 15 40, 10 20, 35 10),(20 30, 35 35, 30 20, 20 30))'),
('POLYGON((26.88 31.08,30.57 31.08,30.57 28.49,26.88 28.49,26.88 31.08))'))
SELECT * FROM j
WHERE ST_Within('POINT(28.46 28.64)',j.geom)
However, if your query is somehow using the BBOX of the polygons instead of their area, you will indeed get the outer polygons as well, e.g.:
WITH j (geom) AS (VALUES
('POLYGON((35 10, 45 45, 15 40, 10 20, 35 10),(20 30, 35 35, 30 20, 20 30))'),
('POLYGON((26.88 31.08,30.57 31.08,30.57 28.49,26.88 28.49,26.88 31.08))'))
SELECT * FROM j
WHERE ST_Within('POINT(28.46 28.64)',j.geom::GEOMETRY::BOX2D)
Consider adding a data sample and the query to your question. Hopefully it helps you debug your code.

Changing icon offset based on zoom level

I am trying to offset symbols of a symbol layer so that they don't interfere with a previous symbol layer (i.e. they don't overlap). I need to offset them as in both cases icon-allow-overlap needs to be set to true, as the symbols need to be viewable at all zoom levels. Ideally I'd like to do something like this:
"icon-offset": [
["zoom"],
12, [-16, 0],
22, [0, 0]
]
but that gives me an error:
array length 2 expected, length 5 found
Is there a way I can do what I want similar to what I was trying above? I know that icon-offset is not transitionable so that is why the above is failing.
Any help would be appreciated.
Thanks for your time.
The answer was to use a function:
"icon-offset": {
"stops": [
[12, [-16, 0]],
[22, [0, 0]]
]
}
More info on this can be found here