Mongo multiple array values match - mongodb

So I have this data model, based on a Facebook Insights API response, and you can assume is inserted as a document in a mongo collection.
My problem is:
- The representation of the Insights data from FB is not the best, with keys that potentially could contain ".", or other chars not mongo friendly.
The queries I am trying to achieve, are sorted by some of the insights values:
Get all documents ranked by "Melbourne, VIC, Australia".
Based on the problem, my question then is:
Would you recommend transforming this documents into something more mongo friendly, and making the queries simpler? or
Would you leave this model representation and work around more complex queries
At the moment of editing this question, I have transformed Insights data representation but your opinion is welcomed!
Cheers!
{
"id": "155257214638229/insights/page_fans_city/lifetime",
"name": "page_fans_city",
"period": "lifetime",
"values": [
{
"value": {
"Bogotá, Distrito Especial, Colombia": 206,
"São Paulo, Brazil": 102,
"Melbourne, VIC, Australia": 95,
"Cali, Valle del Cauca, Colombia": 76,
"Medellín, Antioquia, Colombia": 72,
"Bangalore, Karnataka, India": 39,
"Cartagena, Bolivar, Colombia": 38,
"Barranquilla, Atlantico, Colombia": 36,
"New Delhi, Delhi, India": 34,
"Ibagué, Tolima, Colombia": 32,
"Calcutta, West Bengal, India": 32,
"Bucaramanga, Santander, Colombia": 31,
"Sydney, NSW, Australia": 29,
"Mumbai, Maharashtra, India": 27,
"Lisbon, Lisboa, Portugal": 27,
"Mexico City, Distrito Federal, Mexico": 26,
"Neiva, Huila, Colombia": 23,
"Chennai, Tamil Nadu, India": 22,
"Hyderabad, Andhra Pradesh, India": 19,
"Chandigarh, India": 17,
"Pasto, Narino, Colombia": 17,
"Cúcuta, Norte de Santander, Colombia": 16,
"Santa Marta, Colombia": 16,
"Salvador, Bahia, Brazil": 16,
"Valledupar, Cesar, Colombia": 15,
"Palmira, Valle del Cauca, Colombia": 15,
"Montería, Cordoba, Colombia": 15,
"Rio de Janeiro, Brazil": 14,
"Lucknow, Uttar Pradesh, India": 14,
"Villavicencio, Meta, Colombia": 13,
"Belo Horizonte, Minas Gerais, Brazil": 13,
"Surat, Gujarat, India": 13,
"Ahmedabad, Gujarat, India": 13,
"Santiago, Region Metropolitana, Chile": 12,
"Brasília, Distrito Federal, Brazil": 12,
"Pereira, Risaralda, Colombia": 12,
"Jalandhar, Punjab, India": 11,
"Barrancabermeja, Santander, Colombia": 11,
"Tuluá, Valle del Cauca, Colombia": 11,
"Buenaventura, Valle del Cauca, Colombia": 11,
"Guadalajara, Jalisco, Mexico": 10,
"Goiânia, Goias, Brazil": 10,
"Adelaide, SA, Australia": 10,
"London, England, United Kingdom": 9,
"Armenia, Quindio, Colombia": 9
},
"end_time": "2015-02-19T08:00:00+0000"
},
{
"value": {
"Bogotá, Distrito Especial, Colombia": 207,
"São Paulo, Brazil": 103,
"Melbourne, VIC, Australia": 95,
"Cali, Valle del Cauca, Colombia": 75,
"Medellín, Antioquia, Colombia": 72,
"Bangalore, Karnataka, India": 40,
"Cartagena, Bolivar, Colombia": 38,
"Barranquilla, Atlantico, Colombia": 36,
"New Delhi, Delhi, India": 34,
"Ibagué, Tolima, Colombia": 32,
"Calcutta, West Bengal, India": 31,
"Bucaramanga, Santander, Colombia": 30,
"Sydney, NSW, Australia": 29,
"Lisbon, Lisboa, Portugal": 27,
"Mumbai, Maharashtra, India": 27,
"Mexico City, Distrito Federal, Mexico": 26,
"Neiva, Huila, Colombia": 24,
"Chennai, Tamil Nadu, India": 22,
"Hyderabad, Andhra Pradesh, India": 19,
"Chandigarh, India": 17,
"Pasto, Narino, Colombia": 17,
"Cúcuta, Norte de Santander, Colombia": 16,
"Santa Marta, Colombia": 16,
"Salvador, Bahia, Brazil": 16,
"Palmira, Valle del Cauca, Colombia": 15,
"Valledupar, Cesar, Colombia": 15,
"Montería, Cordoba, Colombia": 15,
"Rio de Janeiro, Brazil": 14,
"Lucknow, Uttar Pradesh, India": 14,
"Belo Horizonte, Minas Gerais, Brazil": 13,
"Surat, Gujarat, India": 13,
"Ahmedabad, Gujarat, India": 13,
"Villavicencio, Meta, Colombia": 13,
"Santiago, Region Metropolitana, Chile": 12,
"Brasília, Distrito Federal, Brazil": 12,
"Pereira, Risaralda, Colombia": 12,
"Jalandhar, Punjab, India": 11,
"Barrancabermeja, Santander, Colombia": 11,
"Tuluá, Valle del Cauca, Colombia": 11,
"Buenaventura, Valle del Cauca, Colombia": 11,
"Guadalajara, Jalisco, Mexico": 10,
"Goiânia, Goias, Brazil": 10,
"Adelaide, SA, Australia": 10,
"Dehra Dun, Uttarakhand, India": 9,
"San Luis Potosí, San Luis Potosi, Mexico": 9
},
"end_time": "2015-02-20T08:00:00+0000"
},
{
"value": {
"Bogotá, Distrito Especial, Colombia": 206,
"São Paulo, Brazil": 103,
"Melbourne, VIC, Australia": 95,
"Cali, Valle del Cauca, Colombia": 75,
"Medellín, Antioquia, Colombia": 72,
"Bangalore, Karnataka, India": 40,
"Cartagena, Bolivar, Colombia": 38,
"Barranquilla, Atlantico, Colombia": 36,
"New Delhi, Delhi, India": 34,
"Ibagué, Tolima, Colombia": 32,
"Calcutta, West Bengal, India": 31,
"Bucaramanga, Santander, Colombia": 31,
"Sydney, NSW, Australia": 29,
"Lisbon, Lisboa, Portugal": 27,
"Mumbai, Maharashtra, India": 27,
"Mexico City, Distrito Federal, Mexico": 26,
"Neiva, Huila, Colombia": 23,
"Chennai, Tamil Nadu, India": 22,
"Hyderabad, Andhra Pradesh, India": 19,
"Chandigarh, India": 17,
"Pasto, Narino, Colombia": 17,
"Santa Marta, Colombia": 16,
"Salvador, Bahia, Brazil": 16,
"Palmira, Valle del Cauca, Colombia": 15,
"Cúcuta, Norte de Santander, Colombia": 15,
"Rio de Janeiro, Brazil": 15,
"Valledupar, Cesar, Colombia": 15,
"Montería, Cordoba, Colombia": 15,
"Lucknow, Uttar Pradesh, India": 14,
"Belo Horizonte, Minas Gerais, Brazil": 13,
"Surat, Gujarat, India": 13,
"Villavicencio, Meta, Colombia": 13,
"Ahmedabad, Gujarat, India": 13,
"Santiago, Region Metropolitana, Chile": 12,
"Brasília, Distrito Federal, Brazil": 12,
"Pereira, Risaralda, Colombia": 12,
"Jalandhar, Punjab, India": 11,
"Barrancabermeja, Santander, Colombia": 11,
"Tuluá, Valle del Cauca, Colombia": 11,
"Buenaventura, Valle del Cauca, Colombia": 11,
"Guadalajara, Jalisco, Mexico": 10,
"Adelaide, SA, Australia": 10,
"Goiânia, Goias, Brazil": 10,
"Dehra Dun, Uttarakhand, India": 9,
"San Luis Potosí, San Luis Potosi, Mexico": 9
},
"end_time": "2015-02-21T08:00:00+0000"
}
],
"title": "Lifetime Likes by City",
"description": "Lifetime: Aggregated Facebook location data, sorted by city, about the people who like your Page. (Unique Users)"
}

Related

Scipy for Heart Rate data processing, Scipy.signal find_peaks returning empty array

Had an array with HR data in the form of BPMs (beats per minute).
Need to split the data into segments where the HR was increasing, decreasing, and stayed the same to find the up/down amplitude and the associated times as well as the times when the heart rate did not change.
To do so, tried using find_peaks to find the peaks and troughs of the HR data.
To find troughs, I did a transformation of the original HR by multiplying it by -1 to find its peaks which in turn is the real troughs of the original HR data.
find_peaks while specifying plateau and height, gives outputs that contains the peak location as well as the plateau locations and the peak values, example from below:
find_peaks(arr, height=0.5, prominence=0.1, plateau_size=0.5)
(array([ 9, 18, 33, 64, 70, 80, 87]),
{'plateau_sizes': array([5, 2, 2, 3, 1, 3, 2]),
'left_edges': array([ 7, 18, 33, 63, 70, 79, 87]),
'right_edges': array([11, 19, 34, 65, 70, 81, 88]),
'peak_heights': array([ 80., 87., 81., 107., 106., 105., 105.]),
'prominences': array([ 2., 11., 2., 18., 3., 8., 8.]),
'left_bases': array([ 3, 3, 29, 43, 68, 74, 74]),
'right_bases': array([13, 40, 40, 98, 98, 98, 98])})
however, upon specifying height for the negative version of the original Heart rate Data to find the troughs and the heights, it returned empty array. If I remove the height=0.5, it outputs valid values except the peak_heights.
find_peaks(trougharr,plateau_size=0.5,height=0.5)
(array([], dtype=int64),
{'plateau_sizes': array([], dtype=int64),
'left_edges': array([], dtype=int64),
'right_edges': array([], dtype=int64),
'peak_heights': array([], dtype=float64)})
Is there something wrong calling height with negative numbered arrays?
If there's a easier method or simpler method for splitting up the data into up and down and constant portions, that would be much appreciated.
the original sample HR data is as such:
HR
array([ 77, 77, 77, 76, 77, 78, 79, 80, 80, 80, 80, 80, 79,
78, 78, 79, 83, 85, 87, 87, 86, 86, 86, 85, 83, 81,
80, 79, 79, 79, 80, 80, 80, 81, 81, 80, 79, 79, 79,
74, 69, 69, 69, 69, 70, 70, 70, 70, 70, 71, 72, 80,
82, 89, 92, 95, 97, 99, 100, 102, 103, 105, 106, 107, 107,
107, 105, 104, 103, 105, 106, 102, 100, 97, 97, 98, 101, 102,
104, 105, 105, 105, 104, 104, 104, 104, 104, 105, 105, 104, 104,
104, 104, 98, 96, 93, 92, 90, 89])
-1*HR. (the problematic one)
array([ -77, -77, -77, -76, -77, -78, -79, -80, -80, -80, -80,
-80, -79, -78, -78, -79, -83, -85, -87, -87, -86, -86,
-86, -85, -83, -81, -80, -79, -79, -79, -80, -80, -80,
-81, -81, -80, -79, -79, -79, -74, -69, -69, -69, -69,
-70, -70, -70, -70, -70, -71, -72, -80, -82, -89, -92,
-95, -97, -99, -100, -102, -103, -105, -106, -107, -107, -107,
-105, -104, -103, -105, -106, -102, -100, -97, -97, -98, -101,
-102, -104, -105, -105, -105, -104, -104, -104, -104, -104, -105,
-105, -104, -104, -104, -104, -98, -96, -93, -92, -90, -89])
Tried to split Heart Rate data into sections where it was increasing/decreasing/staying constant to find the up_amplitude,up_time, down_amplitude, down_time, time_constant.
Found find_peaks to do it but it may not be the simplest, if there's a simpler method, please point me to it.
Tried to use peak_prominences, but it did not detect all prominences for some reason, the codes tried were:
peaks, _ = find_peaks(x)
prominences = peak_prominences(x, peaks)[0]
contour_heights = x[peaks] - prominences
plt.plot(x)
plt.plot(peaks, x[peaks], "x")
plt.vlines(x=peaks, ymin=contour_heights, ymax=x[peaks])
plt.show()
The prominences missed several locations.

unable to enable rabbitmq rabbitmq_auth_backend_oauth2 plugin

Rabbitmq version 3.8.16
followed this guide.
I tried enabling the plugin.
sudo rabbitmq-plugins enable rabbitmq_auth_backend_oauth2
However it throws back an error.
** (CaseClauseError) no case clause matching: {:could_not_start, :jose, {:jose, {{:shutdown, {:failed_to_start_child, :jose_server, {{:case_clause, {:ECPrivateKey, 1, <<104, 152, 88, 12, 19, 82, 251, 156, 171, 31, 222, 207, 0, 76, 115, 88, 210, 229, 36, 106, 137, 192, 81, 153, 154, 254, 226, 38, 247, 70, 226, 157>>, {:namedCurve, {1, 2, 840, 10045, 3, 1, 7}}, <<4, 46, 75, 29, 46, 150, 77, 222, 40, 220, 159, 244, 193, 125, 18, 190, 254, 216, 38, 191, 11, 52, 115, 159, 213, 230, 77, 27, 131, 94, 17, ...>>, :asn1_NOVALUE}}, [{:jose_server, :check_ec_key_mode, 2, [file: 'src/jose_server.erl', line: 189]}, {:lists, :foldl, 3, [file: 'lists.erl', line: 1267]}, {:jose_server, :support_check, 0, [file: 'src/jose_server.erl', line: 153]}, {:jose_server, :init, 1, [file: 'src/jose_server.erl', line: 93]}, {:gen_server, :init_it, 2, [file: 'gen_server.erl', line: 423]}, {:gen_server, :init_it, 6, [file: 'gen_server.erl', line: 390]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 226]}]}}}, {:jose_app, :start, [:normal, []]}}}}
(rabbitmqctl 3.8.0-dev) lib/rabbitmq/cli/plugins/plugins_helpers.ex:210: RabbitMQ.CLI.Plugins.Helpers.update_enabled_plugins/2
(rabbitmqctl 3.8.0-dev) lib/rabbitmq/cli/plugins/plugins_helpers.ex:107: RabbitMQ.CLI.Plugins.Helpers.update_enabled_plugins/4
(rabbitmqctl 3.8.0-dev) lib/rabbitmq/cli/plugins/commands/enable_command.ex:121: anonymous fn/6 in RabbitMQ.CLI.Plugins.Commands.EnableCommand.do_run/2
(elixir 1.10.4) lib/stream.ex:1325: anonymous fn/2 in Stream.iterate/2
(elixir 1.10.4) lib/stream.ex:1538: Stream.do_unfold/4
(elixir 1.10.4) lib/stream.ex:1609: Enumerable.Stream.do_each/4
(elixir 1.10.4) lib/stream.ex:956: Stream.do_enum_transform/7
(elixir 1.10.4) lib/stream.ex:1609: Enumerable.Stream.do_each/4
{:case_clause, {:could_not_start, :jose, {:jose, {{:shutdown, {:failed_to_start_child, :jose_server, {{:case_clause, {:ECPrivateKey, 1, <<104, 152, 88, 12, 19, 82, 251, 156, 171, 31, 222, 207, 0, 76, 115, 88, 210, 229, 36, 106, 137, 192, 81, 153, 154, 254, 226, 38, 247, 70, 226, ...>>, {:namedCurve, {1, 2, 840, 10045, 3, 1, 7}}, <<4, 46, 75, 29, 46, 150, 77, 222, 40, 220, 159, 244, 193, 125, 18, 190, 254, 216, 38, 191, 11, 52, 115, 159, 213, 230, 77, 27, 131, ...>>, :asn1_NOVALUE}}, [{:jose_server, :check_ec_key_mode, 2, [file: 'src/jose_server.erl', line: 189]}, {:lists, :foldl, 3, [file: 'lists.erl', line: 1267]}, {:jose_server, :support_check, 0, [file: 'src/jose_server.erl', line: 153]}, {:jose_server, :init, 1, [file: 'src/jose_server.erl', line: 93]}, {:gen_server, :init_it, 2, [file: 'gen_server.erl', line: 423]}, {:gen_server, :init_it, 6, [file: 'gen_server.erl', line: 390]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 226]}]}}}, {:jose_app, :start, [:normal, []]}}}}}
Any pointers or documentation for this configuration.
Thanks,
Sajith
Well, rabbitmq 3.8.5 seems to work. I assume the plugin built with 3.8.16 has a problem.

Mapbox - Get region from autocomplete result

How can I get region from the selected result from autocomplete?
From the result I am getting, there is 3rd object named region but actually it is department not region.
Here is the example address:
54b route de brie, 91800 Brunoy, France
Mapbox is giving me: Essonne // Its department not region
But actually its: Ile-de-France
How do I get the correct region?
Here is my working demo:
https://jsfiddle.net/rv085oL1/
That information isn't included. But if you just need your site to work in France, it would be straightforward to include a lookup table mapping from département to région, using the last two characters of the short_code. Here's one: https://gist.github.com/SiamKreative/f1074ed95507e69d08a0
"regions": {
"alsace": [67, 68],
"aquitaine": [40, 47, 33, 24, 64],
"auvergne": [43, 3, 15, 63],
"basse-normandie": [14, 61, 50],
"bourgogne": [21, 58, 71, 89],
"bretagne": [29, 35, 22, 56],
"centre": [45, 37, 41, 28, 36, 18],
"champagne-ardenne": [10, 8, 52, 51],
"corse": ["2b", "2a"],
"franche-compte": [39, 25, 70, 90],
"haute-normandie": [27, 76],
"languedoc-roussillon": [48, 30, 34, 11, 66],
"limousin": [19, 23, 87],
"lorraine": [55, 54, 57, 88],
"midi-pyrennees": [46, 32, 31, 12, 9, 65, 81, 82],
"nord-pas-de-calais": [62, 59],
"pays-de-la-loire": [49, 44, 72, 53, 85],
"picardie": [2, 60, 80],
"poitou-charentes": [17, 16, 86, 79],
"provences-alpes-cote-dazur": [4, 5, 6, 13, 84, 83],
"rhones-alpes": [38, 42, 26, 7, 1, 74, 73, 69],
"ile-de-france": [77, 75, 78, 93, 92, 91, 95, 94]
},

polyhedron not filled (openscad)

i have a polyhedron that seams to be well-formed with no overlap.
i have no error or warning when i press f6, and when i press f12 to check
if i have missordered faces , no pink face are displayed from outside ( all faces from inside the object are all pink which is consistent ) .
i have to do a difference between this object and another one but my polyhedron is never solid.
Did i missunderstood something ?
thanks a lot for any advices.
points = [[5.01, -10.505, -1.5], [6.5345, -10.5048, -1.5], [8.059, -10.5045, -1.5], [9.5835, -10.5042, -1.5], [11.108, -10.504, -1.5], [12.6325, -10.5038, -1.5], [14.157, -10.5035, -1.5], [15.6815, -10.5033, -1.5], [17.206, -10.503, -1.5], [18.7305, -10.5028, -1.5], [20.255, -10.5025, -1.5], [21.7795, -10.5023, -1.5], [23.304, -10.502, -1.5], [24.8285, -10.5018, -1.5], [26.353, -10.5015, -1.5], [27.8775, -10.5012, -1.5], [29.402, -10.501, -1.5], [30.9265, -10.5008, -1.5], [32.451, -10.5005, -1.5], [33.9755, -10.5002, -1.5], [35.5, -10.5, -1.5], [5, -10.5, -1.5], [5.31731, -10.5, -0.733875], [5.9485, -10.5, -0.081], [6.86244, -10.5, 0.465375], [8.028, -10.5, 0.912], [9.41406, -10.5, 1.26562], [10.9895, -10.5, 1.533], [12.7232, -10.5, 1.72087], [14.584, -10.5, 1.836], [16.5408, -10.5, 1.88512], [18.5625, -10.5, 1.875], [20.6179, -10.5, 1.81238], [22.676, -10.5, 1.704], [24.7056, -10.5, 1.55662], [26.6755, -10.5, 1.377], [28.5547, -10.5, 1.17187], [30.312, -10.5, 0.948], [31.9163, -10.5, 0.712125], [33.3365, -10.5, 0.471], [34.5414, -10.5, 0.231375], [35.5, -10.5, 0], [2.5, -8.5, -1.5], [3.54437, -8.78162, 0.442813], [4.72, -9.028, 2.0875], [6.01562, -9.24137, 3.45844], [7.42, -9.424, 4.58], [8.92188, -9.57812, 5.47656], [10.51, -9.706, 6.1725], [12.1731, -9.80988, 6.69219], [13.9, -9.892, 7.06], [15.6794, -9.95463, 7.30031], [17.5, -10, 7.4375], [19.3506, -10.0304, 7.49594], [21.22, -10.048, 7.5], [23.0969, -10.0551, 7.47406], [24.97, -10.054, 7.4425], [26.8281, -10.0469, 7.42969], [28.66, -10.036, 7.46], [30.4544, -10.0236, 7.55781], [32.2, -10.012, 7.7475], [33.8856, -10.0034, 8.05344], [35.5, -10, 8.5], [-1, -5, -1.5], [0.543563, -5.30675, 0.389375], [2.1685, -5.624, 2.02], [3.86619, -5.94725, 3.41063], [5.628, -6.272, 4.58], [7.44531, -6.59375, 5.54688], [9.3095, -6.908, 6.33], [11.2119, -7.21025, 6.94812], [13.144, -7.496, 7.42], [15.0971, -7.76075, 7.76438], [17.0625, -8, 8], [19.0317, -8.20925, 8.14562], [20.996, -8.384, 8.22], [22.9468, -8.51975, 8.24188], [24.8755, -8.612, 8.23], [26.7734, -8.65625, 8.20312], [28.632, -8.648, 8.18], [30.4426, -8.58275, 8.17938], [32.1965, -8.456, 8.22], [33.8852, -8.26325, 8.32063], [35.5, -8, 8.5], [-2, -3, -1.5], [-0.584562, -3.30662, 0.788375], [0.9535, -3.623, 2.722], [2.60181, -3.94387, 4.32863], [4.348, -4.264, 5.636], [6.17969, -4.57812, 6.67188], [8.0845, -4.881, 7.464], [10.0501, -5.16738, 8.04012], [12.064, -5.432, 8.428], [14.1139, -5.66962, 8.65537], [16.1875, -5.875, 8.75], [18.2723, -6.04288, 8.73963], [20.356, -6.168, 8.652], [22.4262, -6.24512, 8.51487], [24.4705, -6.269, 8.356], [26.4766, -6.23438, 8.20312], [28.432, -6.136, 8.084], [30.3244, -5.96863, 8.02638], [32.1415, -5.727, 8.058], [33.8708, -5.40587, 8.20663], [35.5, -5, 8.5], [-3, 0, -1.5], [-1.13556, 0, 1.20887], [0.8455, 0, 3.506], [2.92481, 0, 5.42212], [5.084, 0, 6.988], [7.30469, 0, 8.23438], [9.5685, 0, 9.192], [11.8571, 0, 9.89162], [14.152, 0, 10.364], [16.4349, 0, 10.6399], [18.6875, 0, 10.75], [20.8913, 0, 10.7251], [23.028, 0, 10.596], [25.0792, 0, 10.3934], [27.0265, 0, 10.148], [28.8516, 0, 9.89062], [30.536, 0, 9.652], [32.0614, 0, 9.46287], [33.4095, 0, 9.354], [34.5618, 0, 9.35613], [35.5, 0, 9.5], [-2, 3, -1.5], [-0.584562, 3.30662, 0.788375], [0.9535, 3.623, 2.722], [2.60181, 3.94387, 4.32863], [4.348, 4.264, 5.636], [6.17969, 4.57812, 6.67188], [8.0845, 4.881, 7.464], [10.0501, 5.16738, 8.04012], [12.064, 5.432, 8.428], [14.1139, 5.66962, 8.65537], [16.1875, 5.875, 8.75], [18.2723, 6.04288, 8.73963], [20.356, 6.168, 8.652], [22.4262, 6.24512, 8.51487], [24.4705, 6.269, 8.356], [26.4766, 6.23438, 8.20312], [28.432, 6.136, 8.084], [30.3244, 5.96863, 8.02638], [32.1415, 5.727, 8.058], [33.8708, 5.40587, 8.20663], [35.5, 5, 8.5], [-1, 5, -1.5], [0.543563, 5.30675, 0.389375], [2.1685, 5.624, 2.02], [3.86619, 5.94725, 3.41063], [5.628, 6.272, 4.58], [7.44531, 6.59375, 5.54688], [9.3095, 6.908, 6.33], [11.2119, 7.21025, 6.94812], [13.144, 7.496, 7.42], [15.0971, 7.76075, 7.76438], [17.0625, 8, 8], [19.0317, 8.20925, 8.14562], [20.996, 8.384, 8.22], [22.9468, 8.51975, 8.24188], [24.8755, 8.612, 8.23], [26.7734, 8.65625, 8.20312], [28.632, 8.648, 8.18], [30.4426, 8.58275, 8.17938], [32.1965, 8.456, 8.22], [33.8852, 8.26325, 8.32063], [35.5, 8, 8.5], [2.5, 8.5, -1.5], [3.54437, 8.78162, 0.442813], [4.72, 9.028, 2.0875], [6.01562, 9.24137, 3.45844], [7.42, 9.424, 4.58], [8.92188, 9.57812, 5.47656], [10.51, 9.706, 6.1725], [12.1731, 9.80988, 6.69219], [13.9, 9.892, 7.06], [15.6794, 9.95463, 7.30031], [17.5, 10, 7.4375], [19.3506, 10.0304, 7.49594], [21.22, 10.048, 7.5], [23.0969, 10.0551, 7.47406], [24.97, 10.054, 7.4425], [26.8281, 10.0469, 7.42969], [28.66, 10.036, 7.46], [30.4544, 10.0236, 7.55781], [32.2, 10.012, 7.7475], [33.8856, 10.0034, 8.05344], [35.5, 10, 8.5], [5, 10.5, -1.5], [5.31731, 10.5, -0.733875], [5.9485, 10.5, -0.081], [6.86244, 10.5, 0.465375], [8.028, 10.5, 0.912], [9.41406, 10.5, 1.26562], [10.9895, 10.5, 1.533], [12.7232, 10.5, 1.72087], [14.584, 10.5, 1.836], [16.5408, 10.5, 1.88512], [18.5625, 10.5, 1.875], [20.6179, 10.5, 1.81238], [22.676, 10.5, 1.704], [24.7056, 10.5, 1.55662], [26.6755, 10.5, 1.377], [28.5547, 10.5, 1.17187], [30.312, 10.5, 0.948], [31.9163, 10.5, 0.712125], [33.3365, 10.5, 0.471], [34.5414, 10.5, 0.231375], [35.5, 10.5, 0], [5.01, 10.505, -1.5], [6.5345, 10.5048, -1.5], [8.059, 10.5045, -1.5], [9.5835, 10.5042, -1.5], [11.108, 10.504, -1.5], [12.6325, 10.5038, -1.5], [14.157, 10.5035, -1.5], [15.6815, 10.5033, -1.5], [17.206, 10.503, -1.5], [18.7305, 10.5028, -1.5], [20.255, 10.5025, -1.5], [21.7795, 10.5023, -1.5], [23.304, 10.502, -1.5], [24.8285, 10.5018, -1.5], [26.353, 10.5015, -1.5], [27.8775, 10.5012, -1.5], [29.402, 10.501, -1.5], [30.9265, 10.5008, -1.5], [32.451, 10.5005, -1.5], [33.9755, 10.5002, -1.5], [35.5, 10.5, -1.5]];
faces = [[0, 21, 1], [21, 22, 1], [1, 22, 2], [22, 23, 2], [2, 23, 3], [23, 24, 3], [3, 24, 4], [24, 25, 4], [4, 25, 5], [25, 26, 5], [5, 26, 6], [26, 27, 6], [6, 27, 7], [27, 28, 7], [7, 28, 8], [28, 29, 8], [8, 29, 9], [29, 30, 9], [9, 30, 10], [30, 31, 10], [10, 31, 11], [31, 32, 11], [11, 32, 12], [32, 33, 12], [12, 33, 13], [33, 34, 13], [13, 34, 14], [34, 35, 14], [14, 35, 15], [35, 36, 15], [15, 36, 16], [36, 37, 16], [16, 37, 17], [37, 38, 17], [17, 38, 18], [38, 39, 18], [18, 39, 19], [39, 40, 19], [19, 40, 20], [40, 41, 20], [21, 42, 22], [42, 43, 22], [22, 43, 23], [43, 44, 23], [23, 44, 24], [44, 45, 24], [24, 45, 25], [45, 46, 25], [25, 46, 26], [46, 47, 26], [26, 47, 27], [47, 48, 27], [27, 48, 28], [48, 49, 28], [28, 49, 29], [49, 50, 29], [29, 50, 30], [50, 51, 30], [30, 51, 31], [51, 52, 31], [31, 52, 32], [52, 53, 32], [32, 53, 33], [53, 54, 33], [33, 54, 34], [54, 55, 34], [34, 55, 35], [55, 56, 35], [35, 56, 36], [56, 57, 36], [36, 57, 37], [57, 58, 37], [37, 58, 38], [58, 59, 38], [38, 59, 39], [59, 60, 39], [39, 60, 40], [60, 61, 40], [40, 61, 41], [61, 62, 41], [42, 63, 43], [63, 64, 43], [43, 64, 44], [64, 65, 44], [44, 65, 45], [65, 66, 45], [45, 66, 46], [66, 67, 46], [46, 67, 47], [67, 68, 47], [47, 68, 48], [68, 69, 48], [48, 69, 49], [69, 70, 49], [49, 70, 50], [70, 71, 50], [50, 71, 51], [71, 72, 51], [51, 72, 52], [72, 73, 52], [52, 73, 53], [73, 74, 53], [53, 74, 54], [74, 75, 54], [54, 75, 55], [75, 76, 55], [55, 76, 56], [76, 77, 56], [56, 77, 57], [77, 78, 57], [57, 78, 58], [78, 79, 58], [58, 79, 59], [79, 80, 59], [59, 80, 60], [80, 81, 60], [60, 81, 61], [81, 82, 61], [61, 82, 62], [82, 83, 62], [63, 84, 64], [84, 85, 64], [64, 85, 65], [85, 86, 65], [65, 86, 66], [86, 87, 66], [66, 87, 67], [87, 88, 67], [67, 88, 68], [88, 89, 68], [68, 89, 69], [89, 90, 69], [69, 90, 70], [90, 91, 70], [70, 91, 71], [91, 92, 71], [71, 92, 72], [92, 93, 72], [72, 93, 73], [93, 94, 73], [73, 94, 74], [94, 95, 74], [74, 95, 75], [95, 96, 75], [75, 96, 76], [96, 97, 76], [76, 97, 77], [97, 98, 77], [77, 98, 78], [98, 99, 78], [78, 99, 79], [99, 100, 79], [79, 100, 80], [100, 101, 80], [80, 101, 81], [101, 102, 81], [81, 102, 82], [102, 103, 82], [82, 103, 83], [103, 104, 83], [84, 105, 85], [105, 106, 85], [85, 106, 86], [106, 107, 86], [86, 107, 87], [107, 108, 87], [87, 108, 88], [108, 109, 88], [88, 109, 89], [109, 110, 89], [89, 110, 90], [110, 111, 90], [90, 111, 91], [111, 112, 91], [91, 112, 92], [112, 113, 92], [92, 113, 93], [113, 114, 93], [93, 114, 94], [114, 115, 94], [94, 115, 95], [115, 116, 95], [95, 116, 96], [116, 117, 96], [96, 117, 97], [117, 118, 97], [97, 118, 98], [118, 119, 98], [98, 119, 99], [119, 120, 99], [99, 120, 100], [120, 121, 100], [100, 121, 101], [121, 122, 101], [101, 122, 102], [122, 123, 102], [102, 123, 103], [123, 124, 103], [103, 124, 104], [124, 125, 104], [105, 126, 106], [126, 127, 106], [106, 127, 107], [127, 128, 107], [107, 128, 108], [128, 129, 108], [108, 129, 109], [129, 130, 109], [109, 130, 110], [130, 131, 110], [110, 131, 111], [131, 132, 111], [111, 132, 112], [132, 133, 112], [112, 133, 113], [133, 134, 113], [113, 134, 114], [134, 135, 114], [114, 135, 115], [135, 136, 115], [115, 136, 116], [136, 137, 116], [116, 137, 117], [137, 138, 117], [117, 138, 118], [138, 139, 118], [118, 139, 119], [139, 140, 119], [119, 140, 120], [140, 141, 120], [120, 141, 121], [141, 142, 121], [121, 142, 122], [142, 143, 122], [122, 143, 123], [143, 144, 123], [123, 144, 124], [144, 145, 124], [124, 145, 125], [145, 146, 125], [126, 147, 127], [147, 148, 127], [127, 148, 128], [148, 149, 128], [128, 149, 129], [149, 150, 129], [129, 150, 130], [150, 151, 130], [130, 151, 131], [151, 152, 131], [131, 152, 132], [152, 153, 132], [132, 153, 133], [153, 154, 133], [133, 154, 134], [154, 155, 134], [134, 155, 135], [155, 156, 135], [135, 156, 136], [156, 157, 136], [136, 157, 137], [157, 158, 137], [137, 158, 138], [158, 159, 138], [138, 159, 139], [159, 160, 139], [139, 160, 140], [160, 161, 140], [140, 161, 141], [161, 162, 141], [141, 162, 142], [162, 163, 142], [142, 163, 143], [163, 164, 143], [143, 164, 144], [164, 165, 144], [144, 165, 145], [165, 166, 145], [145, 166, 146], [166, 167, 146], [147, 168, 148], [168, 169, 148], [148, 169, 149], [169, 170, 149], [149, 170, 150], [170, 171, 150], [150, 171, 151], [171, 172, 151], [151, 172, 152], [172, 173, 152], [152, 173, 153], [173, 174, 153], [153, 174, 154], [174, 175, 154], [154, 175, 155], [175, 176, 155], [155, 176, 156], [176, 177, 156], [156, 177, 157], [177, 178, 157], [157, 178, 158], [178, 179, 158], [158, 179, 159], [179, 180, 159], [159, 180, 160], [180, 181, 160], [160, 181, 161], [181, 182, 161], [161, 182, 162], [182, 183, 162], [162, 183, 163], [183, 184, 163], [163, 184, 164], [184, 185, 164], [164, 185, 165], [185, 186, 165], [165, 186, 166], [186, 187, 166], [166, 187, 167], [187, 188, 167], [168, 189, 169], [189, 190, 169], [169, 190, 170], [190, 191, 170], [170, 191, 171], [191, 192, 171], [171, 192, 172], [192, 193, 172], [172, 193, 173], [193, 194, 173], [173, 194, 174], [194, 195, 174], [174, 195, 175], [195, 196, 175], [175, 196, 176], [196, 197, 176], [176, 197, 177], [197, 198, 177], [177, 198, 178], [198, 199, 178], [178, 199, 179], [199, 200, 179], [179, 200, 180], [200, 201, 180], [180, 201, 181], [201, 202, 181], [181, 202, 182], [202, 203, 182], [182, 203, 183], [203, 204, 183], [183, 204, 184], [204, 205, 184], [184, 205, 185], [205, 206, 185], [185, 206, 186], [206, 207, 186], [186, 207, 187], [207, 208, 187], [187, 208, 188], [208, 209, 188], [189, 210, 190], [210, 211, 190], [190, 211, 191], [211, 212, 191], [191, 212, 192], [212, 213, 192], [192, 213, 193], [213, 214, 193], [193, 214, 194], [214, 215, 194], [194, 215, 195], [215, 216, 195], [195, 216, 196], [216, 217, 196], [196, 217, 197], [217, 218, 197], [197, 218, 198], [218, 219, 198], [198, 219, 199], [219, 220, 199], [199, 220, 200], [220, 221, 200], [200, 221, 201], [221, 222, 201], [201, 222, 202], [222, 223, 202], [202, 223, 203], [223, 224, 203], [203, 224, 204], [224, 225, 204], [204, 225, 205], [225, 226, 205], [205, 226, 206], [226, 227, 206], [206, 227, 207], [227, 228, 207], [207, 228, 208], [228, 229, 208], [208, 229, 209], [229, 230, 209], [20, 41, 62, 83, 104, 125, 146, 167, 188, 209, 230], [20, 230, 210, 189, 168, 147, 126, 105, 84, 63, 42, 21, 0]];
polyhedron(points,faces);

Spark Scala TF-IDF value sorted vectors

So far I have been able to tokenize all of my documents, and use CountVectorizer and IDF from Spark's MLLib. I am trying to get the top 50 words from each document, but I am not sure how to sort the output of IDF.
onePer is a dataframe of document IDs and tokenized documents.
val tf = new CountVectorizer()
.setInputCol("text")
.setOutputCol("features").fit(onePer)
.transform(onePer).select("features").rdd
.map{x:Row => x.getAs[Vector](0)}
tf.cache()
val idf = new IDF().fit(tf)
val tfidf: RDD[Vector] = idf.transform(tf)
This is what my output looks like (number of words in vocab, id of word, word score). I would like to sort by score and get the top k:
(440,[0,2,3,4,5,6,7,8,9,10,12,15,17,18,19,22,23,24,25,26,27,28,30,31,32,33,34,35,39,41,43,45,47,49,51,52,53,55,57,63,66,69,70,71,74,76,79,80,83,84,85,88,94,95,96,97,99,102,106,107,109,111,117,120,121,124,127,128,129,138,142,145,146,149,154,156,164,166,167,170,171,176,187,189,199,203,204,217,218,219,232,234,236,237,238,240,248,250,251,254,259,263,265,267,280,291,296,302,304,309,319,322,328,333,347,361,364,371,375,384,388,393,395,401,403,433,438,439],[1.3559553712291716,3.9422868018213513,0.6369074622370692,7.795697904781566,3.153829441457081,0.0,5.519201522549892,0.3184537311185346,0.3184537311185346,1.3559553712291716,0.4519851237430572,0.4519851237430572,0.6061358035703155,1.0116009116784799,0.4519851237430572,0.7884573603642703,0.4519851237430572,2.0232018233569597,0.7884573603642703,8.523740461192126,0.6061358035703155,0.6061358035703155,0.6061358035703155,0.6061358035703155,0.7884573603642703,0.6061358035703155,0.6061358035703155,0.6061358035703155,0.7884573603642703,0.7884573603642703,1.0116009116784799,1.0116009116784799,2.0232018233569597,0.7884573603642703,0.7884573603642703,3.897848952390783,0.7884573603642703,0.7884573603642703,1.0116009116784799,5.114244276715276,1.0116009116784799,1.0116009116784799,2.5985659682605218,1.2992829841302609,1.2992829841302609,1.0116009116784799,1.0116009116784799,1.0116009116784799,1.0116009116784799,1.0116009116784799,2.5985659682605218,1.0116009116784799,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,3.4094961844768505,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,3.4094961844768505,1.2992829841302609,1.2992829841302609,1.2992829841302609,3.4094961844768505,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253])
Update
I was able to get this working by doing the following:
tfidf.map(x => x.toSparse).map{x => x.indices.zip(x.values)
.sortBy(-_._2)
.take(10)
.map(_._1)
}
This might help:
scala> val x = (440,Array[Int](0,2,3,4,5,6,7,8,9,10,12,15,17,18,19,22,23,24,25,26,27,28,30,31,32,33,34,35,39,41,43,45,47,49,51,52,53,55,57,63,66,69,70,71,74,76,79,80,83,84,85,88,94,95,96,97,99,102,106,107,109,111,117,120,121,124,127,128,129,138,142,145,146,149,154,156,164,166,167,170,171,176,187,189,199,203,204,217,218,219,232,234,236,237,238,240,248,250,251,254,259,263,265,267,280,291,296,302,304,309,319,322,328,333,347,361,364,371,375,384,388,393,395,401,403,433,438,439),Array[Double](1.3559553712291716,3.9422868018213513,0.6369074622370692,7.795697904781566,3.153829441457081,0.0,5.519201522549892,0.3184537311185346,0.3184537311185346,1.3559553712291716,0.4519851237430572,0.4519851237430572,0.6061358035703155,1.0116009116784799,0.4519851237430572,0.7884573603642703,0.4519851237430572,2.0232018233569597,0.7884573603642703,8.523740461192126,0.6061358035703155,0.6061358035703155,0.6061358035703155,0.6061358035703155,0.7884573603642703,0.6061358035703155,0.6061358035703155,0.6061358035703155,0.7884573603642703,0.7884573603642703,1.0116009116784799,1.0116009116784799,2.0232018233569597,0.7884573603642703,0.7884573603642703,3.897848952390783,0.7884573603642703,0.7884573603642703,1.0116009116784799,5.114244276715276,1.0116009116784799,1.0116009116784799,2.5985659682605218,1.2992829841302609,1.2992829841302609,1.0116009116784799,1.0116009116784799,1.0116009116784799,1.0116009116784799,1.0116009116784799,2.5985659682605218,1.0116009116784799,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,3.4094961844768505,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,3.4094961844768505,1.2992829841302609,1.2992829841302609,1.2992829841302609,3.4094961844768505,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.2992829841302609,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253,1.7047480922384253))
scala> val (r, indices, values) = x
r: Int = 440
indices: Array[Int] = Array(0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 17, 18, 19, 22, 23, 24, 25, 26, 27, 28, 30, 31, 32, 33, 34, 35, 39, 41, 43, 45, 47, 49, 51, 52, 53, 55, 57, 63, 66, 69, 70, 71, 74, 76, 79, 80, 83, 84, 85, 88, 94, 95, 96, 97, 99, 102, 106, 107, 109, 111, 117, 120, 121, 124, 127, 128, 129, 138, 142, 145, 146, 149, 154, 156, 164, 166, 167, 170, 171, 176, 187, 189, 199, 203, 204, 217, 218, 219, 232, 234, 236, 237, 238, 240, 248, 250, 251, 254, 259, 263, 265, 267, 280, 291, 296, 302, 304, 309, 319, 322, 328, 333, 347, 361, 364, 371, 375, 384, 388, 393, 395, 401, 403, 433, 438, 439)
values: Array[Double] = Array(1.3559553712291716, 3.9422868018213513, 0.6369074622370692, 7.795697904781566, 3.153829441457081, 0.0, 5.519201522549892, 0.3184537311185346, 0.31845373...
scala> val topTermIds = indices.zip(values).sortBy( - _._2).take(50).map(_._1)
topTermIds: Array[Int] = Array(26, 4, 7, 63, 2, 52, 109, 124, 138, 5, 70, 85, 24, 47, 176, 187, 189, 199, 203, 204, 217, 218, 219, 232, 234, 236, 237, 238, 240, 248, 250, 251, 254, 259, 263, 265, 267, 280, 291, 296, 302, 304, 309, 319, 322, 328, 333, 347, 361, 364)
Now you need to plug in above code into a closure, something like:
val topTermsByScore = rdd.map { v: Vector =>
// to sort decreasing use -
v.indices.zip(v.values).sortBy( - _._2).take(50).map(_._1)
}