I have a set of films data/ratings, and I need to calculate the average of the ratings by film. It's like a sum on ratings groupby movieId in SQL.
Thank you very much for your help
I've tried to use aggregateBYKey, but I don't know how to use seqOp and CombOp functions. I'm new to PySpark.
here is a chunk of my RDD: [movieId, userId, rating, film]
[('1', '1', 4.0, 'Toy Story (1995)'),
('1', '5', 4.0, 'Toy Story (1995)'),
('1', '7', 4.5, 'Toy Story (1995)'),
('1', '15', 2.5, 'Toy Story (1995)'),
('1', '17', 4.5, 'Toy Story (1995)'),
('1', '18', 3.5, 'Toy Story (1995)'),
('1', '19', 4.0, 'Toy Story (1995)'),
('1', '21', 3.5, 'Toy Story (1995)'),
('1', '27', 3.0, 'Toy Story (1995)'),
('1', '31', 5.0, 'Toy Story (1995)'),
('1', '32', 3.0, 'Toy Story (1995)'),
('1', '33', 3.0, 'Toy Story (1995)'),
('1', '40', 5.0, 'Toy Story (1995)'),
('1', '43', 5.0, 'Toy Story (1995)'),
('1', '44', 3.0, 'Toy Story (1995)'),
('1', '45', 4.0, 'Toy Story (1995)'),
('1', '46', 5.0, 'Toy Story (1995)'),
('1', '50', 3.0, 'Toy Story (1995)'),
('1', '54', 3.0, 'Toy Story (1995)'),
('1', '57', 5.0, 'Toy Story (1995)')]
I need to calculate the average rating for each film, something like:
[('1', average_ratings_of_film_1, film_name_1),
('2', average_ratings_of_film_2, film_name_2)]
thank you very much for your help
You can use the following to convert your list to a DF and then use groupby().avg()
data = spark.sparkContext.parallelize(
[('1', '1', 4.0, 'Toy Story (1995)'),
('1', '5', 4.0, 'Toy Story (1995)'),
('1', '7', 4.5, 'Toy Story (1995)'),
('1', '15', 2.5, 'Toy Story (1995)'),
('1', '17', 4.5, 'Toy Story (1995)'),
('1', '18', 3.5, 'Toy Story (1995)'),
('1', '19', 4.0, 'Toy Story (1995)'),
('1', '21', 3.5, 'Toy Story (1995)'),
('1', '27', 3.0, 'Toy Story (1995)'),
('1', '31', 5.0, 'Toy Story (1995)'),
('1', '32', 3.0, 'Toy Story (1995)'),
('1', '33', 3.0, 'Toy Story (1995)'),
('1', '40', 5.0, 'Toy Story (1995)'),
('1', '43', 5.0, 'Toy Story (1995)'),
('1', '44', 3.0, 'Toy Story (1995)'),
('1', '45', 4.0, 'Toy Story (1995)'),
('1', '46', 5.0, 'Toy Story (1995)'),
('1', '50', 3.0, 'Toy Story (1995)'),
('1', '54', 3.0, 'Toy Story (1995)'),
('1', '57', 5.0, 'Toy Story (1995)')])
df = data.toDF(schema=["movie_id", "user_id", "rating", "movie"])
group = df.groupby("movie").avg("rating")
group.show()
#+----------------+-----------+
#| movie|avg(rating)|
#+----------------+-----------+
#|Toy Story (1995)| 3.875|
#+----------------+-----------+
Related
I want to make a list that looks like the image from the data shown below. I haven't tried anything because I don't know how to do it.
Illustration of similar work:
Code:
class Meue {
int id;
String desc;
int parent;
Meue(
{
required this.id,
required this.desc,
required this.parent}
);
}
List<Meue> itemsMenue=[
Meue(id: 1,desc:'number 1' ,parent:0 ),
Meue(id: 2,desc:'number 2' ,parent: 0),
Meue(id: 3,desc:'number 3' ,parent: 0),
Meue(id: 4,desc:'number 4' ,parent: 0),
Meue(id: 5,desc:'number 5' ,parent: 0),
Meue(id: 6,desc:'number 6' ,parent: 1),
Meue(id: 7,desc:'number 7' ,parent: 1),
Meue(id: 8,desc:'number 8' ,parent: 1),
Meue(id: 9,desc:'number 9' ,parent: 1),
Meue(id: 10,desc:'number 10' ,parent:1 ),
Meue(id: 11,desc:'number 11' ,parent: 2),
Meue(id: 12,desc:'number 12' ,parent: 2),
Meue(id: 13,desc:'number 13' ,parent: 2),
Meue(id: 14,desc:'number 14' ,parent: 2),
Meue(id: 15,desc:'number 15' ,parent: 2),
Meue(id: 16,desc:'number 16' ,parent: 2),
Meue(id: 17,desc:'number 17' ,parent: 3),
Meue(id: 18,desc:'number 18' ,parent: 3),
Meue(id: 19,desc:'number 19' ,parent: 3),
Meue(id: 20,desc:'number 20' ,parent: 3),
Meue(id: 21,desc:'number 21' ,parent: 3),
Meue(id: 22,desc:'number 22' ,parent: 3),
Meue(id: 23,desc:'number 23' ,parent: 4),
Meue(id: 24,desc:'number 24' ,parent: 4),
Meue(id: 25,desc:'number 25' ,parent: 4),
Meue(id: 26,desc:'number 26' ,parent: 4),
Meue(id: 27,desc:'number 27' ,parent: 4),
Meue(id: 28,desc:'number 28' ,parent: 30),
Meue(id: 29,desc:'number 29' ,parent: 31),
Meue(id: 30,desc:'number 30' ,parent: 29),
Meue(id: 31,desc:'number 31' ,parent: 28),
];
I have a list of texts with duration for animation
List<RhymeModel> rhymePhrases = [
RhymeModel(lyricsPhrase: 'Baa, baa', startAt: 0.0, endAt: 0.2),
RhymeModel(lyricsPhrase: 'black sheep', startAt: 0.3, endAt: 0.4),
RhymeModel(lyricsPhrase: 'Have you', startAt: 0.5, endAt: 0.6),
RhymeModel(lyricsPhrase: 'any wool?', startAt: 0.7, endAt: 0.8),
RhymeModel(lyricsPhrase: 'Yes, sir,', startAt: 0.9, endAt: 1.0),
RhymeModel(lyricsPhrase: 'yes, sir,', startAt: 1.1, endAt: 1.2),
RhymeModel(lyricsPhrase: 'Three bags full.', startAt: 1.3, endAt: 1.4),
RhymeModel(lyricsPhrase: 'One for the master,', startAt: 1.5, endAt: 1.6),
RhymeModel(lyricsPhrase: 'And one for', startAt: 1.7, endAt: 1.8),
RhymeModel(lyricsPhrase: 'dame,And one', startAt: 1.9, endAt: 2.0),
RhymeModel(lyricsPhrase: 'for the little', startAt: 2.1, endAt: 2.2),
RhymeModel(lyricsPhrase: 'boy Who lives ', startAt: 2.3, endAt: 2.4),
RhymeModel(lyricsPhrase: 'down the lane.', startAt: 2.5, endAt: 2.6),
];
My objective is to animate each text with a reveal animation using the duration(Similar to what you can see when lyrics get matched with audio) how can I animate each texts with animations ?
Using this flutter package: https://pub.dev/packages/select_form_field
Hello I am having an issue regarding searching in "SelectFormFieldType.dialog", I don't know if I have missed something but it seems like the search doesn't seem to work.
Here is my code:
List of items:
List<Map<String, dynamic>> _items = [
{
'value': '',
'label': 'Select Account',
'icon': Icon(Icons.account_balance)
},
{'value': '1', 'label': 'Account 1', 'icon': Icon(Icons.account_balance)},
{'value': '2', 'label': 'Account 2', 'icon': Icon(Icons.account_balance)},
{'value': '3', 'label': 'Account 3', 'icon': Icon(Icons.account_balance)}
];
Here is selectformfield:
SelectFormField(
dialogCancelBtn: 'Cancel',
enableSearch: true,
dialogSearchHint: 'Search account',
type: SelectFormFieldType.dialog,
icon: Icon(Icons.account_balance),
labelText: 'Select Account',
hintText: 'Select Account',
items: _items,
onChanged: (val) {
print(val);
},
)
Here is the screenshot of the list
list of items
Here is the screenshot of output when searching item
output
I have two lists in Dart as below,
final List availableIssueComponents = [
{'id': 1, 'componentName': 'Cash Acceptor'},
{'id': 2, 'componentName': 'Printer'},
{'id': 3, 'componentName': 'PIN Pad'},
{'id': 4, 'componentName': 'Key Board'},
{'id': 5, 'componentName': 'Touch Screen'},
{'id': 6, 'componentName': 'Computer'},
{'id': 7, 'componentName': 'Application'},
{'id': 8, 'componentName': 'Network'},
{'id': 9, 'componentName': 'Power'},
{'id': 10, 'componentName': 'Camera'},
{'id': 11, 'componentName': 'Safe'},
{'id': 13, 'componentName': 'Screen'},
{'id': 14, 'componentName': 'Battery'},
{'id': 15, 'componentName': 'Ports'},
{'id': 16, 'componentName': 'Application'},
{'id': 17, 'componentName': 'Safe'},
{'id': 18, 'componentName': 'Camera'},
{'id': 19, 'componentName': 'Power'},
{'id': 20, 'componentName': 'Key Board'},
{'id': 21, 'componentName': 'PIN Pad'},
{'id': 22, 'componentName': 'Printer'},
{'id': 23, 'componentName': 'Computer'},
{'id': 24, 'componentName': 'Touch Screen'},
{'id': 25, 'componentName': 'Application'},
{'id': 26, 'componentName': 'Network'}
];
final List selectedIssueComponents = [
{'id': 3, 'componentName': 'PIN Pad'},
{'id': 6, 'componentName': 'Computer'},
{'id': 19, 'componentName': 'Power'},
];
From the above two lists, I am trying to select all the elements from the availableIssueComponents excluding the elements that are already available in the selectedIssueComponents.
Ex: Since components with ids of 3, 6, 19 are common in both the lists, I would want a third list that contains all the components excluding the components with the ids of 3, 6, 19.
The third list should look like below,
final List availableIssueComponents = [
{'id': 1, 'componentName': 'Cash Acceptor'},
{'id': 2, 'componentName': 'Printer'},
{'id': 4, 'componentName': 'Key Board'},
{'id': 5, 'componentName': 'Touch Screen'},
{'id': 7, 'componentName': 'Application'},
{'id': 8, 'componentName': 'Network'},
{'id': 9, 'componentName': 'Power'},
{'id': 10, 'componentName': 'Camera'},
{'id': 11, 'componentName': 'Safe'},
{'id': 13, 'componentName': 'Screen'},
{'id': 14, 'componentName': 'Battery'},
{'id': 15, 'componentName': 'Ports'},
{'id': 16, 'componentName': 'Application'},
{'id': 17, 'componentName': 'Safe'},
{'id': 18, 'componentName': 'Camera'},
{'id': 20, 'componentName': 'Key Board'},
{'id': 21, 'componentName': 'PIN Pad'},
{'id': 22, 'componentName': 'Printer'},
{'id': 23, 'componentName': 'Computer'},
{'id': 24, 'componentName': 'Touch Screen'},
{'id': 25, 'componentName': 'Application'},
{'id': 26, 'componentName': 'Network'}
];
I tried to do this using Sets and the following was my approach,
Set availableComponentsSet = Set.from(availableIssueComponents);
Set issueComponentsSet = Set.from(selectedIssueComponents);
Set resultComponents = availableComponentsSet.difference(issueComponentsSet);
But when logged to the console it resultComponents contained all the components. Which is not what I wanted. I also tried nested for loops and it did not work either.
The components objects are not filtered by the Set when using a Set<Map> because Map is a reference type and is considered unique unless the two objects being compared are pointing at the same instance (as #Pat9RB commented).
I would map the selected IDs to a list, then filter out those IDs using List#where(fn)
final availableIssueComponents = [
{'id': 1, 'componentName': 'Cash Acceptor'},
{'id': 2, 'componentName': 'Printer'},
{'id': 3, 'componentName': 'PIN Pad'},
{'id': 4, 'componentName': 'Key Board'},
{'id': 5, 'componentName': 'Touch Screen'},
{'id': 6, 'componentName': 'Computer'},
{'id': 7, 'componentName': 'Application'},
{'id': 8, 'componentName': 'Network'},
{'id': 9, 'componentName': 'Power'},
{'id': 10, 'componentName': 'Camera'},
{'id': 11, 'componentName': 'Safe'},
{'id': 13, 'componentName': 'Screen'},
{'id': 14, 'componentName': 'Battery'},
{'id': 15, 'componentName': 'Ports'},
{'id': 16, 'componentName': 'Application'},
{'id': 17, 'componentName': 'Safe'},
{'id': 18, 'componentName': 'Camera'},
{'id': 19, 'componentName': 'Power'},
{'id': 20, 'componentName': 'Key Board'},
{'id': 21, 'componentName': 'PIN Pad'},
{'id': 22, 'componentName': 'Printer'},
{'id': 23, 'componentName': 'Computer'},
{'id': 24, 'componentName': 'Touch Screen'},
{'id': 25, 'componentName': 'Application'},
{'id': 26, 'componentName': 'Network'}
];
final selectedIssueComponents = [
{'id': 3, 'componentName': 'PIN Pad'},
{'id': 6, 'componentName': 'Computer'},
{'id': 19, 'componentName': 'Power'},
];
final selectedIds = selectedIssueComponents.map((component) => component['id']).toList();
final filtered = availableIssueComponents.where((element) => !selectedIds.contains(element["id"])).toList();
print(filtered);
If you prefer to use Set and difference you could create sets of the ids. This would create a set of int (Set<int>) which is a primative type and would allow the type of filtering expected:
final availableIssueComponents = [
{'id': 1, 'componentName': 'Cash Acceptor'},
{'id': 2, 'componentName': 'Printer'},
{'id': 3, 'componentName': 'PIN Pad'},
{'id': 4, 'componentName': 'Key Board'},
{'id': 5, 'componentName': 'Touch Screen'},
{'id': 6, 'componentName': 'Computer'},
{'id': 7, 'componentName': 'Application'},
{'id': 8, 'componentName': 'Network'},
{'id': 9, 'componentName': 'Power'},
{'id': 10, 'componentName': 'Camera'},
{'id': 11, 'componentName': 'Safe'},
{'id': 13, 'componentName': 'Screen'},
{'id': 14, 'componentName': 'Battery'},
{'id': 15, 'componentName': 'Ports'},
{'id': 16, 'componentName': 'Application'},
{'id': 17, 'componentName': 'Safe'},
{'id': 18, 'componentName': 'Camera'},
{'id': 19, 'componentName': 'Power'},
{'id': 20, 'componentName': 'Key Board'},
{'id': 21, 'componentName': 'PIN Pad'},
{'id': 22, 'componentName': 'Printer'},
{'id': 23, 'componentName': 'Computer'},
{'id': 24, 'componentName': 'Touch Screen'},
{'id': 25, 'componentName': 'Application'},
{'id': 26, 'componentName': 'Network'}
];
final selectedIssueComponents = [
{'id': 3, 'componentName': 'PIN Pad'},
{'id': 6, 'componentName': 'Computer'},
{'id': 19, 'componentName': 'Power'},
];
final availableIds = availableIssueComponents.map((component) => component['id']).toSet();
final selectedIds = selectedIssueComponents.map((component) => component['id']).toSet();
final filteredIds = availableIds.difference(selectedIds);
final filteredComponents = availableIssueComponents.where((element) => filteredIds.contains(element["id"])).toList();
print(filteredComponents);
I have a google timeline chart. I need to highlight the block in which the current date pass through. How can I implement this stuff? This is my code. Her I need to highlight 'F' timeline which is passing through the current date.
function drawChart() {
var container = document.getElementById('Gateways');
var chart = new google.visualization.Timeline(container);
var dataTable = new google.visualization.DataTable();
dataTable.addColumn({ type: 'string', id: 'Room' });
dataTable.addColumn({ type: 'string', id: 'Name' });
dataTable.addColumn({ type: 'date', id: 'Start' });
dataTable.addColumn({ type: 'date', id: 'End' });
dataTable.addRows([
[ '1', 'A', new Date(2011, 3, 30), new Date(2012, 2, 4) ],
[ '1', 'B', new Date(2012, 2, 4), new Date(2013, 3, 30) ],
[ '1', 'C', new Date(2013, 3, 30), new Date(2014, 2, 4) ],
[ '1', 'D', new Date(2014, 2, 4), new Date(2015, 2, 4) ],
[ '1', 'E', new Date(2015, 3, 30), new Date(2016, 2, 4) ],
[ '1', 'F', new Date(2016, 2, 4), new Date(2017, 2, 4) ],
[ '1', 'G', new Date(2017, 2, 4), new Date(2018, 2, 4) ],
[ '1', 'H', new Date(2018, 2, 4), new Date(2019, 2, 4) ],
[ '1', 'I', new Date(2019, 2, 4), new Date(2020, 2, 4) ],
[ '1', 'J', new Date(2020, 2, 4), new Date(2021, 2, 4) ]]);
var options = {
timeline: { showRowLabels: false },
avoidOverlappingGridLines: false
};
chart.draw(dataTable, options);
}