I'm trying to filter an object in Guava. For example I have a class Team and would like to get all the teams with position below 5.
Iterable<Team> test = Iterables.filter(teams, new Predicate<Team>(){
public boolean apply(Team p) {
return p.getPosition() <= 5;
}
});
I'm getting 2 errors, Predicate cannot be resolved to a type and The method filter(Iterable, Predicate) in the type Iterables is not applicable for the arguments (List <'Team'>, new Predicate<'Team'>(){}).
I'm able to filter Iterables of type Integer.
Iterable<Integer> t6 = Iterables.filter(set1, Range.open(0, 3));
How do i filter an object based on its members in Guava ? I want to use this library in my android project and have many filtering conditions. Can it be used for class objects or is it only for simple data types ?
You need a final variable like range in this example.
This is the way to filter with external parameters, Predicate is an inner class.
final Range range = new IntRange(0, 3);
Iterable<Team> test = Iterables.filter(teams, new Predicate<Team>() {
public boolean apply(Team p) {
return range.containsInteger(p.getPosition());
}
});
Related
Why I have to define a subclass to get the Type of superclass' generic param? Is the limit necessary?
I read the code of Fastjson of Alibaba and tried to figure out why use TypeReference must create an anonymous subclass. Then I found that an object cannot get its own generic param Type even its own Type.
public class TypeReference {
static ConcurrentMap<Type, Type> classTypeCache
= new ConcurrentHashMap<Type, Type>(16, 0.75f, 1);
protected final Type type;
protected TypeReference() {
Type superClass = getClass().getGenericSuperclass();
Type type = ((ParameterizedType) superClass).getActualTypeArguments()[0];
Type cachedType = classTypeCache.get(type);
if (cachedType == null) {
classTypeCache.putIfAbsent(type, type);
cachedType = classTypeCache.get(type);
}
this.type = cachedType;
}
// ...
}
Sorry for my poor English. Thanks for your answers.
Because of Type Erasure.
Consider the following example
List<String> stringList = new ArrayList<>();
List<Number> numberList = new ArrayList<>();
System.out.println(stringList.getClass() == numberList.getClass());
This will print true. Regardless of the generic type, both instances of ArrayList have the same class and a single Class object. So how could this single Class object return the right Type for both objects?
We can even get a step further,
List<String> stringList = Collections.emptyList();
List<Number> numberList = Collections.emptyList();
System.out.println(stringList == (Object)numberList);
Objects do not know their generic type. If a collection is immutable and always empty, it can be used to represent arbitrary empty lists. The same applies to stateless functions
Function<String, String> stringFunction = Function.identity();
Function<Number, Number> numberFunction = Function.identity();
System.out.println(stringFunction == (Object)numberFunction);
Prints true (on most systems; this is not a guaranteed behavior).
Generic types are only retained in some specific cases, like the signatures of field and method declarations and generic super types.
That’s why you need to create a subclass to exploit the fact that it will store the declared generic supertype. While it sometimes would be useful to construct a Type instance in a simpler way and a suitable factory method can be regarded a missing feature, getting the actual generic type of an arbitrary object (or its Class) is not possible in general.
I have to following scenario
case class A(name:String)
class Eq { def isMe(s:String) = s == "ME" }
val a = List(A("ME")).toDS
a.filter(l => new Eq().isMe(l.name))
Does this create a new object Eq every time for each data point on each executor ?
Nice one! I didn't know there is a different filter method for a typed dataset.
In order to answer your question, I will do some deep dive into Spark internals.
filter on a typed Dtaset has the following signature:
def filter(func: T => Boolean): Dataset[T]
Note that func is parameterized with T, hence Spark needs to deserialize both your object A along with the function.
TypedFilter Main$$$Lambda$, class A, [StructField(name,StringType,true)], newInstance(class A)
where Main$$$Lambda$ is a randomly generated function name
During optimization phase it might be eliminated by the EliminateSerialization rule if the following condition is met:
ds.map(...).filter(...) can be optimized by this rule to save extra deserialization, but ds.map(...).as[AnotherType].filter(...) can not be optimized.
If the rule is applicable TypedFilter is replaced by Filter.
The catch here is a Filter's condition. In fact, it is another special expression named Invoke where:
targetObject is the filter function Main$$$Lambda$
functionName is apply since it is a regular Scala function.
Spark eventually runs in one of these two modes - generate code or interpreter. Let's concentrate on the first one as it is the default.
Here is a simplified stack trace of the methods invocation that will generate the code
SparkPlan.execute
//https://github.com/apache/spark/blob/03e30063127fd71bef8a14553381e805fe5b6679/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala#L596
-> WholeStageCodegenExec.execute
[child: Filter]
-> child.execute
[condition Invoke]
-> Invoke.genCode
//https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala#L345
-> doGenCode
Simplified code after generation phase:
final class GeneratedIteratorForCodegenStage1 extends BufferedRowIterator {
private Object[] references;
private scala.collection.Iterator input;
private UnsafeRowWriter writer = new UnsafeRowWriter();
public GeneratedIteratorForCodegenStage1(Object[] references) {
this.references = references;
}
public void init(Iterator inputs) {
this.inputs = inputs;
}
protected void processNext() throws IOException {
while (input.hasNext() && !stopEarly()) {
InternalRow row = input.next();
do {
//Create A object
UTF8String value = row.getUTF8String(0));
A a = new A(value.toString)
//Filter by A's value
result = (scala.Function1) references[0].apply(a);
if (!result) continue;
writer.write(0, value)
append((writer.getRow());
}
if (shouldStop()) return;
}
}
}
We can see that projection is constructed with an array of objects passed in references variable. But where and how many times the references variable is instantiated?
It is created during WholeStageCodegenExec and instantiated only once per partition.
And this leads us to the answer that, however, filter function will be created only once per partition and not per data point, the Eq and A classes will be created per data point.
If you are curious about where it has been added to the code context:
It happens here
where javaType is scala.function1 .
and value is the implementation - Main$$$Lambda$
Ok guys , so I have a list of objects and I want to sort my list by a boolean function I created .
Function ->
bool funct(Student &s1,Student &s2)
{
return s1.calculMedie()<s2.calculMedie();
}
I got this list:
list<Student*> list;
list.push_back(sx[0]);
list.push_back(sx[1]);
list.push_back(sx[2]);
sx is comming from this declaration-> Student **sx=new Student*[3];
I created 3 objects of the type class Student.
I want to sort them by 'calculMedie()' which is a function that returns their average grade.
double Student::calculMedie()
{
int nr=0;
double s=0;
for(auto i : note)
{
nr++;
s=s+i;
}
return s/nr;}
^ thats how it looks.
And when I tried to do a list.sort(list.begin(),list.end(),funct) it gets me this error : " Invalid initialization of reference type 'Class&' from expression of type 'Class'"
It looks like you mixed std::sort algorithm with list<T>::sort method. List can be sorted only by using its sort method.
There are two overloads of list::sort:
void sort();
template< class Compare >
void sort( Compare comp ); // [2]
if you want to sort by comparator, write as follows:
list<Student*> list;
list.sort (funct);
because list stores pointers to Student, you need to modify signature of funct function, it must takes pointers not references:
bool funct(Student* s1,Student* s2)
{
return s1->calculMedie()<s2->calculMedie();
}
good practice is to pass s1,s2 as pointers to const object, when you change s1,s2 to be const Student* s1, const Student* s2 you need also to make calculMedie as const member function.
I have simplified the following example from my code and hoping there's no obvious compilation errors because of it. Lets say I have the following entities (not what i actually have, please assume I have no EF or schema issues, this is just for example):
public class Company
{
public string GroupProperty {get;set;}
public virtual ICollection<PricingForm> PricingForms {get;set;}
}
public class PricingForm
{
public decimal Cost {get;set;}
}
And I want to query like so:
IQueryable DynamicGrouping<T>(IQueryable<T> query)
{
Expression<Func<Company, decimal?>> exp = c => c.PricingForms.Sum(fr => fr.Cost);
string selector = "new (it.Key as Key, #0(it) as Value)";
IQueryable grouping = query.GroupBy("it.GroupProperty", "it").Select(selector, exp);
return grouping;
}
I get the following error when calling the groupby/select line:
System.Linq.Dynamic.ParseException: 'Argument list incompatible with lambda expression'
What type is "it" when grouped? I have tried using other expressions that assume it is an IGrouping<string, Company>, or a IQueryable<Company>, same error. I've tried just selecting "Cost" and moving the Sum() aggregate into the selector string (i.e. Sum(#0(it)) as Value) and always seem to get the same error.
I eventually tried something along the lines of:
Expression<Func<IEnumerable<Company>, decimal?>> exp = l => l.SelectMany(c => c.PricingForms).Sum(fr => fr.Cost);
However this one, I get farther but when attempting to iterate through the results I got a different error.
The LINQ expression node type 'Invoke' is not supported in LINQ to Entities.
So, with this dynamic grouping and injecting my own select expression, what should I assume the datatype of 'it' is? Will this even work?
The type of it is IGrouping<TKey, TElement>, where TKey is dynamic based on the keySelector result type, and TElement is the element type of the input IQueryable. Luckily IGrouping<TKey, TElement> inherits (is a) IEnumerable<TElement>, so as soon as you know the input element type, you can safely base selector on IEnumerable<TElement>.
In other words, the last attempt based on Expression<Func<IEnumerable<Company>, decimal?>> is correct.
The new error you are getting is because #0(it) generates Expression.Invoke call which is not supported by EF. The easiest way to fix that is to use LINQKit Expand method:
Expression<Func<Company, decimal?>> exp = c => c.PricingForms.Sum(fr => fr.Cost);
string selector = "new (it.Key as Key, #0(it) as Value)";
IQueryable grouping = query.GroupBy("it.GroupProperty", "it").Select(selector, exp);
// This would fix the EF invocation expression error
grouping = grouping.Provider.CreateQuery(grouping.Expression.Expand());
return grouping;
I have Predicate builder, which is having predicate and inner predicate and building a dynamic filter based on conditions, let's say I am selecting one department, under that department I am getting list of employees, once I get the list of employees, I need to load the respective records for each and every employee who belongs to the selected department.
Implementation is already done long back and it works fine if department is having not too many employees, once it goes beyond 500 or 1000, the predicate builder is causing a stack overflow. Please see my code snippet for this - I am using .net framework 4.5.2.
Getting stackoverflow exception when assigning to inner predicate at this line with in loop, when record is beyond 1000 or 500, it loops based on the employee records.
Expression<Func<EmployeeTable, bool>> predicate = PredicateBuilder.True<EmployeeTable>();
var innerPredicate = PredicateBuilder.False<EmployeeTable>();
case FilterBy.EmployeeName:
if (!isEmpNameFilterExists)
{
foreach (string empName in item.FieldCollection)
{
innerPredicate = innerPredicate.Or(x => x.Name.Equals(empName,
StringComparison.OrdinalIgnoreCase));
}
predicate = predicate.And(innerPredicate.Expand());
}
break;
This might happen due to the (usually sufficient) but small stack for .NET applications (https://stackoverflow.com/a/823729/2298807). Evaluation on predicates is usually done as part of lamda functions and they use the stack. I do not now the predicate library in detail but I assume a use of recursive functions.
Anyhow: I would suggest to use Contains by building a List<string> containing the names:
Expression<Func<EmployeeTable, bool>> predicate =
PredicateBuilder.True<EmployeeTable>();
var innerPredicate = PredicateBuilder.False<EmployeeTable>();
case FilterBy.EmployeeName:
if (!isEmpNameFilterExists)
{
List<string> namesList = new List<string>();
foreach (string empName in item.FieldCollection)
{
namesList.Add(empName);
}
predicate = predicate.And(x => namesList.Contains(x.Name));
}
break;
Note: Please check the syntax as I do not have a VS environment available at the moment.
I have added my own Expression builder engine, i.e. much better way to generate the Predicate.
PredicateBuilder works well with LINQ to Object, With EntityFramework its having the issue, because it generates the Lambda methods with full namespace of models and keep on adding with multiple search criteria. I felt like its having the limitations with large number of filters in Entity framework. In my case i was passing 728 count to just one field of Model, it was breaking with Stack-overflow exceptions.
728 lambdas method would be adding to the stack with full specific NAMESPACES.
Custom Expression is working totally fine in my case. Please find below Source code for the same.
var entityType = typeof(Emptable);
var parameter = Expression.Parameter(entityType, "a");
var containsMethod = typeof(string).GetMethod("Equals", new[] { typeof(string) });
//Switch Statement for EmployeeName Filter.
case FilterBy.EmployeeName:
if (!isEmpNameFilterExists)
{
var propertyExpression = Expression.Property(parameter, "EmployeeName");
foreach (string empName in item.FieldCollection)
{
var innerExpression = Expression.Call(propertyExpression, containsMethod, Expression.Constant(empName));
body = Expression.OrElse(body, innerExpression);
}
}
break;