Spark optimized coding [closed] - pyspark

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I am newbie in pyspark. In the following ways of the writing the code:
1st way:
s_df= s_df.withColumn('sum', s_df['Col1'] + s_df['Col2'] )
s_df= s_df.withColumn('difference', s_df['Col1'] - s_df['Col2'] )
2nd way:
s_df= ( s_df.withColumn('sum', s_df['Col1'] + s_df['Col2'])
.withColumn('difference', s_df['Col1'] - s_df['Col2']) )
It is always advisable to use the second one, this has to do something with how spark works internally. Can anyone please give me a detailed reason for this?

There is no difference between those 2 "ways" as you describe it, as #mck points out, s_df.explain() will be the same for both cases.
I don't think there is an official or "advisable" way to write code, as Spark doesn't provide any style guidelines in its document. However, I find it's easier writing it this way (more readable and maintainable)
s_df = (s_df
.withColumn('sum', s_df['Col1'] + s_df['Col2'])
.withColumn('difference', s_df['Col1'] - s_df['Col2'])
)
Also, it's worth mentioning that even though it's totally legitimate to override s_df, but you will lost your original dataframe which you probably will need it later.

Related

Using is_numeric to replace a string [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 12 months ago.
Improve this question
In the course of converting some variables from an API, I need to check if the API is returning a string, such as " N/A", and not a number such as "824". This is the code that I'm attempting to use where, if the variable from the API is a number, leave it alone, otherwise, change it to a = (Zero)
$weather["barometer_min"] = (is_numeric($weewxapi[36]) ? number_format($weewxapi[36],0) : "0");
It does not appear to be working, however, it is not throwing any errors. Can anyone guide me in the right direction?
As shown by Syscall, using 3v4l.org, the code works.

how to modify rules in perlcritic [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Hie ,
Perl critic (Source Code Analyser).
I am new to this . Although I have studied the documentation and know how to run it (That's not enough!).
I want to modify the rules (i.e include,exclude or add my own rules to it).
I know .perlcriticrc file can do that.
But I don't know how to do it.
Thank you
According to the doco for Perl::Critic, you can add a "policy" with the add_policy( -policy => $policy_name, -params => \%param_hash ) method:
-policy is the name of a Perl::Critic::Policy subclass module. The 'Perl::Critic::Policy' portion of the name can be omitted for brevity. This argument is required.
Then, when you look at the linked documentation for the subclass module (emphasis mine);
To work with the Perl::Critic engine, your implementation must behave as described below. For a detailed explanation on how to make new Policy modules, please see the Perl::Critic::DEVELOPER document included in this distribution.
... we see there's a whole document covering exactly what you want to do.
What part of that document are you having trouble with?

Is there a Coffeescript equivalent for Dart [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
The main aspect of CoffeeScript I'd like to see available also for Dart in form of a different, Dart-based language would be less verbosity, less brackets, less Java-style.
Does such solution exist ?
No.
If you don't want to have your field static you can omit the static keyword.
If you don't want to have your field final you can write var or a concrete type instead of the final keyword.
And if you don't want a loop you can omit for, while, forEach, ...

Is the best Scala convention to invoke collection.size or collection.length [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I understand these two methods are identical (one is defined in terms of the other) according to this previous question:
Scala Buffer: Size or Length?
But is there a reigning best practice or recommended convention? I can think of three options:
(1) Always use size
(2) Always use length
(3) Use size for all collections exception Array
I'm leaning towards (1) or (3). The rationale behind (3) is that these methods are inherited from Java. And in Java you'd be invoking collection.size() and array.length. The argument for (1) is that it builds on and simplifies (3). The argument for (2) I'm not really sure about.
They are the same. It makes no difference. Use whatever you want.

How to list all called methods? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Is there any way that I can list all called methods like there are called one after the other? For example now I am doing same thing in way that I'm putting NSLog(#"MethodName"); i every that method.
I want to do that by automatic way in NSLog. Is it possible?
If you don't have too much methods you can use
NSLog(#"%#" , NSStringFromSelector(_cmd));
to log their names. This way you don't have to copy the signatures manually each time.
Create a property 'NSMutableArray *calledMethods;`
And in each of your method use
[self.calledMethods addObject:NSStringFromSelector(_cmd)];
And whenever you want to print it NSLog it.
I think you want:
printf("%s\n", __PRETTY_FUNCTION__ ) ;
Which produces (for example)
-[AppDelegate application:didFinishLaunchingWithOptions:]
Or, you can use dtrace. This answer should help: https://stackoverflow.com/a/3874726/210171.
Also check https://stackoverflow.com/a/4604249/210171 (same linked question). Seems there's an environment variable NSObjCMessageLoggingEnabled you can set...