How to give an alias name with a space in sparksql [closed] - scala

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
i have tried below codes
trial-1
..........
val df2=sqlContext.sql("select concat(' ',Id,LabelName) as 'first last' from p1 order by LabelName desc ");
trial-2
.........
val df2=sqlContext.sql("select concat(' ',Id,LabelName) from p1 order by LabelName desc ");
val df3=df2.toDF("first last")
trial-1 is throwing error when i tried to run it.......but in trial-2 it is taking the command but throwing the error when i performed below action
scala> df3.write.parquet("/prashanth/a1")

When a SQL column contains special characters in a SQL statement, you can use `, such as `first last`.
You cannot use space in a Parquet column. You can either rename the column or use other file format, such as csv.

Related

Perl script to remove new line character and move next line data to previous line [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
I have input like below
"ID"|"Desc"
"100"|"
The data present in Desc column has new line characters.
So the data came to second line.
Some records of data went to third line. But I need all data to be present in first line."
"101"|"This record desc is correct data which has present in single line. So I need data to present in single line."
I need output like below,
"ID"|"Desc"
"100"|"The data present in Desc column has new line characters.So the data came to second line.Some records of data went to third line. But I need all data to be present in first line."
"101"|"This record desc is correct data which has present in single line. So I need data to present in single line."
Can someone please help the Perl script where we can achieve above requirement.
Use Text::CSV_XS to process the file as it can parse it correctly.
perl -MText::CSV_XS=csv -wE 'csv( in => shift,
always_quote => 1,
sep_char => "|",
eol => "\n",
on_in => sub { $_[1][1] =~ s/\n//g } );
' -- file.csv > newfile.csv
I'm testing this in a Linux shell, you might need a different eol if you're in MSWin. Also, I don't know what rules Powershell uses for quoting, co you might need to use a different type of quotes.

val a = month(start_date),year(to-date) [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have a requirement, that I have a string like below input and I want string like below output. can anyone please help me ?
example 1
val input = "month(start_date),year(to_date),month(to_date)"
output = "start_date,to-date"
example 2
input = "abc(start),xyz(end)"
output = "start,end"
You need a regex to get the value inside parenthesis
val input = "month(start_date),year(to_date),month(to_date)"
val regex = "(?<=\\()[^)]+(?=\\))".r
val output = regex.findAllIn(input).toSet.mkString(",")
for regex explanation you can find it here How do I match the contents of parenthesis in a scala regular expression
toSet to remove the duplicated
and mkString to join the set with comma

Rename file from xx_02.csv to xx.csv [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I have a folder 'a' with about 200 files with names xx_out_02.csv and I want to rename them to xx_out.csv. May be using Matlab or running some script. I tried it in cmd but I have to run the command for each and every file.
Can someone help me here?
Best Regards
Dilip
You can use the movefilefunction from matlab.
Here is an example:
clc
addpath('yourdir')
csvf = dir('yourdir/*.csv');
numberOfcsv = numel(csvf);
for ii = 1:numberOfcsv
file = csvf(ii).name;
movefile(sprintf('yourdir/%s', file), sprintf('yourdir/x%03d_out.csv', ii), 'f');
end
Your question is unclear. I'm assuming
You want to strip off substrings of the form _ followed by one or more digits right before .csv.
The resulting target names are all different. For example, you have files such as xx_out_02.csv and yy_out_01.csv, but not xx_out_02.csv and xx_out_01.csv.
Operating system? I'm considering Windows. For other systems you can change the system line below with the appropriate system comand. Or better use movefile as in SamuelNLP's answer.
Code:
files = dir('*.csv');
names = {files.name};
for n = 1:numel(names)
name = names{n};
name_new = regexprep(name, '_\d+(?=\.csv$)', '');
system(['ren ' name ' ' name_new]); %// MS-DOS command to rename file
end

Extract text, Matlab [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I am trying to find a way to extract text in a specific and efficient way
as in this example:
'Hello Mr. Jack Andrew , your number is 894Gfsf , and your Bank ID # 734234"
I want a way to get the Name, the Number and the Bank ID Number.
I want to write software that deals with different text files and get those required values. I may not know the exact order but it must be a template like a bank statement or something.
Thanks!
It's a bit hard to understand what exactly is the problem.. If all you need to do is to split strings, here's a possible way to do it:
str = 'Hello Mr. Jack Andrew , your number is 894Gfsf , and your Bank ID # 734234';
tokenized = strsplit(str,' ');
Name = strjoin([tokenized(3:4)],' ');
Number = tokenized{9};
Account = tokenized{end};
Alternatively, for splitting you could use regexp(...,'split') or regexp(...,'tokens');
I think you want regular expressions for this. Here's an example:
str = 'Hello Mr. Jack Andrew , your number is 894Gfsf , and your Bank ID # 734234';
matches=regexp(str, 'your number is (\w+).*Bank ID # (\d+)', 'tokens');
matches{1}
ans =
'894Gfsf' '734234'
My suggestion would be to make a whole array of strings with sample patterns that you want to match, then build a set of regular expressions that collectively match all of your samples. Try each regexp in sequence until you find one that matches.
To do this, you will need to learn about regular expressions.

How to change the name of the output in matlab function to other than ans [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am new to Matlab and the answer may be very simple. I have script that runs a function to return an answer matrix, ans. How can I get Matlab to return a matrix named J instead?
Here is how I call the function: (myfunction(a,b));
I get the following error if i try to call 'myfunction = J'.
Error using myfunction (line 10)
Not enough input arguments.
Error in myfunction (line 25)
J = myfunction
If remove the line myfunction = J. I no longer get an error in line 10 any more.
Thanks
The problem was I was trying to name the output in the function where as I should have defined this in my script.
SO instead of calling the function as:
(myfunction(a,b));
Instead it should be: J=(myfunction(a,b));
It depends on how you call your script.
If you do:
> myScript;
in the command window, the result will be stored in the variable ans.
If you do:
> J = myScript;
The result will be stored in J.
Whatever function you have it should have at least one output in your case (because you say that you expect something). So for instance if your function returns one variable write it as
[T]=myfunction(a,b);
so in this case T is the name of your output instead of "ans". You need to write your function in a separate .m file and save it under the same name as is the name of function, so in this case you need to save it as myfunction.m file. It has to be in the same folder as your main code is.
See the link below
http://www.mathworks.com/help/matlab/ref/function.html