Find function in HIVE - find

I want to check if a field contains a string.
I want a function that would look like this:
FIND("string_to_find",field_to_search)
My data looks like this:
field_to_search
---------------
"no match in this string"
"record 2 has no matches"
"ahh finally xxxstring_to_findxxx is here"
I am looking for a function that identifies that the specified string is contained and at what position the string starts.
return
------
-1
-1
15

The built in locate function does nearly exactly what you need except that for your input, it would return
return
------
0
0
16
Since it indexes from 1. So all you need to do is:
Select locate("string_to_find","ahh finally xxxstring_to_findxxx is here") -1; --returns 15
Select locate("string_to_find","foo") -1; --returns -1

This is can be achieved by using instr and if in hive.
select if(instr(line,"xxxstring_to_findxxx")==0,-1,instr(line,"xxxstring_to_findxxx")) as position from find_tbl;
where line is your column name

Related

Returning values based on delimited string entries

In TSQL, the string in the database record is 'A/A/A' or 'A/B/A' (examples). I want to parse the string and for the first instance return '1'; in the 2nd instance, return '2'. That is, if all the values between the separators are the same, return a value; otherwise return another value. What is the best way to do this?
A bit blind answer:
Read the whole value in a variable. Read the first value part in another:
declare #entire nvarchar(max), #single nvarchar(max)
select/set #entire=....
set #single=left(#entire,charindex('/',#entire)-1)
Compare entire with #single replicated after removing slashes:
set #entire=replace(#entire,'/','')
select case when replicate(#single,len(#entire)/len(#single))=#entire
then 1 else 0 end as [What you want]
Something like this should work:
SELECT
x.*,
CASE
WHEN N > 1 THEN 0
ELSE 1
END Result
FROM (
SELECT
t.Column1,
t.Column2,
t.Column3,
t.SomeColumn,
COUNT(DISTINCT s.value) N
FROM dbo.YourTable t
OUTER APPLY STRING_SPLIT(t.SomeColumn,'/') s
GROUP BY
t.Column1,
t.Column2,
t.Column3,
t.SomeColumn
) x
;
Based on your simple example (no edge cases accounted for) the following should work for you:
select string, iif(replace(s,v,'')='',1,0) as Result
from t
cross apply (
values(left(string,charindex('/', string)-1),(replace(string,'/','')))
)s(v,s);
Example Fiddle

REDSHIFT if value in list

I am trying to set some variables on the top of my query via CTEs to make maintenance of a long query more easy to handle.
I have extracted an example of what I am trying to achieve. I am not managing to make 'tags' be perceived as a list rather than a whole string. I have tried split_part but have not managed to get what I require.
WITH tmp AS (
SELECT
'tag1, tag2, tag3' as tags
)
select
CASE WHEN 'tag1' in (select tags from tmp) THEN 1 ELSE 0 END matched_tags
Basically what I need is to have a string 'tag1' and see if it exists in the list 'tag1','tag2' or 'tag3'. This should give me 1 as there is a match
This is obviously not working because it is taking the 'tag1, tag2, tag3' as one string so there is no match.
Can anyone help me with this?
The STRPOS() function should do what you want. https://docs.aws.amazon.com/redshift/latest/dg/r_STRPOS.html
Something like this:
WITH tmp AS (
SELECT
'tag1, tag2, tag3' as tags
)
SELECT
CASE WHEN STRPOS(tags, 'tag1') > 0 THEN 1 ELSE 0 END as matched_tags
FROM tmp;

string query in a function in kdb

func:{[query] value query};
query is part of my function. I have add some like delete xxx, yyyy from (value query) and some manipulation. I am not sure why when I don't use value "query", the function doesn't work. It said it cannot find the table. So I have to use value query in the function and query is a parameter. I need to pass "select from tab" to the function.
My questions is: how do I send if the filter is a string too?
func["select from tab where a="abc""] <<< this does not work
How can I make string inside a string work?
Also, not sure why if I do
func["select from tab where date = max date"] it did not work due to length error
but func["100#select from tab where date = max date"] it works ?
The whole function is
getTable:{[query]loadHDB[];.Q.view date where date < .z.D-30;tab:(delete xxxx,yyyyy,sub,ID,subID,tID,subTID,text,gID from((value query)));remove[];update {";"sv #[s;where (s:";"vs x) like "cId=*";:;enlist""]}each eData from (update {";"sv #[s;where (s:";"vs x) like "AId=*";:;enlist""]}each eData from tab)};
remove:{[]delete tab from `.};
loadHDB:{[]value "\\l /hdb};
You can escape the quotes using backslash http://code.kx.com/wiki/Reference/BackSlash#escape
func["select from tab where a like \"abc\""]
Edit:
If tab is a HDB table then this length error could point to a column length issue (which 100# is avoiding). What does the following return?
q)checkPartition:{[dt] a!{c!{count get x} each ` sv' x,/:c:({x where not x like "*#"} key[x])except `.d}each a:(` sv' d,/:key[d:hsym `$string dt])};
q)check:checkPartition last date
q)(where{1<count distinct value x}each check)#check
I like using -3! and also -1 to print the result. If you know what your query should look like if executed from the console then after you construct your string, use -1 to print the string. It should print the query as how it would be executed by the console.
q)stst:-3!
q)"select max age by user from tab where col1 like ",stst"Hello"
"select max age by user from tab where col1 like \"Hello\""
q)/then to view how it will be executed, use -1
q)-1"select max age by user from tab where col1 like ",stst"Hello";
select max age by user from tab where col1 like "Hello"
q)/looks good

Trying to get rid of unwanted records in query

I have the following query
Select * from Common.dbo.Zip4Lookup where
zipcode='76033' and
StreetName='PO BOX' and
'704' between AddressLow and AddressHigh and
(OddEven='B' or OddEven = 'E')
The AddressLow and AddressHigh columns are varchar(10) fields.
The records returned are
AddressLow AddressHigh
------------ ------------
1 79
701 711
The second is the desired record How do I get rid of the first record.
The problem is that SQL is using a string compare instead of a numeric compare. This is because AddressLow/High are varchar and not int.
As long as AddressLow/High contain numbers, this should work:
Select * from Common.dbo.Zip4Lookup where
zipcode='76033' and
StreetName='PO BOX' and
704 between
CAST(AddressLow as INT) and
CAST(AddressHigh as INT) and
(OddEven='B' or OddEven = 'E')
The problem is that your condition fits to the first record in 7 on the beginning of the 79 because it's the string value. The easist way is IMHO change the data type to some numeric one.

PostgreSQL, Format Float

I try to format numeric field
select to_char(12315.83453, 'FM999999999999D9999')
in this case all its OK. Result is 12315.8345
But if value is between 0 and 1
select to_char(0.83453, 'FM999999999999D9999')
result is .8345 without 0 (zero) but I need 0.8345.
What kind of format should be indicate to to_char function to obtain result what I need?
SELECT to_char(0.83453, 'FM999999999990D9999');
I just changed the 9 before format D to 0.