T-SQL Parse text in select statement - tsql

I am currently working on a project to import data from one table to another. I am trying to parse a field that contains a FULLNAME into its parts LAST,FIRST,MI. The names are all in the format of "LAST,FIRST MI" I have written a stored procedure that correctly parses and returns the results as neccessary but I am unsure as to how to encorporate the stored procedure into a single select statement. For instance, current I have:
SELECT FULLNAME From UserInfo
and what I would like to have is something like this:
SELECT Last, First, MI from UserInfo
Currently my stored procedure takes the form of ParseName(FULLNAME, Last as OUTPUT, First as OUTPUT, MI as OUTPUT). How can I call this procedure and have the output variables split into 3 different columns?

Replace your stored procedure with table valued function. You can then apply this function to all the rows.
Below is an example - just put your logic for parsing the name
create FUNCTION dbo.f_parseName(#inFullName varchar(255))
RETURNS
#tbl TABLE (lastName varchar(255), firstName varchar(255), middleName varchar(255))
as
BEGIN
-- put your logic here
insert into #tbl(lastName,firstName,middleName)
select substring(#inFullName,0,10),substring(#inFullName,11,10), substring(#inFullName,21,10)
return
end
apply the function
-- sample data
declare #fullNames table (fullName varchar(255))
insert into #fullNames (fullName) values
('111111111122222222223333333333')
,('AAAAAAAAAABBBBBBBBBBCCCCCCCCCC')
select
fn.fullName
,pn.lastName
,pn.firstName
,pn.middleName
from
#fullNames fn
cross apply dbo.f_parseName(fn.fullName) pn

You could put the results of your stored procedure in a (temporary) table, like this (I added the FULLNAME column to provide a join condition, you would have to adapt your stored procedure to do that):
CREATE TABLE #temp (
FULLNAME NVARCHAR(..)
,Last NVARCHAR(..)
,First NVARCHAR(..)
,MI NVARCHAR(..)
);
INSERT INTO #temp (Last, First, MI)
EXECUTE MySproc;
If you want to be able to execute SELECT Last, First, MI from UserInfo structurally, you'd have to first add three columns to UserInfo for your name information, and then insert the parsed data that you got from your stored procedure.
EDIT
You mention that you use a SELECT ... INTO ... to put the data in a new table. I'm guessing that the new table does not have the FULLNAME column, and then you would be better off using a table valued function (as this answer suggests). If you keep the FULLNAME column however, you can use that to join the temp table with your new table to update the new table as follows:
UPDATE NUI
SET NUI.Last = T.Last, NUI.First = T.First, NUI.MI = T.MI
FROM NewUserInfo AS NUI
INNER JOIN #temp AS T ON NUI.FULLNAME = T.FULLNAME;
You could use this UPDATE method also with another join condition if you do not have the FULLNAME column in your new table, but make sure you run a good test beforehand to check if the join holds.
Hope this helps, good luck!

You could add computed columns to table like this:
alter table UserInfo
add firstName as SUBSTRING(fullName, CHARINDEX(',',fullName,0)+2, LEN(fullName)-CHARINDEX(',',fullName,0)-CHARINDEX(' ', REVERSE(fullName),0)-1)
,lastName as SUBSTRING(fullName, 0, CHARINDEX(',',fullName,0))
,middleInitital as REVERSE(SUBSTRING(REVERSE(fullName),0,CHARINDEX(' ', REVERSE(fullName),0)))
But the best solution would be to do the other way around. Normalize the data with real columns for firstName, lastName and middleInitial and do a computed column for the fullName.
The expressions in the code above may need a little more work, as I am sure they can be written more effectivly. I only made them work to show the idea.
After creating the computed columns you may do this:
select firstName
,lastName
,middleInitital
from UserInfo

Related

Creating a table function in PostgreSQL

I'm creating a table function in pgAdmin that returns a table with multiple variables from multiple tables. For simplicity, I will only show the function with one variable since that will answer my question just as well.
I have a patients relation with the following attribute: patient_id, address_id, name, gender, dob. For simplicity purposes, say I want to create a table function that will take in a patient's ID and return a table output with their name.
CREATE OR REPLACE FUNCTION patientname (patient_id char(8))
RETURNS TABLE (
patient_name varchar(250)) AS
$$
BEGIN
RETURN QUERY
SELECT patient_name
FROM patients
WHERE patient_id = patientname.patient_id;
END
$$
In the examples that I have seen, the variables being defined in the RETURNS TABLE() section have the same names as the variables in the tables that the information is being pulled from in the RETURN QUERY section. How would I create this function with variable names in the RETURNS TABLE section that are different from the variable names in the original tables that I will be pulling data from. Like in the example above the table output from the function should return a variable called patient_name but this variable in the patients table is only called name.
You can choose any result column name you want, and it doesn't have to be the same as the alias of the corresponding SELECT list entry of the query. The first column of the query result set will become the first column of the function result set, and so on.

How does SELECT INTO works with SAS

I'm new with SAS and I try to copy my Code from Access vba into SAS.
In Access I use often the SELECT INTO funtion, but it seems to me this function is not in SAS.
I have two tables and I get each day new data and I want to update my table with the new lines. Now I Need to check if some new lines appear -> if yes insert this lines into the old table.
I tried some Code from stackoverflow and other stuff from Google, but I didn't find something which works.
INSERT INTO OLD_TABLE T
VALUES (GRVID = VTGONR)
FROM NEW_TABLE V
WHERE not exists (SELECT V.VTGONR FROM NEW_TABLE V WHERE T.GRVID = V.VTGONR);
Not sure what the purpose of using the VALUES keyword is in your example. PROC SQL uses VALUES() to list static values. Like:
VALUES (100)
SAS just uses normal SQL syntax instead. See for example: https://www.techonthenet.com/sql/insert.php
To specify the observations to insert just use SELECT. You can add a WHERE clause as part of the select to limit the rows that you select to insert. To tell INSERT which columns to insert into list them inside () after the table name. Otherwise it will expect the order that the columns are listed in the select statement to match the order of the columns in the target table.
insert into old_table(GRVID)
select VTGONR from new_table
where VTGONR not in (select GRVID from old_table)
;

Process a row with unknown structure in a cursor

I am new to using cursors for looping through a set of rows. But so far I had prior knowledge of which columns I am about to read.
E.g.
DECLARE db_cursor FOR
SELECT Column1, Column2
FROM MyTable
DECLARE #ColumnOne VARCHAR(50), #ColumnTwo VARCHAR(50)
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO #ColumnOne, #ColumnTwo
...
But the tables I am about to read into my key/value table have no specific structure and I should be able to process them one row at a time. How, using a nested cursor, can I loop through all the columns of the fetched row and process them according to their type and name?
TSQL cursors are not really designed to read data from tables of unknown structure. The two possibilities I can think of to achieve something in that direction are:
First read the column names of an unknown table from the Information Schema Views (see System Information Schema Views (Transact-SQL)). Then use dynamic SQL to create the cursor.
If you simply want to get any columns as a large string value, you might also try a simple SELECT * FROM TABLE_NAME FOR XML AUTO and further process the retrieved data for your purposes (see FOR XML (SQL Server)).
SQL is not very good in dealing with sets generically. In most cases you must know column names, data types and much more in advance. But there is XQuery. You can transform any SELECT into XML rather easily and use the mighty abilities to deal with generic structures there. I would not recommend this, but it might be worth a try:
CREATE PROCEDURE dbo.Get_EAV_FROM_SELECT
(
#SELECT NVARCHAR(MAX)
)
AS
BEGIN
DECLARE #tmptbl TABLE(TheContent XML);
DECLARE #cmd NVARCHAR(MAX)= N'SELECT (' + #SELECT + N' FOR XML RAW, ELEMENTS XSINIL);';
INSERT INTO #tmptbl EXEC(#cmd);
SELECT r.value('*[1]/text()[1]','nvarchar(max)') AS RowID
,c.value('local-name(.)','nvarchar(max)') AS ColumnKey
,c.value('text()[1]','nvarchar(max)') AS ColumnValue
FROM #tmptbl t
CROSS APPLY t.TheContent.nodes('/row') A(r)
CROSS APPLY A.r.nodes('*[position()>1]') B(c)
END;
GO
EXEC Get_EAV_FROM_SELECT #SELECT='SELECT TOP 10 o.object_id,o.* FROM sys.objects o';
GO
--Clean-Up for test purpose
DROP PROCEDURE Get_EAV_FROM_SELECT;
The idea in short
The select is passed into the procedure as string. With the SP we create a statement dynamically and create XML from it.
The very first column is considered to be the Row's ID, if not (like in sys.objects) we can write the SELECT and force it that way.
The inner SELECT will read each row and return a classical EAV-list.

How to update column based on column name in postgres?

I've narrowed it down to two possibilities - DynamicSQL and using a case statement.
However, I've failed with both of these.
I simply don't understand dynamicSQL, and how I would use it in my case.
This is my attempt using case statements; one of many failed variations.
SELECT column_name,
CASE WHEN column_name = 'address' THEN (**update statement gives syntax error within here**)
END
FROM information_schema.columns
WHERE table_name = 'employees';
As an overview, I'm using Axios to talk to my Node server, which is making calls to my Heroku database using Massivejs.
Maybe this isn't the way to go - so here's my main problem:
I've ran into troubles because the values I'm planning on using as column names are sent to my server as strings. The exact call that I've been trying to use is
update employees
set $1 = $2
where employee_id = $3;
Once again, I'm passing into those using massive.
I get the error back { error: syntax error at or near "'address'"} because my incoming values are strings. My thought process was that the above statement would allow me to use variables because 'address' is encapsulated by quotes.
But alas, my thought process has failed me.
This seems to be close to answering my question, but I can't seem to figure out what to do in my case if using dynamic SQL.
How to use dynamic column names in an UPDATE or SELECT statement in a function?
Thanks in advance.
I will show you a way to do this by using a function.
First we create the employees table :
CREATE TABLE employees(
id BIGSERIAL PRIMARY KEY,
column1 TEXT,
column2 TEXT
);
Next, we create a function that requires three parameters:
columnName - the name of the column that needs to be updated
columnValue - the new value to which the column needs to be updated
employeeId - the id of the employee that will be updated
By using the format function we generate the update query as a string and use the EXECUTE command to execute the query.
Here is the code of the function.
CREATE OR REPLACE FUNCTION update_columns_on_employee(columnName TEXT, columnValue TEXT, employeeId BIGINT)
RETURNS VOID AS
$$
DECLARE update_statement TEXT := format('UPDATE EMPLOYEES SET %s = ''%s'' WHERE id = %L',columnName, columnValue, employeeId);
BEGIN
EXECUTE update_statement;
end;
$$ LANGUAGE plpgsql;
Now, lets insert some data into the employees table
INSERT INTO employees(column1, column2) VALUES ('column1_start_value','column2_start_value');
So now we currently have an employee with an id value of 1 who has 'column1_start_value' value for the column1, and 'column2_start_value' value for column2.
If we want to update the value of column2 from 'column2_start_value' to 'column2_new_value' all we have to do is execute the following call
SELECT * FROM update_columns_on_employee('column2','column2_new_value',1);

Split Cell using SSIS

I'm looking to use SSIS to transform the data held from a single source table. One of the cells has a string of characters. For example:
##/\/\/\/\/\##HHHHHHBBBB##/\/\/\/\/\
There's also another cell on the same row which contains a date.
Basically I want a each character within that string to be transferred to a new table as a row on it's own. The first two characters represent the date given in the other cell. The next two characters represent the following day and so on. So as well as having each character on it's own I would also want to increment the data and store that too.
Any idea how I would go about doing this or even if SSIS is the correct tool to be using.
Many Thanks
I wonder if you'd be better running this through a split-string function in SQL first? That way you'l be getting rows for each character along-side the date, and then you can just output it straight to a destination.
I've created a function to facilitate this:
CREATE FUNCTION [dbo].[udf_SplitStringIntoRows](#text varchar(max))
RETURNS #tbl TABLE ([value] char(1) NOT NULL)
AS
BEGIN
WHILE len(#text) > 0
BEGIN
INSERT INTO #tbl
SELECT left(#text,1)
SET #text = RIGHT(#text,len(#text)-1)
END
RETURN
END
Then, to test the data i created a quick temp table with your data in:
DECLARE #source as TABLE([value] varchar(max), [date] datetime)
INSERT INTO #source
SELECT '##/\/\/\/\/\##HHHHHHBBBB##/\/\/\/\/\', getdate()
UNION
SELECT '##/\/\/\/\/\##HHHHHHBBBB##/\/\/\/\/\', getdate()+1
UNION
SELECT '##/\/\/\/\/\##HHHHHHBBBB##/\/\/\/\/\', getdate()+2
Then cross applied the function to this dataset:
SELECT d.[value], s.date
FROM #source s
CROSS APPLY dbo.[udf_SplitStringIntoRows](s.value) d
Which should give you the source dataset you require to further process in SSIS.