Hibernate: StoredProcedure with recursive depthsearch: Mapping/Output Problems - postgresql

I searching for help. I have to map my Postgres 9.4 Database (DB) with Hibernate 5.2, of course it's an study task. The biggest Problem is, that I'm no brain in Hibernate, Java and coding itself XD
It's an SozialNetwork DB. To map the DB with Hibernate doing fine.
Now I should map a stored produce. This Produce should find the shortest friendship path between two persons. In Postgres the produce working fine.
That are the relevant DB-Tables:
For Person:
CREATE TABLE Person (
PID bigint NOT NULL,
firstName varchar(50) DEFAULT NULL,
lastName varchar(50) DEFAULT NULL,
(some more...)
PRIMARY KEY (PID)
);
And for the Relationship between to Persons:
CREATE TABLE Person_knows_Person (
ApID bigint NOT NULL,
BpID bigint REFERENCES Person (PID) (..)
knowsCreationDate timestamp,
PRIMARY KEY (ApID,BpID));
And that is the Stored Produce in short:
CREATE OR REPLACE FUNCTION ShortFriendshipPath(pid bigint, pid2 bigint)
RETURNS TABLE (a_pid bigint, b_pid bigint, depth integer, path2 bigint[], cycle2 boolean)
AS $$
BEGIN
RETURN QUERY
SELECT * FROM (
WITH RECURSIVE FriendshipPath(apid, bpid, depth, path, cycle) AS(
SELECT pkp.apid, pkp.bpid,1,
ARRAY[pkp.apid], false
FROM person_knows_person pkp
WHERE apid=$1 --OR bpid=$1
UNION ALL
SELECT pkp.apid, pkp.bpid, fp.depth+1, path || pkp.apid,
pkp.apid = ANY(path)
FROM person_knows_person pkp, FriendshipPath fp
WHERE pkp.apid = fp.bpid AND NOT cycle)
SELECT *
FROM FriendshipPath WHERE bpid=$2) AS OKOK
UNION
SELECT * FROM (
WITH RECURSIVE FriendshipPath(apid, bpid, depth, path, cycle) AS(
SELECT pkp.apid, pkp.bpid,1,
ARRAY[pkp.apid], false
FROM person_knows_person pkp
WHERE apid=$2 --OR bpid=$1
UNION ALL
SELECT pkp.apid, pkp.bpid, fp.depth+1, path || pkp.apid,
pkp.apid = ANY(path)
FROM person_knows_person pkp, FriendshipPath fp
WHERE pkp.apid = fp.bpid AND NOT cycle)
SELECT *
FROM FriendshipPath WHERE bpid=$1) AS YOLO
ORDER BY depth ASC LIMIT 1;
END;
$$ LANGUAGE 'plpgsql' ;
(Sorry for so much code, but it's for both directions, and before I post some copy+reduce misttakes^^)
The Call in Postgre for example:
SELECT * FROM ShortFriendshipPath(10995116277764, 94);
gives me this Output:
enter image description here
I use the internet for help and find 3 solutions for calling:
direct SQL call
call with NamedQuery and
map via XML
(fav found here)
I faild with all of them XD
I favorite the 1. solution with this call in session:
Session session = HibernateUtility.getSessionfactory().openSession();
Transaction tx = null;
try {
tx = session.beginTransaction();
System.out.println("Please insert a second PID:");
Scanner scanner = new Scanner(System.in);
long pid2 = Long.parseLong(scanner.nextLine());
// **Insert of second ID*/
Query query2 = session.createQuery("FROM " + Person.class.getName() + " WHERE pid = :pid ");
query2.setParameter("pid", pid2);
List<Person> listB = ((org.hibernate.Query) query2).list();
int cnt1 = 0;
while (cnt1 < listB.size()) {
Person pers1 = listB.get(cnt1++);
pid2 = pers1.getPid();
}
// Query call directly:
Query querySP = session.createSQLQuery("SELECT a_pid,path2 FROM ShortFriendshipPath(" + pid + "," + pid2 + ")");
List <Object[]> list = ((org.hibernate.Query) querySP).list();
for (int i=0; i<list.size();i++){
Personknowsperson friendship = (Personknowsperson)result.get(i);
}
} catch (Exception e) { (bla..)}
} finally { (bla....) }
Than I get following Error:
javax.persistence.PersistenceException:
org.hibernate.MappingException: No Dialect mapping for JDBC type: 2003
(..blabla...)
I understand why. Because my output is not of type Personknowsperson. I found an answer: that I have to say Hibernate what is the correct formate. And should use 'UserType'. So I try to find some explanations for how I create my UserType. But I found nothing, that I understand. Second Problem: I'm not sure what I should use for the bigint[] (path2). You see I'm expert -.-
Than I got the idea to try the 3.solution. But the first problem I had was where should I write the xml stuff. Because my Output is no table. So I try in the .cfg.xml but than Hibernate say that
Caused by: java.lang.IllegalArgumentException: org.hibernate.internal.util.config.ConfigurationException: Unable to perform unmarshalling at line number -1 and column -1 in RESOURCE hibernate.cfg.xml. Message: cvc-complex-type.2.4.a: Ungültiger Content wurde beginnend mit Element 'sql-query' gefunden. '{some links}' wird erwartet.
translation:
invalid content found starts with 'sql-query'
Now I'm a nervous wreck. And ask you.
Could someone explain what I have to do and what I did wrong (for dummies please). If more code need (java classes or something else) please tell me. Critic for coding also welcome, cause I want improve =)

Ok, I'm not an expert in postgressql, not hibernate, nor java. (I'm working with C#, SQL Server, NHibernate so ...) I still try to give you some hints.
You probably can set the types of the columns using addXyz methods:
Query querySP = session
.createSQLQuery("SELECT * FROM ShortFriendshipPath(...)")
.addScalar("a_pid", LongType.INSTANCE)
...
// add user type?
You need to create a user type for the array. I don't know how and if you can add it to the query. See this answer here.
You can also add the whole entity:
Query querySP = session
.createSQLQuery("SELECT * FROM ShortFriendshipPath(...)")
.addEntity(Personknowsperson.class)
...;
I hope it takes the mapping definition of the corresponding mapping file, where you can specify the user type.
Usually it's much easier to get a flat list of values, I mean a separate row for each different value in the array. Like this:
Instead of
1 | 2 | (3, 4, 5) | false
You would get:
1 | 2 | 3 | false
1 | 2 | 4 | false
1 | 2 | 5 | false
Which seems denormalized, but is actually the way how you build relational data.
In general: use parameters when passing stuff like ids to queries.
Query querySP = session
.createSQLQuery("SELECT * FROM ShortFriendshipPath(:pid1, :pid2)")
.setParameter("pid1", pid1)
.setParameter("pid2", pid2)
...

Related

Oracle query to Postgress query change ERROR

We are in progress to move our db to Postgres from Orcale.
There is this Oracle query (changed verbiage for privacy concerns):
SELECT * FROM (SELECT somedata.* FROM SNSN.SMS_TXN_SOMEDATA somedata
WHERE ((car= 'tesla' OR car = 'teslaX' OR car = 'teslaY') OR (buyer= 'john' OR buyer = 'rony' OR jim = 'sam'))
AND code = :code
ORDER BY somedata.datetime)
WHERE LIMIT :num
When I hit the endpoint I get this error
ERROR: argument of LIMIT must be type bigint, not type character varying
What would be a suitable/alternative variable to bind this. What can I use instead of variables :code and :num.

PostgreSQL ERROR: invalid input syntax for integer: "1e+06"

The full error message is:
ERROR: invalid input syntax for integer: "1e+06"
SQL state: 22P02
Context: In PL/R function sample
The query I'm using is:
WITH a as
(
SELECT a.tract_id_alias,
array_agg(a.pgid ORDER BY a.pgid) as pgids,
array_agg(a.sample_weight_geo ORDER BY a.pgid) as block_weights
FROM results_20161109.block_microdata_res_joined a
WHERE a.tract_id_alias in (66772, 66773, 66785, 66802, 66805, 66806, 66813)
AND a.bldg_count_res > 0
GROUP BY a.tract_id_alias
)
SELECT NULL::INTEGER agent_id,
a.tract_id_alias,
b.year,
unnest(shared.sample(a.pgids,
b.n_agents,
1 * b.year,
True,
a.block_weights)
) as pgid
FROM a
LEFT JOIN results_20161109.initial_agent_count_by_tract_res_11 b
ON a.tract_id_alias = b.tract_id_alias
ORDER BY b.year, a.tract_id_alias, pgid;
And the shared.sample function I'm using is:
CREATE OR REPLACE FUNCTION shared.sample(ids bigint[], size integer, seed integer DEFAULT 1, with_replacement boolean DEFAULT false, probabilities numeric[] DEFAULT NULL::numeric[])
RETURNS integer[] AS
$BODY$
set.seed(seed)
if (length(ids) == 1) {
s = rep(ids,size)
} else {
s = sample(ids,size, with_replacement,probabilities)
}
return(s)
$BODY$
LANGUAGE plr VOLATILE
COST 100;
ALTER FUNCTION shared.sample(bigint[], integer, integer, boolean, numeric[])
OWNER TO "server-superusers";
I'm pretty new to this stuff, so any help would be appreciated.
Not a problem of the function. Like the error messages says: The string '1e+06' cannot be cast to integer.
Obviously, the columns n_agents in your table results_20161109.initial_agent_count_by_tract_res_11 is not an integer column. Probably type text or varchar? (That info would help in your question.)
Either way, the assignment cast does not work for the target type integer. But it does for numeric:
Does not work:
SELECT '1e+06'::text::int; -- error as in question
Works:
SELECT '1e+06'::text::numeric::int;
If my assumptions hold, you can use this as stepping stone.
Replace b.n_agents in your query with b.n_agents::numeric::int.
It's your responsibility that numbers stay in integer range, or you get the next exception.
If that did not nail it, you need to look into function overloading:
Is there a way to disable function overloading in Postgres
And function type resolution:
PostgreSQL function call
The schema search path is relevant in many related cases, but you did schema-qualify all objects, so we can rule that out.
How does the search_path influence identifier resolution and the "current schema"
Your query generally looks good. I had a look and only found minor improvements:
SELECT NULL::int AS agent_id -- never omit the AS keyword for column alias
, a.tract_id_alias
, b.year
, s.pgid
FROM (
SELECT tract_id_alias
, array_agg(pgid) AS pgids
, array_agg(sample_weight_geo) AS block_weights
FROM ( -- use a subquery, cheaper than CTE
SELECT tract_id_alias
, pgid
, sample_weight_geo
FROM results_20161109.block_microdata_res_joined
WHERE tract_id_alias IN (66772, 66773, 66785, 66802, 66805, 66806, 66813)
AND bldg_count_res > 0
ORDER BY pgid -- sort once in a subquery. cheaper.
) sub
GROUP BY 1
) a
LEFT JOIN results_20161109.initial_agent_count_by_tract_res_11 b USING (tract_id_alias)
LEFT JOIN LATERAL
unnest(shared.sample(a.pgids
, b.n_agents
, b.year -- why "1 * b.year"?
, true
, a.block_weights)) s(pgid) ON true
ORDER BY b.year, a.tract_id_alias, s.pgid;

How to implement raw sql query in Tastypie

I am trying to do this simple query, but it does not work. Thanks.
SELECT * FROM TSimple where (start_date < '2012-04-20' and end_date is null) or
(end_date > '2012-04-20' and start_date < '2012-04-20')
class TSimple (models.Model):
start_date = models.DateTimeField()
end_date = models.DateTimeField(blank=True, null=True)
...
class TSimpleResource(ModelResource):
def dehydrate(self, bundle):
request_method = bundle.request.META['REQUEST_METHOD']
if request_method=='GET':
new_date = bundle.request.GET.get('new_date', '')
qs = TSimple.objects.raw(
'SELECT * FROM TSimple where (start_date<=\'' +
new_date + '\' and end_date>=\'' +
new_date + '\') or (start_date<=\'' + new_date +
'\' and end_date is null)')
ret_list = [row for row in qs]
// NOT WORK. Not able to get correct json data in javascript.
// It needs return bundle. HOW to replace bundle?
// Is this correct way to do it?
return ret_list
else:
// This is ok.
return bundle
I have following questions:
1) (raw sql method) If implementing in dehydrate method is correct way to do it? If it is, above does not work. It should return bundle object. How to construct new bundle?
If above method is ok, I noticed that bundle already constructed .data field with default query(?), which will be thrown away with new query. That raise the questions if this is right way to do it.
2) If there are other raw sql method to do it? Where to execute the sql?
3) How to do it in filter?
4) I know sql and not familiar with complex filter. That's why I am trying to use raw sql method to do quick prototype. What are the draw back? I noticed that using Tastypie has many unnecessary queries which I don't know how to get rid of it. Example, query on table with foreign key trigger query to another table's data, which I don't want to get.
I figure out the filter and it seems worked. But I am still interested in raw sql.
def apply_filters(self, request, applicable_filters):
base_filter = super(TSimpleResource, self).apply_filters(request,
applicable_filters)
new_date = request.GET.get('new_date', None)
if new_date:
qset = (
(
Q(start_date__lte=new_date) &
Q(end_date__gte=new_date)
) |
(
Q(start_date__lte=new_date) &
Q(end_date__isnull=True)
)
)
base_filter = base_filter.filter(qset)
return base_filter

Inserting values into multiple columns by splitting a string in PostgreSQL

I have the following heap of text:
"BundleSize,155648,DynamicSize,204800,Identifier,com.URLConnectionSample,Name,
URLConnectionSample,ShortVersion,1.0,Version,1.0,BundleSize,155648,DynamicSize,
16384,Identifier,com.IdentifierForVendor3,Name,IdentifierForVendor3,ShortVersion,
1.0,Version,1.0,".
What I'd like to do is extract data from this in the following manner:
BundleSize:155648
DynamicSize:204800
Identifier:com.URLConnectionSample
Name:URLConnectionSample
ShortVersion:1.0
Version:1.0
BundleSize:155648
DynamicSize:16384
Identifier:com.IdentifierForVendor3
Name:IdentifierForVendor3
ShortVersion:1.0
Version:1.0
All tips and suggestions are welcome.
It isn't quite clear what do you need to do with this data. If you really need to process it entirely in the database (looks like the task for your favorite scripting language instead), one option is to use hstore.
Converting records one by one is easy:
Assuming
%s =
BundleSize,155648,DynamicSize,204800,Identifier,com.URLConnectionSample,Name,URLConnectionSample,ShortVersion,1.0,Version,1.0
SELECT * FROM each(hstore(string_to_array(%s, ',')));
Output:
key | value
--------------+-------------------------
Name | URLConnectionSample
Version | 1.0
BundleSize | 155648
Identifier | com.URLConnectionSample
DynamicSize | 204800
ShortVersion | 1.0
If you have table with columns exactly matching field names (note the quotes, populate_record is case-sensitive to key names):
CREATE TABLE data (
"BundleSize" integer, "DynamicSize" integer, "Identifier" text,
"Name" text, "ShortVersion" text, "Version" text);
You can insert hstore records into it like this:
INSERT INTO data SELECT * FROM
populate_record(NULL::data, hstore(string_to_array(%s, ',')));
Things get more complicated if you have comma-separated values for more than one record.
%s = BundleSize,155648,DynamicSize,204800,Identifier,com.URLConnectionSample,Name,URLConnectionSample,ShortVersion,1.0,Version,1.0,BundleSize,155648,DynamicSize,16384,Identifier,com.IdentifierForVendor3,Name,IdentifierForVendor3,ShortVersion,1.0,Version,1.0,
You need to break up an array into chunks of number_of_fields * 2 = 12 elements first.
SELECT hstore(row) FROM (
SELECT array_agg(str) AS row FROM (
SELECT str, row_number() OVER () AS i FROM
unnest(string_to_array(%s, ',')) AS str
) AS str_sub
GROUP BY (i - 1) / 12) AS row_sub
WHERE array_length(row, 1) = 12;
Output:
"Name"=>"URLConnectionSample", "Version"=>"1.0", "BundleSize"=>"155648", "Identifier"=>"com.URLConnectionSample", "DynamicSize"=>"204800", "ShortVersion"=>"1.0"
"Name"=>"IdentifierForVendor3", "Version"=>"1.0", "BundleSize"=>"155648", "Identifier"=>"com.IdentifierForVendor3", "DynamicSize"=>"16384", "ShortVersion"=>"1.0"
And inserting this into the aforementioned table:
INSERT INTO data SELECT (populate_record(NULL::data, hstore(row))).* FROM ...
the rest of the query is the same.

Metadata about a column in SQL Server 2008 R2?

I'm trying to figure out a way to store metadata about a column without repeating myself.
I'm currently working on a generic dimension loading SSIS package that will handle all my dimensions. It currently does :
Create a temporary table identical to the given table name in parameters (this is a generic stored procedure that receive the table name as parameter, and then do : select top 0 * into ##[INSERT ORIGINAL TABLE NAME HERE] from [INSERT ORIGINAL TABLE NAME HERE]).
==> Here we insert custom code for this particular dimension that will first query the data from a datasource and get my delta, then transform the data and finally loads it into my temporary table.
Merge the temporary table into my original table with a T-SQL MERGE, taking care of type1 and type2 fields accordingly.
My problem right now is that I have to maintain a table with all the fields in it to store a metadata to tell my scripts if this particular field is type1 or type2... this is nonsense, I can get the same data (minus type1/type2) from sys.columns/sys.types.
I was ultimately thinking about renaming my fields to include their type in it, such as :
FirstName_T2, LastName_T2, Sex_T1 (well, I know this can be type2, let's not fall into that debate here).
What do you guyz would do with that? My solution (using a table with that metadata) is currently in place and working, but it's obvious that repeating myself from the systables to a custom table is nonsense, just for a simple type1/type2 info.
UPDATE: I also thought about creating user defined types like varchar => t1_varchar, t2_varchar, etc. This sounds like something a bit sluggy too...
Everything you need should already be in INFORMATION_SCHEMA.COLUMNS
I can't follow your thinking of not using provided tables/views...
Edit: As scarpacci mentioned, this somewhat portable if needed.
I know this is bad, but I will post an answer to my own question... Thanks to GBN for the help tho!
I am now storing "flags" in the "description" field of my columns. I, for example, can store a flag this way : "TYPE_2_DATA".
Then, I use this query to get the flag back for each and every column :
select columns.name as [column_name]
,types.name as [type_name]
,extended_properties.value as [column_flags]
from sys.columns
inner join sys.types
on columns.system_type_id = types.system_type_id
left join sys.extended_properties
on extended_properties.major_id = columns.object_id
and extended_properties.minor_id = columns.column_id
and extended_properties.name = 'MS_Description'
where object_id = ( select id from sys.sysobjects where name = 'DimDivision' )
and is_identity = 0
order by column_id
Now I can store metadata about columns without having to create a separate table. I use what's already in place and I don't repeat myself. I'm not sure this is the best possible solution yet, but it works and is far better than duplicating information.
In the future, I will be able to use this field to store more metadata, where as : "TYPE_2_DATA|ANOTHER_FLAG|ETC|OH BOY!".
UPDATE :
I now store the information in separate extended properties. You can manage extended properties using sp_addextendedproperty and sp_updateextendedproperty stored procedures. I have created a simple store procedure that help me to update those values regardless if they currently exist or not :
create procedure [dbo].[UpdateSCDType]
#tablename nvarchar(50),
#fieldname nvarchar(50),
#scdtype char(1),
#dbschema nvarchar(25) = 'dbo'
as
begin
declare #already_exists int;
if ( #scdtype = '1' or #scdtype = '2' )
begin
select #already_exists = count(1)
from sys.columns
inner join sys.extended_properties
on extended_properties.major_id = columns.object_id
and extended_properties.minor_id = columns.column_id
and extended_properties.name = 'ScdType'
where object_id = (select sysobjects.id from sys.sysobjects where sysobjects.name = #tablename)
and columns.name = #fieldname
if ( #already_exists = 0 )
begin
exec sys.sp_addextendedproperty
#name = N'Scd_Type',
#value = #scdtype,
#level0type = N'SCHEMA',
#level0name = #dbschema,
#level1type = N'TABLE',
#level1name = #tablename,
#level2type = N'COLUMN',
#level2name = #fieldname
end
else
begin
exec sys.sp_updateextendedproperty
#name = N'Scd_Type',
#value = #scdtype,
#level0type = N'SCHEMA',
#level0name = #dbschema,
#level1type = N'TABLE',
#level1name = #tablename,
#level2type = N'COLUMN',
#level2name = #fieldname
end
end
end
Thanks again