Postgres custom aggregation returns null when parallelized - postgresql

I have created a custom aggregate in postgres 11.3 and it works when parallel if off. When I mark it as parallel = safe, it returns null.
Could someone point me in the direction of where to start looking or how do I debug a parallel aggregation in postgres? In non parallel aggregation I can insert the state at each record into a temporary table, but inserts are not allowed in parallel queries...
Here's the aggregate:
CREATE OR REPLACE FUNCTION array_sort(ANYARRAY)
RETURNS ANYARRAY LANGUAGE SQL
AS $$
SELECT ARRAY(SELECT unnest($1) ORDER BY 1)
$$;
create type _stats_agg_accum_type AS (
cnt bigint,
q double precision[],
n double precision[],
np double precision[],
dn double precision[]
);
create type _stats_agg_result_type AS (
count bigint,
q25 double precision,
q50 double precision,
q75 double precision
);
create or replace function _stats_agg_p2_parabolic(_stats_agg_accum_type, double precision, double precision)
returns double precision AS '
DECLARE
a alias for $1;
i alias for $2;
d alias for $3;
BEGIN
RETURN a.q[i] + d / (a.n[i + 1] - a.n[i - 1]) * ((a.n[i] - a.n[i - 1] + d) * (a.q[i + 1] - a.q[i]) / (a.n[i + 1] - a.n[i]) + (a.n[i + 1] - a.n[i] - d) * (a.q[i] - a.q[i - 1]) / (a.n[i] - a.n[i - 1]));
END;
'
language plpgsql;
create or replace function _stats_agg_p2_linear(_stats_agg_accum_type, double precision, double precision)
returns double precision AS '
DECLARE
a alias for $1;
i alias for $2;
d alias for $3;
BEGIN
return a.q[i] + d * (a.q[i + d] - a.q[i]) / (a.n[i + d] - a.n[i]);
END;
'
language plpgsql;
create or replace function _stats_agg_accumulator(_stats_agg_accum_type, double precision)
returns _stats_agg_accum_type AS '
DECLARE
a ALIAS FOR $1;
x alias for $2;
k int;
d double precision;
qp double precision;
BEGIN
a.cnt = a.cnt + 1;
if a.cnt <= 5 then
a.q = array_append(a.q, x);
if a.cnt = 5 then
a.q = array_sort(a.q);
end if;
return a;
end if;
case
when x < a.q[1] then
a.q[1] = x;
k = 1;
when x >= a.q[1] and x < a.q[2] then
k = 1;
when x >= a.q[2] and x < a.q[3] then
k = 2;
when x >= a.q[3] and x < a.q[4] then
k = 3;
when x >= a.q[4] and x <= a.q[5] then
k = 4;
when x > a.q[5] then
a.q[5] = x;
k = 4;
end case;
for ii in 1..5 loop
if ii > k then
a.n[ii] = a.n[ii] + 1;
end if;
a.np[ii] = a.np[ii] + a.dn[ii];
end loop;
for ii in 2..4 loop
d = a.np[ii] - a.n[ii];
if (d >= 1 and a.n[ii+1] - a.n[ii] > 1) or (d <= -1 and a.n[ii-1] - a.n[ii] < -1) then
d = sign(d);
qp = _stats_agg_p2_parabolic(a, ii, d);
if qp > a.q[ii-1] and qp < a.q[ii+1] then
a.q[ii] = qp;
else
a.q[ii] = _stats_agg_p2_linear(a, ii, d);
end if;
a.n[ii] = a.n[ii] + d;
end if;
end loop;
return a;
END;
'
language plpgsql;
create or replace function _stats_agg_combiner(_stats_agg_accum_type, _stats_agg_accum_type)
returns _stats_agg_accum_type AS '
DECLARE
a alias for $1;
b alias for $2;
c _stats_agg_accum_type;
BEGIN
c.cnt = a.cnt + b.cnt;
c.q[2] = (a.q[2] + b.q[2]) / 2;
c.q[3] = (a.q[3] + b.q[3]) / 2;
c.q[4] = (a.q[4] + b.q[4]) / 2;
RETURN c;
END;
'
strict language plpgsql;
create or replace function _stats_agg_finalizer(_stats_agg_accum_type)
returns _stats_agg_result_type AS '
BEGIN
RETURN row(
$1.cnt,
$1.q[2],
$1.q[3],
$1.q[4]
);
END;
'
language plpgsql;
create aggregate stats_agg(double precision) (
sfunc = _stats_agg_accumulator,
stype = _stats_agg_accum_type,
finalfunc = _stats_agg_finalizer,
combinefunc = _stats_agg_combiner,
--parallel = safe,
initcond = '(0, {}, "{1,2,3,4,5}", "{1,2,3,4,5}", "{0,0.25,0.5,0.75,1}")'
);
Here's the setup and run code:
--CREATE TABLE temp (val double precision);
--insert into temp (val) select i from generate_series(0, 150000) as t(i);
select (stats_agg(val)).* from temp;
The expected result as follows and it works when run in parallel = unsafe
150001, 37500, 75000, 112500
In parallel = safe I get nulls:
150001, null, null, null

The problem is in the _stats_agg_combiner function. The function definition includes the strict keyword so there is no need to check for null input values.
In this specific aggregate, the _stats_agg_accum_type includes multiple arrays and the _stats_agg_combiner function requires that these arrays be filled with a minimum of 5 entries. This assumes that each new _stats_agg_accum_type instance processes at a minimum 5 records before being passed to the _stats_agg_combiner function.
Tests were being done on a table with 150k records and an assumption that each instance would therefore receive at a minimum 5 records. For whatever reason, this is an incorrect assumption. Regardless of the number of workers used (tested with 1-4) there is always at least one instance which processed exactly 0 records.
The solution was to add support for a _stats_agg_accum_type instance that had processed zero records and this had an array length of 0. See code below.
create or replace function _stats_agg_combiner(_stats_agg_accum_type, _stats_agg_accum_type)
returns _stats_agg_accum_type AS '
DECLARE
a alias for $1;
b alias for $2;
c _stats_agg_accum_type;
addA boolean;
addB boolean;
BEGIN
addA = a.cnt <= 5;
addB = b.cnt <= 5;
if addA and not addB then
c = b;
elsif addB and not addA then
c = a;
else
c.cnt = a.cnt + b.cnt;
for ii in 2..4 loop
c.q[ii] = (a.q[ii] + b.q[ii]) / 2;
end loop;
end if;
for ii in 1..5 loop
if addA and ii <= a.cnt then
c = _stats_agg_accumulator(c, a.q[ii]);
end if;
if addB and ii <= b.cnt then
c = _stats_agg_accumulator(c, b.q[ii]);
end if;
end loop;
RETURN c;
END;
'
language plpgsql strict;

Related

system verilog: for loop index variable issue

`timescale 1ns / 1ps
module param_right_shifter
# (parameter N = 3)
(
input logic [$clog2(N)-1:0] a, // input
input logic [N-1:0] amt, // shift bits
output logic [$clog2(N)-1:0] y // output
);
logic [$clog2(N)-1:0][N:0] s;
logic [$clog2(N)-1:0] placeholder = a;
localparam bit_num = $clog2(N)-1;
always_comb
begin
for(int i = 0; i < N; i++)
begin
if (i == 0)
begin
s[i] = amt[i] ? {placeholder[i], placeholder[bit_num:2**i]} : placeholder;
placeholder = s[i];
end
else
begin
s[i] = amt[i] ? {placeholder[$clog2(N)-1:0], placeholder[bit_num:2**i]} : placeholder;
placeholder = s[i];
end
end
end
endmodule
I am having an issue with referencing the 'i' variable. It says that 'range must be bounded by constant expressions'. I am unsure of how to resolve.
Try below. I am assuming N is will have a constant value throughout the simulation.
Please note that placeholder is driven by all bits of s so here placeholder will settle to s[N-1].
genvar i;
generate
for(i = 0; i < N; i++)
begin
if (i == 0)
begin
always_comb
begin
s[i] = amt[i] ? {placeholder[i], placeholder[bit_num:2**i]} : placeholder;
placeholder = s[i];
end
end
else
begin
always_comb
begin
s[i] = amt[i] ? {placeholder[$clog2(N)-1:0], placeholder[bit_num:2**i]} : placeholder;
placeholder = s[i];
end
end
end
endgenerate

Speed in Matlab vs. Julia vs. Fortran

I am playing around with different languages to solve a simple value function iteration problem where I loop over a state-space grid. I am trying to understand the performance differences and how I could tweak each code. For posterity I have posted full length working examples for each language below. However, I believe that most of the tweaking is to be done in the while loop. I am a bit confused what I am doing wrong in Fortran as the speed seems subpar.
Matlab ~2.7secs : I am avoiding a more efficient solution using the repmat function for now to keep the codes comparable. Code seems to be automatically multithreaded onto 4 threads
beta = 0.98;
sigma = 0.5;
R = 1/beta;
a_grid = linspace(0,100,1001);
tic
[V_mat, next_mat] = valfun(beta, sigma, R ,a_grid);
toc
where valfun()
function [V_mat, next_mat] = valfun(beta, sigma, R, a_grid)
zeta = 1-1/sigma;
len = length(a_grid);
V_mat = zeros(2,len);
next_mat = zeros(2,len);
u = zeros(2,len,len);
c = zeros(2,len,len);
for i = 1:len
c(1,:,i) = a_grid(i) - a_grid/R + 20.0;
c(2,:,i) = a_grid(i) - a_grid/R;
end
u = c.^zeta * zeta^(-1);
u(c<=0) = -1e8;
tol = 1e-4;
outeriter = 0;
diff = 1000.0;
while (diff>tol) %&& (outeriter<20000)
outeriter = outeriter + 1;
V_last = V_mat;
for i = 1:len
[V_mat(1,i), next_mat(1,i)] = max( u(1,:,i) + beta*V_last(2,:));
[V_mat(2,i), next_mat(2,i)] = max( u(2,:,i) + beta*V_last(1,:));
end
diff = max(abs(V_mat - V_last));
end
fprintf("\n Value Function converged in %i steps. \n", outeriter)
end
Julia (after compilation) ~5.4secs (4 threads (9425469 allocations: 22.43 GiB)), ~7.8secs (1 thread (2912564 allocations: 22.29 GiB))
[EDIT: after adding correct broadcasting and #views its only 1.8-2.1seconds now, see below!]
using LinearAlgebra, UnPack, BenchmarkTools
struct paramsnew
β::Float64
σ::Float64
R::Float64
end
function valfun(params, a_grid)
#unpack β,σ, R = params
ζ = 1-1/σ
len = length(a_grid)
V_mat = zeros(2,len)
next_mat = zeros(2,len)
u = zeros(2,len,len)
c = zeros(2,len,len)
#inbounds for i in 1:len
c[1,:,i] = #. a_grid[i] - a_grid/R .+ 20.0
c[2,:,i] = #. a_grid[i] - a_grid/R
end
u = c.^ζ * ζ^(-1)
u[c.<=0] .= typemin(Float64)
tol = 1e-4
outeriter = 0
test = 1000.0
while test>tol
outeriter += 1
V_last = deepcopy(V_mat)
#inbounds Threads.#threads for i in 1:len # loop over grid points
V_mat[1,i], next_mat[1,i] = findmax( u[1,:,i] .+ β*V_last[2,:])
V_mat[2,i], next_mat[2,i] = findmax( u[2,:,i] .+ β*V_last[1,:])
end
test = maximum( abs.(V_mat - V_last)[.!isnan.( V_mat - V_last )])
end
print("\n Value Function converged in ", outeriter, " steps.")
return V_mat, next_mat
end
a_grid = collect(0:0.1:100)
p1 = paramsnew(0.98, 1/2, 1/0.98);
#time valfun(p1,a_grid)
print("\n should be compiled now \n")
#btime valfun(p1,a_grid)
Fortran (O3, mkl, qopenmp) ~9.2secs: I also must be doing something wrong when declaring the openmp variables as the compilation will crash for some grid sizes when using openmp (SIGSEGV error).
module mod_calc
use omp_lib
implicit none
integer, parameter :: dp = selected_real_kind(33,4931), len = 1001
public :: dp, len
contains
subroutine linspace(from, to, array)
real(dp), intent(in) :: from, to
real(dp), intent(out) :: array(:)
real(dp) :: range
integer :: n, i
n = size(array)
range = to - from
if (n == 0) return
if (n == 1) then
array(1) = from
return
end if
do i=1, n
array(i) = from + range * (i - 1) / (n - 1)
end do
end subroutine
subroutine calc_val()
real(dp):: bbeta, sigma, R, zeta, tol, test
real(dp):: a_grid(len), V_mat(2,len), V_last(2,len), &
u(len,len,2), c(len,len,2)
integer :: outeriter, i, sss, next_mat(2,len), fu
character(len=*), parameter :: FILE_NAME = 'data.txt' ! File name.
call linspace(from=0._dp, to=100._dp, array=a_grid)
bbeta = 0.98
sigma = 0.5
R = 1.0/0.98
zeta = 1.0 - 1.0/sigma
tol = 1e-4
test = 1000.0
outeriter = 0
do i = 1,len
c(:,i,1) = a_grid(i) - a_grid/R + 20.0
c(:,i,2) = a_grid(i) - a_grid/R
end do
u = c**zeta * 1.0/zeta
where (c<=0)
u = -1e6
end where
V_mat = 0.0
next_mat = 0.0
do while (test>tol .and. outeriter<20000)
outeriter = outeriter+1
V_last = V_mat
!$OMP PARALLEL DEFAULT(NONE) &
!$OMP SHARED(V_mat, next_mat,V_last, u, bbeta) &
!$OMP PRIVATE(i)
!$OMP DO SCHEDULE(static)
do i=1,len
V_mat(1,i) = maxval(u(:,i,1) + bbeta*V_last(2,:))
next_mat(1,i) = maxloc(u(:,i,1) + bbeta*V_last(2,:),1)
V_mat(2,i) = maxval(u(:,i,2) + bbeta*V_last(1,:))
next_mat(2,i) = maxloc(u(:,i,2) + bbeta*V_last(1,:),1)
end do
!$OMP END DO
!$OMP END PARALLEL
test = maxval(abs(log(V_last/V_mat)))
end do
end subroutine
end module mod_calc
program main
use mod_calc
implicit none
integer:: clck_counts_beg,clck_rate,clck_counts_end
call omp_set_num_threads(4)
call system_clock ( clck_counts_beg, clck_rate )
call calc_val()
call system_clock ( clck_counts_end, clck_rate )
write (*, '("Time = ",f6.3," seconds.")') (clck_counts_end - clck_counts_beg) / real(clck_rate)
end program main
There should be ways to reduce the amount of allocations (Julia reports 32-45% gc time!) but for now I am too novice to see them, so any comments and tipps are welcome.
Edit:
Adding #views and correct broadcasting to the while loop improved the Julia speed considerably (as expected, I guess) and hence beats the Matlab loop now. With 4 threads the code now takes only 1.97secs. Specifically,
#inbounds for i in 1:len
c[1,:,i] = #views #. a_grid[i] - a_grid/R .+ 20.0
c[2,:,i] = #views #. a_grid[i] - a_grid/R
end
u = #. c^ζ * ζ^(-1)
#. u[c<=0] = typemin(Float64)
while test>tol && outeriter<20000
outeriter += 1
V_last = deepcopy(V_mat)
#inbounds Threads.#threads for i in 1:len # loop over grid points
V_mat[1,i], next_mat[1,i] = #views findmax( #. u[1,:,i] + β*V_last[2,:])
V_mat[2,i], next_mat[2,i] = #views findmax( #. u[2,:,i] + β*V_last[1,:])
end
test = #views maximum( #. abs(V_mat - V_last)[!isnan( V_mat - V_last )])
end
The reason the fortran is so slow is that it is using quadruple precision - I don't know Julia or Matlab but it looks as though double precision is being used in that case. Further as noted in the comments some of the loop orders are incorrect for Fortran, and also you are not consistent in your use of precision in the Fortran code, most of your constants are single precision. Correcting all these leads to the following:
Original: test = 9.83440674663232047922921588613472439E-0005 Time =
31.413 seconds.
Optimised: test = 9.8343643237979391E-005 Time = 0.912 seconds.
Note I have turned off parallelisation for these, all results are single threaded. Code is below:
module mod_calc
!!$ use omp_lib
implicit none
!!$ integer, parameter :: dp = selected_real_kind(33,4931), len = 1001
integer, parameter :: dp = selected_real_kind(15), len = 1001
public :: dp, len
contains
subroutine linspace(from, to, array)
real(dp), intent(in) :: from, to
real(dp), intent(out) :: array(:)
real(dp) :: range
integer :: n, i
n = size(array)
range = to - from
if (n == 0) return
if (n == 1) then
array(1) = from
return
end if
do i=1, n
array(i) = from + range * (i - 1) / (n - 1)
end do
end subroutine
subroutine calc_val()
real(dp):: bbeta, sigma, R, zeta, tol, test
real(dp):: a_grid(len), V_mat(len,2), V_last(len,2), &
u(len,len,2), c(len,len,2)
integer :: outeriter, i, sss, next_mat(2,len), fu
character(len=*), parameter :: FILE_NAME = 'data.txt' ! File name.
call linspace(from=0._dp, to=100._dp, array=a_grid)
bbeta = 0.98_dp
sigma = 0.5_dp
R = 1.0_dp/0.98_dp
zeta = 1.0_dp - 1.0_dp/sigma
tol = 1e-4_dp
test = 1000.0_dp
outeriter = 0
do i = 1,len
c(:,i,1) = a_grid(i) - a_grid/R + 20.0_dp
c(:,i,2) = a_grid(i) - a_grid/R
end do
u = c**zeta * 1.0_dp/zeta
where (c<=0)
u = -1e6_dp
end where
V_mat = 0.0_dp
next_mat = 0.0_dp
do while (test>tol .and. outeriter<20000)
outeriter = outeriter+1
V_last = V_mat
!$OMP PARALLEL DEFAULT(NONE) &
!$OMP SHARED(V_mat, next_mat,V_last, u, bbeta) &
!$OMP PRIVATE(i)
!$OMP DO SCHEDULE(static)
do i=1,len
V_mat(i,1) = maxval(u(:,i,1) + bbeta*V_last(:, 2))
next_mat(i,1) = maxloc(u(:,i,1) + bbeta*V_last(:, 2),1)
V_mat(i,2) = maxval(u(:,i,2) + bbeta*V_last(:, 1))
next_mat(i,2) = maxloc(u(:,i,2) + bbeta*V_last(:, 1),1)
end do
!$OMP END DO
!$OMP END PARALLEL
test = maxval(abs(log(V_last/V_mat)))
end do
Write( *, * ) test
end subroutine
end module mod_calc
program main
use mod_calc
implicit none
integer:: clck_counts_beg,clck_rate,clck_counts_end
!!$ call omp_set_num_threads(2)
call system_clock ( clck_counts_beg, clck_rate )
call calc_val()
call system_clock ( clck_counts_end, clck_rate )
write (*, '("Time = ",f6.3," seconds.")') (clck_counts_end - clck_counts_beg) / real(clck_rate)
end program main
Compilation / linking:
ian#eris:~/work/stack$ gfortran --version
GNU Fortran (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
ian#eris:~/work/stack$ gfortran -Wall -Wextra -O3 jul.f90
jul.f90:36:48:
character(len=*), parameter :: FILE_NAME = 'data.txt' ! File name.
1
Warning: Unused parameter ‘file_name’ declared at (1) [-Wunused-parameter]
jul.f90:35:57:
integer :: outeriter, i, sss, next_mat(2,len), fu
1
Warning: Unused variable ‘fu’ declared at (1) [-Wunused-variable]
jul.f90:35:36:
integer :: outeriter, i, sss, next_mat(2,len), fu
1
Warning: Unused variable ‘sss’ declared at (1) [-Wunused-variable]
Running:
ian#eris:~/work/stack$ ./a.out
9.8343643237979391E-005
Time = 0.908 seconds.
What #Ian Bush says in his answer about the dual precision is correct. Moreover,
You will likely not need openmp for the kind of parallelization you have done in your code. The Fortran's intrinsic do concurrent() will automatically parallelize the loop for you (when the code is compiled with the parallel flag of the respective compiler).
Also, the where elsewhere construct is slow as it often requires the creation of a logical mask array and then applying it in a do-loop. You can use do concurrent() in place of where to both avoid the extra temporary array creation and parallelize the computation on multiple cores.
Also, when comparing 64bit precision numbers, it's good to make sure both values are the same type and kind to avoid an implicit type/kind conversion before the comparison is made.
Also, the calculation of a_grid(i) - a_grid/R in computing the c array is redundant and can be avoided in the subsequent line.
Here is the modified optimized parallel Fortran code without any OpenMP,
module mod_calc
use iso_fortran_env, only: dp => real64
implicit none
integer, parameter :: len = 1001
public :: dp, len
contains
subroutine linspace(from, to, array)
real(dp), intent(in) :: from, to
real(dp), intent(out) :: array(:)
real(dp) :: range
integer :: n, i
n = size(array)
range = to - from
if (n == 0) return
if (n == 1) then
array(1) = from
return
end if
do concurrent(i=1:n)
array(i) = from + range * (i - 1) / (n - 1)
end do
end subroutine
subroutine calc_val()
implicit none
real(dp) :: bbeta, sigma, R, zeta, tol, test
real(dp) :: a_grid(len), V_mat(len,2), V_last(len,2), u(len,len,2), c(len,len,2)
integer :: outeriter, i, j, k, sss, next_mat(2,len), fu
character(len=*), parameter :: FILE_NAME = 'data.txt' ! File name.
call linspace(from=0._dp, to=100._dp, array=a_grid)
bbeta = 0.98_dp
sigma = 0.5_dp
R = 1.0_dp/0.98_dp
zeta = 1.0_dp - 1.0_dp/sigma
tol = 1e-4_dp
test = 1000.0_dp
outeriter = 0
do concurrent(i=1:len)
c(1:len,i,2) = a_grid(i) - a_grid/R
c(1:len,i,1) = c(1:len,i,2) + 20.0_dp
end do
u = c**zeta * 1.0_dp/zeta
do concurrent(i=1:len, j=1:len, k=1:2)
if (c(i,j,k)<=0._dp) u(i,j,k) = -1e6_dp
end do
V_mat = 0.0_dp
next_mat = 0.0_dp
do while (test>tol .and. outeriter<20000)
outeriter = outeriter + 1
V_last = V_mat
do concurrent(i=1:len)
V_mat(i,1) = maxval(u(:,i,1) + bbeta*V_last(:, 2))
next_mat(i,1) = maxloc(u(:,i,1) + bbeta*V_last(:, 2),1)
V_mat(i,2) = maxval(u(:,i,2) + bbeta*V_last(:, 1))
next_mat(i,2) = maxloc(u(:,i,2) + bbeta*V_last(:, 1),1)
end do
test = maxval(abs(log(V_last/V_mat)))
end do
Write( *, * ) test
end subroutine
end module mod_calc
program main
use mod_calc
implicit none
integer:: clck_counts_beg,clck_rate,clck_counts_end
call system_clock ( clck_counts_beg, clck_rate )
call calc_val()
call system_clock ( clck_counts_end, clck_rate )
write (*, '("Time = ",f6.3," seconds.")') (clck_counts_end - clck_counts_beg) / real(clck_rate)
end program main
Compiling your original code with /standard-semantics /F0x1000000000 /O3 /Qip /Qipo /Qunroll /Qunroll-aggressive /inline:all /Ob2 /Qparallel Intel Fortran compiler flags, yields the following timing,
original.exe
Time = 37.284 seconds.
compiling and running the parallel concurrent Fortran code in the above (on at most 4 cores, if any at all is used) yields,
concurrent.exe
Time = 0.149 seconds.
For comparison, this MATLAB's timing,
Value Function converged in 362 steps.
Elapsed time is 3.575691 seconds.
One last tip: There are several vectorized array computations and loops in the above code that can still be merged together to even further improve the speed of your Fortran code. For example,
u = c**zeta * 1.0_dp/zeta
do concurrent(i=1:len, j=1:len, k=1:2)
if (c(i,j,k)<=0._dp) u(i,j,k) = -1e6_dp
end do
in the above code can be all merged with the do concurrent loop appearing before it,
do concurrent(i=1:len)
c(1:len,i,2) = a_grid(i) - a_grid/R
c(1:len,i,1) = c(1:len,i,2) + 20.0_dp
end do
If you decide to do so, then you can define an auxiliary variable inverse_zeta = 1.0_dp / zeta to use in the computation of u inside the loop instead of using * 1.0_dp / zeta, thus avoiding the extra division (which is more costly than multiplication), without degrading the readability of the code.

OpenModelica complains about a negative value which can't be negative

Following this question I have modified the energy based controller which I have described here to avoid negative values inside the sqrt:
model Model
//constants
parameter Real m = 1;
parameter Real k = 2;
parameter Real Fmax = 3;
parameter Real x0 = 1;
parameter Real x1 = 2;
parameter Real t1 = 5;
parameter Real v0 = -2;
//variables
Real x, v, a, xy, F, vm, K;
initial equation
x = x0;
v = v0;
equation
v = der(x);
a = der(v);
m * a + k * x = F;
algorithm
if time < t1 then
xy := x0;
else
xy := x1;
end if;
K := Fmax * abs(xy - x) + k * (xy^2 - x^2) / 2;
if abs(xy - x) < 1e-6 then
F := k * x;
else
if K > 0 then
vm := sign(xy - x) * sqrt(2 * K / m);
F := Fmax * sign(vm - v);
else
F := Fmax * sign(x - xy);
end if;
end if;
annotation(
experiment(StartTime = 0, StopTime = 20, Tolerance = 1e-06, Interval = 0.001),
__OpenModelica_simulationFlags(lv = "LOG_STATS", outputFormat = "mat", s = "euler"));
end Model;
However, it keeps giving me the error:
The following assertion has been violated at time 7.170000
Model error: Argument of sqrt(K / m) was -1.77973e-005 should be >= 0
Integrator attempt to handle a problem with a called assert.
The following assertion has been violated at time 7.169500
Model error: Argument of sqrt(K / m) was -6.5459e-006 should be >= 0
model terminate | Simulation terminated by an assert at the time: 7.1695
STATISTICS 
Simulation process failed. Exited with code -1.
I would appreciate if you could help me know what is the problem and how I can solve it.
The code you created does event localization to find out when the condition in the if-statements becomes true and/or false. During this search it is possible that the expression in the square-root becomes negative although you 'avoided' it with the if-statement.
Try reading this and to apply the solution presented there. Spoiler: It basically comes down to adding a noEvent() statement for you Boolean condition...

PL/SQL to PostgreSQL conversion

Please help me to convert this PL/SQL into PostgreSQL. Thank you very much.
Prime Numbers
CREATE TABLE n (n NUMBER);<br/>
CREATE OR REPLACE PROCEDURE prime_number (n NUMBER)<br/>
IS <br/>
prime_count NUMBER := 0;<br/>
y VARCHAR2 (1) := 'N';<br/>
BEGIN<br/>
IF n >= 1
THEN
prime_count := 1;
INSERT INTO n
VALUES (2);
END IF;
IF n >= 2
THEN
prime_count := 2;
INSERT INTO n
VALUES (2);
END IF;
IF n >= 3
THEN
FOR i IN 4 .. n * n * n
LOOP
y := 'N';
FOR j IN 2 .. CEIL (SQRT (i))
LOOP
IF (MOD (i, j) = 0)
THEN
y := 'Y';
EXIT;
END IF;
END LOOP;
IF (y = 'N')
THEN
INSERT INTO n
VALUES (i);
COMMIT;
prime_count := prime_count + 1;
EXIT WHEN prime_count = n;
END IF;
END LOOP;
END IF;<br/>
END;
BEGIN<br/>
prime_number (1000000);<br/>
END;

How to update the column based on criteria in TSQL?

I have the following columns:
Cost | Rate1 | Rate2 | Rate3 | IsContainRate
100; 95; 100; 105; Y
105; 100; 110; 120; N
95; 95; 100; 130; Y
Basically, I want to update the IsContainRate column based on
IF (Cost = Rate1 OR Cost = Rate2 OR Cost=Rate3) THEN
update IsContainRate=Y
ELSE IsContainRate = N
Thanks
In TSQL, you can use a CASE statement to perform your conditional update.
Here's an example:
UPDATE yourTable
SET isContainRate = CASE
WHEN (Cost = Rate1
OR Cost = Rate2
OR Cost = Rate3)
THEN 'Y'
ELSE
'N'
END
The Case Statement will evaluate out to either 'Y' or 'N' depending on your expression.