A system wraps lines in a log file if they exceed X characters. I am trying to extract various data from the log, but first I need to combine all the split lines so gawk can parse the fields as a single record.
For example:
2012/11/01 field1 field2 field3 field4 fi
eld5 field6 field7
2012/11/03 field1 field2 field3
2012/12/31 field1 field2 field3 field4 fi
eld5 field6 field7 field8 field9 field10
field11 field12 field13
2013/01/10 field1 field2 field3
2013/01/11 field1 field2 field3 field4
I want to return
2012/11/01 field1 field2 field3 field4 field5 field6 field7
2012/11/03 field1 field2 field3
2012/12/31 field1 field2 field3 field4 field5 field6 field7 field8 field9 field10 field11 field12 field13
2013/01/10 field1 field2 field3
2013/01/11 field1 field2 field3 field4
The actual max line length in my case is 130. I'm reluctant to test for that length and use getline to join the next line, in case there is a entry that is exactly 130 chars long.
Once I've cleaned up the log file, I'm also going to want to extract all the relevant events, where "relevant" may involve criteria like:
'foo' is anywhere in any field in the record
field2 ~ /bar|dtn/
if field1 ~ /xyz|abc/ && field98 == "0001"
I'm wondering if I will need to run two successive gawk programs, or if I can combine all of this into one.
I'm a gawk newbie and come from a non-Unix
$ awk '{printf "%s%s",($1 ~ "/" ? rs : ""),$0; rs=RS} END{print ""}' file
2012/11/01 field1 field2 field3 field4 field5 field6 field7
2012/11/03 field1 field2 field3
2012/12/31 field1 field2 field3 field4 field5 field6 field7 field8 field9 field10 field11 field12 field13
2013/01/10 field1 field2 field3
2013/01/11 field1 field2 field3 field4
Now that I've noticed you don't actually want to just print recombined records, here's an alternative way to do that that's more amenable to test on the recompiled record ("s" in this script:
$ awk 'NR>1 && $1~"/"{print s; s=""} {s=s $0} END{print s}' file
Now with that structure, instead of just printing s you can perform tests on s, for example (note "foo" in 3rd record):
$ cat file
2012/11/01 field1 field2 field3 field4 fi
eld5 field6 field7
2012/11/03 field1 field2 field3
2012/12/31 field1 field2 foo field4 fi
eld5 field6 field7 field8 field9 field10
field11 field12 field13
2013/01/10 field1 field2 field3
2013/01/11 field1 field2 field3 field4
$ awk '
function tst(rec, flds,nf,i) {
nf=split(rec,flds)
if (rec ~ "foo") {
print rec
for (i=1;i<=nf;i++)
print "\t",i,flds[i]
}
}
NR>1 && $1~"/" { tst(s); s="" }
{ s=s $0 }
END { tst(s) }
' file
2012/12/31 field1 field2 foo field4 field5 field6 field7 field8 field9 field10 field11 field12 field13
1 2012/12/31
2 field1
3 field2
4 foo
5 field4
6 field5
7 field6
8 field7
9 field8
10 field9
11 field10
12 field11
13 field12
14 field13
gawk '{ gsub( "\n", "" ); printf $0 RT }
END { print }' RS='\n[0-9][0-9][0-9][0-9]/[0-9][0-9]/[0-9][0-9]' input
This can be somewhat simplified with:
gawk --re-interval '{ gsub( "\n", "" ); printf $0 RT }
END { print }' RS='\n[0-9]{4}/[0-9]{2}/[0-9]{2}' input
This might work for you (GNU sed):
sed -r ':a;$!N;\#\n[0-9]{4}/[0-9]{2}/[0-9]{2}#!{s/\n//;ta};P;D' file
Here's a slightly bigger Perl solution which also handles the additional filtering (as you tagged this perl as well):
root#virtualdeb:~# cat combine_and_filter.pl
#!/usr/bin/perl -n
if (m!^2\d{3}/\d{2}/\d{2} !){
print $prevline if $prevline =~ m/field13/;
$prevline = $_;
}else{
chomp($prevline);
$prevline .= $_
}
root#virtualdeb:~# perl combine_and_filter < /tmp/in.txt
2012/12/31 field1 field2 field3 field4 field5 field6 field7 field8 field9 field10 field11 field12 field13
this may work for you:
awk --re-interval '/^[0-9]{4}\//&&s{print s;s=""}{s=s""sprintf($0)}END{print s}' file
test with your example:
kent$ echo "2012/11/01 field1 field2 field3 field4 fi
eld5 field6 field7
2012/11/03 field1 field2 field3
2012/12/31 field1 field2 field3 field4 fi
eld5 field6 field7 field8 field9 field10
field11 field12 field13
2013/01/10 field1 field2 field3
2013/01/11 field1 field2 field3 field4"|awk --re-interval '/^[0-9]{4}\//&&s{print s;s=""}{s=s""sprintf($0)}END{print s}'
2012/11/01 field1 field2 field3 field4 field5 field6 field7
2012/11/03 field1 field2 field3
2012/12/31 field1 field2 field3 field4 field5 field6 field7 field8 field9 field10 field11 field12 field13
2013/01/10 field1 field2 field3
2013/01/11 field1 field2 field3 field4
Related
I have such table (for example):
Field1
Field2
Field3
Field4
.....
1
a
c
c
1
a
x
c
1
a
c
c
2
a
y
j
2
b
y
k
2
b
y
l
I need to select by one field by one value and compare all fields in selected rows, like SELECT * WHERE Filed1=1.....COMPARE
I would like to have a result like:
Field1
Field2
Field3
Field4
.....
true
true
false
true
This should work for fixed columns and if there are no NULL values:
SELECT
COUNT(DISTINCT t.col1) = 1,
COUNT(DISTINCT t.col2) = 1,
COUNT(DISTINCT t.col3) = 1,
...
FROM mytable t
WHERE t.filter_column = 'some_value'
GROUP BY col1;
If you have some nullable columns, perhaps you could give it a try with something like this instead of the COUNT(DISTINCT t.<colname>) = 1:
BOOL_AND(NOT EXISTS(
SELECT 1
FROM mytable t2
WHERE t2.filter_column = 'some_value'
AND t2.<colname> IS DISTINCT FROM t.<colname>
))
If you do not have fixed columns, you should try to build up a dynamic query by a function taking as parameters the tablename, the name of the filter-column and the value for the filter.
Another remark: If you remove the filter (the condition t.filter_column = 'some_value') and add another output column as just t.filter_column, you should be able to recieve the result of this query for all distinct values in your filter-column.
I am trying to join two paired RDDs, as per the answer provided here
Joining two RDD[String] -Spark Scala
I am getting an error
error: value leftOuterJoin is not a member of org.apache.spark.rdd.RDD[
The code snippet is as below.
val pairRDDTransactions = parsedTransaction.map
{
case ( field3, field4, field5, field6, field7,
field1, field2, udfChar1, udfChar2, udfChar3) =>
((field1, field2), field3, field4, field5,
field6, field7, udfChar1, udfChar2, udfChar3)
}
val pairRDDAccounts = parsedAccounts.map
{
case (field8, field1, field2, field9, field10 ) =>
((field1, field2), field8, field9, field10)
}
val transactionAddrJoin = pairRDDTransactions.leftOuterJoin(pairRDDAccounts).map {
case ((field1, field2), (field3, field4, field5, field6,
field7, udfChar1, udfChar2, udfChar3, field8, field9, field10)) =>
(field1, field2, field3, field4, field5, field6,
field7, udfChar1, udfChar2, udfChar3, field8, field9, field10)
}
In this case, field1 and field 2 are my keys, on which I want to perform join.
Joins are defined for RDD[(K, V)] (RDD of Tuple2 objects. In you case however, there arbitrary tuples (Tuple4[_, _, _, _] and Tuple8[_, _, _, _, _, _, _, _]) - this just cannot work.
You should
... =>
((field1, field2),
(field3, field4, field5, field6, field7, udfChar1, udfChar2, udfChar3)
and
... =>
((field1, field2), (field8, field9, field10))
respectively.
I have table structure as below
FIELD1 FIELD2 FIELD3 FIELD4
ID001 AB 1 R
ID001 CD 2 R
ID002 AB 1 R
ID002 CD 3 R
ID002 EF 4 R
ID003 AB 1 R
ID003 CD 2 R
ID003 PQ 4 R
ID004 PQ 1 R
ID004 RS 2 R
Input I am getting from the other resource is like this-:
Field2, field3 and field4 will be the input. Field2 and field3 will be sent in combination. Field 4 will be sent once.
Input 1-((AB,1,CD,2),R)
Input 2-((AB,1,CD,2,PQ,4),R)
For this I should get field1 as the output.
For input 1, it should return ID001
For input 2, it should return ID003.
Can anybody help me out for this?
The whole requirement is to get field1 from other fields.
It works using the XML aggregation capabilities of DB2, and the input parameters as concatenated filter string:
select field1 from
(
select
field1,
xmlcast(xmlgroup(field2 || field3 as a) as varchar(15)) as fields23,
field4
from
your_table
group by
field1, field4
)
where
fields23 = 'AB1CD2' and field4 = 'R';
For the "Input 2" case use this filter:
...
where
fields23 = 'AB1CD2PQ4' and field4 = 'R';
Based on this blog entry: https://www.ibm.com/developerworks/community/blogs/SQLTips4DB2LUW/entry/aggregating_strings42?lang=en
I have a problem. Right now, a file that was supposed to be tab-delimited is missing a few "newlines"... My file looks something like this right now
Field1 Field2 Field3
Field1 Field2 Field3 Field1 Field2 Field3 Field1 Field2 Field3
Field1 Field2 Field3 Field1 Field2 Field3
Field1 Field2 Field3
Field1 Field2 Field3 Field1 Field2 Field3
Field1 Field2 Field3
I want to make it look uniform, with each "field1" starting at a new line
Field1 Field2 Field3
Field1 Field2 Field3
Field1 Field2 Field3
Field1 Field2 Field3
Field1 Field2 Field3
The problem is, each of these columns has a unique set of data, so I can't find a familiar place to split it into a new line. Any help is greatly appreciated!
PS: doing this in sed or tr would be greatly appreciated
PS: there can be up to 150 columns, not just 6 or 9 or any other multiple of 3
This might work for you:
sed 's/\s/\n/3;P;D' file
Explanation:
The third white space character (space or tab) is replaced by a newline s/\s/\n/3
The string upto the first newline is printed P
The string upto the first newline is deleted D
The D command has a split personality. If there is no newline it deletes the string and the next line is read in. If, however, a newline exists, it deletes the string upto the newline and then the cycle is started on the same string until no newlines exist.
This will work on the example you gave...
sed -e 's/\([^\t ]* [^\t ]* [^\t ]*\)[\t ]/\1\n/g'
We just recently moved our DB from 9i to 10G
(Yes..better late than never and No - moving to 11g is currently not an option :-))
Details of my Oracle 10G DB are :-
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE 10.2.0.1.0 Production
I am faced with a very weird problem since that move.
A query that was and still is working fine with 9i just wont work on 10G.
I did search through other SO questions related to rownum but couldnt really find anything similar.
SQL Query is :-
SELECT * FROM
( SELECT field1, field2 , field3, field4, field5, field6, field7, to_char(rownum) field8
FROM
( SELECT
field1,
field2,
field3,
field4,
field5,
field6,
field7,
''
FROM
.......REST OF MY COMPLEX INNER QUERY
)
)
WHERE field8 BETWEEN 21 AND 30;
Basically, the 21 / 30 are numbers that are the index of the records passed to the query for pagination and in 9i, this query works like expected and returns the specified set of data only.
However in 10G, this same query does not work at all - always returns 0 records.
If i comment the rownum related parts of the query:-
to_char(rownum) field8 and
WHERE field8 BETWEEN 21 AND 30;
then i get the entire result set and thats great.
But since my intention is to do pagination using the rownum, the entire purpose is defeated.
Does anyone know of any reason why this query has stopped working with 10G.
I tried looking up any updates to the rownum implementation but havent been able to really come across anything that will help.
EDIT :-
While doing my debugging, i have come across something that to me, is making no sense.
I am putting in the entire query below as i cant explain without it.
SELECT * FROM
( SELECT field1, field2 , field3, field4, field5, field6, field7, to_char(rownum) field8 from
( SELECT PM.POLICY_NO field1
,PM.INSURED_CODE field2
,PM.INSURED_NAME field3
,TO_CHAR(PM.POLICY_EFFECTIVE_DATE,'DD/MM/YYYY') field4
,TO_CHAR(PM.POLICY_EXPIRATION_DATE,'DD/MM/YYYY') field5
,'' field6
,'' field7
,'' field8
FROM POLICY_MAIN PM
,POLICY_ENDORSEMENT_MAIN PEM
,MASTER_UW_LOB_CLASS MAS
WHERE PM.POLICY_NO = PEM.POLICY_NO
AND PM.POLICY_NO LIKE UPPER('%%')
AND PM.INSURED_CODE LIKE UPPER('%%')
AND PM.SOURCE_OF_BUSINESS LIKE UPPER('%%')
AND PM.POLICY_TYPE IS NULL
AND PM.POLICY_STATUS = 'POST'
AND PM.POLICY_LOB = MAS.UW_LOB_CODE
AND MAS.UW_CLASS_CODE LIKE UPPER('AUTO')
AND PEM.POLICY_ENDORSEMENT_NO =
(SELECT MAX(PEM2.POLICY_ENDORSEMENT_NO)
FROM POLICY_ENDORSEMENT_MAIN PEM2
WHERE PEM.POLICY_NO = PEM2.POLICY_NO
***AND PEM.ENDORSEMENT_STATUS = 'POST'***
)
***order by 1 ASC***
)
)
WHERE field8 BETWEEN 21 AND 40
Refer the lines marked between *** in the innermost subquery.
If i comment this line from my query, the query works fine.
AND PEM.ENDORSEMENT_STATUS = 'POST'
If i comment this line from my query and everything else remains unchanged from the original, the query works fine too
order by 1 ASC
The earlier points related to rownum still hold true but commenting these lines individually seems to be making the rownum thing irrelevant and the entire query works fine (except for that fact that the results are logically different now)
I am confused. To say the least!!!
EDIT 2:
Adding the execution plan for the above query
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=19 Card=1 Bytes=114)
1 0 VIEW (Cost=19 Card=1 Bytes=114)
2 1 COUNT
3 2 FILTER
4 3 VIEW (Cost=17 Card=1 Bytes=128)
5 4 SORT (ORDER BY) (Cost=17 Card=1 Bytes=130)
6 5 TABLE ACCESS (BY INDEX ROWID) OF 'POLICY_ENDORSEMENT_MAIN' (TABLE) (Cost=2 Card=1 Bytes=39)
7 6 NESTED LOOPS (Cost=16 Card=1 Bytes=130)
8 7 NESTED LOOPS (Cost=14 Card=1 Bytes=91)
9 8 TABLE ACCESS (FULL) OF 'POLICY_MAIN' (TABLE) (Cost=14 Card=1 Bytes=82)
10 8 INDEX (UNIQUE SCAN) OF 'PK_MASTER_UW_LOB_CLASS' (INDEX (UNIQUE)) (Cost=0 Card=1 Bytes=9)
11 7 INDEX (RANGE SCAN) OF 'PK_POLICY_ENDORSEMENT_MAIN' (INDEX (UNIQUE)) (Cost=1 Card=1)
12 3 SORT (AGGREGATE)
13 12 FILTER
14 13 INDEX (RANGE SCAN) OF 'PK_POLICY_ENDORSEMENT_MAIN' (INDEX (UNIQUE)) (Cost=2 Card=2 Bytes=68)
EDIT 3:
Exact same query as above but if i remove the
ORDER BY 1 ASC
clause, then the results are retrieved as expected.
The PLAN for this query without the order by is below
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=18 Card=1 Bytes=114)
1 0 VIEW (Cost=18 Card=1 Bytes=114)
2 1 COUNT
3 2 FILTER
4 3 TABLE ACCESS (BY INDEX ROWID) OF 'POLICY_ENDORSEMENT_MAIN' (TABLE) (Cost=2 Card=1 Bytes=39)
5 4 NESTED LOOPS (Cost=16 Card=1 Bytes=130)
6 5 NESTED LOOPS (Cost=14 Card=1 Bytes=91)
7 6 TABLE ACCESS (FULL) OF 'POLICY_MAIN' (TABLE) (Cost=14 Card=1 Bytes=82)
8 6 INDEX (UNIQUE SCAN) OF 'PK_MASTER_UW_LOB_CLASS' (INDEX (UNIQUE)) (Cost=0 Card=1 Bytes=9)
9 5 INDEX (RANGE SCAN) OF 'PK_POLICY_ENDORSEMENT_MAIN' (INDEX (UNIQUE)) (Cost=1 Card=1)
10 3 SORT (AGGREGATE)
11 10 FILTER
12 11 INDEX (RANGE SCAN) OF 'PK_POLICY_ENDORSEMENT_MAIN' (INDEX (UNIQUE)) (Cost=2 Card=2 Bytes=68)
Note that the only real difference between the two plans is that the one that is not working has the following two additional steps after step 3 where as these steps are not present in the query without the order by - which is working fine.
As expected, step 5 is the step where the ordering of the data is being done.
4 3 VIEW (Cost=17 Card=1 Bytes=128)
5 4 SORT (ORDER BY) (Cost=17 Card=1 Bytes=130)
It seems that step 4 is maybe an additional view being created due to the ordering.
WHY this should prevent the rownum logic from working is what i am still trying to grasp.
Any help appreciated!!
EDIT 4 - Original Query plan from 9i environment
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 VIEW
2 1 COUNT
3 2 VIEW
4 3 SORT (ORDER BY)
5 4 FILTER
6 5 TABLE ACCESS (BY INDEX ROWID) OF 'POLICY_MAIN'
7 6 NESTED LOOPS
8 7 NESTED LOOPS
9 8 TABLE ACCESS (FULL) OF 'POLICY_ENDORSEMENT_MAIN'
10 8 INDEX (RANGE SCAN) OF 'PK_MASTER_UW_LOB_CLASS' (UNIQUE)
11 7 INDEX (RANGE SCAN) OF 'PK_POLICY_MAIN' (UNIQUE)
12 5 SORT (AGGREGATE)
13 12 FILTER
14 13 INDEX (RANGE SCAN) OF 'PK_POLICY_ENDORSEMENT_MAIN' (UNIQUE)
As Adam has suggested, the subquery is filtering the results after the sort and ROWNUM are applied.
I think you need to force that subquery to be filtered earlier, by using the PUSH_SUBQ hint:
SELECT * FROM
( SELECT field1, field2 , field3, field4, field5, field6, field7,
ROWNUM field8 from
( SELECT PM.POLICY_NO field1
,PM.INSURED_CODE field2
,PM.INSURED_NAME field3
,TO_CHAR(PM.POLICY_EFFECTIVE_DATE,'DD/MM/YYYY') field4
,TO_CHAR(PM.POLICY_EXPIRATION_DATE,'DD/MM/YYYY') field5
,'' field6
,'' field7
,'' field8
FROM POLICY_MAIN PM
,POLICY_ENDORSEMENT_MAIN PEM
,MASTER_UW_LOB_CLASS MAS
WHERE PM.POLICY_NO = PEM.POLICY_NO
AND PM.POLICY_NO LIKE UPPER('%%')
AND PM.INSURED_CODE LIKE UPPER('%%')
AND PM.SOURCE_OF_BUSINESS LIKE UPPER('%%')
AND PM.POLICY_TYPE IS NULL
AND PM.POLICY_STATUS = 'POST'
AND PM.POLICY_LOB = MAS.UW_LOB_CODE
AND MAS.UW_CLASS_CODE LIKE UPPER('AUTO')
AND PEM.POLICY_ENDORSEMENT_NO =
(SELECT /*+ PUSH_SUBQ*/
MAX(PEM2.POLICY_ENDORSEMENT_NO)
FROM POLICY_ENDORSEMENT_MAIN PEM2
WHERE PEM.POLICY_NO = PEM2.POLICY_NO
AND PEM.ENDORSEMENT_STATUS = 'POST'
)
order by 1 ASC
)
)
WHERE field8 BETWEEN 21 AND 40
I've also removed the TO_CHAR from the ROWNUM - you want to use numbers for that range comparison.
EDIT
Try #2 - use CTE instead:
WITH q AS
( SELECT /*+MATERIALIZE*/
field1, field2 , field3, field4, field5, field6, field7,
ROWNUM field8 from
( SELECT PM.POLICY_NO field1
,PM.INSURED_CODE field2
,PM.INSURED_NAME field3
,TO_CHAR(PM.POLICY_EFFECTIVE_DATE,'DD/MM/YYYY') field4
,TO_CHAR(PM.POLICY_EXPIRATION_DATE,'DD/MM/YYYY') field5
,'' field6
,'' field7
,'' field8
FROM POLICY_MAIN PM
,POLICY_ENDORSEMENT_MAIN PEM
,MASTER_UW_LOB_CLASS MAS
WHERE PM.POLICY_NO = PEM.POLICY_NO
AND PM.POLICY_NO LIKE UPPER('%%')
AND PM.INSURED_CODE LIKE UPPER('%%')
AND PM.SOURCE_OF_BUSINESS LIKE UPPER('%%')
AND PM.POLICY_TYPE IS NULL
AND PM.POLICY_STATUS = 'POST'
AND PM.POLICY_LOB = MAS.UW_LOB_CODE
AND MAS.UW_CLASS_CODE LIKE UPPER('AUTO')
AND PEM.POLICY_ENDORSEMENT_NO =
(SELECT MAX(PEM2.POLICY_ENDORSEMENT_NO)
FROM POLICY_ENDORSEMENT_MAIN PEM2
WHERE PEM.POLICY_NO = PEM2.POLICY_NO
AND PEM.ENDORSEMENT_STATUS = 'POST'
)
order by 1 ASC
)
)
SELECT * from q
WHERE field8 BETWEEN 21 AND 40
It sounds like Oracle is mergeing the inline view into the main query so that field8 (based on ROWNUM) is calculated too late. I haven't seen that happen myself, but if that is what is happening you could try adding a NO_MERGE hint like this:
SELECT /*+ NO_MERGE(vw) */ * FROM
( SELECT field1, field2 , field3, field4, field5, field6, field7, to_char(rownum) field8
FROM
( SELECT
field1,
field2,
field3,
field4,
field5,
field6,
field7,
''
FROM
.......REST OF MY COMPLEX INNER QUERY
)
) vw
WHERE field8 BETWEEN 21 AND 30;
(Incidentally, why the TO_CHAR on ROWNMUM when you are treating it as a number in the WHERE clause anyway?)
Try this:
SELECT field1, field2 , field3, field4, field5, field6, field7, to_char(rn) field8 from
(SELECT PM.POLICY_NO field1
,PM.INSURED_CODE field2
,PM.INSURED_NAME field3
,TO_CHAR(PM.POLICY_EFFECTIVE_DATE,'DD/MM/YYYY') field4
,TO_CHAR(PM.POLICY_EXPIRATION_DATE,'DD/MM/YYYY') field5
,'' field6
,'' field7
,rownum as rn
FROM POLICY_MAIN PM
inner join POLICY_ENDORSEMENT_MAIN PEM
on PM.POLICY_NO = PEM.POLICY_NO
inner join MASTER_UW_LOB_CLASS MAS
on PM.POLICY_LOB = MAS.UW_LOB_CODE
WHERE PM.POLICY_NO LIKE UPPER('%%')
AND PM.INSURED_CODE LIKE UPPER('%%')
AND PM.SOURCE_OF_BUSINESS LIKE UPPER('%%')
AND PM.POLICY_TYPE IS NULL
AND PM.POLICY_STATUS = 'POST'
AND MAS.UW_CLASS_CODE = 'AUTO'
AND PEM.ENDORSEMENT_STATUS = 'POST'
AND PEM.POLICY_ENDORSEMENT_NO =
(SELECT MAX(PEM2.POLICY_ENDORSEMENT_NO)
FROM POLICY_ENDORSEMENT_MAIN PEM2
WHERE PEM.POLICY_NO = PEM2.POLICY_NO
)
order by pm.policy_no ASC)
WHERE rn BETWEEN 21 AND 40
Changes:
Restructured joins to use ANSI syntax to differentiate joins from filters.
Changed LIKE UPPER('AUTO') to = 'AUTO'
Removed unnecessary level of nesting.
Changed order by to use expression vs. positional notaion
Moved filtering criteria PEM.ENDORSEMENT_STATUS = 'POST' from correlated subquery to main query, which may correct wrong results issue.
Changed pagination condition to use a numeric expression rather than a character one, because:
select * from dual where '211' between '21' and '40';
select * from dual where 211 between 21 and 40;
Do not return the same results.
Explain plan should help you identify the problem. As Tony has stated, any merging of the inner query into the outer query will break your query. Any queries where the condition RONUM > 1 unconditionally applies will fail.
Queries such as you are building may require building the entire result set and then filtering out the rows for the page. You may want to consider building a key set for the desired rows in then inner query and then adding additional columns in the outer query. A CARDINALITY hint on the query selecting on rownum may help.
Try using "rownum() over (order by 1) rn" to generate the order. (I assume order is different than 1 at times.) Add a "/*+ FIRST_ROWS(20) */" to the first inner query. http://www.oracle.com/technology/oramag/oracle/07-jan/o17asktom.html for more help.