Oracle Techniques by Sameer Wadhwa ( Analytic Functions)

borrowed from : http://www.samoratech.com/topicofinterest/swAnalyticalFuntions.htm#Ranked_func_bk

xmlns:o=”urn:schemas-microsoft-com:office:office”
xmlns:w=”urn:schemas-microsoft-com:office:word”
xmlns=”http://www.w3.org/TR/REC-html40″>

 

id=”_x0000_t136″ coordsize=”21600,21600″ o:spt=”136″ adj=”10800″ path=”m@7,0l@8,0m@5,21600l@6,21600e”>

o:connectangles=”270,180,90,0″/>

strokeweight=”1.5pt”>

fitpath=”t” string=”Analytical functions”/>
     

style=’mso-tab-count:13′>                                                                                                                                                     

 

                                                                                    style=’color:maroon’>SAMEER WADHWA

                                                                        style=”mso-spacerun: yes”>       Wadhwa_S@Hotmail.com

                                                           

In this article I have 
tried to aware you about some of the analytic functions provided by oracle
8i.These funtions are very powerful and ease to use.

 

style=’font-size:11.0pt;mso-bidi-font-size:12.0pt;font-family:Symbol;
color:blue;mso-bidi-font-weight:bold’>·        
href=”http://www.samoratech.com/topicofinterest/swAnalyticalFuntions.htm#Rollup_and_cube_bk”>ROLLUP
AND CUBE AGGREGATE FUNCTIONS
style=’mso-bidi-font-weight:normal’>·        
href=”http://www.samoratech.com/topicofinterest/swAnalyticalFuntions.htm#Ranked_func_bk”>RANKED
FUNCTION
style=’mso-bidi-font-weight:normal’>·        
href=”http://www.samoratech.com/topicofinterest/swAnalyticalFuntions.htm#Case_bk”>CASE
style=’font-size:11.0pt;mso-bidi-font-size:12.0pt;font-family:Symbol;
color:blue;mso-bidi-font-weight:bold’>·        
href=”http://www.samoratech.com/topicofinterest/swAnalyticalFuntions.htm#Lag_and_lead_bk”>LAG
AND LEAD FUNCTION
style=’font-size:11.0pt;mso-bidi-font-size:12.0pt;font-family:Symbol;
color:blue;mso-bidi-font-weight:bold’>·        
href=”http://www.samoratech.com/topicofinterest/swAnalyticalFuntions.htm#Ratio_to_report_bk”>RATIO_TO_REPORT
 
 
 

                                    name=”Rollup_and_cube_bk”>ROLLUP AND CUBE
AGGREGATE FUNCTIONS

 

To understand the power of ROLLUP and CUBE functions ,consider the
following SQL statement :-

 

ora816 SamSQL :> compute sum of totsal on deptno

ora816 SamSQL :> break on deptno

ora816 SamSQL :> select deptno,job,sum(sal) totsal from emp group
by deptno,job;

 

style=”mso-spacerun: yes”>    DEPTNO JOB           TOTSAL

———-
——— ———-

style=”mso-spacerun: yes”>        10 CLERK           1300

style=”mso-spacerun: yes”>           MANAGER         2450

style=”mso-spacerun: yes”>           PRESIDENT       5000

********** style=”mso-spacerun: yes”>           ———-

sum style=”mso-spacerun: yes”>                        8750

style=”mso-spacerun: yes”>        20 ANALYST         6000

style=”mso-spacerun: yes”>           CLERK           1900

style=”mso-spacerun: yes”>           MANAGER         2975

********** style=”mso-spacerun: yes”>           ———-

sum style=”mso-spacerun: yes”>                       10875

style=”mso-spacerun: yes”>        30 CLERK            950

style=”mso-spacerun: yes”>           MANAGER         2850

style=”mso-spacerun: yes”>           SALESMAN        5600

********** style=”mso-spacerun: yes”>           ———-

sum style=”mso-spacerun: yes”>                        9400

 

Now see the use of ROLLUP Function

 

ora816 SamSQL :> select deptno,job,sum(sal) totsal from emp
group by ROLLUP(deptno,job);

 

style=”mso-spacerun: yes”>    DEPTNO JOB           TOTSAL

———-
——— ———-

style=”mso-spacerun: yes”>        10 CLERK           1300

style=”mso-spacerun: yes”>        10 MANAGER         2450

id=”_x0000_t88″ coordsize=”21600,21600″ o:spt=”88″ adj=”1800,10800″ path=”m0,0qx10800@0l10800@2qy21600@11,10800@3l10800@1qy0,21600e”
filled=”f”>

textboxrect=”0,@4,7637,@5″/>

style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:#3366FF’>        10 PRESIDENT style=”mso-spacerun: yes”>       5000

style=”mso-spacerun: yes”>        10                 8750   Total of Deptno 10

style=”mso-spacerun: yes”>        20 ANALYST         6000

style=”mso-spacerun: yes”>        20 CLERK           1900

style=”mso-spacerun: yes”>        20 MANAGER         2975

style=”mso-spacerun: yes”>        20                 style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:red’>10875

style=”mso-spacerun: yes”>        30 CLERK            950

style=”mso-spacerun: yes”>        30 MANAGER         2850

style=”mso-spacerun: yes”>        30 SALESMAN        5600

type=”#_x0000_t88″ style=’position:absolute;left:0;text-align:left;
margin-left:297pt;margin-top:5.55pt;width:9pt;height:18pt;z-index:6′>

style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:#3366FF’>        30 style=”mso-spacerun: yes”>                 9400

style=”mso-spacerun: yes”>                          style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:red’>29025   Grand Total

 

 

So if you compare the two output you will notice that you are getting
the same output.  By using rollup
you can avoid compute and break clausesfrom SQL
. style=”mso-spacerun: yes”> 
This will mostly helpful in style=”mso-spacerun: yes”>  PL/SQL 
.  We do not have to put logic
for computing values on break of groups.

 

 

Now see the use of
CUBE  Function

 

ora816
SamSQL :> select deptno,job,sum(sal) totsal from emp group by
CUBE(deptno,job);

 

Fri Mar 23

style=”mso-spacerun:
yes”>                                                         
NuGenesis
Report

 

style=”mso-spacerun: yes”>    DEPTNO JOB           TOTSAL

———-
——— ———-

style=”mso-spacerun: yes”>        10 CLERK           1300

style=”mso-spacerun: yes”>        10 MANAGER        
2450

type=”#_x0000_t88″ style=’position:absolute;left:0;text-align:left;
margin-left:4in;margin-top:8.25pt;width:9pt;height:18pt;z-index:4′>

style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:#3366FF’>        10 PRESIDENT style=”mso-spacerun: yes”>       5000

style=”mso-spacerun: yes”>        10                 8750   Total of Deptno 10

style=”mso-spacerun: yes”>        20 ANALYST         6000

style=”mso-spacerun: yes”>        20 CLERK           1900

style=”mso-spacerun: yes”>        20 MANAGER         2975

style=”mso-spacerun: yes”>        20                10875

style=”mso-spacerun: yes”>        30 CLERK            950

style=”mso-spacerun: yes”>        30 MANAGER         2850

style=”mso-spacerun: yes”>        30 SALESMAN        5600

style=”mso-spacerun: yes”>        30                 9400

type=”#_x0000_t88″ style=’position:absolute;left:0;text-align:left;
margin-left:4in;margin-top:2.95pt;width:18pt;height:54pt;z-index:2′>

style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:#3366FF’>           ANALYST style=”mso-spacerun: yes”>         6000 style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:#3366FF’>

style=”mso-spacerun: yes”>           CLERK           4150 style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:#3366FF’>

style=”mso-spacerun: yes”>           MANAGER         8275    Total w.r.t JOB

style=”mso-spacerun: yes”>           PRESIDENT        style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:maroon’>5000

style=”mso-spacerun: yes”>           SALESMAN        5600 style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:#3366FF’>

id=”_x0000_t87″ coordsize=”21600,21600″ o:spt=”87″ adj=”1800,10800″ path=”m21600,0qx10800@0l10800@2qy0@11,10800@3l10800@1qy21600,21600e”
filled=”f”>

textboxrect=”13963,@4,21600,@5″/>

style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:#3366FF’>            style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”;
color:red’>Grand Total   29025

 

Cube also do a total with respect to second group
for example JOB in our case . Also at end you will see the grand total

 

 

Conclusion : Rollup and
Cube are the aggregate function which allows developers and dbas to avoid
compute and break clauses and simplify logic of programming

 

name=”Ranked_func_bk”>Ranked Function in 8i (816) style=’font-size:12.0pt;mso-bidi-font-size:10.0pt;font-family:”Times New Roman”;
mso-fareast-font-family:”MS Mincho”;color:red’>

 

Suppose you have a data in table which you want to rank in a specified
order for example you have a table test and you want to rank a value of repcol.

 

                                               style=’font-size:12.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>Ora816 SamSQL
> select * from test;

 

REPCOL         
VALUE

———-   
———-

A                
100

A            style=”mso-spacerun: yes”>     200

A                
300

B               
1000

B                
900

B                
800

A                
500

B                
400

B                
500

 

Ora816
SamSQL
> select repcol,value, style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>rank() over ( partition by repcol

style=”mso-spacerun: yes”>                2 style=”mso-spacerun: yes”>  order
by value desc )
ranked_value

style=”mso-spacerun: yes”>           style=”mso-spacerun: yes”>     3 
from test;

 

REPCOL         
VALUE RANKED_VALUE

———- ———- ————

A                
500            1

A                
300            2

A                
200            3

A                
100            4

B  style=”mso-spacerun: yes”>              1000            1

B                
900            2

B                
800            3

B                
500            4

B                
400            5
style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>

 

The
above value is ranked by the rank function provided by 8.1.6
style=’font-size:12.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>

 

 

style=’font-size:12.0pt;mso-bidi-font-size:10.0pt;font-family:”Times New Roman”;
mso-fareast-font-family:”MS Mincho”;color:red’>Use of Case in SELECT
style=’font-size:12.0pt;mso-bidi-font-size:10.0pt;font-family:”Times New Roman”;
mso-fareast-font-family:”MS Mincho”;color:red’>

 

Case Statement are similar to decode , it is more
flexible and gives better performace

 

Ora816
SamSQL
>  ed

Wrote
file afiedt.buf

 

style=”mso-spacerun: yes”>  1 
select sum(case style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> repcol=’A’ then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> value else style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> 0 end style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>) sum_of_A,

style=”mso-spacerun: yes”>           style=”mso-spacerun: yes”> 2        
sum(Case style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> repcol=’B’ then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> value else style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> 0 end style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>) Sum_of_B,

style=”mso-spacerun: yes”>  3        
sum(case style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>value = 500 then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> 1 else style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> 0 end style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>) Value_Eq_500,

style=”mso-spacerun: yes”>  4        
sum(case style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> value > 100 then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>1 else style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> 0 end style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>) Value_Gre_100

style=”mso-spacerun: yes”>  5* 
from test

Ora816
SamSQL
>  /

 

  SUM_OF_A style=”mso-spacerun: yes”>   SUM_OF_B VALUE_EQ_500 VALUE_GRE_100

———- ———- ———— ————-

      1100 style=”mso-spacerun: yes”>       3600            2             8

 

Use of Group by in CASE

 

Ora816
SamSQL
> ed

Wrote
file afiedt.buf

 

style=”mso-spacerun: yes”>  1 
select repcol,sum(case when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> repcol=’A’ then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> value else style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> 0 end style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>) sum_of_A,

style=”mso-spacerun: yes”>  2        
sum(Case when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> repcol=’B’ then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> value else style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> 0 end style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>) Sum_of_B,

style=”mso-spacerun: yes”>  3        
sum(case when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> value = 500 then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> 1 else style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>0 end style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>) Value_Eq_500,

style=”mso-spacerun: yes”>  4        
sum(case when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> value > 100 then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>1 else style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> 0 end style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>) Value_Gre_100

style=”mso-spacerun: yes”>  5  
from test

style=”mso-spacerun: yes”>  6* group
by
repcol

Ora816 SamSQL> /

 

REPCOL      
SUM_OF_A   SUM_OF_B VALUE_EQ_500
VALUE_GRE_100

———- ———- ———- ————
————-

A               
1100          0 style=”mso-spacerun: yes”>           
1             3

B                  
0       3600 style=”mso-spacerun: yes”>           
1             5

 

Ora816
SamSQL
> ed

Wrote
file afiedt.buf

 

style=”mso-spacerun: yes”>  1 
select (case style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> value between 100 and 300 then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> ‘100-300’

style=”mso-spacerun: yes”>  2      
when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> value between 400 and 700 then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> ‘400-700’

style=”mso-spacerun: yes”>  3      
when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> value between 800 and 900 then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> ‘800-900’

style=”mso-spacerun: yes”>  4      
when style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> value > 900 then style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> ‘>900’ end style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’>) VALUE_RANGE,

style=”mso-spacerun: yes”>  5       
count(*) as VALUE_COUNT

style=”mso-spacerun: yes”>  6      
from test

style=”mso-spacerun: yes”>  7 
group by

style=”mso-spacerun: yes”>  8 
(case when value between 100 and 300 then ‘100-300’

style=”mso-spacerun: yes”>  9       
when value between 400 and 700 then ‘400-700’

style=”mso-spacerun: yes”> 10       
when value between 800 and 900 then ‘800-900’

style=”mso-spacerun: yes”> 11*      
when value > 900 then ‘>900’ end )

Ora816
SamSQL
> /

 

VALUE_R VALUE_COUNT

——- ———–

100-300          
3

400-700          
3

800-900          
2

>900             
1

 

Ora816
SamSQL
> ed

Wrote
file afiedt.buf

 

style=”mso-spacerun: yes”>  1  select
(case when value between 100 and 300 then ‘100-300’

style=”mso-spacerun: yes”>  2      
when value between 400 and 700 then ‘400-700’

style=”mso-spacerun: yes”>  3      
when value between 800 and 900 then ‘800-900’

style=”mso-spacerun: yes”>  4      
when value > 900 then ‘>900’ end) VALUE_RANGE,value

style=”mso-spacerun: yes”>  5*     
from test

Ora816
SamSQL
> /

 

VALUE_R     
VALUE

——- ———-

100-300       
100

100-300       
200

100-300       
300

>900         
1000

800-900       
900

800-900       
800

400-700       
500

400-700       
400

400-700       
500

 

9 rows selected.

 

style=”mso-spacerun: yes”> Lag and Lead
Functions

 

 

Ora816
SamSQL
> ed

Wrote
file afiedt.buf

 

style=”mso-spacerun: yes”>  1 
select repcol,

style=”mso-spacerun: yes”>  2 
value,

style=”mso-spacerun: yes”>  3  style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>lag(value, style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>1) style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>over ( style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>order by repcol) lag_value ,

style=”mso-spacerun: yes”>  4  style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>lead(value, style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>1) style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>over ( style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>order by repcol) lead_value

style=”mso-spacerun: yes”>  5* from test

Ora816
SamSQL
> /

 

REPCOL         
VALUE  LAG_VALUE LEAD_VALUE

———- ———- ———- ———-

A                
100                   200

A                
200        100 style=”mso-spacerun: yes”>       
500

A                
500        200 style=”mso-spacerun: yes”>       
300

A                
300        500 style=”mso-spacerun: yes”>      
1000

B              style=”mso-spacerun: yes”>  1000       
300        500

B                
500       1000 style=”mso-spacerun: yes”>       
900

B                
900        500 style=”mso-spacerun: yes”>       
400

B                
400        900 style=”mso-spacerun: yes”>       
800

B                
800        400

 

9 rows selected.

 

The LAG function provides access
to a row at a given offset prior to the position and the LEAD function provides
access to a row at a given offset after the current position.

The functions have the
following syntax
:

{LAG | LEAD}   (, [ [, ]]) OVER       ([PARTITION BY [,…]]      ORDER BY  [collate clause>]      [ASC | DESC] [NULLS FIRST | NULLS LAST] [,…]) 

style=’font-size:11.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”‘>is
an optional parameter and defaults to 1. is an optional parameter and
is the value returned if the falls outside the bounds of the table or
partition
.

 

style=’font-size:12.0pt;mso-bidi-font-size:14.0pt;color:red’>Ratio_to_report style=’font-size:12.0pt;mso-bidi-font-size:14.0pt;color:red’>

The RATIO_TO_REPORT
function computes the ratio of a value to the sum of a set of values. If the
expression
value expression style=’font-size:11.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”‘>
evaluates to NULL, RATIO_TO_REPORT also evaluates to NULL, but it is treated as
zero for computing the sum of values for the denominator. Its syntax is:

RATIO_TO_REPORT() OVER        ([PARTITION BY [,…]]) 

style=’font-size:11.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”‘>where

style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:Symbol;
mso-bidi-font-weight:bold;mso-bidi-font-style:italic’>· style=’font-size:7.0pt;mso-bidi-font-weight:bold;mso-bidi-font-style:italic’>        
and can be any valid
expression involving column references or aggregates.
style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:Symbol;
mso-bidi-font-style:italic’>·         style=’font-size:11.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”‘>The
PARTITION BY clause defines the groups on which the RATIO_TO_REPORT function is
to be computed. If the PARTITION BY clause is absent, then the function is
computed over the whole query result set
style=’font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”‘>. style=’font-size:11.0pt;mso-bidi-font-size:12.0pt;font-family:”Courier New”‘>

 

Ora816
SamSQL
> ed

Wrote
file afiedt.buf

 

style=”mso-spacerun: yes”>  1 
select repcol,

style=”mso-spacerun: yes”>  2 
value,

style=”mso-spacerun: yes”>  3 
sum(value) over() style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:blue’> sumofvalue,

style=”mso-spacerun: yes”>  4  style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>ratio_to_report (sum(value)) style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>over() RATIO

style=”mso-spacerun: yes”>  5 
from test

style=”mso-spacerun: yes”>  6* group by repcol ,value

Ora816
SamSQL
> /

 

REPCOL         
VALUE SUMOFVALUE      RATIO

———- ———- ———- ———-

A                
100       4700 .021276596

A                
200       4700 .042553191

A                
300       4700 .063829787

A                
500       4700 .106382979

B                
400       4700 .085106383

B                
500       4700 .106382979

B                
800       4700 .170212766

B            style=”mso-spacerun: yes”>     900      
4700 .191489362

B               
1000       4700 .212765957

 

9 rows selected.

 

Ora816
SamSQL
> ed

Wrote
file afiedt.buf

 

style=”mso-spacerun: yes”>  1 
select repcol,

style=”mso-spacerun: yes”>  2 
value,

style=”mso-spacerun: yes”>  3  style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>sum(value) style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>over(partition by repcol) sumofvalue,

style=”mso-spacerun: yes”>  4  style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>ratio_to_report (sum(value)) style=’font-size:11.0pt;mso-bidi-font-size:10.0pt;mso-fareast-font-family:”MS Mincho”;
color:red’>over(partition by repcol) RATIO

style=”mso-spacerun: yes”>  5 
from test

style=”mso-spacerun: yes”>  6* group by repcol ,value

Ora816
SamSQL
> /

 

REPCOL         
VALUE SUMOFVALUE      RATIO

———- ———- ———- ———-

A                
100       1100 .090909091

A                
200       1100 .181818182

A                
300       1100 .272727273

A                
500       1100 .454545455

B                
400       3600 .111111111

B                
500       3600 .138888889

B                
800       3600 .222222222

B                
900       3600        .25

B               
1000       3600 .277777778

 

9 rows selected.

 

Conclusion : These analytic functions are very powerful and very
useful.  Thanks to Oracle to provides us
such a great tool.

 

 

References:

 

Oracle Server concept manual

Oracle Datawarehouse manual.

 

Please feel free to send  comments or  feedback at style=’mso-fareast-font-family:”MS Mincho”;color:red’> href=”mailto:wadhwa_s@hotmail.com”>wadhwa_s@hotmail.com style=’font-size:12.0pt;mso-bidi-font-size:10.0pt;font-family:”Times New Roman”;
mso-fareast-font-family:”MS Mincho”;color:red’>.

Thanks for reading this article style=’font-size:12.0pt;mso-bidi-font-size:10.0pt;font-family:”Times New Roman”;
mso-fareast-font-family:”MS Mincho”;color:blue’>.

SAMEER WADHWA

 

Copyright  2001 style=”mso-spacerun: yes”>  Sameer Wadhwa (All right reserved)

 

 

style=”mso-spacerun: yes”> 

height=1 id=”_x0000_i1030″ src=”swAnalyticalFuntions_files\serv.gif” alt=1>

 

 

Enhanced Aggregation, Cube, Grouping and Rollup | Oracle FAQ


borrowed from the site: http://www.orafaq.com/node/56

 

 

 
 

User login

Enhanced Aggregation, Cube, Grouping and Rollup


 

Shouvik Basu's picture

Much of the OLAP reporting feature embedded in Oracle SQL is ignored. People turn to expensive OLAP reporting tools in the market – even for simple reporting needs. This article outlines some of the common OLAP reporting needs and shows how to meet them by using the enhanced aggregation features of Oracle SQL.

The article is divided in two sections. The first introduces the GROUP BY extensions of SQL, and the second uses them to generate some typical reports. A section at the end introduces the common OLAP terminologies.

The enhanced SQL aggregation features are available across all flavors of Oracle including Oracle Standard Edition One. It might be worth mentioning here, that Oracle OLAP, the special OLAP package of Oracle, is not available with Oracle Standard Edition and Standard Edition One. Enhanced aggregation features discussed here have been tested on Oracle 9i and Oracle 10g.

Advanced Aggregation Extensions of GROUP BY

GROUPING SETS clause, GROUPING function and GROUPING_ID function

The fundamental concept of enhanced aggregation features of Oracle is that of GROUPING SETS. All other aggregation features can be expressed in terms of it. With GROUPING SETS clause comes the functions GROUPING, GROUPING_ID and GROUP_ID.

The GROUPING SETS clause in GROUP BY allows us to specify more than one GROUP BY options in the same record set. All GROUPING clause query can be logically expressed in terms of several GROUP BY queries connected by UNION. Table-1 shows several such equivalent statements. This is helpful in forming the idea of the GROUPING SETS clause. A blank set ( ) in the GROUPING SETS clause calculates the overall aggregate.

Table 1 – GROUPING SET queries and the equivalent GROUP BY queries

Set A – Aggregate Query with GROUPING SETS Set B – Equivalent Aggregate Query with GROUP BY A1. SELECT a, b, SUM(c) FROM tab1 GROUP BY GROUPING SETS ( (a,b) ) B1. SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b A2. SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a,b), a) B2. SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b UNION SELECT a, null, SUM( c ) FROM tab1 GROUP BY a A3. SELECT a,b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS (a,b) B3. SELECT a, null, SUM( c ) FROM tab1 GROUP BY a UNION SELECT null, b, SUM( c ) FROM tab1 GROUP BY b A4. SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), a, b, ( ) ) B4. SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b UNION SELECT a, null, SUM( c ) FROM tab1 GROUP BY a, null UNION SELECT null, b, SUM( c ) FROM tab1 GROUP BY null, b UNION SELECT null, null, SUM( c ) FROM tab1
Example (Table-1 Set 4) is like a superset of all the above cases and also includes an overall aggregate by the use of ( ). We will see latter that this result is similar to that of CUBE (a, b). The first 3 columns of Table-2 show the result of a query of this type.

GROUPING clause uses a single scan to compute all the required aggregates. So the performance is better than its logical equivalent of several GROUP BY and UNION.

The general syntax of a SQL with GROUPING SETS is –

SELECT <grouping_columns>, <aggregate_functions> FROM <table_list> WHERE <where_condition> GROUP BY GROUPING SETS (<column_set_1>, … , <column_set_N>
The “column sets” can have none, one or more “grouping column” from SELECT. However, all columns from the select should be present in at least one of the column sets. In mathematical terms –

UNION UNION should be

equal to

So the following two queries below will return error –

(1) SELECT a, b, c, SUM(d ) FROM tab1 GROUP BY GROUPING SETS ( (a,b), b) — Reason (a,b) U ( b ) is not equal to (a,b,c) (2) SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS (a, ( ) ) — Reason (a) U ( ) is not equal to ( a, b )

Table 2 – A GROUPING SET query with GROUPING and GROUPING_ID Function on EMP

SELECT deptno, job, SUM(sal), GROUPING(deptno) GDNO, GROUPING (job) GJNO, GROUPING_ID(deptno, job) GID_DJ, GROUPING_ID(job, deptno) GID_JD FROM EMP GROUP BY GROUPING SETS ( (deptno, job), deptno, job, ( )) DEPTNO JOB SUM(SAL) GDNO GJNO GID_DJ GID_JD ———- ——— ———- ———- ———- ———- ———- 10 CLERK 1300 0 0 0 0 10 MANAGER 2450 0 0 0 0 10 PRESIDENT 5000 0 0 0 0 20 CLERK 1900 0 0 0 0 20 ANALYST 6000 0 0 0 0 20 MANAGER 2975 0 0 0 0 30 CLERK 950 0 0 0 0 30 MANAGER 2850 0 0 0 0 30 SALESMAN 5600 0 0 0 0 10 8750 0 1 1 2 20 10875 0 1 1 2 30 9400 0 1 1 2 ANALYST 6000 1 0 2 1 CLERK 4150 1 0 2 1 MANAGER 8275 1 0 2 1 PRESIDENT 5000 1 0 2 1 SALESMAN 5600 1 0 2 1 29025 1 1 3 3 18 rows selected.

GROUPING Function and GROUPING_ID Function

From Table-2 we see that when aggregates are displayed for a column its value is null. This may conflict in case the column itself has some null values. There needs to be some way to identify NULL in column, which means aggregate and NULL in column, which means value. GROUPING function is the solution to that.

This function returns a flag “1” for a row in the result set if that column has been aggregated in that row. Otherwise the value is “0”. There can be only one column expression as the argument of the GROUPING function and that column should also be in the SELECT. GROUPING function can be used to substitute the NULL value, which usually appears in columns at the aggregation level by something meaningful like Total.

GROUPING function has the general syntax of GROUPING ( ). It is used only in SELECT clause. It takes only a single column expression as argument.

GROUPING_ID takes a set of columns. It applies the GROUPING function on each column in its argument and composes a bit vector with the “0” and “1” values. It returns the decimal equivalent of the bit vector. The columns GID_DJ and GID_JD show the use of GROUPING_ID function and also show how interchanging the order of the columns inside the GROUPING_ID function might impact the result.

CUBE

This is the most generalized aggregation clause. The general syntax is CUBE ( ). It is used with the GROUP BY only. CUBE creates a subtotal of all possible combinations of the set of column in its argument. Once we compute a CUBE on a set of dimension, we can get answer to all possible aggregation questions on those dimensions. Table-3 shows a cube building.

It might be also worth mentioning here that

GROUP BY CUBE( a, b, c) is equivalent to

GROUP BY GROUPING SETS ( (a, b, c), (a, b), (b, c), (a, c), (a), (b), (c), ( )).

ROLLUP

ROLLUP clause is used with GROUP BY to compute the aggregate at the hierarchy levels of a dimension.

ROLLUP(a, b, c) assumes that the hierarchy is “a” drilling down to “b” drilling down to “c”.

ROLLUP (a, b, c) is equivalent to GROUPING SETS ( (a, b, c), (a, b), (a), ( )).

The general syntax of ROLLUP is ROLLUP( )

Composite Columns

A composite column is a collection of columns that can be used in CUBE or ROLLUP. They are treated as unit before computing the aggregate.Composite columns usage in CUBE and ROLLUP and the equivalent GROUPING SETS –

. CUBE( (a, b), c) is equivalent to GROUPING SETS ( (a, b, c), (a, b) , c, ( )) . ROLLUP ( a, (b, c) ) is equivalent to GROUPING SETS ( (a, b, c), ( a ), ( ) )

Partial GROUPING SETS, CUBE or ROLLUP

If any column appears in GROUP BY but outside the aggregation clauses discussed above. It can be thought of as being first column of the resulting GROUPING SET equivalent. The following examples make this clear.

GROUP BY a, CUBE( b, c) is equivalent to GROUP BY GROUPING SETS ( (a, b, c), (a, b), (a, c), (a) ) GROUP BY a, ROLLUP( b, c) is equivalent to GROUP BY GROUPING SETS ( (a, b, c), (a, b), (a) )

OLAP Reporting using enhanced aggregation features

While the queries on the EMP table are used to illustrate the GROUPING SETS they will be poor examples for discussing the next sections. The reason is that the tables are not in a Star-Schema format. Please run the script (Script A) to get a simple Star Schema.

The tables of the Sample Schema are

Product(prdid, prd_name, prd_family)

TimeByDay(datekey, td_month, td_quarter, td_year)

Location( Loc_id, City, State, Country)

Customer(cust_id, cust_name, cust_type);

Sales(sales_id, cust_id, loc_id, prdid, sales_date, amount);

The schema is about a fictitious Art Trader that supplies remakes of statues of famous historical figures (like ALEXANDER, BUDDHA, etc) or landscape paintings of places (like SIKKIM, etc). They sell to museums, resellers or individuals.

The dimensions are Product, TimeByDay, Location and Customer. The fact is Sales.

The hierarchies are –

(1) Product_Name (prd_name) -> Product Family (prd_family)

(2) Date (datekey) -> Month (td_month) -> Quarter (td_quarter) -> Year (td_year)

(3) City -> State -> Country

(4) Customer_Name (cust_name) -> Customer_Type (cust_type)

The two approaches used for generating OLAP reports are as follows –

(1) Get the most generalized possible CUBE built with the dimensions, or

(2) Use on the fly aggregation queries to get the real-time report.

Using generalized pre-built CUBE for CUBE, ROLLUP, Drill Down and Slicing Queries.

This approach consists of building a table or a materialized view with the CUBE of the dimensions. Table-3 shows the SQL to build such a cube. The generalized CUBE keeps all possible meaningful aggregation pre-computed. We need to query some of the rows of the CUBE to get the desired values. Since the CUBE stores all possible permutations of the dimensions there is a chance that the number or records in the cube itself might be large. Intelligent use of composite columns might help a great deal here. Note the use of composite columns (City, State) in the CUBE. This is because each state has got only one city with the office of our demo organization.

The GROUPING_ID function helps to achieve the ROLLUP. For example, take the combination (cust_name, cust_type). It is meaningless to make the cube perform aggregations for customer types across customer names. So we include only the bit vectors (1,1), (1, 0) and (0,0) that is GROUPING_ID of 3, 2 and 0 on the customer dimension.

Table 3 – Building a cube

CREATE TABLE sales_cube AS SELECT prd_name, prd_family, datekey, td_month, td_quarter, td_year, cust_name, cust_type, city, state, country, GROUPING_ID (prd_name, prd_family) GID_product, GROUPING_ID (datekey, td_month, td_quarter, td_year) GID_DATE, GROUPING_ID (cust_name, cust_type) GID_CUST, GROUPING_ID (city, state, country) GID_LOC, sum(amount) amount FROM sales, product, timebyday, location, customer WHERE sales.cust_id = customer.cust_id and sales.loc_id = location.loc_id and sales.sales_date = timebyday.datekey and sales.prdid = product.prdid GROUP BY CUBE ( prd_name, prd_family, datekey, td_month, td_quarter, td_year, cust_name, cust_type, (city, state), country ) HAVING ( GROUPING_ID (prd_name, prd_family) = 0 or GROUPING_ID (prd_name, prd_family) = 2 or GROUPING_ID (prd_name, prd_family) = 3) and ( GROUPING_ID (datekey, td_month, td_quarter, td_year) = 0 or GROUPING_ID (datekey, td_month, td_quarter, td_year) = 8 or GROUPING_ID (datekey, td_month, td_quarter, td_year) = 12 or GROUPING_ID (datekey, td_month, td_quarter, td_year) = 14 or GROUPING_ID (datekey, td_month, td_quarter, td_year) = 15) and ( GROUPING_ID (cust_name, cust_type) = 0 or GROUPING_ID (cust_name, cust_type) = 2 or GROUPING_ID (cust_name, cust_type) = 3) and ( GROUPING_ID (city, state, country) = 0 or GROUPING_ID (city, state, country) = 6 or GROUPING_ID (city, state, country) = 7 );The next table (Table-4) shows a typical crosstab query of sales for Product and Location. It shows the query and also how to generate a crosstab report out of it by using the function CROSSTAB (Script-B). The next examples show the query and cross-tab report and skips the PLSQL portion. The WHERE condition is determined by the bit vectors. We need –

  • Details of product and details of customer – Both Product and Customer dimensions are all details. So GID_Product = bit vector (0,0) = 0. Same for GID_Cust.
  • Summary of product and details of customer – Product is summarized fully, so GID_Product = bit vector (1,1) = 3.
  • Details of product and summary of customer – Customer is summarized fully, GID_Customer = bit vector (1,1) = 3.
  • Summary of product and summary of customer – Both Customer and products are summarized.
  • Along with any of the above 4 conditions we need full summary or the rest of the dimensions. So GID_date = bit vector (1,1, 1,1) = 15 and GID_Loc = bit vector (1,1,1) = 7.

Table-4 Crosstab Query on Product and Location (Query, Generation Routine and Result)

/*********** The Query ***********/ SELECT prd_name, cust_name, amount FROM sales_cube WHERE ((GID_Product = 0 and GID_Cust = 0) or (GID_Product = 0 and GID_Cust = 3) or (GID_Product = 3 and GID_Cust = 0) or (GID_Product = 3 and GID_Cust = 3)) and GID_date = 15 and GID_LOC = 7; /*********** Generating the crosstab report ********/ set serveroutput on set lines 120 var tempstr varchar2(500) exec :tempstr := ”||- ‘SELECT cust_name, prd_name, amount’||chr(10)||- ‘FROM sales_cube’||chr(10)||- ‘WHERE ((GID_Product = 0 and GID_Cust = 0) or’||chr(10)||- ‘ (GID_Product = 0 and GID_Cust = 3) or’||chr(10)||- ‘ (GID_Product = 3 and GID_Cust = 0) or’||chr(10)||- ‘ (GID_Product = 3 and GID_Cust = 3)) and’||chr(10)||- ‘ GID_date = 15 and’||chr(10)||- ‘ GID_LOC = 7’; exec crosstab(:tempstr); *Customers * *———————- Products —————————–* ************ ALEXANDER BUDDHA CHANDRAGUPTA PURI BEACH SIKKIM –Total– ART HOUSE 0 0 0 500 750 1250 BARKER 5100 0 0 0 0 5100 JONES 0 0 0 2050 3500 5550 MAHAJATI 0 0 0 1000 0 1000 RATAN 0 5000 0 0 4000 9000 SMITH 9500 9000 900 0 0 19400 STONEWORK 850 800 6000 0 0 7650 –Total– 15450 14800 6900 3550 8250 48950Tables Table-5, Table-6 and Table-7 show TimeSales report and drill-down to the quarters of year 2003. Slicing is achieved by including WHERE condition in the query with desired values of the dimensions. Drill down is achieved by selection of proper value of GID_ type columns and by deciding the proper GROUPING value of all the dimensions at the particular level of drill down. Dicing is achieved by merely interchanging the first two columns of SELECT.

Table-5 Year-Product Sales Report: Main (Query and Result)

SELECT prd_name, td_year, amount FROM Sales_cube WHERE ((GID_Product = 0 and GID_date = 14) or (GID_Product = 0 and GID_date = 15) or (GID_Product = 3 and GID_date = 14) or (GID_Product = 3 and GID_date = 15)) and GID_Cust = 3 and GID_Loc = 7; ************ 2002 2003 –Total– ALEXANDER 5100 10350 15450 BUDDHA 6800 8000 14800 CHANDRAGUPTA 0 6900 6900 PURI BEACH 3550 0 3550 SIKKIM 0 8250 8250 –Total– 15450 33500 48950Table-6 Year-Product Sales Report:Drill Down to Quarters and Dicing Product and Time Dimensions (Query and Result)

SELECT td_year||td_quarter, prd_name, amount FROM Sales_cube WHERE ((GID_Product = 0 and GID_date = 12) or (GID_Product = 0 and GID_date = 15) or (GID_Product = 3 and GID_date = 12) or (GID_Product = 3 and GID_date = 15)) and GID_Cust = 3 and GID_Loc = 7; ************ ALEXANDER BUDDHA CHANDRAGUPTA PURI BEACH SIKKIM –Total– 2002Q1 0 1000 0 2050 0 3050 2002Q2 5100 0 0 500 0 5600 2002Q4 0 5800 0 1000 0 6800 2003Q1 10350 0 6000 0 0 16350 2003Q2 0 0 0 0 750 750 2003Q3 0 8000 0 0 3500 11500 2003Q4 0 0 900 0 4000 4900 –Total– 15450 14800 6900 3550 8250 48950Table-6 Year-Product Sales Report:Slice of year 2003, Quarter level drill down (Query and Result)

SELECT td_year||td_quarter, prd_name, amount FROM Sales_cube WHERE ((GID_Product = 0 and GID_date = 12) or (GID_Product = 0 and GID_date = 14) or (GID_Product = 3 and GID_date = 12) or (GID_Product = 3 and GID_date = 14)) and GID_Cust = 3 and GID_Loc = 7 and td_year = 2003; ************ ALEXANDER BUDDHA CHANDRAGUPTA SIKKIM –Total– 2003 10350 8000 6900 8250 33500 2003Q1 10350 0 6000 0 16350 2003Q2 0 0 0 750 750 2003Q3 0 8000 0 3500 11500 2003Q4 0 0 900 4000 4900Using on-the-fly aggregation queries for CUBE, ROLLUP, Drill Down and Slicing

While using on-the-fly aggregation queries the cube is not pre-computed and we get the real time summary. However the performance is slower than querying pre-computed cubes. Several features (CUBE, ROLLUP, Composite Columns) discussed here can be used to generate the required aggregation levels. An important thing to ensure during executing on-the-fly queries is that, the query should not perform any useless aggregation. Proper use of the GROUPING functions is important.

Conclusion

Most of the OLAP tools will provide several additional features other than just reporting. There are user-friendly drag and drop interfaces which make drill-down, rollup, slicing, dicing happen on a mouse-click. Report generation and formatting is easier for someone who is not familiar with SQL. There are security features that restrict specific users from drilling down specific sections of the cube or viewing some specific cubes.

When the requirements are just few canned OLAP reports or when simple custom GUI can be made to mask the SQLs, use of the enhanced aggregation features can be really effective. A large portion of the requirements do fall in the second category.

Glossary and Scripts

Script A Script to create Sample Tables for Reporting examples

drop table product; drop table timebyday; drop table location; drop table customer; drop table sales; create table product (prdid number(2), prd_name varchar2(12), prd_family varchar2(10)); create table timebyday (datekey date, td_month char(3), td_quarter char(2), td_year char(4)); create table location (loc_id number(2), city varchar2(10), state varchar2(10), country varchar2(10)); create table customer (cust_id number(2), cust_name varchar2(10), cust_type varchar2(4)); create table sales (sales_id number(6), cust_id number(2), loc_id number(2), prdid number(2), sales_date date, amount number); insert into product values (1,’BUDDHA’,’STATUE’); insert into product values (2,’ALEXANDER’,’STATUE’); insert into product values (3,’CHANDRAGUPTA’,’STATUE’); insert into product values (4,’SIKKIM’,’PAINTING’); insert into product values (5,’PURI BEACH’,’PAINTING’); insert into location values (1, ‘KOLKATA’, ‘WB’,’INDIA’); insert into location values (2, ‘MUMBAI’,’MH’,’INDIA’); insert into location values (3, ‘SYDNEY’,’SA’,’AUSTRALIA’); insert into location values (4, ‘CHICAGO’,’IL’,’USA’); insert into customer values (1, ‘SMITH’, ‘INDV’); insert into customer values (2, ‘JONES’, ‘INDV’); insert into customer values (3, ‘BARKER’,’INDV’); insert into customer values (4, ‘ART HOUSE’,’ORGN’); insert into customer values (5, ‘STONEWORK’,’ORGN’); insert into customer values (6, ‘MAHAJATI’,’ORGN’); insert into customer values (7, ‘RATAN’,’INDV’); insert into sales values (1, 1, 4, 1, ’05-JAN-02′, 1000); insert into sales values (2, 2, 3, 5, ‘1-MAR-02′, 2050); insert into sales values (3, 3, 4, 2, ’14-MAY-02′, 5100); insert into sales values (4, 4, 3, 5, ’22-JUN-02′, 500); insert into sales values (5, 5, 3, 1, ’15-OCT-02′, 800); insert into sales values (6, 6, 1, 5, ’10-NOV-02′, 1000); insert into sales values (7, 7, 1, 1, ’25-DEC-02′, 5000); insert into sales values (8, 1, 4, 2, ’18-FEB-03′, 9500); insert into sales values (9, 5, 3, 2, ’28-FEB-03′, 850); insert into sales values (10, 5, 4, 3, ’15-MAR-03′,6000); insert into sales values (11, 4, 4, 4, ’12-JUN-03′, 750); insert into sales values (12, 1, 4, 1, ’18-JUL-03’,8000); insert into sales values (13, 2, 3, 4, ‘5-AUG-03’,3500); insert into sales values (14, 1, 4, 3, ‘8-NOV-03′, 900); insert into sales values (15, 7, 2, 4, ’23-NOV-03′,4000); insert into timebyday select distinct sales_date, to_char(sales_date,’MON’), ‘Q’||to_char(sales_date,’Q’), to_char(sales_date,’YYYY’) from sales; commit;Script B Script to create procedure CROSSTAB

drop table temp_tab1; create table temp_tab1 (col1 varchar2(12), col2 varchar2(12), col3 number); create or replace procedure crosstab(ip_qstr varchar2) is type c1_rec is record (col1 varchar2(14), col2 varchar2(14), col3 number); type c1_tab is table of c1_rec; c1_tab1 c1_tab; varstr1 varchar2(200) := ”; ho_value number; begin dbms_output.enable(1000000); execute immediate(‘truncate table temp_tab1′); execute immediate ip_qstr bulk collect into c1_tab1; for ii in 1..c1_tab1.LAST loop insert into temp_tab1 values (nvl(c1_tab1(ii).col1,’–Total–‘),nvl(c1_tab1(ii).col2,’–Total–‘),c1_tab1(ii).col3); end loop; varstr1 := ‘************ ‘; for jj in (select distinct nullif(col2,’–Total–‘) col2 from temp_tab1 order by 1 nulls last) loop varstr1 := varstr1||lpad(nvl(jj.col2,’–Total–‘), 12)||’ ‘; end loop; dbms_output.put_line(varstr1); for ii in (select distinct nullif(col1,’–Total–‘) col1 from temp_tab1 order by 1 nulls last) loop varstr1 := rpad(nvl(ii.col1,’–Total–‘),12); for jj in (select distinct nullif(col2,’–Total–‘) col2 from temp_tab1 order by 1 nulls last) loop begin select col3 into ho_value from temp_tab1 where col1 = nvl(ii.col1,’–Total–‘) and col2 = nvl(jj.col2,’–Total–‘); exception when no_data_found then ho_value := 0; end; varstr1 := varstr1||’ ‘||lpad(ho_value,12); end loop; dbms_output.put_line(varstr1); end loop; return; end crosstab; /OLAP Terminologies

I am not considering in this article familiarity with OLAP terms as a prerequisite. Readers are advised to skip this section in case they are familiar with OLAP terminologies.

  • Data Warehouse Huge Store of historical transaction data from various sources. Gradually keeps on growing as more and more data is loaded into it (history increases). Logical Table structure of a data warehouse is designed keeping in mind the parameters we would like to analyze them with. As such, the logical design has special forms like Star Schema or Snowflake Schema. The physical design is done kept in mind the ease of query. As such warehouse table structures are de-normalized and heavily indexed.
  • ETL Stands Extract, Transform and Load. This is the technique to load the huge store of point (1) from the various sources.
  • OLAP Engine OLAP stands for Online Analytical Processing, the technique of reporting from the data warehouse.
  • Dimension and Measures Take the function F (a, b, c) = (x, y, z) where a, b, c, x, y, and z are several attributes. Function F can be thought of as a business behavior where (x, y, z) are measures of that behavior. The business needs to know how the measure of behavior F, i.e. x, y, z, varies with categories (a, b, c). Probably, with that knowledge the business will try to manipulate (a,b,c) to get the desired (x, y, z), or it might take several other decisions not expressible in such simple terms. (a, b, c) are relatively static well defined categories known as Dimensions. The measures (x, y, z) are also known as Facts.
  • Dimension Table and Hierarchies Each dimension has a fixed (or slowly changing) domain of values, which is stored in the dimension table. Each dimension can have aggregate attributes defined at several levels known as hierarchy. Hierarchies of a given dimension are also stored in dimension tables. Let “a” be a dimension, let “a1” be a collection of “a”, let “a2” be a collection of “a1”. The dimension table can be (a, a1, a2).
  • Fact Table The table where dimensions are linked to the measures is known as Fact table. In our example we can take the fact table as (a, b, c, x, y, z). The dimension attributes mentioned in the fact tables are the ones with lowest granularity. So the fact table, in our example, will not have a1 or a2, which are aggregates.
  • Cube Cube stores pre-computed aggregations, such as sums and counts, across all dimensions. It is useful in answering multidimensional questions by avoiding frequent summary from the flood of detailed data.
  • Drill Down Frequently the business user will start with the highest aggregation of the hierarchy and ask for details on one or some selected dimensions. This is known as drill down.
  • Rollup The requirement might also be to start from the lower granularity of a particular dimension and then move to the higher aggregates. This activity is known as rollup. If the requirement is to view the aggregation across all the level of hierarchies of any given dimension, then such a report is also known as rollup report.
  • Slice Selecting some values of a dimension based on a filtering condition and then applying aggregation is known as slicing.
  • Dice Changing the outlook of the report table, i.e., interchanging the row labels and column labels is known as dicing. It might be mentioned here that dicing does not affect the values of the aggregate columns.

This is an extremely good & useful article. However I have a few points to make. I have personally worked in a few OLAP tools like Cognos. In Cognos the Transformer is used for creating a multidimensional cube. The rolled up data is same as that using Oracle Provided CUBE function. However the difference is that it uses multipass queries instead of a single query. There are several complex business rules which are either impossible or really difficult to implement in a single query. Also OLAP tools have lots of other reporting features including dashboard, graphs, drill through etc.

This is a very good artilcle. Any one looking for OLAP should try to use the oracle provided features before they look for other expensive OLAP softwares in the market.

Indeed Arnab, I do not disagree with you. You can refer the last line of my article. But the tradeoff here is the cost of Cognos vs. the cost of native SQL PLSQL crosstab and graph in XL.

“Taken the bull by the

“Taken the bull by the horn” – Indeed

Explanation on extension to sql for aggregate

EXCELLENT!

You should do the same work with other obscure topics!

Grazie.

a.c.