Oracle SQL for analysis and reporting ROCKS!
Data stored in traditional OLTP systems isn't always easily usable in analytic reporting systems - the data is almost always geared towards operational reporting systems. Oracle SQL for analysis and reporting gives you options to take your operational data and convert it into meaningful information using analytic functions.
One of the frequent requirements in a reporting system is to fill in the gaps i.e. create data where it doesn't exist, but if it did, it would make the data more consistent and the reporting would be much easier. In BI speak, this is known as data densification - the process of converting sparse data into dense form.
This is easier to explain with an example. Consider the following data set which is a result of the SQL query
select salesperson, period, item, amount
from items_sold
where salesperson = 'S1'
and period between P1 and P3;
Figure 1
The output sought is:
Figure 2
i.e. "Show me everything sales person S1 sold in periods P1, P2 and P3 even if he sold 0 amount of an item".
Step 1: Group the data by period and item to get the total amount
select salesperson, period, item, sum(amount)
from items_sold
where salesperson = 'S1'
and period between P1 and P3
group by salesperson, period, item;
Step 2: Densify the data - fill in the gaps using Partitioned Outer Joins
Assuming that the period values are stored in the periods table -
select q1.salesperson, q2.period, q1.item, nvl(q1.amt,0)
from
(
select salesperson, period, item, sum(amount) amt
from items_sold
where salesperson = 'S1'
and period between P1 and P3
group by salesperson, period, item
) q1
PARTITION BY (q1.salesperson, q1.item)
RIGHT OUTER JOIN
(
select period from periods
where period between P1 and P3 ) q2
on ( q1.period = q2.period )
So what did I just do? Basically, I took the query from step 1, created an independent sub set of rows using the PARTITION clause - effectively the set of all items S1 sold between P1 and P3
Figure 3
I outer join this subset with a static set of periods (using the inline view q2). Run the new query and you get the desired output as shown in Figure 2.
Densify away!
Data stored in traditional OLTP systems isn't always easily usable in analytic reporting systems - the data is almost always geared towards operational reporting systems. Oracle SQL for analysis and reporting gives you options to take your operational data and convert it into meaningful information using analytic functions.
One of the frequent requirements in a reporting system is to fill in the gaps i.e. create data where it doesn't exist, but if it did, it would make the data more consistent and the reporting would be much easier. In BI speak, this is known as data densification - the process of converting sparse data into dense form.
This is easier to explain with an example. Consider the following data set which is a result of the SQL query
select salesperson, period, item, amount
from items_sold
where salesperson = 'S1'
and period between P1 and P3;
Figure 1
Sales Person | Period | Item | Amount |
S1 | P1 | W1 | 100 |
S1 | P1 | W2 | 200 |
S1 | P1 | W4 | 400 |
S1 | P1 | W4 | 200 |
S1 | P2 | W3 | 200 |
S1 | P2 | W5 | 100 |
S1 | P2 | W1 | 20 |
S1 | P3 | W5 | 400 |
S1 | P3 | W2 | 200 |
The output sought is:
Figure 2
Sales Person | Period | Item | Total |
S1 | P1 | W1 | 100 |
S1 | P1 | W2 | 200 |
S1 | P1 | W3 | 0 |
S1 | P1 | W4 | 600 |
S1 | P1 | W5 | 0 |
S1 | P2 | W1 | 20 |
S1 | P2 | W2 | 0 |
S1 | P2 | W3 | 200 |
S1 | P2 | W4 | 0 |
S1 | P2 | W5 | 100 |
S1 | P3 | W1 | 0 |
S1 | P3 | W2 | 200 |
S1 | P3 | W3 | 0 |
S1 | P3 | W4 | 0 |
S1 | P3 | W5 | 400 |
i.e. "Show me everything sales person S1 sold in periods P1, P2 and P3 even if he sold 0 amount of an item".
Step 1: Group the data by period and item to get the total amount
select salesperson, period, item, sum(amount)
from items_sold
where salesperson = 'S1'
and period between P1 and P3
group by salesperson, period, item;
Step 2: Densify the data - fill in the gaps using Partitioned Outer Joins
Assuming that the period values are stored in the periods table -
select q1.salesperson, q2.period, q1.item, nvl(q1.amt,0)
from
(
select salesperson, period, item, sum(amount) amt
from items_sold
where salesperson = 'S1'
and period between P1 and P3
group by salesperson, period, item
) q1
PARTITION BY (q1.salesperson, q1.item)
RIGHT OUTER JOIN
(
select period from periods
where period between P1 and P3 ) q2
on ( q1.period = q2.period )
So what did I just do? Basically, I took the query from step 1, created an independent sub set of rows using the PARTITION clause - effectively the set of all items S1 sold between P1 and P3
Figure 3
S1 | W1 |
S1 | W2 |
S1 | W3 |
S1 | W4 |
S1 | W5 |
I outer join this subset with a static set of periods (using the inline view q2). Run the new query and you get the desired output as shown in Figure 2.
Densify away!
No comments:
Post a Comment