CoSORT
Accelerates Aggregation in Oracle Environments
Creating
aggregates is another way to dramatically improve
Oracle query and runtime speed. CoSORT’s
award-winning sort control language (SortCL)
program can aggregate data several times faster
than using equivalent SQL ‘Group by’ functionality,
depending on the volume of data involved. All SQL
aggregate functions are supported by SortCL, including
sum, average, count, maximum and minimum, but SortCL
is more efficient in that it can sort on multiple
keys and produce aggregate results for one or more
output files in the same pass through the data.
CoSORT's
SortCL can derive aggregate values based on inter-
and intra-record breaks (similar to SQL’s 'having’
clause) to produce sophisticated EIS summary, or
"drill-down" analysis on multiple classes
of field data. It can also provide running (accumulating)
aggregates, and aggregates on cross-calculated values.
Using FACT, CoSORT,
and SQL*Loader, you
can create a piped ETL
solution that maximizes the speed advantage over
SQL aggregation done exclusively in Oracle.
Consider
this simple SQL example:
SELECT
department, SUM (sales) as "Total sales"
FROM order_details
GROUP BY department
HAVING SUM (sales) > 1000; |
Once
data from 'order_details' is extracted, from FACT
for example, you could run the following SortCL
script to achieve the same results, but with much
faster performance – and with the flexibility to
perform a wider array of aggregation and/or create
additional reports in the same pass through the
data. Consider the following SortCL job script example
replicating the above SQL logic:
/INFILE=order_details
# FACT-unloaded table
/FIELD=(department,POS=1,SIZE=10)
/FIELD=(sales,POS=11,SIZE=10.2,NUMERIC)
/INCLUDE WHERE sales > 1000 #
= SQL HAVING clause
/SORT
/KEY=department # the
sort key (used in GROUP BY)
/OUTFILE=totals
# results file, optionally
re-loaded
/FIELD=(department,POS=1,SIZE=10)
/FIELD=(total_sales,POS=13,SIZE=12.2,NUMERIC)
/SUM total_sales FROM sales BREAK department
# BREAK is equivalent
to GROUP BY in SQL
/OUTFILE=totals_department.internal
# add' report
.
.
.
etc.
|
SortCL's
BREAK condition is a considered a 'unary' change
break, such that the aggregate values are reset
every time the value of the department field changes.
This is equivalent to GROUP BY in SQL. Click
here for related information.
NOTE: The input field layouts shown above are for
example purposes only. They can instead be referenced
in a .ddf (data definition file) which is produced
automatically by FACT, and based on the table description.
You can also use FACT to reduce the field width
if desired.

|