Produtos Soluções
Página Inicial Quem Somos Downloads Jornal CoSORT Contate a CoSORT Brasil

CoSORT: Record Sort Speed, Best for Data Warehouse Staging

"Almost all processing in the staging area is either sorting or simple sequential processing."
Ralph Kimball, "The Foundations for Modern Data Warehousing" Intelligent Enterprise Magazine, Data Warehouse Designer

CoSORT's parallel coroutine sort engine (which directly exploits multiple CPUs) is the fastest way to collate large volumes of data.

Click for sorting benchmarks on UNIX and Windows servers.

CoSORT's record-breaking sort performance is available throughout the CoSORT suite; i.e. the engine is central to all of CoSORT's standalone utilities (including the sort control language, or SortCL, program), API libraries, and third-party (plug'n'play) sort replacements.

CoSORT users can specify any number of fixed and/or floating key fields and collating sequences. The sorted results are immediately available for faster database reloads, cross-table joins, aggregations, and other data warehouse processing.

In a CoSORT merge operation, records from two or more commonly-formatted input files are folded together based on the key(s). Input files must already be ordered on those same keys. And because of the input files’ presorted disposition, the merge process is faster than a sort. Click here if you are interested in CoSORT's join functionality (sometimes called merge).

Oracle vs. CoSORT Sort Benchmark
HP RP5450 server with four (4) Itanium2 CPUs, HP-UX 11i, Oracle 9i

Table-to-Table
SQL*Plus ordered a 50 million-row table into a new table (SELECT * FROM table ORDER BY column_name) in 1 hr 38 minutes. CoSORT's identical one-key sort in a data staging area outside the database was performed in a piped ETL (fact | sortcl | sqlldr) operation in 18 minutes – more than five times faster than Oracle by extracting to, CoSORTing, and re-loading the sorted flat file into a new table. Click here for details.

Table-to-File / File-to-File
SQL*Plus ordered a 30 million-row table and wrote the output to a file. CoSORT sorted the same input and wrote the same output file more than 7 times faster:

30 million, 50-byte rows (1.4GB)
CoSORT: 6 mins
Oracle: 44 mins

CoSORT SortCL script

/INFILE=medload.dat
/FIELD=(MED1,POS=1,SIZE=13)
/FIELD=(MED2,POS=14,SIZE=7)
/FIELD=(MED3,POS=21,SIZE=30)
/SORT
/KEY=MED1
/OUTFILE=medload.sorted

Oracle SQL*Plus Script

set timing on
set trimspool on
set pagesize 0
set heading off
set feedback off
set termout off
spool joinload.txt
SELECT * FROM medfixload order by med1;
spool off
set timing

 

 


© 2007 CoSORT Brasil / IRI Innovative Routines International, Inc.
mkt@cosort.com.br | Aviso Legal