Hitesh Shah
2nd November 2006, 15:58
Certain programs typically process or print sessions sometimes take inexplicably long time . In order to improve such program a genric performance dll code is written.The performance improvement can be as high as 90% .
How does one identify such programs where the performance dll can be used .
1. When such programs are run , other users too experience slow system.
2. In the process / task manager on server , CPU and memory utilization shoots for the user running the program.
3. Data base drivers eats up substantial portion of the program . This happens due to repeated read / update on the certain transaction tables in the program , causing lot heavy but slow IO algorithms .
4. Such program runs for very long hours.
Typical characteristic of such program is that these programs require mainly sorting , grouping and aggregation operation .
And for that purpose it relies heavily on repeated IO operation by lot of selects on transaction tables and lot of rprt_send(), report sorting and printing . It does not uses very fast memory operations by using dyanamic arrays . Using dynamic arrays in such a program , performance can improve drastically.
What contributes most to the performance improvement through the use of this dll .
1. There should be clear knowledge of the business requirement & of the data structures involved .
2. The possible option for grouping using various sort fields should be explored to use dynamic array . The possible array
length and length of the group variable (along with possible numeric index included therein) must be known . This will give
rough idea of maximum array size in runtime . Though system allows an array size of 5 MB , goal should be to minimize the
max array size in bytes to lower value say 1 MB to allow for future expansions and to gain maximum out of the performance
improvement.
3. Substantial performance gains comes from shifting disk io operations to memory io . Disk IO can be mainly
heavy repeated database driver selects on big transaction table, heavy , expensive & repeated database driver updates on
big transaction table , lot of rprt_sends, sorting and reading of tmp report file and actual generation of report file .
4. Reducing the select / update jobs on big transaction table .e.g. 1000 select jobs on indexed table returning 1 row at a
time is costlier than 1 select job returning the same 1000 rows .
As an example if a program is doing grouping on
1. division (enum), converted 3 chr string
2. date (date) converted to 6 chr l string
3. workcenter (3 character string )
4. Year/Month (year*100 + Month - long converted to 6 chr string )
Integers , enums and date fields also can be used in the grouping strings.
Say a program running the same can have maximum 11 division * 31 days * 20 Work Center * 1 Year month at a time= 6820 combination .
So maximum size of a dynmic string array with an index for accumulators can be division 3 + date 6 + workcenter 3 +
yearmonth 6 + index 5 character long = 23 * 6820 = 156860 bytes = 0.15 MB . So this can be technically viable case for
performance improvement in circumstances described above . In runtime the array size could be lower depending on the runtime
data.
There may be 2 quanity to aggregate
1. Loss
2. Weight
So 2 separate based double variable array required to keep track of double values .
An example how the dll functions are to be called and the actual dll code is attached .
This dll can be used from Baan IV to ERP LN - all version . Earlier versions need to create an include for this dll and they can use the same in earlier versions also .
Do post the the results with array memory size , accumulators , data statistics (like distinct groups / total records ), database driver , and kind of disk IO shifted to the memory grouping so that others too can benefit from such information .
Hitesh Shah
12th November 2006, 18:33
The dll has been uploaded with minor correction , enhancements and proper disclaimers .
Also the summaries through this dll are speedier than the summaries through disk based Baan report . It's elaborated with an example in the word doc attached .
Also user need to do proper memory calculation before using this dll . So an easy to use excel calculator is uploaded with illustrations and certain requisite editable cells .
I have certain ideas to make this dll more general . But will take some time to put forward . In the meantime , wish some other developers make some contributions to make it more general , it can be great .
Happy speeding .
Hitesh Shah
19th December 2006, 16:22
Another extra function to do top down grouping in the dll for memory based operations at the speed of light .
function extern long top.down.grouping(ref domain tcmcs.s256 grpstr(),ref double accumulator(),
domain tcbool ascend , long numofelmt ,long search_len , long grplen)
{
|PURPOSE
|After the summary values are calculated to find number of top / down distinct elements with
|value of accumulator based on a particular accumulator
|WHEN TO CALL THIS FUNCTION
|After the group string is sorted and accumulated values come in accumulators and there is no data
|remaining to be grouped , this function should be called to sort double qty in ascending or descending order .
|Then for-endfor loop should be run for 1 to top.down number
|After the execution of the function accumulator values will be sorted and grpstr elements too will
|change in accordance with accumulator element value .
|VARIABLE ARGUMENTS
|grpstr - sorted string array containing finite number (numofelmt) of unique distict strings upto first
|search_len characters and after that index value pointing do double index. In summary , this function can be used
|easily in conjuction with other functions in this dll .
|accumulator - double array containing accumulated values corrsponding index values in grpstr array
| for which top down is required
|ascend - Whether to sort the double in ascending (true) or descending (false)
|numofelmt - Number of elements in the string / double array
|search_len --> length of string to search excluding characters for index fixed through out
| the program . Index value in grpstr starts after search_len
|grplen --> length of grpstr single element to search including characters for index - fixed through out
| the program
|RETURN VALUE
| return -1 for unsuccessful operation and 0 for succsessful top down grouping
|accumulator -- will contain numerically sorted values
|grpstr - this array will be in accordance with accumulator order (ie it may become unsorted
|alphanumerically )
long dblstrlen , arrctr ,dblindex , srcpos,tmpnum
if alloc.mem(topdown,2,numofelmt) < 0 then
message("Insufficient memory, Can not go ahead")
free.mem(topdown)
stop()
endif
if set.mem(topdown,0) < 0 then
message("Error initializnig memory")
return(-1)
endif
copy.mem(topdown,accumulator)
for arrctr = 1 to numofelmt
dblindex = lval(grpstr(search_len +1,arrctr))
| dblstr(1,dblindex) = edit$( accumulator(dblindex),dblformat ) &
| edit$(arrctr,idxformat )
topdown(2,dblindex) = arrctr
endfor
qss.start(search_criteria,1,1)
qss.type(search_criteria,1,db.double)
if ascend = true then
qss.way(search_criteria,1,qss.up)
else
qss.way(search_criteria,1,qss.down)
endif
e = qss.sort(accumulator,search_criteria)
if e < 0 then
on case e
case -1:
case -11:
message("Definition not of type long or table not array")
break
case -13:
message("Depth must be positive")
break
case -14:
message("Def argnot declared correctly")
break
case -15:
message("Search argument does not fit ")
break
case -16:
message("QSS.Type not correct")
break
case -17:
message("Table arg not correct ")
break
case -18:
message("No definition found in def ")
break
endcase
return(-1)
endif
if alloc.mem(tmpstr , grplen) < 0 then
message("Insufficient memory, Can not go ahead")
free.mem(tmpstr)
stop()
endif
for arrctr = 1 to numofelmt
dblindex = topdown(2,arrctr)
| dblstr(1,dblindex) = edit$( accumulator(dblindex),dblformat ) &
| edit$(arrctr,idxformat )
srcpos = qss.search(qss.equal + qss.src.is.sorted+qss.src.dupl.allowed ,topdown(1,arrctr) , accumulator,search_criteria)
on case srcpos
case -1:
message("Data error ! Can not go ahead")
return(-1)
break
case -11:
case -12:
case -13:
case -14:
case -15:
case -16:
case -17:
case -18:
case -21:
case -22:
case -23:
message("QSS.SEARCH error %d" , srcpos)
return(-1)
break
default:
tmpstr = grpstr(1,srcpos)
grpstr(1,srcpos) = grpstr(1,dblindex)
grpstr(1,dblindex) = tmpstr
topdown(2, lval(grpstr(search_len +1,dblindex) ))=dblindex
topdown(2,lval(grpstr(search_len +1,srcpos) )) = srcpos
break
endcase
endfor
free.mem(topdown)
free.mem(tmpstr)
return(0)
}
mbdave
17th January 2007, 07:25
Great work Sir
Hitesh Shah
17th January 2007, 16:52
Thanks Mahesh for the feedback .
There is still something more needed in this dll so that it's usability and adaptability increases substantially. Developer still has to spend some effort to use memory based grouping as against disk based grouping . It must be capable of being done easily , quickly and accurately .
Hitesh Shah
30th October 2007, 17:26
Speed is very important in data transformation . And in-memory data analysis is a major criteria in evaluation criteria of BI Vendors (http://mediaproducts.gartner.com/reprints/microsoft/vol7/article3/article3.html) . In Baan QSS functions are the vehicle for enabling one to do in-memory data transformations very fast. The performance dll written through this thread brings a higher level generalization in data transformation hiding the complexities of qss functions .
Still the developer has to take care off lot off things like calculating array length , encoding and decoding of string for array , error handling , taking care off data specific issues etc . Even these complexities have been taken care off through a code generator (http://www.erpjewels.com/Code%20Generator%20for%20data%20%20transformation.htm) which one can download . There is a good report which can tell user which all reports can be imporved with use of this code generator.
Documented memory limit for an array is 5 MB . Though array variables of higher size can work in many OS and later porting sets , this is a very high limit and good error handling to take care of it . In fact one can also say if the array is higher than 5 MB, it's not a good case for summary reporting .
steventay
31st October 2007, 17:23
Another extra function to do top down grouping in the dll for memory based operations at the speed of light .
function extern long top.down.grouping(ref domain tcmcs.s256 grpstr(),ref double accumulator(),
domain tcbool ascend , long numofelmt ,long search_len , long grplen)
{
|PURPOSE
|After the summary values are calculated to find number of top / down distinct elements with
|value of accumulator based on a particular accumulator
|WHEN TO CALL THIS FUNCTION
|After the group string is sorted and accumulated values come in accumulators and there is no data
|remaining to be grouped , this function should be called to sort double qty in ascending or descending order .
|Then for-endfor loop should be run for 1 to top.down number
|After the execution of the function accumulator values will be sorted and grpstr elements too will
|change in accordance with accumulator element value .
|VARIABLE ARGUMENTS
|grpstr - sorted string array containing finite number (numofelmt) of unique distict strings upto first
|search_len characters and after that index value pointing do double index. In summary , this function can be used
|easily in conjuction with other functions in this dll .
|accumulator - double array containing accumulated values corrsponding index values in grpstr array
| for which top down is required
|ascend - Whether to sort the double in ascending (true) or descending (false)
|numofelmt - Number of elements in the string / double array
|search_len --> length of string to search excluding characters for index fixed through out
| the program . Index value in grpstr starts after search_len
|grplen --> length of grpstr single element to search including characters for index - fixed through out
| the program
|RETURN VALUE
| return -1 for unsuccessful operation and 0 for succsessful top down grouping
|accumulator -- will contain numerically sorted values
|grpstr - this array will be in accordance with accumulator order (ie it may become unsorted
|alphanumerically )
long dblstrlen , arrctr ,dblindex , srcpos,tmpnum
if alloc.mem(topdown,2,numofelmt) < 0 then
message("Insufficient memory, Can not go ahead")
free.mem(topdown)
stop()
endif
if set.mem(topdown,0) < 0 then
message("Error initializnig memory")
return(-1)
endif
copy.mem(topdown,accumulator)
for arrctr = 1 to numofelmt
dblindex = lval(grpstr(search_len +1,arrctr))
| dblstr(1,dblindex) = edit$( accumulator(dblindex),dblformat ) &
| edit$(arrctr,idxformat )
topdown(2,dblindex) = arrctr
endfor
qss.start(search_criteria,1,1)
qss.type(search_criteria,1,db.double)
if ascend = true then
qss.way(search_criteria,1,qss.up)
else
qss.way(search_criteria,1,qss.down)
endif
e = qss.sort(accumulator,search_criteria)
if e < 0 then
on case e
case -1:
case -11:
message("Definition not of type long or table not array")
break
case -13:
message("Depth must be positive")
break
case -14:
message("Def argnot declared correctly")
break
case -15:
message("Search argument does not fit ")
break
case -16:
message("QSS.Type not correct")
break
case -17:
message("Table arg not correct ")
break
case -18:
message("No definition found in def ")
break
endcase
return(-1)
endif
if alloc.mem(tmpstr , grplen) < 0 then
message("Insufficient memory, Can not go ahead")
free.mem(tmpstr)
stop()
endif
for arrctr = 1 to numofelmt
dblindex = topdown(2,arrctr)
| dblstr(1,dblindex) = edit$( accumulator(dblindex),dblformat ) &
| edit$(arrctr,idxformat )
srcpos = qss.search(qss.equal + qss.src.is.sorted+qss.src.dupl.allowed ,topdown(1,arrctr) , accumulator,search_criteria)
on case srcpos
case -1:
message("Data error ! Can not go ahead")
return(-1)
break
case -11:
case -12:
case -13:
case -14:
case -15:
case -16:
case -17:
case -18:
case -21:
case -22:
case -23:
message("QSS.SEARCH error %d" , srcpos)
return(-1)
break
default:
tmpstr = grpstr(1,srcpos)
grpstr(1,srcpos) = grpstr(1,dblindex)
grpstr(1,dblindex) = tmpstr
topdown(2, lval(grpstr(search_len +1,dblindex) ))=dblindex
topdown(2,lval(grpstr(search_len +1,srcpos) )) = srcpos
break
endcase
endfor
free.mem(topdown)
free.mem(tmpstr)
return(0)
}
do i add the "function extern long top.down.grouping(ref domain tcmcs...." inside the performance dll.txt at the bottom?
i am new in Baan, how do i add this function to my baan server?
Hitesh Shah
1st November 2007, 14:41
U need to create a dll in ttadv2131m000 with performance dll.txt and append the function text in the dll . The dll should be compiled and used in used in the program with statement #pragma used dll dll_name" .
If u r new to baan programming , I would recommend u to get familiar with Baan arrays , data types conversions in Baan ,qss functions, reporting structure and the circumstances in which this dll can help improve speed(
This dll is not grand panacea for all kinds of performance issues ) .
klixy23
5th November 2007, 11:44
I try to compile the dll and missing the declaration of topdown and tmpstr. Where should declared these variables ?
Have a nice day!
Hitesh Shah
5th November 2007, 11:58
string tmpstr(1) based
double topdown(1,1) based |2 dimensional array for top down grouping
NirajKakodkar
2nd March 2009, 08:08
Really great work , I think I am really late to check out this post .
I will try out the dll .
Regards,
Niraj
Hitesh Shah
2nd March 2009, 14:29
Thanks. U got some time to touch to this thread finally .
Do post ur feedback after testing (for the benefit of others on this ) with enhancements if any.