Performance: Rules of thumb
When we can send everything to the database in one large select query, we call this possible in preprocessing. When we need the database for each row in a dataset/iterator, this is generally slower, and we call this postprocessing. In addition, there is placeholder processing, which means that we cannot calculate the column/field in the select query, but we can retrieve the necessary data, so that we no longer have to visit database after the large query.
Datasets
Dataset columns
What we often have difficulty with:
- data types: multivalue, multilanguage we usually cannot, in sorting/conditions also not money, json file, polygon2D, point2D
- picklist labels
- various template functions such as util., iterators, json, taskmeta., spacialmath, xml, hasrole, file, datefunctions, dataset column
- calculate dataset is only possible if the conditions and column of the underlying dataset are possible
- different data types that have a non-trivial conversion
- for math functions, check the data type
- platform attributes with different definitions (e.g. some calculated and some not).
Dataset conditions
- The child column must be available.
- The data types must match or be trivially convertible
- A number of restrictions (see the chapter about the column) on the data type
- The operator must be (usually) "simple", including
==
,<
,>
,<=
,>=
,!=
,notlike
,beginswith
,endswith
,notbeginswith
,notendswith
,stringcontrains
,notstringcontains
. Numbers that are not simple:in
,notint
,anyin
,notanyin
,multivlauecontains
,notmultivaluecontains
,is_textual_equal_picklist
- If in an 'AND' condition list one of the conditions can be accessed through the database and another cannot, then we do we can do via the database (of course that does not help with an 'OR').
Dataset sorts
The underlying column must be available, and some restrictions (see the chapter on the column) on the data type
Dataset aggregations
It is important that the underlying dataset can execute all conditions and relevant columns, as well as other things.
Extremely large datasets (million rows)
For extremely large datasets that can be retrieved but are slow, everything is often already calculated at database level . Here are some tips:
- If you want to see recent results, sort not by datecreated but by id (such as taskid). There is a index on id
- If you use standard grids/data requests, the number of rows is determined for each request. If that is not necessary, go to your dataset in the studio as a meta-developer. Do the activity "Action(s)>Edit no full count" and enable the checkbox, now you will no longer receive the count of a dataset by default.
- Put the easiest/most important conditions at the top
- OR conditions are difficult
Templates
If one template is very slow, this is often due to slow datasets, you use (item) iterations or dataset calculations, there is a good chance that this is the cause.
Attributes
Calculated attributes can cause performance problems, see Calculated attributes.