Andyblg's Blog

May 5, 2010

Categorization of Measures

Filed under: DW,OLAP — andyblg @ 09:29
Tags: , ,

Measures can be organized into three categories based on the kind of aggregate functions

  • distributive,
  • algebraic,
  • holistic.

An aggregate function is distributive if it can be computed in a distributed manner. Suppose the data are partitioned into n sets. We apply the function to each partition, resulting in n aggregate values. If the result derived by applying the function to the n aggregate values is the same as that derived by applying the function to the entire data set (without partitioning), the function can be computed in a distributed manner.

For example, count() can be computed for a data cube by first partitioning the cube into a set of subcubes, computing count() for each subcube, and then summing up the counts obtained for each subcube. Hence, count() is a distributive
aggregate function. For the same reason, sum(), min(), and max() are distributive aggregate functions.

A measure is distributive if it is obtained by applying a distributive aggregate function. Distributive measures can be computed efficiently because they can be computed in a distributive manner.

An aggregate function is algebraic if it can be computed by an algebraic function with m arguments (where m is a bounded positive integer), each of which is obtained by applying a distributive aggregate function.

For example, avg() (average) can be computed by sum()/count(), where both sum() and count() are distributive
aggregate functions. Similarly, it can be shown that min N() and max N() (which find the N minimum and N maximum values, respectively, in a given set) and standard deviation() are algebraic aggregate functions.

A measure is algebraic if it is obtained by applying an algebraic aggregate function.

An aggregate function is holistic if there is no constant bound on the storage size needed to describe a subaggregate. That is, there does not exist an algebraic function with m arguments (where m is a constant) that characterizes the computation.

Common examples of holistic functions include median(), mode(), and rank().

A measure is holistic if it is obtained by applying a holistic aggregate function.

Categories of
aggregate functions





Count(), Minimum(), Maximum()


MaxN() (N largest values),
MinN() (N smallest values), CenterOfMass()


MostFrequent(), Rank()




  1. Una gran herramienta on-line para adquirir mejor en la red
    Nuestro sistema es tan sencillo y efectivo que siempre y en todo momento estamos
    pensando… ¿De qué forma no se nos ocurrió
    Cosas que aportamos para crear un planeta de compras online mucho
    mejor y (mucho pero entretenido):


    Uno de los puntos de nuestro algoritmo es que tiene
    muy en cuenta el coste de lo que quieras comprar.

    SIEMPRE que vayas a comprar algo, asegúrate de que miras el costo aquí … Te sorprenderemos si o
    bien si

    Si nuestros usuarios son felices con sus compras,
    nosotros mucho mas!! Por eso otro de los puntos el algoritmo es que solo muestra
    productos por comentarios de gente que es feliz con lo
    que compra. ¿No te lo crees? Mira ciertos ejemplos!

    Estas son las mejores busquedas de compras on line que han hecho el día
    de hoy nuestros usuarios

    Comment by superventas summer edition 2015 — December 24, 2016 @ 23:07 | Reply

  2. […] Categorization of Measures […]

    Pingback by ce633jp — January 16, 2016 @ 10:53 | Reply

  3. 2. Data Warehousing, OLAP and Data cube computation

    (a) The standard deviation of n observations x1, x2, ….xn is defined as

    Where x¯ is the average (i.e., mean) value of x1 , . . . , xn .

    i. What kind of measure does standard deviation belong to distributive, algebraic, or holistic? Justify your answer.


    The standard deviation would belong to the algebraic measure

    ii. Outline an efficient algorithm that computes an iceberg cube with standard deviation as the measure, where the iceberg condition is n ≥ 100 and σ ≥ 2.


    (b) It is desirable to construct an AlbumCube to facilitate multidimensional search through digital photo collections, such as by date, photographer, location, theme, content, color, etc.

    i. What should be the dimensions and measures for such a data cube?


    ii. What analytical functions can you provide?


    iii. What are the major challenges on implementing AlbumCube, and how would you propose to handle them?


    Comment by robin — December 8, 2015 @ 06:34 | Reply

  4. There’s a further level of categorization — namely whether the function is also reversible. So max() is distributive when you’re adding elements, but it’s holistic when happen to remove an element which is the max value. Count, Mean, Standard deviation are all algebraic for both inserts and deletes. This is nice for implementing a sliding window because you don’t have to iterate all the values in the window to recalculate the average each time you add a new value and remove the oldest value.

    Comment by Anonymous — August 10, 2012 @ 01:26 | Reply

  5. nice explanation



    Comment by Anonymous — March 29, 2012 @ 07:38 | Reply

  6. Just want to say what a great blog you got here!
    I’ve been around for quite a lot of time, but finally decided to show my appreciation of your work!

    Thumbs up, and keep it going!


    Comment by Kesecoedusa — May 16, 2010 @ 16:52 | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at

%d bloggers like this: