Click here to Skip to main content
15,881,089 members
Articles / Database Development / SQL Server
Article

Persisted Permutations

, ,
Rate me:
Please Sign up or sign in to vote.
5.00/5 (10 votes)
4 May 200419 min read 36.5K   345   25  
An article on how to persist permutations of items in relational databases.

Introduction

The common way of fetching ordered sets of data from a relational database is to use the SQL clause ORDER BY. This, used in conjunction with a SELECT statement, orders the entire set of result data based on the values in the columns indicated by the ORDER BY clause. Let’s consider the following table (named STUDENT) as an example:

NameCoefficient
John10
Michael12
George8
Steve8

A query like “SELECT * FROM STUDENT ORDER BY coefficient DESC” will return the data sorted after the value of the coefficient, in descending order:

Michael12
John10
George8
Steve8

This method of sorting data is very simple and easy to use, and has the advantage of being part of the SQL standard and thus available on virtually any relational database provided out of the box. It has one significant drawback though: it can only be effectively used when the values in the column(s) based on which we want to sort the data contain the information needed for the sorting process. In real life situations, we may desire to obtain data in an order that has nothing to do with the data itself. It may be that the user entered it in a certain order and he/she expects to see it coming back in the same order, or any other situation in which we may want to be able to obtain the records in a table in a certain order and that order cannot be deduced (calculated) from the data itself. In other words, a simple ORDER BY clause would not be enough in these cases. Let’s take an example.

(Table MASTERTABLE)
ItemGroup (PK)
48E9405C45EE4BB3A3DCD3918898A815
40AB772255184B20A4A8EDD89B59536F
0826DE86308148F89CACEBD8A043F4DC
(Table DETAILTABLE)
ItemGroup (FK to MASTER.ItemGroup)Item (PK)
48E9405C45EE4BB3A3DCD3918898A8156BF71715E18640419BA9023585E55B55
48E9405C45EE4BB3A3DCD3918898A815C41908F6E896409A95ABD3EC81550B0D
48E9405C45EE4BB3A3DCD3918898A8154D95851F6F7D4CEB8C4AC7ABDAC429D9
48E9405C45EE4BB3A3DCD3918898A8154D467B250ECD46D3B99857CAE00BC967
40AB772255184B20A4A8EDD89B59536F6F9B9D06515D450E990CD9AEBEE9DCCF
40AB772255184B20A4A8EDD89B59536F6BF71715E18640419BA9023585E55B55
0826DE86308148F89CACEBD8A043F4DC63E3F1F9229C41F889C05F66EB0E441
0826DE86308148F89CACEBD8A043F4DC13329EB5387649C6B42DC2FDDE1E5700

All columns are of type RAW(16) and the values that you can see inside are GUID (Globally Unique Identifier) values. They have no meaning in the real world by themselves, they are just blind identifiers. This database has a number of categories (in the MASTERTABLE table), and for each of these categories, we can have zero or more items (in the DETAILTABLE table). We have no additional identifying information except these identifiers. We can assume that such information is external to this database, or, even if not, it may still not be feasible or desirable to sort based on it. Let’s assume we want to maintain an order for all the items inside a given item group. How do we do that?

SELECT * FROM DETAILTABLE WHERE ItemGroup = <SOME_ITEMGROUP> ORDER BY Item” will not cause any errors, but will do nothing useful (it will sort the values lexicographically, which is not what we are after).

We propose a number of techniques that will help solve this and some related problems as follows.

Method #1. Additional column with dispersed integer values

This method implies adding to the DETAILTABLE table a new column, with meta information about the Item IDs. Let’s call this column Rank. This meta information is simply a positive integer number, such that “SELECT … WHERE ITEMGROUP = <SOME_ITEMGROUP> ORDER BY Rank” will produce the desired order. At insert time, a rank value will be chosen so that the item just inserted occupies the desired position in the ordered list of items for that particular item group. In our example, we are only interested in sorting items inside their own item group (and an item, being a PK, will only be part of one item group), so these rank numbers have meaning at an item group scope rather than at the scope of the DETAILTABLE table. The technical details of the stored procedure we propose for this technique are described as follows. There shall be an alternate key for the columns (ItemgGroup, Rank) to enforce uniqueness amongst the ranks of the same item group.

First, let MAXINT denote the maximum integer value supported by the particular DB you are using. For portability issues, one may want to choose a conservative value for MAXINT, one that will be guaranteed to work on all relational databases.

Let us assume that the item ITEM<SUB>1</SUB> is the first item inserted in the DETAILTABLE table for the item group ITEMGROUP<SUB>M</SUB>. Its rank shall be chosen as the value of ROUND (MAXINT / 2).

Let us now assume that we have a series of N items entered for the item group ITEMGROUP<SUB>M</SUB> in the DETAIL table, ITEM<SUB>1</SUB>, ITEM<SUB>2</SUB>ITEM<SUB>N</SUB> and that their ranks are RANK<SUB>1</SUB>, RANK<SUB>2</SUB>RANK<SUB>N</SUB>, respectively. We wish to position this new ITEM as the direct successor of ITEM<SUB>K</SUB>, 0 <= K < N (0 as a special case, when we want this new item to be the first, while RANK<SUB>N+1</SUB> is considered to be MAXINT). We shall choose its rank as the value of:

ROUND ((RAN<SUB>K</SUB> + (RANK<SUB>K+1</SUB> - RANK<SUB>K</SUB>) / 2)), where RANK<SUB>0</SUB> is defined to be 0. If we want to insert the new item as the first one, its rank shall be ROUND (RANK<SUB>1</SUB> / 2), and if we want it to be the last one, its rank shall be ROUND ((MAXINT – RANK<SUB>N</SUB>) / 2).

It can be easily seen that RANK<SUB>K</SUB> < RANK<SUB>P</SUB> for all K < P.

If a large number of items are inserted in a particular place in the list (e.g., all are inserted, one by one, as the immediate successor of the same existing item), we may arrive at a point where RANK<SUB>K+1</SUB> - RANK<SUB>K</SUB> = 1 for some K (ITEM<SUB>K</SUB> being the preferred immediate predecessor we mentioned earlier). It effectively means that no new ITEM<SUB>P</SUB> can be inserted such that RANK<SUB>K</SUB> < RANK<SUB>P</SUB> < RANK<SUB>K+1</SUB>. We fix the situation by redistributing the rank values over all the items in the item group. Let us call global redistribution the following simple method of rank redistribution:

SQL
ITEMGROUP // user given argument
N //number of items for ITEMGROUP
 
RANK[] RANKS = "SELECT RANK, ITEM FROM DETAIL 
                 WHERE DETAIL.ITEMGROUP = ITEMGROUP 
                 ORDER BY RANK ASC"
integer index = 1;
integer PREVIOUS = RANKS [1]; //first rank
for each RANK R in RANKS
{
    R = ROUND ((MAXINT / (N+1))) * index;
    update R to db;
    index++;
}

Example:

Let us assume we have N = 8 items and their ranks are: 3, 5, 7, 7, 7, 10, 11 and 14. They will get redistributed to:

ROUND (MAXINT / 9), ROUND ((MAXINT / 9)) * 2, ROUND ((MAXINT / 9)) * 3, ROUND ((MAXINT / 9)) * 3, ROUND ((MAXINT / 9)) * 3, ROUND ((MAXINT / 9)) * 6, ROUND ((MAXINT / 9)) * 7, ROUND ((MAXINT / 9)) * 8.

It is observed immediately that equal ranks get redistributed to equal ranks, and that if the order of multiplicity of rank K is M<SUB>1</SUB>, and the order of multiplicity of rank P is M<SUB>2</SUB>. After redistribution, we shall have:

ROUND ((RANK<SUB>K+1</SUB> - RANK<SUB>K</SUB>) / (RANK<SUB>P+1</SUB> - RANK<SUB>P</SUB>)) = ROUND (M<SUB>1</SUB> / M<SUB>2</SUB>); (with the limit case of RANK<SUB>K+1</SUB> being MAXINT if K is the maximum rank).

The advantage of this redistribution method is that it guarantees a reasonable “widening” of the “space” between ranks, without knowing or caring about where the rank value density is higher than the average. Unfortunately, this simple redistribution strategy does not use any information that might be available about the expected rank distribution over the items.

Of course, the redistribution methods can be implemented inside or outside the database. As an example, the article provides a PL/SQL implementation for future reference. The body of the stored procedure that handles the rank redistribution is printed below:

SQL
PROCEDURE 

redistranksglobally (itemgroupguid IN RAW)
IS
    n NUMBER (38, 0);
    f NUMBER (38, 0);
    ceils NUMBER (38, 0);
    newrank NUMBER (38, 0);
BEGIN
SELECT COUNT (1)
    INTO n
    FROM detailtable_disp_total
WHERE detailtable_disp_total.itemgroup = itemgroupguid;

IF n > 0
THEN
    f := FLOOR (common.maxint / (n + 1));
    ceils := common.maxint - f * (n + 1);
    newrank := 0;

    FOR r IN ranks (itemgroupguid)
    LOOP
        newrank := newrank + f;

        IF ceils > 0
        THEN
            newrank := newrank + 1;
            ceils := ceils - 1;
        END IF;

        UPDATE detailtable_disp_total
        SET detailtable_disp_total.RANK = newrank
        WHERE CURRENT OF ranks;
    END LOOP;
END IF;
END;

Partially ordered ranks

If one only desires partial order amongst the ranks, he/she can choose to do that. The alternate key for the columns (ItemgGroup, Rank) shall have to be removed and additional insertion and redistribution semantics shall have to be devised. An example on how this can be done is present in the reference implementation attached to this article.

The main advantage of this method is that, by starting with [1 ... MAXINT] as our interval for rank values and by distributing the ranks far from one another, many inserts of new items in the DB can happen before any rank redistribution is needed, and many more will happen before a new redistribution is needed. These properties make this technique appropriate for highly mutable tables.

The main drawback presented by this method consists of the introduction of a sorting complexity for finding out the nth item from a given item group.

The next technique is somehow complementary of this one, in that it makes ranks meaningful as indexes, but will require at least partial rank redistribution for the vast majority of insert operations. Also, the next technique does not allow partial ordering.

Method #2. Additional column with consecutive integer values

This method implies adding a new column to the DETAILTABLE table, with meta information about the Item IDs. Let’s call this column Rank. This meta information is simply a positive integer number, such that “SELECT … WHERE ITEMGROUP = <SOME_ITEMGROUP> ORDER BY Rank” will produce the desired order. In this case, the ranks start from 1 and have consecutive values. It follows immediately that the maximum rank for a given item group is the item group's item count.

At insert time, a rank value will be chosen so that the item just inserted will occupy the desired position in the ordered list of items for a particular item group. In our example, we are only interested in sorting items inside their item group (and an item, being a PK, will only be part of one item group), so these rank numbers have meaning at an item group scope, but not at the scope of the whole DETAILTABLE table. Total order is required, which means that no two different items can have the same rank.

The first inserted item for an item group will have the rank set to 1.

Let us now assume that we have a series of N items belonging to the item group ITEMGROUP<SUB>M</SUB> in the DETAIL table, ITEM<SUB>1</SUB>, ITEM<SUB>2</SUB>ITEM<SUB>N</SUB> and that their ranks are RANK<SUB>1</SUB>, RANK<SUB>2</SUB>RANK<SUB>N</SUB>, respectively. We know that RANK<SUB>K</SUB> = RANK<SUB>K-1</SUB> + 1, with the exception of RANK<SUB>1</SUB>, which has no predecessor. If we desire to insert the new item at the end of the list, that is done by setting its rank to N + 1. If not, let us assume that the desired predecessor for the newly inserted item has the rank K, with 0 <= K < N where 0 means we want it first on the list. This is achieved in two simple steps. First, we increment all the ranks which are strictly larger than K. Second, we insert the new item with the rank of K + 1.

We follow with a description of a set of database constraints that have the role of assuring that the ranks have positive consecutive integer values, starting with 1. Let us return to the previously defined table, DETAILTABLE.

ItemGroup (FK to MASTER.ItemGroup)Item (PK)RankRankMinus1
48E9405C45EE4BB3A3DCD3918898A8156BF71715E18640419BA9023585E55B551 
48E9405C45EE4BB3A3DCD3918898A815C41908F6E896409A95ABD3EC81550B0D21
48E9405C45EE4BB3A3DCD3918898A8154D95851F6F7D4CEB8C4AC7ABDAC429D932
48E9405C45EE4BB3A3DCD3918898A8154D467B250ECD46D3B99857CAE00BC96743
40AB772255184B20A4A8EDD89B59536F6F9B9D06515D450E990CD9AEBEE9DCCF1 
40AB772255184B20A4A8EDD89B59536F6BF71715E18640419BA9023585E55B5521
0826DE86308148F89CACEBD8A043F4DC63E3F1F9229C41F889C05F66EB0E4411 
0826DE86308148F89CACEBD8A043F4DC13329EB5387649C6B42DC2FDDE1E57002 

We add the columns Rank and RankMinus1, which have an integer data type. For each rank, the corresponding value in the RankMinus1 column is rank – 1. We shall return to why this is this way at a later point. When we have a rank of 1, the corresponding entry in the RankMinus1 column is NULL (to be noted that RankMinus1 is nullable).

A constraint shall be added to the table DETAIL, in the following form:

SQL
((RANK =1 AND rankminusone IS NULL)OR RANK>1 AND rankminusone=RANK-1)

The constraint will be made deferrable and initially deferred, so as to allow for manipulation of ranks during the insertion of new items.

A foreign key from RankMinus1 shall be added to point to Rank, to make sure that no bogus values will be inserted in the RankMinus1 column. Also, alternate keys will be introduced to ensure that the combinations (ItemGroup, Rank) and (ItemGroup, RankMinus1) are unique.

With the addition of this constraint mechanism, new steps will have to be added when inserting new items. Specifically, the values in the column RankMinus1 also have to be updated. This is trivial to do, and no detailed explanation shall be given here.

The advantage of this method is that the ranks provide not only order, but also information on the index of a certain item. One can find out “the nth element in the table” by doing “SELECT * FROM DETAIL WHERE Rank = n”.

The disadvantage is that an insertion of a new item with a rank of K will require for the updating of N – K ranks and corresponding N – K values from the RankMinus1 column.

An example on how this can be done is present in the reference implementation attached to this article.

Isolating the ordering information in the master table

We have seen that keeping the ordering information in the detail table results in lots of update operations when that order changes. In the following sections, we introduce a way of storing that information in the master table. We also analyze when the proposed method is feasible and when not, and we discuss a less efficient way of achieving similar results, but without so strong limitations on the item count.

Background - The factorial number system

Definition 1 Factorial

The factorial n! is defined as n! = n(n-1)...1 where n is a natural number. The factorial n! is the number of ways in which n distinct objects can be permuted. Because an empty set has a single permutation, it is considered that 0! = 1.

In order to indicate how big the factorial numbers actually are, we introduce some limits that are a consequence of Stirling's approximation:

Proposition 1 [Robbins 1955, Feller 1968]

The factorial function is bounded by the following double inequality:


Ö
 

2p
 
nn+[1/2]e-n+[1/(12n+1)] < n! <
Ö
 

2p
 
nn+[1/2]e-n+[1/12n]

Proposition 2 [Gosper]

For n large it holds true that:

n! »  æ
Ö

(2n+1

3
)p
 
nne-n

As opposed to our traditional geometric radix number systems where the denominations of successive "places" form geometric series, e.g., 1, 10, 100, ... or 1, 2, 4, 8, 16, ... the factorial number system's denominations are the factorial numbers 1, 2, 6, 24, 120, ..... Thus the factorial number system belongs to the item group of mixed-radix representation systems.

The following proposition is very important in the existence of the factorial number system:

Proposition 3

The identity 0 * 0! + 1 * 1! + 2 * 2! + 3 * 3! + ... + k * k! = (k + 1)! - 1 holds true for all k ≥ 0.

Proposition 4

Let t ≥ 2 be a natural number. Every integer in the range 0 ≤ f ≤ t! can be uniquely written in the form:

f = (...(c<SUB>t<FONT face=symbol>-</FONT>1</SUB>(t<FONT face=symbol>-</FONT>1)+c<SUB>t<FONT face=symbol>-</FONT>2</SUB>)(t<FONT face=symbol>-</FONT>2)+...+c<SUB>2</SUB>)2+c<SUB>1</SUB>

Expressed in another form,

f = (t<FONT face=symbol>-</FONT>1)!c<SUB>t<FONT face=symbol>-</FONT>1</SUB>+(t<FONT face=symbol>-</FONT>2)!c<SUB>t<FONT face=symbol>-</FONT>2</SUB>+...+2!c<SUB>2</SUB>+1!c<SUB>1</SUB>

where the "digits" c<SUB>j</SUB> are integers satisfying 0 ≤ c<SUB>j</SUB> ≤ j, for 1 ≤ j < t.

The factorial number system enables us to represent the ordering of a set of t elements as a single natural number less than t!. This is optimal in the sense that there are exactly t! ways to arrange the elements of a t cardinality set.

Method #3. Permutations packed as natural numbers

Let (U<SUB>1</SUB>, ..., U<SUB>t</SUB>) be a sequence of t distinct identifying keys for the items to represent where U<SUB>i</SUB> <FONT face=symbol>Î</FONT> U. These can constitute a primary key or a unique index in the detail table. We assume that the database imposes a total order on the set U which we call a "natural ordering", and that is generally true, be them GUIDs, an autonumber field, or anything else that can be used for identification purposes in the most widely used relational databases.

The ORDER BY SQL clause always orders the items based on the database's natural ordering imposed on the column's data type. The sequence (U<SUB>1</SUB>, ..., U<SUB>t</SUB>) can be represented starting from the database's natural ordering and applying a permutation, and vice versa. The following algorithm computes the inverse permutation as a composition of transpositions and represents the permutation as a natural number. More details about variants of the following two algorithms can be found in Art of Computer Programming, Volume 2: Seminumerical Algorithms by Donald E. Knuth, section 3.3.2:

Algorithm 1 [Analyze a permutation]

We compute a bijective map f : P<SUB>t</SUB> <FONT face=symbol>®</FONT> [0, t!) <FONT face=symbol>Ç</FONT> Z where P<SUB>t</SUB> is the set of all the possible permutations of (1, ..., t). Let <FONT face=symbol>s</FONT> denote such a permutation.

  • P1 Let r <FONT face=symbol>¬</FONT> t, f <FONT face=symbol>¬</FONT> 0.
  • P2 Find the (unique) s position such that <FONT face=symbol>s</FONT><SUB>s</SUB> = r. Set f <FONT face=symbol>¬</FONT> r * f + s - 1.
  • P3 Exchange <FONT face=symbol>s</FONT><SUB>r</SUB> with <FONT face=symbol>s</FONT><SUB>s</SUB>.
  • P4 Decrement r. If r > 1 then go to step P2.

Note that the <FONT face=symbol>s</FONT> permutation becomes the identity permutation when the algorithm stops. To prove that the function is indeed bijective, we run the algorithm backwards:

Algorithm 2 [Backwards]

for r = 2, 3, ..., t

  • s <FONT face=symbol>¬</FONT> f mod r + 1
  • f = floor([f/r])
  • Exchange <FONT face=symbol>s</FONT><SUB>s</SUB> with <FONT face=symbol>s</FONT><SUB>r</SUB>.

The relationship between the factorial number system and algorithm 1 is the identity c<SUB>r - 1</SUB> = s every time step P2 is performed for any value of r.

Note that these two algorithms can run directly on the sequence (U<SUB>1</SUB>, ..., U<SUB>t</SUB>) rather than using the permutation <FONT face=symbol>s</FONT>. In this case, instead of looking for <FONT face=symbol>s</FONT><SUB>s</SUB> = r, one has to find out the maximum element of the sequence (U<SUB>1</SUB>, ..., U<SUB>t</SUB>).

If the items were indexed by an autonumber column, then appending an item to the end of the sequence could simply be achieved using:

PERMUTATION <FONT face=symbol>¬</FONT> t! t + PERMUTATION

In the source associated to this article, a stored function with the signature:

SQL
FUNCTION arrangecategoryitems (initems IN rawtable) RETURN NUMBER

is provided that computes the number representation of the permutation. The implementation closely follows the algorithm described above:

SQL
items := initems;
i := 0;
s := 1;
t := items.COUNT;
r := t;
f := 0;

LOOP
   s := 1;
   i := 0;

   LOOP
      i := i + 1;
      EXIT WHEN i > r;

      IF items (i) > items (s)
      THEN
         s := i;
      END IF;
   END LOOP;

   aux := items (s);
   items (s) := items (r);
   items (r) := aux;
   f := r * f + s - 1;
   r := r - 1;
   EXIT WHEN r <= 1;
   NULL;
END LOOP;

RETURN f;

A stored function that extracts the ordering information from the PERMUTATION column is also included:

SQL
FUNCTION reversearrangecategoryitems (
   permutation   IN   NUMBER,
   items         IN   rawtable
)
   RETURN rawtable

The implementation closely follows the backwards algorithm:

SQL
RESULT := items;
i := 0;
s := 1;
t := items.COUNT;
f := permutation;
r := 1;

LOOP
   r := r + 1;
   EXIT WHEN r > t;
   s := MOD (f, r);
   f := FLOOR (f / r);
   aux := RESULT (s + 1);
   RESULT (s + 1) := RESULT (r);
   RESULT (r) := aux;
END LOOP;

LOOP
   i := i + 1;
   EXIT WHEN i > RESULT.COUNT;
END LOOP;

RETURN RESULT;

Limitations of this method

As stated above, this method permits the storage of the permutation information in the master table rather than the detail table. The ordering information for at most t items is stored as an integer number less than t!. When the ordering changes, there is a single field in the master table to change, and this is much cheaper than the other methods presented herein.

But let's find out what is the maximum number of items supported using the integer data types supported by (some) databases. We add the PERMUTATION column to the master table. In Oracle, we recommend using the NUMBER(38, 0) data type. In SQL Server, the equivalent is DECIMAL(38, 0). In both cases, MAXINT = 10<SUP>39</SUP> - 1 while the greatest number t such that t! < MAXINT is t = 34. So, we can only support 34 detail items for each master record on Oracle and SQL Server using this method.

Method #4. Overpassing the above limitations

Rather than using an integer, one can store the sequence c<SUB>1</SUB>,c<SUB>2</SUB>, ..., c<SUB>t-2</SUB>, c<SUB>t-1</SUB> packed into a byte sequence data type, that is RAW on Oracle and VARBINARY on SQL Server. Because 0 <FONT face=symbol>£</FONT> c<SUB>j</SUB> <FONT face=symbol>£</FONT> j, one can use a very simple byte packing strategy to represent the sequence as a byte sequence data type: every c<SUB>j</SUB> will occupy ceil(log<SUB>2</SUB>(j+1)) bits in the PERMUTATION column. The number of bits needed to express the sequence packed like described above is:

St = t - 1
å
j = 1 
ceil(log2(j + 1)).

Some loose bounds for this sum appear immediately if we use the double inequality log2(t!) ≤ St ≤ t + log2(t!):

1

2
(1 + log2(p)) + (t + 1

2
) log2(t) - (t -1

12 t + 1
)log2(e) £ St
St £ t - 1 + 1

2
(1 + log2(p)) + (t+1

2
) log2(t) - (t -1

12 t
)log2(e)

Using the upper bound, one can easily compute how many bits (rather than bytes) to allocate for the PERMUTATION column. In order to compute the byte size, one has to divide the upper bound by 8 and compute the ceiling of the result.

But we can also compute the exact size required to store the sequence c<SUB>1</SUB>, c<SUB>2</SUB>, ..., c<SUB>t - 2</SUB>, c<SUB>t - 1</SUB>. Let's define:

m(t) = the greatest integer m such that 2<SUP>0</SUP> + 2<SUP>1</SUP> + ... + 2<SUP>(m - 1)</SUP> <FONT face=symbol>£</FONT> t - 1

An equivalent expression for m(t) is:

m(t) = floor(log2(t))

Now, let's compute the sum.

St = m(t)
å
j = 1 
j 2j - 1 + (m(t) + 1)(t - 2m(t))

In Concrete Mathematics by Ronald L. Graham, Donald E. Knuth and Oren Patashnik, the chapter of sums contains, amongst others, the following sum:

n
å
k = 0 
>k 2k = (n - 1) 2n+1 + 2.

Using the above formula, we get:

St = (m(t) - 1) 2<SUP>m(t)</SUP> + 1 + (m(t) + 1)(t - 2<SUP>m(t)</SUP>)

or

S<SUB>t</SUB> = t (m(t) + 1) + 1 - 2<SUP>m(t) + 1</SUP>

Roughly speaking, S<SUB>t</SUB> = O(t log(t)).

Using the Oracle RAW(2000) data type, the maximum number of items supported is t = 1640, while using SQL Server with BINARY(8000) or VARBINARY(8000), the maximum number of items supported is t = 5553.

For demonstrative purposes, we include below a table of computed sizes expressed in bytes:

tsize(PERMUTATION) in bytes
104
209
3015
4023
5030
6038
7046
8055
9063
10072
200169
300274
400387
500499
600623
700748
800873
900998
10001123
20002495
30003989
40005489
50007102
60008727
700010352
800011977
900013703
1000015453

In the source code associated with this article, a stored function with the signature:

SQL
FUNCTION arrangecategoryitems (initems IN rawtable)
   RETURN RAW

that computes the PERMUTATION column's value is included. Note that the implementation uses the UTL_RAW Oracle package for byte manipulation:

SQL
RESULT := UTL_RAW.copies ('00', 2000);
bitposition := 0;
bitsize := 1;
samesizecount := 1;
samesizeindex := 1;
items := initems;
t := items.COUNT;
r := t;

LOOP
   s := 1;
   i := 0;

   LOOP
      i := i + 1;
      EXIT WHEN i > r;

      IF items (i) > items (s)
      THEN
         s := i;
      END IF;
   END LOOP;


   auxitem := items (s);
   items (s) := items (r);
   items (r) := auxitem;
   x := (s - 1) * POWER (2, (24 - bitsize - (bitposition MOD 8)));
   xasraw := UTL_RAW.SUBSTR (UTL_RAW.cast_from_binary_integer (x), 2, 3);
   curent := FLOOR (bitposition / 8);

   IF curent > 0
   THEN
      aux := UTL_RAW.copies ('00', curent);
   ELSE
      aux := NULL;
   END IF;

   aux := UTL_RAW.CONCAT (aux, xasraw);
   RESULT := UTL_RAW.bit_or (RESULT, aux);
   bitposition := bitposition + bitsize;
   samesizeindex := samesizeindex + 1;

   IF (samesizeindex > samesizecount)
   THEN
      BEGIN
         samesizeindex := 1;
         samesizecount := samesizecount * 2;
         bitsize := bitsize + 1;
      END;
   END IF;

   r := r - 1;
   EXIT WHEN r <= 1;
   NULL;
END LOOP;

A stored function that extracts the ordering information from the PERMUTATION column is also implemented with the signature:

SQL
FUNCTION reversearrangecategoryitems (permutation IN 

RAW, items IN rawtable)
      RETURN rawtable

Again, a lot of bit manipulation was needed to implement this function:

SQL
bitposition := 0;
bitsize := 1;
samesizecount := 1;
samesizeindex := 1;
cj := NULL;
RESULT := items;
r := 0;

LOOP
   r := r + 1;
   EXIT WHEN r > items.COUNT - 1;
   curent := FLOOR (bitposition / 8);
   xasraw := UTL_RAW.SUBSTR (permutation, curent + 1, 3);
   MASK :=
      CASE bitposition MOD 8
         WHEN 0
            THEN 'FFFFFF'
         WHEN 1
            THEN '7FFFFF'
         WHEN 2
            THEN '3FFFFF'
         WHEN 3
            THEN '1FFFFF'
         WHEN 4
            THEN '0FFFFF'
         WHEN 5
            THEN '07FFFF'
         WHEN 6
            THEN '03FFFF'
         WHEN 7
            THEN '01FFFF'
      END;
   xasraw := UTL_RAW.bit_and (xasraw, MASK);
   x := UTL_RAW.cast_to_binary_integer (xasraw);
   s := FLOOR (x / POWER (2, (24 - bitsize - (bitposition MOD 8))));

   IF cj IS NULL
   THEN
      cj := auxvarray (s + 1);
   ELSE
      cj.EXTEND ();
      cj (cj.COUNT) := s + 1;
   END IF;

   bitposition := bitposition + bitsize;
   samesizeindex := samesizeindex + 1;

   IF (samesizeindex > samesizecount)
   THEN
      BEGIN
         samesizeindex := 1;
         samesizecount := samesizecount * 2;
         bitsize := bitsize + 1;
      END;
   END IF;
END LOOP;

r := cj.COUNT () + 1;

LOOP
   r := r - 1;
   EXIT WHEN r <= 0;
   aux := items (cj (r));
   RESULT (cj (r)) := RESULT (RESULT.COUNT - r + 1);
   RESULT (RESULT.COUNT - r + 1) := aux;
END LOOP;

Note that one can also implement these algorithms in the programming language of choice. We implemented it in PL/SQL in order to show that it is possible to implement them in a procedural language.

Conclusions

The article makes an attempt to present some methods that should be able to solve the often appearing problem of ordering items in a relational database. We look forward to hearing your opinion about these procedures. We also encourage anyone who might have other/better approaches to share them with us all.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Web Developer
Romania Romania
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Written By
Software Developer (Senior)
Romania Romania
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Written By
Romania Romania
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
-- There are no messages in this forum --