RC Coding

Introduction

This article describes an alternative approach to minimum redundancy coding (fixed-length) without using tree data structures. The main advantages are simplicity and flexibility, allowing for continuous improvement by integrating new techniques from selection/sorting algorithms. Many different implementations are possible, with time complexity ranging from O(n2) to O(n).

Example source code is included, so you can try it out and see if it works for your purposes.

Requirements

It is assumed the reader is familiar with the traditional algorithms for generating minimum redundancy codes, Huffman variants in particular. Experience with general computer programming is also necessary, to properly understand some of the concepts and to follow the example source code, mostly written in C# but easily adaptable to any C based language.

Circular chains

The common feature and primary innovation of the algorithms is the use of a data structure named circular chain. In this context, it's defined as a collection of elements where each element points to the next, with the last element pointing to the initial element and therefore closing the chain:

cc1

Implementation is done by maintaining an array of ID/index values for each element:

Symbol Next
A B
B C
C D
D A

There are some similarities between this data structure and the well-known linked list, but the latter is usually implemented by allocating multiple blocks of memory and assigning the address of each block as the Next pointer. Considering this, a circular chain can be viewed as an optimized and specialized circularly-linked list.

Navigation

Iterating through the elements of the circular chain consists of starting with an arbitrary element and following the Next pointer until the initial element is found again (the chain closed).

Element removal

Removing an element D from the circular chain consists of finding the element pointing to D and setting the corresponding Next pointer to the value D is pointing to:

cc2
Symbol Next
A B
B C
C A
D A

Merging

Combining 2 (or more) circular chains consists of exchanging the Next pointer of an arbitrary element from one circular chain with the Next pointer of an arbitrary element of another circular chain.

Before:

cc3
Symbol Next
A B
B C
C D
D A
E F
F E

After:

cc4
Symbol Next
A B
B C
C E
D A
E F
F D

Identification

An interesting property of circular chains is that a single arbitrary element can identify and reference the entire collection of associated elements, thus becoming its representative.

Base algorithm

The base algorithm is an adaptation of standard Huffman coding and consists of the following steps:

  1. Initialize data structures by assigning a circular chain to each symbol: Each chain has a single element that points to itself and gets an associated frequency count corresponding to the respective symbol.
  2. Find the 2 circular chains X and Y with the lowest frequency counts.
  3. Traverse chain X and add a bit prefix with value 0 to the code of each symbol, incrementing the respective code length.
  4. Traverse chain Y and add a bit prefix with value 1 to the code of each symbol, incrementing the respective code length.
  5. Merge the two circular chains into a new one (swap two Next pointers between them). Assign it the combined frequency count.
  6. Repeat steps 2-5 until there is only one circular chain left, which includes all symbols.

Evolution of the data structures during the coding process for the stream of symbols "ADAACBBAA" is demonstrated in the following tables.

Iteration 0 (after initialization):

Symbol Next Frequency Code
A A 5
B B 2
C C 1
D D 1

Iteration 1:

Symbol Next Frequency Code
A A 5
B B 2
C D 2 0
D C 1

Iteration 2:

Symbol Next Frequency Code
A A 5
B D 4 0
C B 10
D C 11

Iteration 3:

Symbol Next Frequency Code
A D 9 1
B A 00
C B 010
D C 011

If only the code lengths are necessary, it can be simplified even further:

  1. Initialize data structures by assigning a circular chain to each symbol: Each chain has a single element that points to itself and gets an associated frequency count corresponding to the respective symbol.
  2. Find the 2 circular chains X and Y with the lowest frequency counts.
  3. Merge the two circular chains into a new one (swap two Next pointers between them). Assign it the combined frequency count.
  4. Traverse the new chain and increment the code length of each symbol.
  5. Repeat steps 2-4 until there is only one circular chain left, which includes all symbols.

Specific versions of the base algorithm will vary depending on the implementation of step 2, which also determines the overall processing time performance.

RC1

If we consider the most obvious approach to finding the 2 elements with the lowest values in a list by performing a linear scan, the corresponding C# implementation could be something like this:

Show source code
 internal static FixedLengthCode[] RC1CalculateCodes(uint[] frequencies) {
  // Initialize circular chains data
  int[] circular_chains = new int[frequencies.Length];
  for (int i = 0; i < circular_chains.Length; i++) circular_chains[i] = i;

  FixedLengthCode[] codes = new FixedLengthCode[frequencies.Length];
  for (int i = 0; i < codes.Length; i++) {
   codes[i].Length = 0;
   codes[i].Data = new byte[FixedLengthCode.MaxCodeSize]; // Filled with 0 values by default
  }

  while (true) {
   // Find the 2 elements with the lowest frequency count
   int x_pos = 0, y_pos = 0; // Initialization not necessary but forced by C# compiler
   uint x_freq = uint.MaxValue, y_freq = uint.MaxValue;
   for (int i = 0; i < frequencies.Length; i++) {
    uint f = frequencies[i];
    if (f < x_freq) {
     x_pos = i;
     x_freq = f;
     if (x_freq < y_freq) {
      // Swap values
      uint t1 = x_freq;
      x_freq = y_freq;
      y_freq = t1;
      int t2 = x_pos;
      x_pos = y_pos;
      y_pos = t2;
     }
    }
   }
   if (x_freq == uint.MaxValue) break; // Exit condition

   // Increment code lengths for elements in chain X
   int curpos = x_pos;
   do {
    codes[curpos].Length++;
    curpos = circular_chains[curpos]; // Next element
   } while (curpos != x_pos); // Exit when chain closed

   // Increment code length and add 1 prefix to elements in chain Y
   curpos = y_pos;
   do {
    byte[] ba = codes[curpos].Data;
    int l = codes[curpos].Length++;
    ba[ba.Length - 1 - (l >> 3)] |= unchecked((byte) (1 << (l & 7)));
    curpos = circular_chains[curpos]; // Next element
   } while (curpos != y_pos); // Exit when chain closed

   // Merge the 2 circular chains
   int t3 = circular_chains[x_pos];
   circular_chains[x_pos] = circular_chains[y_pos];
   circular_chains[y_pos] = t3;

   // Assign the new representative the combined frequency count
   frequencies[x_pos] += y_freq;
   // Mark the other representative so it's not taken into account in further iterations
   frequencies[y_pos] = uint.MaxValue;
  }
  return codes;
 }


struct FixedLengthCode {
 internal const int MaxCodeSize = 7; // Maximum code size in bytes

 internal byte Length; // Number of bits
 internal byte[] Data;
 // A more efficient approach (but requires unsafe code) would be:
 // internal fixed byte Data[MaxCodeSize];

 // This implementation of ToString() is concise but very inefficient
 public override string ToString() {
  StringBuilder sb = new StringBuilder(Length);
  for (int i = Length - 1; i >= 0; i--) {
   sb.Append((Data[Data.Length - 1 - (i >> 3)] & (1 << (i & 7))) != 0 ? "1" : "0");
  }
  return sb.ToString();
 }
}

The frequencies array is modified during the code generation process. Apart from the input and output arrays, there's a single additional circular_chains array involved that stores data temporarily. This array can usually be stored in the same memory block that will receive the final compressed data, so the algorithm can have virtually null memory footprint if applied properly.

Note the source code still considers symbols with a 0 frequency count. It's generally preferable to exclude these from the code generation process and some possible options to achieve this are:

Maximum code size is defined as 7 bytes in this particular case, which is unnecessarily large for practical applications but shows how easily it can scale. To minimize the space needed to store the codes, we can assign the prefix 1 to the circular chain with the lowest maximum code length. Codes generated this way and stored as a 32-bit unsigned integer are defined here as "compact":

Show source code
 internal static uint[] RC1CalculateCompactCodes(uint[] frequencies, out byte[] codeLengths) {
  // Initialize circular chains data
  int[] circular_chains = new int[frequencies.Length];
  for (int i = 0; i < circular_chains.Length; i++) circular_chains[i] = i;
  uint[] codes = new uint[frequencies.Length]; // Filled with 0 values by default
  codeLengths = new byte[frequencies.Length]; // Filled with 0 values by default

  while (true) {
   // Find the 2 elements with the lowest frequency count
   int x_pos = 0, y_pos = 0; // Initialization not necessary but forced by C# compiler
   uint x_freq = uint.MaxValue, y_freq = uint.MaxValue;
   for (int i = 0; i < frequencies.Length; i++) {
    uint f = frequencies[i];
    if (f < x_freq) {
     x_pos = i;
     x_freq = f;
     if (x_freq < y_freq) {
      // Swap values
      uint t1 = x_freq;
      x_freq = y_freq;
      y_freq = t1;
      int t2 = x_pos;
      x_pos = y_pos;
      y_pos = t2;
     }
    }
   }
   if (x_freq == uint.MaxValue) break; // Exit condition

   // This check reduces the space needed to store the codes
   if (codeLengths[x_pos] < codeLengths[y_pos]) {
    int t3 = x_pos;
    x_pos = y_pos;
    y_pos = t3;
   }

   // Increment code lengths for elements in chain X
   int curpos = x_pos;
   do {
    codeLengths[curpos]++;
    curpos = circular_chains[curpos]; // Next element
   } while (curpos != x_pos); // Exit when chain closed

   // Increment code length and add 1 prefix to elements in chain Y
   curpos = y_pos;
   do {
    codes[curpos] |= unchecked((uint) (1 << codeLengths[curpos]++));
    curpos = circular_chains[curpos]; // Next element
   } while (curpos != y_pos); // Exit when chain closed

   // Merge the 2 circular chains
   int t4 = circular_chains[x_pos];
   circular_chains[x_pos] = circular_chains[y_pos];
   circular_chains[y_pos] = t4;

   // Assign the new representative the combined frequency count
   frequencies[x_pos] += y_freq;
   // Mark the other representative so it's not taken into account in further iterations
   frequencies[y_pos] = uint.MaxValue;
  }
  return codes;
 }

Improving speed

Performing a linear scan to find the lowest values can significantly increase the processing time as the number of symbols grows, making RC1 unfeasible for large numbers of symbols. To improve speed, the same techniques used by good sorting algorithms can be applied.

RC2

We can use a binary heap to allow efficient retrieval of the minimum values, as the next implementation demonstrates. The customized min heap compares the frequencies associated with each symbol (the representative of a circular chain) to determine where to store its reference in the internal nodes list. A good introduction to heaps can be found in [1].

Show source code
 internal static uint[] RC2CalculateCompactCodes(uint[] frequencies, out byte[] codeLengths) {
  // Initialize circular chains data
  int[] circular_chains = new int[frequencies.Length];
  for (int i = 0; i < circular_chains.Length; i++) circular_chains[i] = i;
  uint[] codes = new uint[frequencies.Length]; // Filled with 0 values by default
  codeLengths = new byte[frequencies.Length]; // Filled with 0 values by default

  MinHeap min_heap = new MinHeap(frequencies);
  while (min_heap.Count >= 2) {
   // Find the 2 elements with the lowest frequency count
   int x_pos = min_heap.Remove(), y_pos = min_heap.Remove();

   // This check reduces the space needed to store the codes
   if (codeLengths[x_pos] < codeLengths[y_pos]) {
    int t1 = x_pos;
    x_pos = y_pos;
    y_pos = t1;
   }

   // Increment code lengths for elements in chain X
   int curpos = x_pos;
   do {
    codeLengths[curpos]++;
    curpos = circular_chains[curpos]; // Next element
   } while (curpos != x_pos); // Exit when chain closed

   // Increment code length and add 1 prefix to elements in chain Y
   curpos = y_pos;
   do {
    codes[curpos] |= unchecked((uint) (1 << codeLengths[curpos]++));
    curpos = circular_chains[curpos]; // Next element
   } while (curpos != y_pos); // Exit when chain closed

   // Merge the 2 circular chains
   int t2 = circular_chains[x_pos];
   circular_chains[x_pos] = circular_chains[y_pos];
   circular_chains[y_pos] = t2;

   // Assign the new representative the combined frequency count
   frequencies[x_pos] += frequencies[y_pos];
   // Add new representative to the heap
   min_heap.Add(x_pos);
  }
  return codes;
 }

 struct MinHeap {
  internal MinHeap(uint[] frequencies) {
   count = 0;
   nodes = new int[frequencies.Length];
   this.frequencies = frequencies;
   for (int i = 0; i < frequencies.Length; i++) Add(i);
  }

  internal int Count { get { return count; } }

  internal void Add(int key) {
   int i;
   for (i = count++; i > 0; ) {
    int j = (i - 1) >> 1;
    if (frequencies[nodes[j]] <= frequencies[key]) break;
    nodes[i] = nodes[j];
    i = j;
   }
   nodes[i] = key;
  }

  internal int Remove() {
   int res = nodes[0];
   int ck = nodes[--count];
   if (count > 0) {
    int i;
    for (i = 0; i < (count >> 1); ) {
     int j = (i << 1) + 1;
     if (j < (count - 1) && frequencies[nodes[j]] > frequencies[nodes[j + 1]]) j++;
     if (frequencies[ck] < frequencies[nodes[j]]) break;
     nodes[i] = nodes[j];
     i = j;
    }
    nodes[i] = ck;
   }
   return res;
  }

  readonly int[] nodes;
  readonly uint[] frequencies;
  int count;
 }

Some possible improvements could be speeding up the initial heap creation and combining the last Remove() and Add(x_pos) function calls into a single operation, but tests show these don't have a significant impact on performance.

RC3

If the frequencies are pre-sorted, we can generate the codes with a single pass. This requires maintaining a first-in-first-out list (FIFO queue - [2]) with references to representatives of the processed circular chains:

Show source code
 internal static FixedLengthCode[] RCDemo1CalculateCodes(uint[] frequencies) {
  // Initialize circular chains data
  int[] circular_chains = new int[frequencies.Length];
  for (int i = 0; i < circular_chains.Length; i++) circular_chains[i] = i;

  FixedLengthCode[] codes = new FixedLengthCode[frequencies.Length];
  for (int i = 0; i < codes.Length; i++) {
   codes[i].Length = 0;
   codes[i].Data = new byte[FixedLengthCode.MaxCodeSize]; // Filled with 0 values by default
  }

  Queue<int> processing_queue = new Queue<int>();
  for (int i = 0; (frequencies.Length - i + processing_queue.Count) >= 2; ) {
   // Find the 2 elements with the lowest frequency count
   int x_pos, y_pos;
   if (i >= frequencies.Length) {
    x_pos = processing_queue.Dequeue();
    y_pos = processing_queue.Dequeue();
   } else if (processing_queue.Count == 0) {
    x_pos = i++;
    y_pos = i++;
   } else {
    if (frequencies[i] < frequencies[processing_queue.Peek()]) {
     x_pos = i++;
     if (i >= frequencies.Length || frequencies[i] >= frequencies[processing_queue.Peek()]) {
      y_pos = processing_queue.Dequeue();
     } else {
      y_pos = i++;
     }
    } else {
     x_pos = processing_queue.Dequeue();
     if (processing_queue.Count == 0 || frequencies[i] < frequencies[processing_queue.Peek()]) {
      y_pos = i++;
     } else {
      y_pos = processing_queue.Dequeue();
     }
    }
   }

   // Increment code lengths for elements in chain X
   int curpos = x_pos;
   do {
    codes[curpos].Length++;
    curpos = circular_chains[curpos]; // Next element
   } while (curpos != x_pos); // Exit when chain closed

   // Increment code length and add 1 prefix to elements in chain Y
   curpos = y_pos;
   do {
    byte[] ba = codes[curpos].Data;
    int l = codes[curpos].Length++;
    ba[ba.Length - 1 - (l >> 3)] |= unchecked((byte) (1 << (l & 7)));
    curpos = circular_chains[curpos]; // Next element
   } while (curpos != y_pos); // Exit when chain closed

   // Merge the 2 circular chains
   int t1 = circular_chains[x_pos];
   circular_chains[x_pos] = circular_chains[y_pos];
   circular_chains[y_pos] = t1;

   // Assign the new representative the combined frequency count
   frequencies[x_pos] += frequencies[y_pos];
   // Add new representative to processing queue
   processing_queue.Enqueue(x_pos);
  }
  return codes;
 }

The implementation shown works but relies on a Queue class, which introduces the following problems:

To solve these problems we have to get rid of the Queue and customize the alternative structure to match our specific needs. In order to achieve this we have to consider what data is necessary to implement a FIFO:

If we notice the changes in the frequencies array during processing, we can see that as circular chains are merged the new representative gets assigned the combined frequency count, but the frequencies of the other associated elements are no longer relevant. We can take advantage of this unused space to store the Next reference of each element in the new FIFO structure.

Iteration 0 (position = A, FIFO count = 0):

Symbol Next Frequency Code
A A 3
B B 5
C C 5
D D 5
E E 6
F F 7

Iteration 1 (position = C, FIFO count = 1, FIFO first = A, FIFO last = A):

Symbol Next Frequency Code
A B 8 0
B A 1
C C 5
D D 5
E E 6
F F 7

Iteration 2 (position = E, FIFO count = 2, FIFO first = A, FIFO last = C):

Symbol Next Frequency Code
A B 8 0
B A C 1
C D 10 0
D C 1
E E 6
F F 7

Iteration 3 (position = invalid, FIFO count = 3, FIFO first = A, FIFO last = E):

Symbol Next Frequency Code
A B 8 0
B A C 1
C D 10 0
D C E 1
E F 13 0
F E 1

Iteration 4 (position = invalid, FIFO count = 2, FIFO first = E, FIFO last = A):

Symbol Next Frequency Code
A D 18 00
B A 01
C B 10
D C 11
E F 13 0
F E A 1

Iteration 5 (position = invalid, FIFO count = 1, FIFO first = E, FIFO last = E):

Symbol Next Frequency Code
A F 100
B A 101
C B 110
D C 111
E D 31 00
F E 01

The corresponding implementation then becomes:

Show source code
 internal static uint[] RCDemo2CalculateCompactCodes(uint[] frequencies, out byte[] codeLengths) {
  // Initialize circular chains data
  int[] circular_chains = new int[frequencies.Length];
  for (int i = 0; i < circular_chains.Length; i++) circular_chains[i] = i;
  uint[] codes = new uint[frequencies.Length]; // Filled with 0 values by default
  codeLengths = new byte[frequencies.Length]; // Filled with 0 values by default

  OptimizedQueue processing_queue = new OptimizedQueue(frequencies, circular_chains);
  for (int i = 0; (frequencies.Length - i + processing_queue.Count) >= 2; ) {
   // Find the 2 elements with the lowest frequency count
   int x_pos = int.MaxValue; int y_pos = x_pos;
   do {
    int t1 = i < frequencies.Length && (processing_queue.Count == 0 || frequencies[i] < frequencies[processing_queue.Peek()]) ? i++ : processing_queue.Dequeue();
    if (x_pos == int.MaxValue) x_pos = t1;
    else y_pos = t1;
   } while (y_pos == int.MaxValue);

   // This check reduces the space needed to store the codes
   if (codeLengths[x_pos] < codeLengths[y_pos]) {
    int t2 = x_pos;
    x_pos = y_pos;
    y_pos = t2;
   }

   // Increment code lengths for elements in chain X
   int curpos = x_pos;
   do {
    codeLengths[curpos]++;
    curpos = circular_chains[curpos]; // Next element
   } while (curpos != x_pos); // Exit when chain closed

   // Increment code length and add 1 prefix to elements in chain Y
   curpos = y_pos;
   do {
    codes[curpos] |= unchecked((uint) (1 << codeLengths[curpos]++));
    curpos = circular_chains[curpos]; // Next element
   } while (curpos != y_pos); // Exit when chain closed

   // Merge the 2 circular chains
   int t3 = circular_chains[x_pos];
   circular_chains[x_pos] = circular_chains[y_pos];
   circular_chains[y_pos] = t3;

   // Assign the new representative the combined frequency count
   frequencies[x_pos] += frequencies[y_pos];
   // Add new representative to processing queue
   processing_queue.Enqueue(x_pos);
  }
  return codes;
 }

 struct OptimizedQueue {
  internal OptimizedQueue(uint[] frequencies, int[] circularChains) {
   this.frequencies = frequencies;
   this.circularChains = circularChains;
   count = first = last = 0;
  }

  internal int Count {
   get { return count; }
  }

  internal int Peek() {
   return first;
  }

  internal void Enqueue(int newValue) {
   if (count++ == 0) {
    first = newValue;
   } else {
    frequencies[circularChains[last]] = unchecked((uint) newValue); // Link new element
   }
   last = newValue;
  }

  internal int Dequeue() {
   int res = first;
   if (--count > 0) first = unchecked((int) frequencies[circularChains[first]]); // Set to next element
   return res;
  }

  int count, first, last;
  uint[] frequencies;
  int[] circularChains;
 }

Additional optimizations can be done by removing the function calls inside the loops, basically converting them to C++ style inline functions:

Show source code
 internal static uint[] RC3CalculateCompactCodes(uint[] frequencies, out byte[] codeLengths) {
  // Initialize circular chains data
  int[] circular_chains = new int[frequencies.Length];
  for (int i = 0; i < circular_chains.Length; i++) circular_chains[i] = i;
  uint[] codes = new uint[frequencies.Length]; // Filled with 0 values by default
  codeLengths = new byte[frequencies.Length]; // Filled with 0 values by default

  int temp;
  int fifo_count = 0;
  int fifo_first = 0, fifo_last = 0; // Initialization not necessary but forced by C# compiler
  for (int i = 0; (frequencies.Length - i + fifo_count) >= 2; ) {
   // Find the 2 elements with the lowest frequency count
   int x_pos = int.MaxValue; int y_pos = x_pos;
   do {
    if (i < frequencies.Length && (fifo_count == 0 || frequencies[i] < frequencies[fifo_first])) {
     temp = i++;
    } else {
     temp = fifo_first;
     if (--fifo_count > 0) fifo_first = unchecked((int) frequencies[circular_chains[fifo_first]]); // Set to next element
    }
    if (x_pos == int.MaxValue) x_pos = temp;
    else y_pos = temp;
   } while (y_pos == int.MaxValue);

   // This check reduces the space needed to store the codes
   if (codeLengths[x_pos] < codeLengths[y_pos]) {
    temp = x_pos;
    x_pos = y_pos;
    y_pos = temp;
   }

   // Increment code lengths for elements in chain X
   temp = x_pos;
   do {
    codeLengths[temp]++;
    temp = circular_chains[temp]; // Next element
   } while (temp != x_pos); // Exit when chain closed

   // Increment code length and add 1 prefix to elements in chain Y
   temp = y_pos;
   do {
    codes[temp] |= unchecked((uint) (1 << codeLengths[temp]++));
    temp = circular_chains[temp]; // Next element
   } while (temp != y_pos); // Exit when chain closed

   // Merge the 2 circular chains
   temp = circular_chains[x_pos];
   circular_chains[x_pos] = circular_chains[y_pos];
   circular_chains[y_pos] = temp;

   // Assign the new representative the combined frequency count
   frequencies[x_pos] += frequencies[y_pos];

   // Add new representative to processing queue
   if (fifo_count++ == 0) {
    fifo_first = x_pos;
   } else {
    frequencies[circular_chains[fifo_last]] = unchecked((uint) x_pos); // Link new element
   }
   fifo_last = x_pos;
  }
  return codes;
 }

Integration

Here are some guidelines that should be followed when integrating RC coding in a real application:

Appendix A - Files

Download full source code with this link.

Appendix B - Machine code optimizations

Some interesting observations can be made from the analysis of the RC1 x86 assembly language implementation that generates the code lengths:

Show source code
; EBX = Pointer to frequencies array (input data, modified during processing)
; ECX = Number of elements in frequencies array
; ESI = Pointer to circular_chains array (temporary storage)
; EDI = Pointer to code_lengths array (receives output data)

 ; Initialize circular chains data (may be removed if circular_chains array already contains values in ascending order)
 mov eax, ecx
init_cc:
 dec eax
 mov dword ptr [esi + eax * 4], eax
 jnz init_cc

 mov ebp, ecx
 push edi
 ; Initialize code lengths (may be removed if code_lengths array already contains null values)
 ; xor eax, eax - not necessary since EAX already contains 0
 rep stosb

next_iteration:
 push esi
 ; Find the 2 elements with the lowest frequency count
 xor esi, esi
 dec esi
 mov edi, esi
 mov ecx, ebp
find_l2:
 dec ecx
 cmp esi, dword ptr [ebx + ecx * 4]
 jbe skip_freq
 mov esi, dword ptr [ebx + ecx * 4]
 mov eax, ecx
 cmp esi, edi
 jae skip_freq
 xchg esi, edi
 xchg eax, edx
skip_freq:
 jecxz found_l2
 jmp find_l2
found_l2:
 inc esi
 pop esi
 jz exit ; Exit condition

 ; Assign the new representative the combined frequency count
 add dword ptr [ebx + eax * 4], edi
 ; Mark the other representative so it's not taken into account in further iterations
 dec ecx
 mov dword ptr [ebx + edx * 4], ecx

 ; Merge the 2 circular chains
 mov ecx, dword ptr [esi + eax * 4]
 xchg ecx, dword ptr [esi + edx * 4]
 mov dword ptr [esi + eax * 4], ecx

 pop edi
 push edi
 ; Increment code lengths for elements in new chain
 mov ecx, eax
inc_length:
 inc byte ptr [edi + ecx]
 mov ecx, dword ptr [esi + ecx * 4] ; Next element
 cmp ecx, eax
 jne inc_length ; Exit when chain closed
 jmp next_iteration
exit:
 pop edi

The XCHG instruction is used to exchange 2 values atomically, something not possible to specify with common high-level languages. Machine code size in this case is 80 bytes, which should be very close to the minimum possible to achieve an equivalent result.

References

  1. Heaps and Heapsort
  2. Queues (FIFO)

SourceForge.net Logo