Editing Big numbers

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 13: Line 13:
 
===Little-endian vs. big-endian===
 
===Little-endian vs. big-endian===
 
The ''byte'' is the fundamental addressable unit of memory for a given processor. This is distinct from the ''word'', which is the natural unit of data for a given processor. For example, the Intel 386 processor had 8 bits to a byte but 32 bits to a word. Each byte in memory may be addressed individually by a pointer, but one cannot address the individual bits in them. That being said, when a 32-bit machine word is written to memory, there are two ways it could be done. Suppose the number {{hex|CAFEBABE}} is stored at memory location {{hex|DEADBEEF}}. This will occupy four bytes of memory, and they must be contiguous so that the processor can read and write them as units. The important question is whether the most significant byte (in this case {{hex|CA}}) comes first (big-endian) or last (little-endian). The following table shows where each byte ends up in each scheme.
 
The ''byte'' is the fundamental addressable unit of memory for a given processor. This is distinct from the ''word'', which is the natural unit of data for a given processor. For example, the Intel 386 processor had 8 bits to a byte but 32 bits to a word. Each byte in memory may be addressed individually by a pointer, but one cannot address the individual bits in them. That being said, when a 32-bit machine word is written to memory, there are two ways it could be done. Suppose the number {{hex|CAFEBABE}} is stored at memory location {{hex|DEADBEEF}}. This will occupy four bytes of memory, and they must be contiguous so that the processor can read and write them as units. The important question is whether the most significant byte (in this case {{hex|CA}}) comes first (big-endian) or last (little-endian). The following table shows where each byte ends up in each scheme.
<!-- The CSS for this table is at the end of the page. -->
+
{|style="border-collapse: collapse; border-width: 1px; border-style: solid; border-color: #000"
{| class="endian_table"
+
! style="border-style: solid; border-width: 1px" |
!|
+
! style="border-style: solid; border-width: 1px" | {{hex|DEADBEEF}}
!| {{hex|DEADBEEF}}
+
! style="border-style: solid; border-width: 1px" | {{hex|DEADBEF0}}
!| {{hex|DEADBEF0}}
+
! style="border-style: solid; border-width: 1px" | {{hex|DEADBEF1}}
!| {{hex|DEADBEF1}}
+
! style="border-style: solid; border-width: 1px" | {{hex|DEADBEF2}}
!| {{hex|DEADBEF2}}
+
 
|-
 
|-
|| '''Big-endian'''
+
| style="border-style: solid; border-width: 1px" | '''Big-endian'''
|| {{hex|CA}}
+
| style="border-style: solid; border-width: 1px" | {{hex|CA}}
|| {{hex|FE}}
+
| style="border-style: solid; border-width: 1px" | {{hex|FE}}
|| {{hex|BA}}
+
| style="border-style: solid; border-width: 1px" | {{hex|BA}}
|| {{hex|BE}}
+
| style="border-style: solid; border-width: 1px" | {{hex|16}}
 
|-
 
|-
|| '''Little-endian'''
+
| style="border-style: solid; border-width: 1px" | '''Little-endian'''
|| {{hex|BE}}
+
| style="border-style: solid; border-width: 1px" | {{hex|BE}}
|| {{hex|BA}}
+
| style="border-style: solid; border-width: 1px" | {{hex|BA}}
|| {{hex|FE}}
+
| style="border-style: solid; border-width: 1px" | {{hex|FE}}
|| {{hex|CA}}
+
| style="border-style: solid; border-width: 1px" | {{hex|CA}}
 
|}
 
|}
 
One faces a similar choice when storing bignums: does the most significant part get stored in the first or the last position of the array? Almost every processor is either consistently little-endian or consistently big-endian, but this does not affect the programmer's ability to choose either little-endian or big-endian representations for bignums as the application requires. The importance of this is discussed in the next section.
 
One faces a similar choice when storing bignums: does the most significant part get stored in the first or the last position of the array? Almost every processor is either consistently little-endian or consistently big-endian, but this does not affect the programmer's ability to choose either little-endian or big-endian representations for bignums as the application requires. The importance of this is discussed in the next section.
Line 39: Line 38:
  
 
On the other hand, sometimes it is not so easy to determine in advance the size of the numbers we might be working with, or a problem might have bundled test cases and a strict time limit, forcing the programmer to make the small cases run more quickly than the large ones. When this occurs it is a better idea to use ''dynamic'' bignums, which can expand or shrink according to their length. Dynamic bignums are trickier to code than fixed-width ones: when we add them, for example, we have to take into account that they might not be of the same length; we might then treat all the missing digits as zeroes, but in any case it requires extra code. When using dynamic bignums the difference between the little-endian and big-endian representations becomes significant. If we store the bignums little-endian, and add them, alignment is free: just look at the first entry in each of them; they are in the ones' places of their respective numbers. The code presented in this article will assume the little-endian representation.
 
On the other hand, sometimes it is not so easy to determine in advance the size of the numbers we might be working with, or a problem might have bundled test cases and a strict time limit, forcing the programmer to make the small cases run more quickly than the large ones. When this occurs it is a better idea to use ''dynamic'' bignums, which can expand or shrink according to their length. Dynamic bignums are trickier to code than fixed-width ones: when we add them, for example, we have to take into account that they might not be of the same length; we might then treat all the missing digits as zeroes, but in any case it requires extra code. When using dynamic bignums the difference between the little-endian and big-endian representations becomes significant. If we store the bignums little-endian, and add them, alignment is free: just look at the first entry in each of them; they are in the ones' places of their respective numbers. The code presented in this article will assume the little-endian representation.
 
===Operations===
 
This section describes how to actually manipulate bignums. We assume a dynamic zero-based little endian array representation but leave lots of details to the programmer.
 
====Addition====
 
The schoolbook addition algorithm first adds the ones places, then adds the tens (with a carry, if necessary), then the hundreds, and so on. We will likewise start by adding the ones places and proceed to more significant digits.
 
<pre>
 
input bignums x,y
 
n &larr; length of x
 
m &larr; length of y
 
p &larr; max(m,n)
 
carry &larr; 0
 
for i &isin; [0..p)
 
    z[i] &larr; (x[i]+y[i]+carry) mod radix
 
    if x[i]+y[i]+carry &ge; radix
 
          carry &larr; 1
 
    else
 
          carry &larr; 0
 
if carry
 
    z[p] &larr; 1
 
    p &larr; p+1
 
</pre>
 
After this code has completed, <code>z</code> will hold the sum of <code>x</code> and <code>y</code>, and <code>p</code> will be the length of <code>z</code> (the number of nonzero places). There are two caveats, though:
 
* When adding bignums of equal length, the loop counter <code>i</code> will grow beyond the length of one of the two. What does <code>x[i]</code> mean when <code>i</code> &ge; <code>n</code>? To make this code work, it should be treated as zero. In a working implementation, we must take care to avoid out-of-bounds array access.
 
* If the radix used is the same size as a machine word, then we cannot actually check whether <code>x[i]+y[i]+carry &ge; radix</code> as shown. Instead, check whether the result is greater that or equal to both <code>x</code> and <code>y</code>. If it is not, an overflow has occurred (and the carry bit should be set.)
 
Nevertheless, this shows the basic idea behind the addition of bignums. Here is a sample of C++ code as it might actually appear, using radix 10:
 
<pre>
 
// x, y, and z are vectors of digits
 
void add(vector<int>& z,vector<int>& x,vector<int>& y)
 
{
 
    int n = x.length();
 
    int m = y.length();
 
    int p = max(n,m);
 
    z.resize(p);
 
    int carry = 0;
 
    for (int i=0; i<p; i++)
 
    {
 
          int t=carry;
 
          if (i<n) t+=x[i];
 
          if (i<m) t+=y[i];
 
          z[i]=t%10;
 
          carry=t/10;
 
    }
 
    if (carry)
 
          z.push_back(1);
 
}
 
</pre>
 
  
 
===Error conditions===
 
===Error conditions===
Line 114: Line 67:
 
The other method is suitable only for fixed-size bignums and is based on the ''two's complement'' convention used in almost all modern processors for built-in integer types. We suppose that the radix is <math>b</math> and that the bignum contains <math>n</math> digits. Then, the largest positive integer we can represent is <math>b^w-1</math>. Notice that when
 
The other method is suitable only for fixed-size bignums and is based on the ''two's complement'' convention used in almost all modern processors for built-in integer types. We suppose that the radix is <math>b</math> and that the bignum contains <math>n</math> digits. Then, the largest positive integer we can represent is <math>b^w-1</math>. Notice that when
 
-->
 
-->
 
{{#css:
 
  .endian_table
 
  {
 
    border-collapse: collapse;
 
  }
 
  .endian_table th
 
  {
 
    border: 1px solid #888;
 
    font-weight: bold;
 
  }
 
  .endian_table td
 
  {
 
    border: 1px solid #888;
 
  }
 
}}
 

Please note that all contributions to PEGWiki are considered to be released under the Attribution 3.0 Unported (see PEGWiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)

Templates used on this page: