very large numbers

i am writing a program for factoring very large numbers(1 to 650 digits apprximately) i have a large number class:
//=================INCLUDES============
#include <stdlib.h>
#include <iostream.h>
#include <string.h>
//====================================

//=================CONSTANTS===========
#define sz 700
//====================================

//=================CLASSES=============
class verylong
{
private:
char vlstr[sz];// string that holds the individual digits
int vlen; // length of verylong string
verylong multdigit(int);
verylong mult10(verylong);
public:
verylong()
{
vlstr[0] = '\0';
vlen = 0;
}
verylong(char s[sz])
{
strcpy(vlstr,s);
vlen = strlen(s);
}
verylong(unsigned long n)
{
ltoa(n,vlstr,10);
strrev(vlstr);
vlen = strlen(vlstr);
}
void putvl(); // displays verylong
void getvl(); // input verylong from user
verylong operator + (verylong); // adding verylongs
verylong operator * (verylong); // multiply verylongs
verylong operator - (verylong); // subtract verylongs
};
//====================================

void verylong::putvl()
{
char temp[sz];
strcpy(temp,vlstr);
cout << strrev(temp);
}

void verylong::getvl()
{
cin >> vlstr;
vlen = strlen(vlstr);
strrev(vlstr);
}

verylong verylong::operator + (verylong v)
{
char temp[sz];
int maxlen = (vlen > v.vlen) ? vlen : v.vlen;
int carry = 0;
int j;
for (j = 0;j < maxlen;j++)
{
int d1 = (j > vlen - 1) ? 0 : vlstr[j] - '0';
int d2 = (j > v.vlen - 1) ? 0 : v.vlstr[j] - '0';
int digitsum = d1 + d2 + carry;
if (digitsum >= 10)
{
digitsum -= 10;
carry = 1;
}
else
carry = 0;
temp[j] = digitsum + '0';
}
if (carry == 1)
{
temp[j++] = '1';
}
temp[j] = '\0';
return verylong(temp);
}

verylong verylong::operator * (verylong v)
{
verylong pprod;
verylong tempsum;
for (int j = 0;j < v.vlen;j++)
{
int digit = v.vlstr[j] - '0';
pprod = multdigit(digit);
for (int k = 0;k < j;k++)
{
pprod = mult10(pprod);
}
tempsum = tempsum + pprod;
}
return tempsum;
}

verylong verylong::mult10(verylong v)
{
char temp[sz];
for (int j = v.vlen - 1;j >= 0;j--)
{
temp[j + 1] = v.vlstr[j];
temp[0] = '0';
temp[v.vlen + 1] = '\0';
}
return verylong(temp);
}

verylong verylong::multdigit(int d2)
{
char temp[sz];
int carry = 0;
int j;
for (j = 0;j < vlen;j++)
{
int d1 = vlstr[j] - '0';
int digitprod = d1 * d2;
digitprod += carry;
if (digitprod >= 10)
{
carry = digitprod / 10;
digitprod -= carry * 10;
}
else
carry = 0;
temp[j] = digitprod + '0';
}
if (carry != 0)
{
temp[j++] = carry + '0';
}
temp[j] = '\0';
return verylong(temp);
}

need help with subtraction, devision, modular devision
[3494 byte] By [ScreechPowers] at [2007-11-18 19:26:44]
# 1 Re: very large numbers
I did something like this a long time ago in Pascal...
-the basic idea for subtraction is the same way you would do subtraction on paper:
-compare the right-most/least significant digits first; if the minuend (the base number) digit is less than the subtrahend (the amount subtracted) digit, you need to borrow...take one away from the next digit up in the minuend, and treat the current minuend as a "teen" instead of it's current value (i.e. 15 instead of 5, 13 instead of 3, 10 instead of 0). Do the "small value" subtraction, and you now have your first digit of your difference. Iterate to the next higher digit and do the same thing.
-something to look out for: What if I need to borrow, but the next higher digit is a 0? [I'm not going to do all your work for you ;-) ]

-Large integer division: using your subtract function, do repeated subtraction by your divisor value until your difference is less than your divisor. The number of iterations this took is your integer quotient.

-Large integer modulus (remainder), do the same thing as integer division, but report the amount left instead of the number of iterations it took to get there.

if you haven't already, it would probably also be good to implement operator=(), operator++(), operator>(), and operator<() .
jefranki at 2007-11-9 0:32:37 >
# 2 Re: very large numbers
In the same file, with division, it should be noted that repeated subtraction will not work. The numbers i am using are HUGE:

25195908475657893494027183240048398571429282126204
03202777713783604366202070759555626401852588078440
69182906412495150821892985591491761845028084891200
72844992687392807287776735971418347270261896375014
97182469116507761337985909570009733045974880842840
17974291006424586918171951187461215151726546322822
16869987549182422433637259085141865462043576798423
38718477444792073993423658482382428119816381501067
48104516603773060562016196762561338441436038339044
14952634432190114657544454178424020924616515723350
77870774981712577246796292638635637328991215483143
81678998850404453640235273819513786365643912120103
97122822120720357

or sometimes larger...also, the work will be done by a network of computers, but still with numbers this large, time is valuble...
ScreechPowers at 2007-11-9 0:33:35 >
# 3 Re: very large numbers
Just brainstorming (I need more time to think before any definites):
-Maybe use some sort of partitioned division? (A little more like long division by hand.)
jefranki at 2007-11-9 0:34:41 >
# 4 Re: very large numbers
It sounds like you are working on a serious project. Not to offend anyone who has posted ideas, but I wouldn't bet on finding expert advice for this particular problem here. The best advice I think you can use is either (a) purchase a library meant for this type of use or (b) spend some time researching in a library or possibly on the web about how to perform division and multiplication with large numbers. One reference is this chapter from "Numerical Recipes in C" (http://www.library.cornell.edu/nr/bookcpdf/c20-6.pdf). It's not C++, and I believe there are more efficient FFT algorithms out there, but this chapter will at least give you an idea about how to best perform these operations.

- Kevin

BTW, I did some rough number crunching given the outlines of complexity in the chapter from Numerical Recipes. If you did byte-wise operations, NR's algorithms would require 28 times less operations than doing it by with "familliar elementary school" methods using byte-wise operations. Using 32-bit operations would provide you with even better performance. As you said, time is valuable.
KevinHall at 2007-11-9 0:35:34 >
# 5 Re: very large numbers
Here's my idea I cranked out since my last post. This isn't a full algorthm for division/remainder, just kinda the outline of one. It is based on long division (the stuff you did and hated in elementary school). I think it should have a significantly shorter runtime than the brute force multiple-subtract, but I didn't go prove it with "Big-O". An illustration of the process is at the bottom. It took me long enough to work out something on paper, so no code yet, but if you decide to use my basic method, please credit me (i.e. Thanks to).

First, break up the "verylong" into chunks of (signed)16-bit ints (shorts on 32-bit systems), based on "ten thousands" (10000<32767). (so an array of shorts, rather than char or enum). It's definitely possible to use int's instead, but I'm trying to think ahead to using Altivec or any x86 vector processing unit equivilent. Altivec uses single precision, and (int)MAX_SHORT*(int)MAX_SHORT < MAX_INT.

It may or may not be easier to just redo your class on this basis, and then convert text in and out. I'll continue to explain as if you did...I'm sure you can figure the appropriate work arounds if you don't.

Then compare the two verylongs you are dividing to make sure your dividend is <= your divisor (reason should be obvious). Check the length of your two arrays of shorts, subtract the length of the divisor from length the dividend, this will be the length of your quotient.

Estimate the most significant digit of your quotient by dividing your dividend by (divisor+1). Use the multiply operator you built, multiply the divisor by this estimate, then subtract this new value from the subset of your dividend that has same length as your divisor. If this result is less than your divisor, "bring down" the next digit and repeat the process for the next quotient digit. If the result is >= your divisor, repeat the process using this left over and add the value to the current quotient digit. Repeat the process until you are out of digits, your quotient should be complete, and the final "leftover result" will be your remainder.

//code flag used for fixed width font only

1| 2469| 9876 //first estimate i.e. 1==12/(9+1)
|+ 30|+ 122 //second estimate
| |+ 1 //third estimate
(1| 2499| 9999) //final total
____________________________
9|8765|4321 \ 12| 3456| 7899| 8765| 4321
- 9| 8765| 4321| //12/(9+1)==1, estimate by 1*9|8765|4321
================
2| 4691| 3578| //result<divisor
================
2| 4691| 3578| 8765| //so bring down
- 2| 4385| 1851| 8549| //24691/10==2469, estimate by 2469*9|8765|4321
======================
0| 0306| 1727| 0216| //result>=divisor, so do a second estimate and add to quotient
- 296| 2962| 9630| //306/10==30, estimate by 30*9|8765|4321
======================
9| 8764| 0586| //result<divisor
===================
9| 8764| 0586| 4321 //so bring down
- 9| 7540| 7407| 4196 //98764/10==9876, estimate by 9876*9|8765|4321
========================
0| 1223| 3179| 0125 //result>=divisor, so do a second estimate and add to quotient
- 1204| 9382| 7162 //1223/10==122, estimate by 122*9|8765|4321
========================
18| 3796| 2963 //result>=divisor, so do a third estimate and add to quotient
- 9| 8765| 8642 //18/10==1, estimate by 1*9|8765|4321
========================
8| 5029| 8642 //result<divisor, no digits left, so this is the remainder
jefranki at 2007-11-9 0:36:40 >
# 6 Re: very large numbers
Check out this library ( http://www.shoup.net/ntl/).
KevinHall at 2007-11-9 0:37:36 >
# 7 Re: very large numbers
Hi,

I've done a lot of work developing and using arbitrary precision packages.

Here is a link to a previous post pertaining to this topic.

http://www.dev-archive.com/forum/showthread.php?threadid=246419

Do a search of the forum for 'arbitrary precision' from dude_1967.

It is really a lot of work to set up a goog multiple precision package. You might try to use a library. I prefer my own packages which unfortunately are not yet available as freeware. As far as commercial packages are concerned: Mathematica is the front runner in my opinion.

Look at the other posts.

Good luck.

Sincerely, Chris.

:)
dude_1967 at 2007-11-9 0:38:40 >
# 8 Re: very large numbers
And another note...

The best algorithm for computing the factorial of huge numbers is called the method of binary splitting. With binary splitting huge numbers are broken down and adjacent pairs are multiplied first, successively increasing the precision with following stages.

Sincerely, Chris.

:)

P.S. It might take a while to study up on all this stuff.
dude_1967 at 2007-11-9 0:39:41 >
# 9 Re: very large numbers
Well, for huge numbers, you should use GMP available from the GNU website. It has the best available algorithms and comes with fine-tuned assembler primitives for many platforms. If you want to have access to more advanced functions (different factoring methods included), then have a look at pari/gp.

Mathematica is very good for symbolic calculations but quite slow (comparatively) for computations on big integers.

For GMP, see their homepage (http://www.swox.com/gmp/) and for the description of their algorithms see the manual (http://www.swox.com/gmp/gmp-man-4.1.2.pdf)
Yves M at 2007-11-9 0:40:38 >
# 10 Re: very large numbers
thanks for the help, subtractions works(kind of) here is current code:

// when completed, will have { + , - , * , / , > , < , == , <= , >= , %}

//=================INCLUDES============
#include <stdlib.h>
#include <iostream.h>
#include <string.h>
//====================================

//=================CONSTANTS===========
#define sz 10000
//====================================

//=================CLASSES=============
class verylong
{
private:
char vlstr[sz];// string that holds the individual digits
int vlen; // length of verylong string
verylong multdigit(int);
verylong mult10(verylong);
public:
verylong()
{
vlstr[0] = '\0';
vlen = 0;
}
verylong(char s[sz])
{
strcpy(vlstr,s);
vlen = strlen(s);
}
verylong(unsigned long n)
{
ltoa(n,vlstr,10);
strrev(vlstr);
vlen = strlen(vlstr);
}
void putvl(); // displays verylong
void getvl(); // input verylong from user
void borrow(int, verylong); // borrow for subtraction
verylong operator + (verylong); // adding verylongs
verylong operator * (verylong); // multiply verylongs
verylong operator - (verylong); // subtract verylongs
verylong operator / (verylong); // devide verylongs
int compare (verylong); // compares verylongs to see which is greater
};
//====================================

void verylong::putvl()
{
char temp[sz];
strcpy(temp,vlstr);
cout << strrev(temp);
}

void verylong::getvl()
{
cin >> vlstr;
vlen = strlen(vlstr);
strrev(vlstr);
}

verylong verylong::operator + (verylong v)
{
char temp[sz];
int maxlen = (vlen > v.vlen) ? vlen : v.vlen;
int carry = 0;
int j;
for (j = 0;j < maxlen;j++)
{
int d1 = (j > vlen - 1) ? 0 : vlstr[j] - '0';
int d2 = (j > v.vlen - 1) ? 0 : v.vlstr[j] - '0';
int digitsum = d1 + d2 + carry;
if (digitsum >= 10)
{
digitsum -= 10;
carry = 1;
}
else
carry = 0;
temp[j] = digitsum + '0';
}
if (carry == 1)
{
temp[j++] = '1';
}
temp[j] = '\0';
return verylong(temp);
}

verylong verylong::operator - (verylong v)
{
char temp[sz];
int maxlen = (vlen > v.vlen) ? vlen : v.vlen;
int j;
for (j = 0;j < maxlen;j++)
{
int d1 = (j > vlen - 1) ? 0 : vlstr[j] - '0';
int d2 = (j > v.vlen - 1) ? 0 : v.vlstr[j] - '0';
if (d1 - d2 < 0)
{
borrow(j, v);
}
int difference = (d1 - d2) + '0';
temp[j] = difference;
}
temp[j] = '\0';
return verylong(temp);
}

verylong verylong::operator * (verylong v)
{
verylong pprod;
verylong tempsum;
for (int j = 0;j < v.vlen;j++)
{
int digit = v.vlstr[j] - '0';
pprod = multdigit(digit);
for (int k = 0;k < j;k++)
{
pprod = mult10(pprod);
}
tempsum = tempsum + pprod;
}
return tempsum;
}

verylong verylong::mult10(verylong v)
{
char temp[sz];
for (int j = v.vlen - 1;j >= 0;j--)
{
temp[j + 1] = v.vlstr[j];
temp[0] = '0';
temp[v.vlen + 1] = '\0';
}
return verylong(temp);
}

verylong verylong::multdigit(int d2)
{
char temp[sz];
int carry = 0;
int j;
for (j = 0;j < vlen;j++)
{
int d1 = vlstr[j] - '0';
int digitprod = d1 * d2;
digitprod += carry;
if (digitprod >= 10)
{
carry = digitprod / 10;
digitprod -= carry * 10;
}
else
carry = 0;
temp[j] = digitprod + '0';
}
if (carry != 0)
{
temp[j++] = carry + '0';
}
temp[j] = '\0';
return verylong(temp);
}

void verylong::borrow(int whatel, verylong v)
{
int d1;
int d2;
int x;// for compare();
d1 = vlstr[whatel] - '0';
d2 = vlstr[whatel + 1] - '0';
if (d2 > 0)
{
d1 += 10;
d2 -= 1;
}
else
{
borrow(whatel + 1, v);
}
vlstr[whatel] = d1 + '0';
vlstr[whatel + 1] = d2 + '0';
d1 = v.vlstr[whatel] - '0';
d2 = v.vlstr[whatel + 1] - '0';
if (d2 > 0)
{
d1 += 10;
d2 -= 1;
}
else
{
borrow(whatel + 1, v);
}
v.vlstr[whatel] = d1 + '0';
v.vlstr[whatel + 1] = d2 + '0';
d1 = vlstr[whatel] - '0';
d2 = vlstr[whatel + 1] - '0';
if (d2 > 0)
{
d1 += 10;
d2 -= 1;
}
else
{
borrow(whatel + 1, v);
}
vlstr[whatel] = d1 + '0';
vlstr[whatel + 1] = d2 + '0';
}

int verylong::compare(verylong v)// probably overload ==, <, >, <=, >= later
{
int maxlen = (vlen > v.vlen) ? vlen : v.vlen;
if (vlen > v.vlen)
{
return 1; // greater value is this class rather than object passed
}
if (vlen < v.vlen)
{
return 2; // greater value is the value passed
}
int y = maxlen;
if ((vlen == v.vlen) && y > 0)
{
int x = 0;
y = maxlen;
while (x == 0)
{
x = v.vlstr[y] - vlstr[y];
y++;
}
if (x < 0)
{
return 1;
}
if (x > 0)
{
return 2;
}
if (x == 0)
{
return 0;
}
}
}

i appreciate the libraries of large numbers, but as what i am doing requires me to write the library myself
ScreechPowers at 2007-11-9 0:41:43 >
# 11 Re: very large numbers
I've created a class for very long integers.
I didn't put much efforts for increasing the speed,
so I'm sure I can get better results.

the performance for multiplication of 500*500 digits is 1 sec
and for 1000*1000 digits is 4.2 sec.

system spec:
celeron 1.7, 512 MB (DDR 266), win XP pro.

is it considered as a reasonable performance?
Guysl at 2007-11-9 0:42:39 >
# 12 Re: very large numbers
Well, there is an easy test. Install pari/gp on your machine and test it with numbers of that range. I would say that 1 second for 500 digits is quite slow. But depending on what your goal is, it may be OK.
Yves M at 2007-11-9 0:43:43 >
# 13 Re: very large numbers
Your mulitplication scheme is O(n^2) -- you quadrupled the time by doubling the number of digits. You can do better -- O(n*log(n)*log(log(n))). Just factoring in the number of operations, you should be able to improve your time for 500 digit numbers by at least a factor of 14 and your calculations of 1000 digit numbers by at least a factor of 23.
KevinHall at 2007-11-9 0:44:50 >
# 14 Re: very large numbers
Thabk guys.

KevinHall,
I knew it would be ended with math skills, which unfortunately
I lack of... those 'O(n*log(n)...' are beyond my skills.
could you please explain yourself in more simple terms?

by the way, Yves, there is a table of performance example of pari that I don't quiet understand, could you tell me please where
falls a 500*500 digits in there?

http://www.math.u-psud.fr/~belabas/pari/doc/faq.html#largeint

I won't even try to compete the speed of a solution made by
mathematicians/algorithms-engeneers nor use assembler routines, I did it for practice.
On the other hand, I don't mind to improve it as long as I don't need to rewrite it from scratch.

Thanks in advance,
Guy
Guysl at 2007-11-9 0:45:52 >
# 15 Re: very large numbers
That table shows multiplication times for 2^n bits and you see n in the top row. So when n = 9, then 2^n = 512. Depending on what your "digits" mean, this may be it. If you are using a decimal representation, then 500 decimal digits would mean 10^500 which is approximately 1661 binary digits, so you should look between n=10 and n=11.

You don't really know on which computer the pari/gp results were computed, but in any case, they are way faster (by a factor of 500) than yours. If you understand a bit of maths, then look at the GMP manual which I linked to above, it contains the descriptions of algorithms which scale much better than the usual multiplication.
Yves M at 2007-11-9 0:46:52 >
# 16 Re: very large numbers
Guysl,

Statements such as "<some algorithm> scales as O(log(n))" represent how complex a certain algorithm is computationally -- thus it is given the term "complexity". (This is not how complex an algorithm is to understand ;) -- just how complex it is for a computer to crunch). Remember this one, because it is very important to the idea of algorithms -- and it works itself into many places in modern programming such as the choice of an STL containter.

OK, now for the explanation. Let's say we have three algorithms A, B, and C that all try to accomplish the same thing. And we have three data sets x(1 element), y(10 elements), and z(100 elements). For the sake of multiplying large numbers, these sets could represent the number of decimal digits. Now let's say that all three algorithms compute set x with one computation. But for set y, the algorithms are not equally efficient. A must make 100 computations, B does much better with only 10 compuations, and C does it in a speedy 2 computations. For data set z, A makes 10000 compuations, B 100 computations, and C 3 computations. Let's make a table to summarize:
| 1 | 10 | 100
-+--+---+---
A| 1 | 100 | 10000
B| 1 | 10 | 100
C| 1 | 2 | 3
You may notice a pattern. For A, the number of compuations is equal to the square of the number of elements. For B, it is equal to the number of elements. For C it is equal to 1 + log10(# of elements). Or in summary:
let n = the number of elements.

computations_A(n) = n^2
computations_B(n) = n
computations_C(n) = 1+log10(n)
This is where we get the complexity. A, has the complexity of O(n^2). B has the complexity of O(n) [also known as linear complexity], and C has the complexity of O(log(n)).

That explains the basics. Complexity is best used to tell how an algorithm responds to the growth of a data set and is actually very poor for making absolute statements about speed. I am about to explain why. If you want to stop reading here, feel free to.

Let's analyze a few other imaginary algorithms D,E,F, and G. And here is the equations for the number of compuations for each:
let n = the number of elements.

computations_D(n) = 5*n^2-4*n
computations_E(n) = 10*n+3*log(n)+5
computations_F(n) = 100*log10(n)+50000
computations_G(n) = 100000

How do we describe these? D and E have multiple n's in the equation. Well, we measure complexity by the worst set with n in it. So D has O(n^2) complexity. Why not O(5*n^2)? Well, without going further into detail, it just isn't. The coefficients just aren't listed. So E has O(n) complexity, F has O(log(n)), and G has constant complexity. Now which would you rather use for small numbers? The complexities would suggest that we use G and that D would be worst. However, since we have the equations for for the number of calculations, we would find that for very small numbers D is best, and for small to large numbers E is best. Only when n becomes very large does F become a preferable choice. And only when n is a behemoth is G the best. I didn't write this last part to confuse you. The coefficients don't swing things that much in every day algorithms. But it does reinforce my original statement. That complexity tells how an algorithm behaves when its data set grows -- not how well it actually performs.
KevinHall at 2007-11-9 0:47:44 >
# 17 Re: very large numbers
Yves thanks alot.

KevinHall, thank you for patience, I read your post later
when I get home.

...
read it - that was a good lesson, KevinHall. cheers.

Regards,
Guy
Guysl at 2007-11-9 0:48:53 >