Le Forum de l'Alliance Francophone

Nouvelles:

Auteur Sujet: NFS@Home  (Lu 94713 fois)

0 Membres et 3 Invités sur ce sujet

Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #250 le: 27 June 2016 à 22:02
Hi,

Can I request support from you all to run lasieve5f application where we queued more than 1M wus to sieve 6,383-, 7,353+, 7,353-, and 2,1009+composites.

Thank you in advance,

Carlos
NFS@Home Moderator



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #251 le: 04 July 2016 à 12:39
Name: Need Fast Ships, Pirates Ahoy NFS@Home Challenge
Status:Upcoming
Project:NFS@Home
Issued by Anguillan Pirates
Start time: 2016-07-10 00:00 UTC
End time:2016-07-17 00:00 UTC

http://boincstats.com/en/stats/challenge/team/chat/799



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #252 le: 04 July 2016 à 16:31
More than 2M wus to process. Come and join the Pirates.



Hors ligne modesti

  • CàA
  • Boinc'eur devant l'éternel
  • *****
  • Messages: 19044
  •   
    • Brocantes en Bourbonnais
    • E-mail
Réponse #253 le: 04 July 2016 à 21:21
L'AF is already in ;)
Anyway, thanks a lot for the invitation :jap:


Viendez chez nous, cause qu'on est les meilleur(e)s :D


In memoriam Jip - In memoriam Cocagne


Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #254 le: 08 July 2016 à 16:46
We have a lot of wus from all three applications ready to be processed.
Have fun on the challenge.



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #255 le: 01 October 2016 à 20:47
We have a set of 750k 16e wus (lasieve5f application) to immediately start crushing and more 2M to come afterwards. Every help is appreciated.

Go to:

https://escatter11.fullerton.edu/nfs/prefs.php?subset=project

and set:

lasieved - app for RSALS subproject, uses less than 0.5 GB memory: no
lasievee - work nearly always available, uses up to 0.5 GB memory: no
lasievef - used for huge factorizations, uses up to 1 GB memory: no
lasieve5f - used for huge factorizations, uses up to 1 GB memory: yes

Kind Regards,

Carlos
NFS@Home Moderator



naz

  • Invité
Réponse #256 le: 03 October 2016 à 16:20
Hi Carlos,

It would be possible that you speaks to us about NFS! The utility of the results etc. And that they are the evolutions. As for example a 17th continuation Lattice Sieve, 18th Lattice Sieve, etc.

Best regard

Naz

ps : Sory for my bad English, I am French  :D
« Modifié: 03 October 2016 à 19:41 par naz »



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #257 le: 03 October 2016 à 21:31
7/10-14/10 Need Faster Ships, Pirates Ahoy part deux NFS@Home Challenge

http://boincstats.com/en/stats/challenge/team/chat/815

Name Need Faster Ships, Pirates Ahoy part deux NFS@Home Challenge
Status Upcoming
Project NFS@Home
Issued by Anguillan Pirates
Start time 2016-10-07 00:00 UTC
End time 2016-10-14 00:00 UTC
Late entrants allowed? Yes
Number of teams participating » 7
Number of users participating 0



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #258 le: 03 October 2016 à 21:36
Hi Naz,

The goal of NFS@Home is to factor large numbers using the Number Field Sieve algorithm. After setting up two polynomials and various parameters, the project participants "sieve" the polynomials to find values of two variables, called "relations," such that the values of both polynomials are completely factored. Each workunit finds a small number of relations and returns them. Once these are returned, the relations are combined together into one large file then start the "postprocessing." The postprocessing involves combining primes from the relations to eliminate as many as possible, constructing a matrix from those remaining, solving this matrix, then performing square roots of the products of the relations indicated by the solutions to the matrix. The end result is the factors of the number.

NFS@Home is interested in the continued development of open source, publicly available tools for large integer factorization. Over the past couple of years, the capability of open source tools, in particular the lattice sieve of the GGNFS suite and the program msieve, have dramatically improved.

Integer factorization is interesting both mathematical and practical perspectives. Mathematically, for instance, the calculation of multiplicative functions in number theory for a particular number require the factors of the number. Likewise, the integer factorization of particular numbers can aid in the proof that an associated number is prime. Practically, many public key algorithms, including the RSA algorithm, rely on the fact that the publicly available modulus cannot be factored. If it is factored, the private key can be easily calculated. Until quite recently, RSA-512, which uses a 512-bit modulus (155 digits), was used. As recently demonstrated by factoring the Texas Instruments calculator keys, these are no longer secure.

NFS@Home BOINC project makes it easy for the public to participate in state-of-the-art factorizations. The project interests is to see how far we can push the envelope and perhaps become competitive with the larger university projects running on clusters, and perhaps even collaborating on a really large factorization.

The numbers are chosen from the Cunningham project. The project is named after Allan Joseph Champneys Cunningham, who published the first version of the factor tables together with Herbert J. Woodall in 1925. This project is one of the oldest continuously ongoing projects in computational number theory, and is currently maintained by Sam Wagstaff at Purdue University.

Concerning target size there are four three sievers applications available:

lasieved - app for RSALS subproject, uses less than 0.5 GB memory: yes or no
lasievee - work nearly always available, uses up to 0.5 GB memory: yes or no
lasievef - used for huge factorizations, uses up to 1 GB memory: yes or no
lasieve5f - used for huge factorizations, uses up to 1 GB memory: yes or no

How's the credit distributed per target wu?

lasieved - 36
lasievee - 44
lasievef - 130 (not anymore)
lasieve5f - 130

Why the difference in credits?

The more valuable calculation gets more credit. Especially for 16e (lasievef+lasieve5f), the extra credit also awards for the large memory usage.

What project uses what application?

lasieved - Oddperfect, n^n+(n+1)^(n+1), Fibonacci, Lucas, Cunningham, Cullen and Woodall for SNFS difficulty below 250.
lasievee - Cunningham, Oddperfect, Fibonacci, Lucas or other for SNFS difficulty above 250 to ~280.
lasievef - push the state of art for very difficulty factorizations, above SNFS difficulty of 280
lasieve5f - push the state of art for very difficulty factorizations, above SNFS difficulty of 280

The limits depends upon the boundaries chosen for the poly and characteristics of the number being factored. It's advanced math related.

For a (much) more technical description of the NFS, see:
- Wikipedia article (http://en.wikipedia.org/wiki/Number_field_sieve)
- Briggs' Master's thesis (http://scholar.lib.vt.edu/theses/available/etd-32298-93111/unrestricted/etd.pdf)
« Modifié: 03 October 2016 à 21:38 par Carlos Pinho »



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #259 le: 03 October 2016 à 21:37
Factoring an integer using NFS - from msieve readme.nfs file

Factoring an integer using NFS has 3 main steps:

1. Select Polynomial
2. Collect Relations via Sieving (NFS@Home is dedicated to this step)
3. Combine Relations


1. Polynomial Selection


Step 1 of NFS involves choosing a polynomial-pair (customarily shortened to 'a polynomial') to use in the other NFS phases. The polynomial is completely specific to the number you need factored, and there is an  effectively infinite supply of polynomials that will work. The quality of the polynomial you select has a dramatic effect on the sieving time; a *good* polynomial can make the sieving proceed two or three times faster compared to an average polynomial. So you really want a *good* polynomial, and for large problems should be prepared to spend a fair amount of time looking for one.

Just how long is too long, and exactly how you should look for good polynomials, is currently an active research area. The approximate consensus is that you should spend maybe 3-5% of the anticipated sieving time looking for a good polynomial.

We measure the goodness of a polynomial primarily by its Murphy E score; this is the probability, averaged across all the  possible relations we could encounter during the sieving, that an 'average' relation will be useful for us. This is usually a very small number, and the E score to expect goes down as the number to be factored becomes larger. A larger E score is better.

Besides the E score, the other customary measure of polynomial goodness is the 'alpha score', an approximate measure of how much of an average relation is easily 'divided out' by dividing by small primes. The E score computation requires that we know the approximate alpha value, but alpha is also of independent interest. Good alpha values are negative, and negative alpha with large aboslute value is better. Both E and alpha were first  formalized in Murphy's wonderful dissertation on NFS polynomial selection.

With that in mind, here's an example polynomial for a 100-digit input of no great significance:

R0: -2000270008852372562401653
R1:  67637130392687
A0: -315744766385259600878935362160
A1:  76498885560536911440526
A2:  19154618876851185
A3: -953396814
A4:  180
skew 7872388.07, size 9.334881e-014, alpha -5.410475, combined = 1.161232e-008

As mentioned, this 'polynomial' is actually a pair of polynomials, the Rational polynomial R1 * x + R0 and the 4-th degree Algebraic polynomial

   A4 * x^4 + A3 * x^3 + A2 * x^2 + A1 * x + A0

The algebraic polynomial is of degree 4, 5, or 6 depending on the size of the input. The 'combined' score is the Murphy E value for this polynomial, and is pretty good in this case. The other thing to note about this polynomial-pair is that the leading algebraic coefficient is very small, and each other coefficient looks like it's a fixed factor larger than the next higher- degree coefficient. That's because the algebraic polynomial expects the sieving region to be 'skewed' by a factor equal to the reported skew above.
The polynomial selection determined that the 'average size' of relations drawn from the sieving region is smallest when the region is 'short and wide' by a factor given by the skew. The big advantage to skewing the  polynomial is that it allows the low-order algebraic coefficients to be large, which in turn allows choosing them to optimize the alpha value. The modern algorithms for selecting NFS polynomials are optimized to work when the skew is very large.

NFS polynomial selection is divided into two stages. Stage 1 chooses the leading algebraic coefficient and tries to find the two rational polynomial coefficients so that the top three algebraic coefficients are small. Because stage 1 doesn't try to find the entire algebraic polynomial, it can use powerful sieving techniques to speed up this portion of the search. When stage 1 finds a 'hit', composed of the rational and the leading algebraic polynomial coefficient, Stage 2 then finds the complete polynomial pair and tries to optimize both the alpha and E values. A single stage 1 hit can generate many complete polynomials in stage 2. You can think of stage 1 as a very compute-intensive net that tries to drag in something good, and stage 2 as a shorter but still compute-intensive process that tries to polish things.

2. Sieving for Relations

The sieving step is not the theoretically most complex part of the algorithm of factorization, but it is the most time consuming part because it iterates over a large domain with some expensive calculations like division and modulo, although some of these can be avoided by using logarithms.
In general optimization of the sieving step will give the biggest reduction in actual running time of the algorithm. It is easy to use a large amount of memory in this step, and one should be aware of this and try to reuse arrays and use the smallest possible data types. The factor bases can for record factorizations contain millions of elements, so one should try to obtain the best on-disk/in-memory tradeoff.

The purpose of the sieving step is to find usable relations, i.e. elements (a, b) with the following properties
• gcd(a, b) = 1
• a + bm is smooth over the rational factor base
• b^deg(f)*f(a/b) is smooth over the algebraic factor base

Finding elements with these properties can be done by various sieving methods like the classical line sieving or the faster lattice sieving, the latter being used at NFS@Home.

The lattice sieving was proposed by John Pollard in "Lattice sieving, Lecture Notes in Mathematics 1554 (1991), 43–49.". The factor bases are split into smaller sets and then the elements which are divisible by a large prime q are sieved. The sizes of the factor bases have to be determined empirically, and are dependent on the precision of the sieving code, if all smooth elements are found or if one skips some by using special-q methods.

One advantage the lattice siever has is the following. The yield rate for the line siever decreases over time because the norms get bigger as the sieve region moves away from the origin. The lattice siever brings the sieve region "back to the origin" when special-q's are changed. This might be its biggest advantage (if there is one).

3. Combine Relations

The last phase of NFS factorization is a group of tasks collectively referred to as 'NFS postprocessing'. You need the factor base file described in the sieving section (only the polynomial is needed, not the actual factor base entries), and all of the relations from the sieving. If you have performed sieving in multiple steps or on multiple machines, all of the relations that have been produced need to be combined into a single giant file. And by giant I mean *really big*; the largest NFS jobs that I know about currently have involved relation files up to 100GB in size.
Even a fairly small 100-digit factorization generates perhaps 500MB of disk files, so you are well advised to allow plenty of space for relations. Don't like having to deal with piling together thousands of files into one? Sorry, but disk space is cheap now.

With the factor base and relation data file available, is is time to perform NFS postprocessing.However, for larger jobs or for any job where data has to be moved from machine to machine, it is probably necessary to divide the postprocessing into its three fundamental tasks. These are described below:

NFS Filtering
-------------

The first phase of NFS postprocessing is the filtering step. This analyzes the input relation file, sets up the rest of the filtering to ignore relations  that will not be useful (usually 90% of them or more), and produces  a 'cycle file' that describes the huge matrix to be used in the next  postprocessing stage.

To do that, every relation is assigned a unique number, corresponding to its line number in the relation file. Relations are numbered starting from zero, and part of the filtering also adds 'free relations' to the dataset. Free relations are so-called because it does not require any sieving to find them; these are a unique feature of the number field sieve, although there will never be very many of them. Filtering is a very complex process. If you do not have enough relations for filtering to succeed, no output  is produced other than complaints to that effect. If there are 'enough'  relations for filtering to succeed, the result is a 'cycle file'.

How many relations is 'enough'? This is unfortunately another hard question, and answering it requires either compiling large tables of factorizations of similar size numbers, running the filtering over and over again, or  performing simulations after a little test-sieving. There's no harm in  finding more relations than you strictly need for filtering to work at  all, although if you mess up and find twice as many relations as you need  then getting the filtering to work can also be difficult. In general the  filtering works better if you give it somewhat more relations than it stricly needs, maybe 10% more. As more and more relations are added, the size of the generated matrix becomes smaller and smaller, partly because the filtering can throw away more and more relations to keep only the 'best' ones.

NFS Linear Algebra
------------------

The linear algebra step constructs the matrix that was generated from the filtering, and finds a group of vectors that lie in the nullspace of that matrix. Finding nullspace vectors for a really big matrix is an enormous amount of work. To do the job, Msieve uses the block Lanczos algorithm with a large number of performance optimizations. Even with fast code like that, solving the matrix can take anywhere from a few minutes (factoring a 100-digit input leads to a matrix of size ~200000) to several months  (using the special number field sieve on 280-digit numbers from the  Cunningham Project usually leads to matrices of size ~18 million). Even worse, the answer has to be *exactly* correct all the way through; there's no throwing away intermediate results that are bad, like the other NFS phases can do. So solving a big matrix is a severe test of both your computer and your patience.

Multithreaded Linear Algebra
----------------------------

The linear algebra is fully multithread aware. Note that the solver is primarily memory bound, and using as many threads as you have cores on your multicore processor will probably not give the best performance. The best number of threads to use depends on the underlying machine; more recent processors have much more powerful memory controllers and can continue speeding up as more and more threads are used. A good rule of thumb to start off is to try two threads for each physical package on your motherboard; even if it's not the fastest choice, just two or four threads gets the vast majority of the potential speedup for the vast majority of machines.

Finally, note that the matrix solver is a 'tightly parallel' computation, which means if you give it four threads then the machine those four threads run on must be mostly idle otherwise. The linear algebra will soak up most of the memory bandwidth your machine has, so if you divert any of it away to something else then the completion time for the linear algebra will suffer.

As for memory use, solving the matrix for a 512-bit input is going to  require around 2GB of memory in the solver, and a fast modern processor running  the solver with four threads will need about 36 hours. A slow, less modern processor that is busy with other stuff could take up to a week!

NFS Square Root
---------------

With the solution file from the linear algebra in hand, the last phase of  NFS postprocessing is the square root.
'an NFS square root' is actually two square roots, an easy one over the integers and a very complex one over the algebraic number field described by the NFS polynomial we selected. Traditionally, the best algorithm for the algebraic part of the NFS square root is the one described by Montgomery and Nguyen, but that takes some quite sophisticated algebraic number theory smarts.
Every solution generated by the linear algebra is called 'a dependency', because it is a linearly dependent vector for the matrix we built.
The square root in Msieve proceeds one dependency at a time; it requires all the relations from the data file, the cycle file from the filtering, and the dependency file from the linear algebra. Technically the square root can be speed up if you process multiple dependencies in parallel, but doing one at a time makes it possible to adapt the precision of the numbers used to save a great deal of memory toward the end of each dependency.
Each dependency has a 50% chance of finding a nontrivial factorization of the input.

msieve is the client used for post-processing phase



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #260 le: 03 October 2016 à 21:40
The Cunningham factorizations are published here:

http://homes.cerias.purdue.edu/~ssw/cun/

More precisely at  http://homes.cerias.purdue.edu/~ssw/cun/page132.

Reserved integers for NFS@Home grid at: http://homes.cerias.purdue.edu/~ssw/cun/who



En ligne JeromeC

  • CàA
  • Boinc'eur devant l'éternel
  • *****
  • Messages: 31064
  •   
Réponse #261 le: 03 October 2016 à 23:59
Ouch :eek: y'a de la traduction dans l'air !! mais ça va faire mal !!  :bouh:

A quoi bon prendre la vie au sérieux, puisque de toute façon nous n’en sortirons pas vivants ? (Alphonse Allais)



naz

  • Invité
Réponse #262 le: 04 October 2016 à 13:09
thank you for taking the time to all these details Carlos  :jap:
So now...  :hyperbon: :hyperbon: :hyperbon: :)
« Modifié: 04 October 2016 à 17:05 par naz »



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #263 le: 05 October 2016 à 22:39
You guys can join even after the challenge, there's plenty of work to be done.



Hors ligne [AF>Libristes] nico8313

  • Boinc'eur devant l'éternel
  • *****
  • Messages: 8027
  •   
Réponse #264 le: 05 October 2016 à 23:07
Hello Carlos  :hello:

it's possible to create a GPU application?
and Thank you for the added work
best regards
« Modifié: 05 October 2016 à 23:11 par [AF>Libristes] nico8313 »



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #265 le: 06 October 2016 à 21:09
Hello Carlos  :hello:

it's possible to create a GPU application?
and Thank you for the added work
best regards

Hi nico,

For the moment there's no need to create a GPU application despite the fact that the Polynomial Selection stage uses GPU (Nvidia CUDA) capability for the GNFS numbers.
There was a discussion about this at mersenneforum.org but stayed in standby. It's not high priority task at the moment.

Carlos



naz

  • Invité
Réponse #266 le: 08 October 2016 à 18:23
Hi Carlos,

Have you a graph of the same style as the yoyo even our progress?

http://stats.distributed.net/project/ogr_status.php?project_id=28



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #267 le: 09 October 2016 à 09:23
Hi Carlos,

Have you a graph of the same style as the yoyo even our progress?

http://stats.distributed.net/project/ogr_status.php?project_id=28

Hi Naz,

It's difficult to forecast the number of tasks for a lasieve5f factorisation therefore there won't be any progress bar on the webpage. This follows the fact that there are a lot of variables that influence the build of the matrix for the linear algebra phase, this one not done by the NFS users, it's done on clusters. If you follow the tasks sometimes we are processing one number and then after we are processing other number, and then the former one again which is pushed a little further. This behave has to do with the failure on building the matrix and hence the need for more relations from sieve.

Carlos



naz

  • Invité
Réponse #268 le: 09 October 2016 à 09:29
OK! I understand! Thanks  :jap:



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #269 le: 09 October 2016 à 11:59
Sometimes even on the small applications as lasieved and lasievee you need to add more work for a particular number after the initial number of tasks is per-determinated.
This case was real with me when I was trying to post-process a sieved lasieved integer. I managed to build the matrix but the LA on 4 cores would take 200 hours, adding more tasks to sieve we manage to decrease the LA to 80 hours.

All of this to tell you when we have a grant to use the cluster it's wise to get the smallest matrix so it can use less CPU time on the cluster so we can you use more often instead of using the grant in one single LA. For this case adding more tasks in steps until the matrix is optimal is the way to go and impossible to forecast how many tasks/wus are needed.
Hope this helps answering your questions. Any queries please let me know.

BTW, I hope to see you all soon crunching some of the lasieve5f tasks.



naz

  • Invité
Réponse #270 le: 09 October 2016 à 12:58
Je ne reçois que des taches 14,15 et 16eme lattice. Pas de lasieve5f :??:



Hors ligne Xe120

  • CàA
  • Boinc'eur devant l'éternel
  • *****
  • Messages: 1524
  •   
    • E-mail
Réponse #271 le: 09 October 2016 à 13:18
lasieved = 14
lasievee = 15
lasieve5f = 16

C'est juste que le nom est différent.




naz

  • Invité
Réponse #272 le: 09 October 2016 à 14:27
Oups! Ok merci  :kookoo:



Hors ligne Carlos Pinho

  • Boinc'eur Junior
  • **
  • Messages: 101
Réponse #273 le: 27 October 2016 à 13:19
If you are looking for 14e badge please serve yourself because we would like to reach 1000 14e (lasieved) factorizations by the end of the year. The ''Now Sieving'' should be completed as well as the ones in ''Queued''. I am processing C187_121_102 which is to be completed within 9 hours. Then I have reserved P223 (284688...)+1 which I don't know if I will be able to build a matrix because last time I tried to process it it was consuming more than 14GB of memory on my laptop!!!

Overall 300k 14e (lasieved) needs to be sieved by NFS@Home and then all of them post-processed in the back stage to achieve 1000 factorizations.

Carlos



En ligne JeromeC

  • CàA
  • Boinc'eur devant l'éternel
  • *****
  • Messages: 31064
  •   
Réponse #274 le: 27 October 2016 à 17:22
:gno: :gni: :gnu:

A quoi bon prendre la vie au sérieux, puisque de toute façon nous n’en sortirons pas vivants ? (Alphonse Allais)