BOINC - golemfactory/golem-rd GitHub Wiki

BOINC is a software for volunteering computing. Golem can be described in simplicity as BOINC with a market. Project site.

Interesting information

Z Boincintro

If you have an existing application, figure on about three man-months to create the project: one month of an experienced sys admin, one month of a programmer, and one month of a web developer (these are very rough estimates). Once the project is running, budget a 50% FTE (mostly system admin) to maintain it. In terms of hardware, you'll need a mid-range server computer (e.g. Dell Poweredge) plenty of memory and disk. Budget about $5,000 for this. You'll also need a fast connection to the commercial Internet (T1 or faster).

Cost comparison

Suppose you need a lot of computing power - say, 100 TeraFLOPS for 1 year. Here are some ways you can get it:

Use Amazon's Elastic Computing Cloud: 75 Million Based on .10 per node/hour. Build a cluster: 2.4 Million This includes power and air-conditioning infrastructure, network hardware, computing hardware, storage, electricity, and sysadmin personnel. Use BOINC: 25,000 Based on the average throughput and budget of the 6 largest volunteer computing projects.

How to estimate number of floating point operations

The computational size parameters (rsc_fpops_est, rsc_fpops_bound) are expressed in terms of number of floating-point operations. For example, suppose a job takes 1 hour to complete on a machine with a Whetstone benchmark of 1 GFLOPS; then the "size" of J is 3.6e12 FLOPs.

To get an initial estimate of job size, run several typical jobs on your own computer, see how long they take, and multiply by the Whetstone score of the computer (to find this, run BOINC on the computer and look at the event log).

Creating applications

BOINC Wrapper

An existing application (or sequence of applications) can be run under BOINC using a wrapper program supplied by BOINC. The wrapper runs the applications as subprocesses, and handles all communication with the BOINC client (e.g., to report CPU time and fraction done).

There is also special wrapper for applications that run in virtual machines and for Rappture applications.

Native BOINC applications

With some minor source code modifications, you can run an application directly without need for the wrapper. The changes are:

Add calls to BOINC initialization and finalization routines.

Precede each fopen() call with a BOINC function that maps logical to physical names.

Link it with the BOINC runtime library.

The BOINC runtime library is implemented in C++ and is easiest to use from C/C++ programs. However, it also has a FORTRAN binding, and it is possible to run Java and Lisp programs under BOINC as well.

PyBOINC

PyBOINC is a wrapper and a set of predefined libraries that lets you package a Python program as a BOINC application. Threre

The PyBOINC interpreter also includes a custom module for interfacing with the BOINC client API under the namespace "boinc". Most common BOINC API functions have been implemented

For running python scripts as work units user can also use PyMW framework.

Jobs submitting

Job = executable + files. Jobs for application are submitted to BOINC server. They may be submitted locally, by web pages or remotely. There are some specific XML templates that describe jobs and output files.

BOINC vocabulary vs. Golem vocabulary

  • subtask = job
  • subtask_state = result
  • comptaskdef = workunit
  • task = applicaiton
  • collector = assimilator

Security issues mentioned in BOINC security

  • Result falsification. Attackers return incorrect results.
  • Credit falsification. Attackers return results claiming more CPU time than was actually used.
  • Malicious executable distribution. Attackers break into a BOINC server and, by modifying the database and files, attempt to distribute their own executable (e.g. a virus program) disguised as a BOINC application.
  • Overrun of data server. Attackers repeatedly send large files to BOINC data servers, filling up their disks and rendering them unusable.
  • Theft of participant account information by server attack. Attackers break into a BOINC server and steal email addresses and other account information.
  • Theft of participant account information by network attack. Attackers exploit the BOINC network protocols to steal account information.
  • Theft of project files. Attackers steal input and/or output files.
  • Intentional abuse of participant hosts by projects. A project intentionally releases an application that abuses participant hosts, e.g. by stealing sensitive information stored in files.
  • Accidental abuse of participant hosts by projects. A project releases an application that unintentionally abuses participant hosts, e.g. deleting files or causing crashes.

Verification

There's number of redundancy and numer of results that must be same to have consensus. After consensus additional results are still verified, but they can change consensus.

Basic validators

There are three basic validators

  • trivial - job is valid if output files are present
  • substr - searches for specific strings in stderr
  • bitwise - compare replicated results byte by byte.

Homogeneous redundancy

BOINC has feature called homogeneous redudancy. It was created to deal with the fact that applications produce different outcomes for a given workunit depending on the machine architecture, operating system, compiler and compiler flags. HR divides hosts into 'numerical equivalence classes'. Replicas of given subtask are send only to the hosts from the class. There are 3 HR types: no HR, divide by OS and CPU architecture (Intel, PPC, ARM), or divide by OS and CPU classification (Celeron, Pentium,), etc.

Communication

Client with server

Communication takes place via RPC messages.

Client with application

The BOINC client tells the application when to suspend, resume and quit. The application tells the client its current CPU time, and when it has checkpoint. Application created by using BOINC wrapper communicates with him using files. There is an xml file that describe current application and communication files that it uses. There's a file for stdin, stdout, stderr, checkpoint and fraction_done.

Possible integration with Golem

See #20