notes.txt

problems
host 2191
    first rpc jan 22 2019
    GPU flops 4e16
    too-large CPU times?
        maybe because of existing (pre-SU) totals?
====================
Allocation
--------------
The way in which computing power is divided among projects and users.
"Hard allocation": one that guaranteed, based on assumptions
about volunteer pool.
--------------
goals: multi-objective
    honor hard allocations
    respect soft allocations
    respect user prefs
    maximize throughput
--------------    
We can estimate the total throughput of the system,
   and we can break it down by resource type,
   and for a given set of keywords we can see how much is not vetoed.

Goal: someone comes to SU with an allocation request, which includes
- the available app version types
    (CPU, NVIDIA, AMD, platform, VM)
- the set of keywords
- # FLOPs
We can tell them whether this can be granted, and if so starting when

Parameters:
- what % of resources to use for allocations (50?)
- max % for a single allocation (10?)

Projects also get an unallocated share,
with a fraction determined by the popularity of their keywords

to process an allocation request:
select hosts that aren't ruled out by keywords
total their flops for eligible resources
------------
SU serves as a scheduler at the level of project.
It keeps track of targets, allocations, balances, host assignments.
SU allocation is at the granularity of project.
If a project needs app- or user-level granularity it can either
    - run multiple BOINC projects, 1 per user/app
    - use BOINC's built-in allocation system

An "allocation" consists of
- an initial balance
- the rate of balance accrual
- the interval of balance accrual
where "balance" is in terms of FLOPs

There are two measures of work done:
1) REC, as reported directly by clients
2) credit, as reported by projects

2) is cheat-resistant, 1) is not
1) is up-to-date, 2) is not (long jobs, validation delay)

Approach:
use 1) from SU scheduling, but don't show it to volunteers
(eliminate incentive for cheating)

show 2) to volunteers.

?? what if SU assigns a host to a project, and the project doesn't have work?
	SU should work in a reasonable way for sporadic workloads
    BOINC client:
    if devices are idle, and scheduler RPC returns nothing
    do an AM RPC, but only every so often
    (min AM RPC interval?)
Need way for SU to check if project has work?
--------------
gives projects more allocation if they supply computing?
    No.  nanoHub users are going to attach to nanoHub, not SU
--------------
========================================
projects
tacc
    http://129.114.6.131/tacctest/
    Biology
    United States
nanohub:
    https://devboinc.nanohub.org/boincserver/
    Nanotechnology
    United States
SETI@home
    Astronomy, SETI
    United States, University of California
BOINC Test Project
    http://boinc.berkeley.edu/test/
Einstein@home
    Astronomy, Physics
    Germany, United States
Rosetta
    Biology, Medicine
    United States, University of Washington

-----------------
categorize keywords as major or minor
    major: astro, bio, env, physics, math
        US, Europe, Asia
categorize projects as auto or on-demand
    auto: accounts creation initiated on SU account creation

====================
User Web site
functions:
    new user experience
    return user experience
    catalog of research projects
        where does this come from?
    new from projects
--------------
main page (index.php)
    If not logged in
        call to action; describe VC, science; pictures; safety
        big Join button, small login link

    if logged in
        if never got RPC from this user
            need to download;link
        if last RPC is old or client version is old
            advise to download; link
        if problem accounts, link to problem account page
        progress/accounting stuff
            show graph of recent EC
            "In the last 24 hours your computers have contributed
            CPU time/EC, GPU time/EC, #jobs success
            You are contributing to science projects doing x, x, x
            located in y, y, y"
        link to Account page
        Add new device
            Android:
            log in from device,

    menu bar (all pages)
        Your account (if logged in)
            projects
            science prefs
            computing prefs
            computing stats
            certificate?
        Computing
            projects
            leaderboards
        Community
            message boards
            teams
            profiles
            community prefs
            user search
            user of the day
        Site
            search
            languages
            about SU
            help
--------------------
Join page (su_join.php)
    create account info
        email
        screen name
        password
    basic prefs
    send cookie
    goes to...
--------------------
Download page (su_download.php)
    says you need to install BOINC
    Download link (direct)
        goes via concierge; sets cookies
        When user runs app, welcome screen says
            You are now running SU as user Fred
            No further action is needed.
            SU will run science jobs in the background.
            OK
    small print: if BOINC version >= 7.10 and VBox already installed:
        In BOINC Manager
            select Tools/user account manager
            choose SU
            enter the email address and password of your SU account
    goes to...
--------------------
Getting started (su_intro.php)
    Welcome, thanks etc.
    add to your other devices too
    tell your family and friends
    Explain
        computing prefs
        keyword prefs
    join a team
--------------------
User account page (su_home.php)
    like BOINC home page:
    left:
        problem accounts
        graphs of recent work
        link to list of hosts
        links to prefs
            keywords
            computing prefs
            community prefs
    right:
        account info
        social functions
--------------------
host list (su_hosts.php)
    columns:
    name
    boinc version
    last RPC
    active projects
--------------------
keyword prefs edit page (su_prefs.php)
    show list of major keywords, yes/maybe/no radio buttons
    "more" button shows all keywords
    if user has prefs for non-major keywords, show all in that category
--------------------
global prefs edit page
    if min/med/max
        show radio buttons
        link to all prefs
--------------------
connect account page (su_connect.php>
    explain the problem
    show list of problem accounts; click for details
    details page:
        let the user enter password of existing account.
        OK: contacts project, tell user if fail
---------------------
Accounting pages (su_acct.php)
    Totals: tables and graphs of totals
    projects: table of projects (su_projects_acct.php; merge)
    project: graphs for 1 project
    users: list of top users
    user: graphs for 1 user
---------------------
new items
    would be good to show news items from project
    easiest way: each project exports RSS feed
    news items are tagged with keywords
    SU aggregates these
    if logged in, filter/order by keyword
    show combined stream to others

    Give projects more allocation if they supply news?
forums
    for discussion of projects, keywords?
==================
RPC handler

when are accounts created?
on initial registration:
    pick N best projects based on prefs
        (same logic as in RPC handler)
    start account creation for those projects
on RPC:
    pick N best projects
    if not registered with any of them
        return highest-score project we are registered with
==================
client changes

represent user keyword sets as two vector<int>s

The following have keyword sets:
AM_ACCOUNT

Scheduler request: include

Eventually (to show keywords in GUI),
client needs to have master keyword list (keywords.xml)
    bundle with installer;
    refresh every 2 weeks;
    refresh if see unknown keyword ID
The following have keyword sets:
PROJECT
WORKUNIT

==================
server changes
workunit table has keywords as varchar 256
WU_RESULT has keywords as string
scheduler: maintain array of vector<int> corresponding to cache
    if look at job, keywords nonempty, vector empty: parse
    when remove from cache, clear vector

remote job submission:
batch can have keywords
each job in batch can have keywords

local job submission:
create_work takes kw ids
    (multi-job as well as single)
==================
keyword-based scheduling

submission: batch has keywords
copied to results (new DB field, 256 chars)

each volunteer reports yes/no keywords in sched request
	(AM can supply "project opaque" text, passed to projects)

score-based scheduling:
	"no" keyword match returns -1
	"yes" match adds 1

ensure non-starvation of jobs in cache
	add job age to score

Project keywords include a fraction flag
	(estimate fraction of work from that project with that keyword).
	100 means all work from project will have that keyword

If a user has a "no" keyword matching a project "all" keyword,
don't attach to that project.

Can compute:
	- the fraction of work from project that user can do
	- the fraction of work from project that user wants to do
==================
Project plan
- web site for register and prefs

- admin web interface for editing projects, apps, keywords, quotas

- RPC stubs for account creation, setting prefs

- integrated client download
==================
docs
Overview
    SU targets a different population than the current BOINC user base:
        largely young (20s, 30s)
        ~half female
        primary motivation: science goals
        non-motivation: competition
            show user their own participation level over time
            no comparisons, leader boards
        science-oriented
        non-technical
            don't use words like FLOPS, CPU, or GPU
            OK: computer, job
    Simplicity
        hide information; TMI is a turn-off for target users
        hide features
    Graphical rather than textual display
    Keep the notion of "project" in the background.
    SU vs BOINC
        Ideally, we'd like an SU_branded version of BOINC.
        But until then, we explain that BOINC is software used by SU

instructions for alpha testers

Design doc
==================
testing and roll-out plan

- get scienceunited.berkeley.edu, su.berkeley.edu; get SSL working

- make 7.9 for win/mac;
    project list includes SU and boinc test

- announce SU to alpha testers; start testing

When SU is stable:
add SU to project list, non-test
fork/test/release 7.10

------------
project stuff

Github
    learn project interface

communication channels

---------------------------

project app version info

need to know whether a host
a) can do work for a project at all
b) can use its GPU (s) for the project

where to store project AV info?

digest projects.xml and for each project produce a list of triples
(platform, GPU type, vbox flag)

store these either in JSON, XML, or serialize

function host_project_compatible() returns whether
    - host can do work for project at all
    - it can do work w/ vbox
    - it can do work using a GPU
-------------------
notion of backup project

Send 1-2 extra projects w/ zero resource share as backups,
in case higher-score project has no work, is down,
doesn't work for this host, etc.

More generally: we should notice if a host can't get work done
for a particular project, and treat it differently.
===========
New allocation model

share/total share = fraction of total FLOPS this project gets
balance = FLOPs owed to this project

daily accounting:
    x = total FLOPS in last day
    for each project p
        p.balance += x*(p.share/total_share)
        record balance, share in project accounting record

RPC
    for each project p
        p.balance -= reported EC

projected balance
    The idea is that there will be a ~24 period from
    when we start attaching a project to when
    the first reports of EC come in, and its balance declines.
    During that period we might over-attach it.

    So we could maintain a separate "projected balance",
    reset periodically to the balance,
    and decremented each time we attach the project by,
    say, half the estimated EC over the next day

    I'm not sure we need this; let's not do it for now.
================
Admin page
30-day graphs
    FLOPS
    jobs success/fail
    #active hosts
    #active users
Text:
    24-hour accounting
    total accounting

links to
    project list
-------
project detail page
    30-day graphs
        FLOPS
        jobs success/fail
    text:
        24-hours
        total


================
Use of email

email prefs:
- send status emails: never / weekly / monthly
    default: week
    store this in user.send_email
    seti_last_result_time: when to send next email

status email
    if some device has been active in last week/month
        show totals for that period
    for devices not active in that period
        show
    if no device has ever been active
        link to help

=============
Message boards
News
Questions
    get help
Problems
    report things that are broken
Suggestions
    suggest changes or additions
=============
on home page:
"Science projects" links to projects page
user projects page:
    show all projects
        put ones w/ work at top
    fix "since"
    new cols:
    account status (created, pending, problem, none)
    allowed by prefs?
    opt-out checkbox

========================
starvation

Some projects may be down, or have no jobs, for arbitrary periods.
If we attach a client to such projects, they'll starve.

This applies only to dynamic AMs;
use "send_rec" flag as proxy for this

Questions:

- When should the client make an "starved" RPC to SU?
    use exponential backoff 10 min .. 1 day to prevent flooding SU
    when a resource is idle
        we know this after doing a rr simulation
        write a func to do it fast!
            scan minimal # of jobs
    keep track of:
        first_starved: start of starvation interval, or zero
        starved_rpc_backoff: delay between starved RPCs, or zero
        starved_rpc_min_time: earliest time to do a starved RPC
    if using dynamic AM
        every 1 min
            check idle rsc
            if none
                first_starved = 0
            else
                if first_starved == 0
                    first_starved = now;
                    starved_rpc_backoff = 10 min
                    starved_rpc_min_time = now + 10 min
                else if now > starved_rpc_min_time
                    do AM RPC
                    starved_rpc_backoff = min(starved_rpc_backoff*2, 1 day)
                    starved_rpc_min_time = now + starved_rpc_backoff

    A starved RPC may not return any new projects (e.g. because of prefs)

- What additional info should be in the AM RPC request?
    for each project: flags saying we didn't get work
    we expected from this project, namely:
    - <sched_req_failed/>: last sched request failed
        get this from PROJECT::nrpc_failures
    - <sched_req_no_work>rsc</sched_req_no_work>
        last sched request requested work for rsc and didn't get any
        (not counting "too soon" error)
    set this stuff in PROJECT after scheduler req

- What changes to the scheduling algorithm are needed?
    in scan over projects:
    
    if last request failed or zero jobs, add to list but
        don't mark any resources
    if didn't get work for some resources, don't mark those resources
=============================
Anonymous accounts

Project accounts created by SU should be anonymous:
- User name is "Science United user random-str"
- email address is su_randomstring@nonexistent.domain
    different per project?
    I think so.
- password hash is random string (use new crypt code)

Purposes:
- simplify GDPR implementation
- reduce security risk (password guessing)
- eliminate stuff for resolving passwords
- let projects see how much computing comes from SU users

Implementation
DB:
add email_addr, passwd_hash to su_account

Web code:
change account creation
    if "email already exists", try new one
remove password resolution code

make script for handling existing accounts
===================
Group accounts

Projects may stop supporting account-create RPC.
So instead manually create accounts on each project.
Name "Science United",
email "scienceunited.org@gmail.com"

Add to project table
authenticator
weak_authenticator
Populate these for E@h, WCG

When creating account records for these projects,
create them immediately
====================
when a project changes from http: to https:

AM RPC:
if request includes a project w/ http: that is in our DB as https:,
tell it to detach from that project.
=================
tables or graphs for progress report:
per month for last X years:
    # of active volunteers
    # of active hosts
        # CPU cores
        # GPUs
    CPU TeraFLOPS
    GPU TeraFLOPS
    # jobs