Skip to content

Commit

Permalink
Merge pull request #145 from StarfilesFileSharing/alpha
Browse files Browse the repository at this point in the history
Alpha
  • Loading branch information
QuixThe2nd authored Nov 4, 2024
2 parents ac228d7 + a5e1546 commit 73b00b5
Show file tree
Hide file tree
Showing 16 changed files with 115 additions and 74 deletions.
17 changes: 12 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,30 +10,37 @@
If you're a developer, you can check our issues for ideas on how to help. Otherwise, check <a href="#contribute-to-hydrafiles">here</a> on how else you can contribute.
</p>

**Reading this on GitHub?**
Check our docs at any Hydrafiles node such as [hydrafiles.com](https://hydrafiles.com).
**Reading this on GitHub?** Check our docs at any Hydrafiles node such as [hydrafiles.com](https://hydrafiles.com).

## What is Hydrafiles?

Hydrafiles is a peer to peer network, enabling anonymous upload/download of files and anonymous hosting and usage of APIs. Peers can host and serve static files or dynamic backends over HTTP and/or WebRTC without revealing their identity.

## What environments does Hydrafiles run in?

Hydrafiles runs in both browsers and desktop/server with both an JS library available for both, and an executable or Docker container available for non-web environments.

P.s. **Using web nodes**, you are able to **serve APIs** and static files over WebRTC, that are **accessible via HTTP**. Yes, you read that right.

## How is it anonymous?

![I'm Spartacus!](public/i-am-spartacus.gif)

TLDR: This scene ^

When someone requests a file or calls an endpoint, the request is sent to all peers. If a peer has the file or controls the requested endpoint, it will serve it. If not, it will forward the request to known peers and mirror the response. If no one has the file or controls the endpoint, the request will return a 404 once peers
timeout. Because the request is forwarded by each peer and all peers mirror the response, it is impossible to tell which peers the request or response originated from.
When someone requests a file or calls an endpoint, the request is sent to all peers. If a peer has the file or controls the requested endpoint, it will serve it. If not, it will forward the request to known peers and mirror the response. If
no one has the file or controls the endpoint, the request will return a 404 once peers timeout. Because the request is forwarded by each peer and all peers mirror the response, it is impossible to tell which peers the request or response
originated from.

## Who's in charge of Hydrafiles?
No one, anyone, everyone. Hydrafiles doesn't have a head. It's simply an API specification. Anyone can setup a domain, server, S3 bucket, or literally just run a JavaScript library and connect to the network. The more people that decide to do this, the more private the network.

No one, anyone, everyone. Hydrafiles doesn't have a head. It's simply an API specification. Anyone can setup a domain, server, S3 bucket, or literally just run a JavaScript library and connect to the network. The more people that decide to
do this, the more private the network.

## Where is Hydrafiles?

Hydrafiles is everywhere. With the goal of being cross-border, we ask people like you to contribute to the movement, by setting up cross-border domains and servers.

## What Hydrafiles isn't.

Hydrafiles does **not** hide content. All data on the Hydrafiles network can be seen by all peers. Sensitive content MUST be encrypted before submission to the network.
2 changes: 1 addition & 1 deletion deno.jsonc
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@starfiles/hydrafiles",
"version": "0.7.32",
"version": "0.7.33",
"description": "The (P2P) web privacy layer.",
"main": "src/hydrafiles.ts",
"exports": {
Expand Down
10 changes: 7 additions & 3 deletions docs/1. Why Hydrafiles.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
# Why Use Hydrafiles Instead of Tor, Bit/WebTorrent, and I2P?

Hydrafiles was originally designed to replace the traditional client-server relationship when serving files with a peer to peer relationship.

The problem aimed to solve originally was allowing websites and applications to distribute static files via traditional HTTP rails with privacy and P2P redundancy. Standard BitTorrent can't be used on the web, and WebTorrent comes with many limitations such as a WebRTC dependency and requiring browsers to run JavaScript to download content.
The problem aimed to solve originally was allowing websites and applications to distribute static files via traditional HTTP rails with privacy and P2P redundancy. Standard BitTorrent can't be used on the web, and WebTorrent comes with many
limitations such as a WebRTC dependency and requiring browsers to run JavaScript to download content.

As of v0.4.0, Hydrafiles now allows nodes to run APIs/backends through the network, using Hydrafiles as a reverse proxy to preserve privacy over vanilla HTTP. This opens up the discussion of, why Hydrafiles over Tor or I2P?

Let's start with a privacy comparison. With Tor & I2P, if you control x% of the network, you have y% probability of deducing the origin of content. With Hydrafiles, you must control all but one node, to know what he is doing. Even if you control 90% of the network, you don't know where content originates from, just that it came from that 10%. This arguably makes Hydrafiles less traceable than Tor & I2P.
Let's start with a privacy comparison. With Tor & I2P, if you control x% of the network, you have y% probability of deducing the origin of content. With Hydrafiles, you must control all but one node, to know what he is doing. Even if you
control 90% of the network, you don't know where content originates from, just that it came from that 10%. This arguably makes Hydrafiles less traceable than Tor & I2P.

Next is availability/accessibility. Same as BitTorrent, with Tor & I2P, your users need special software to access the network and retrieve content. With Hydrafiles, clients can leech from the network using normal HTTP requests, optionally running a full node to validate or contribute to the network.
Next is availability/accessibility. Same as BitTorrent, with Tor & I2P, your users need special software to access the network and retrieve content. With Hydrafiles, clients can leech from the network using normal HTTP requests, optionally
running a full node to validate or contribute to the network.

Finally, there is user privacy. Hydrafiles does not provide any user privacy. Hydrafiles is intended for anonymous distribution of content. It is your users (and your) responsibility to protect their identity if needed.
8 changes: 6 additions & 2 deletions docs/2. Installation.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
# Install Full Node

## Universal Library

1. Go to [Releases](https://github.com/StarfilesFileSharing/hydrafiles/releases) and import the latest `hydrafiles.js` file. NPM, JSR, and CDN packages will be made available when Hydrafiles is less experimental.
2. Import the library in your project.

```
import * as Hydrafiles from './hydrafiles.js';
const hydrafiles = new Hydrafiles();
```

3. Start the node.

```
hydrafiles.start();
```
Expand Down Expand Up @@ -92,8 +96,8 @@ CDNs cache files which makes download times faster and lowers server load. CDNs

**Option B: Running a Reverse Proxy** is cheaper but less safe as file integrity can't be verified. It is not recommended to rely on a reverse-proxied peer as am exit point.

To set up the reverse proxy, first choose a Hydrafiles IP from an HTTP node you can find [here](/dashboard.html). Then configure [Nginx](https://www.digitalocean.com/community/tutorials/how-to-configure-nginx-as-a-reverse-proxy-on-ubuntu-22-04),
[Caddy](https://caddyserver.com/docs/quick-starts/reverse-proxy), or similar software to point port 80 to the IP.
To set up the reverse proxy, first choose a Hydrafiles IP from an HTTP node you can find [here](/dashboard.html). Then configure
[Nginx](https://www.digitalocean.com/community/tutorials/how-to-configure-nginx-as-a-reverse-proxy-on-ubuntu-22-04), [Caddy](https://caddyserver.com/docs/quick-starts/reverse-proxy), or similar software to point port 80 to the IP.

### Donate Storage

Expand Down
13 changes: 9 additions & 4 deletions docs/3. Web Nodes.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,24 @@
## What is a web/full node?

- **Web nodes** run in your browser. Websites can import the Hydrafiles library and run a web node in your browser.
- **Full nodes** run on your computer like actual applications.

## Differences

The browser environment is sandboxed, forcing Hydrafiles to implement some **hacky** solutions. Many of the basic features available to full nodes are impossible normally in browsers.

The core differences include:
- **No Ports:** Before v0.6, Hydrafiles ran purely on HTTP. All peers were required to run an HTTP server to contribute to the network (seed). As of v0.6, Hydrafiles now supports WebRTC. Hydrafiles full nodes still support HTTP, but both now support WebRTC. This allows for web nodes to contribute to the network (and anonymity-set). Full nodes now also host a WebSocket room to in-house signalling.

- **No Ports:** Before v0.6, Hydrafiles ran purely on HTTP. All peers were required to run an HTTP server to contribute to the network (seed). As of v0.6, Hydrafiles now supports WebRTC. Hydrafiles full nodes still support HTTP, but both
now support WebRTC. This allows for web nodes to contribute to the network (and anonymity-set). Full nodes now also host a WebSocket room to in-house signalling.
- **No SQLite:** Hydrafiles uses SQLite. SQLite is not supported in browsers. There is also no modern implementation of SQL in the web either. Web nodes now use IndexedDB as the database.
- **No File System:** Websites are unable to access your filesystem. Modern browsers are starting to support FileSystem API, which solves this. But [support is lackluster](https://caniuse.com/?search=showDirectoryPicker) with only chromium-based desktop browsers supporting it. To solve this, we have implemented a virtual file system. Web nodes wraps IndexedDB to make it act like a file system, treating each file write as a row insert, and file read as a row read.
- **No File System:** Websites are unable to access your filesystem. Modern browsers are starting to support FileSystem API, which solves this. But [support is lackluster](https://caniuse.com/?search=showDirectoryPicker) with only
chromium-based desktop browsers supporting it. To solve this, we have implemented a virtual file system. Web nodes wraps IndexedDB to make it act like a file system, treating each file write as a row insert, and file read as a row read.

| Feature | Full Node | Web Node |
|-------------|-------------|----------------------------------------|
| ----------- | ----------- | -------------------------------------- |
| WebRTC | Seed+Leech | Seed+Leech |
| HTTP | Seed+Leech | Leech Only |
| WebSocket | Seed+Leech | Leech Only |
| Database | SQLite | IndexedDB |
| File System | File System | FileSystem API (or IndexedDB Fallback) |
| File System | File System | FileSystem API (or IndexedDB Fallback) |
3 changes: 2 additions & 1 deletion docs/4. Indexing.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# Indexing

Hydrafiles allows for files in the network to be indexed (searchable).

Each peer keeps a list of all files it knows about, identified by their hashes. Whenever a peer discovers (or adds) a file, it is added to their list.

File lists are exchanged between peers periodically. When a peer receives a file list, it will check for specific metadata/columns to add/replace for existing files, as well as new files to add to the list.

To know how to approximate the importance/relevance of a file, see "Hash Counting".
To know how to approximate the importance/relevance of a file, see "Hash Counting".
13 changes: 12 additions & 1 deletion docs/5. Reverse Proxy.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,32 @@
# Using Hydrafiles as a Reverse Proxy ‐ Anonymous APIs

Hydrafiles v0.4.0 introduced reverse proxies. This allows clients to host HTTP APIs anonymously.

## How does it work?

This uses the same routing mechanism that Hydrafiles uses for serving static files. The core differences are:

- Instead of requesting via file hash, peer requests via endpoint's public key.
- Response checksum isn't validated, instead signature is validated.

## Instructions

### 1. Run Node

First run your node. Your node will automatically generate `public.key` & `private.key` files. Backup these as they prove ownership of your hostname.

### 2. Get Hostname

Your hostname is then displayed with each summary. Hostnames look like this:

```
9tpmjtjmenb7cqtdedwpctkrctwpyv9de57m6mju9mu64qtqa1gjuxv4c91k0xku6cupe.etm62bb4cwv30rbj9n83gjv6atcq2u9ddnjm8mvt9t25jqv4e557cjjadt3n6d3jb1n7e
```

### 3. Change Config (Optional)

Set `reverseProxy` in your config to the base URL of your endpoint (e.g. http://localhost:81) or set in JavaScript, set:

```
Hydrafiles.rpcServer.handleCustomRequest = (req) => {
return "Hello World!";
Expand All @@ -27,7 +36,9 @@ Hydrafiles.rpcServer.handleCustomRequest = (req) => {
You can skip this step to test, a "Hello World!" will show by default.

### 4. Test Hydrafiles Domain

Check any Hydrafiles node for your hostname. For example:

```
http://localhost/endpoint/9tpmjtjmenb7cqtdedwpctkrctwpyv9de57m6mju9mu64qtqa1gjuxv4c91k0xku6cupe.etm62bb4cwv30rbj9n83gjv6atcq2u9ddnjm8mvt9t25jqv4e557cjjadt3n6d3jb1n7e
```
```
5 changes: 5 additions & 0 deletions docs/6. Hash Counting.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
# Hash Counting

The goal of hash counting is to allow peers to anonymously cast votes in a gossip network. This allows peers to rank files based on arbitrary factors the network believes are important.

## Where does Hydrafiles use hash counting?

Hydrafiles peers exchange lists of files. To boost the rank of a file, the peer can cast a vote for the file. The peer can choose when to cast a vote. For example on download.

## How does it work?

You can imagine a vote working like this:

```
const hash = hash(file)
const nonce = randomFloat()
Expand All @@ -21,4 +25,5 @@ Other peers then receive a copy of the file's nonce, and if the nonce is higher
To fix diminishing returns after a lot of hashes have been generated, we can implement a "rank by 2nd best vote nonce" rule for example or add expiry's.

## When/Why do peers vote?

Peers can vote for files they believe are important at any time and as many times as they want. This allows us to rank files based on whate the network believes is important.
9 changes: 6 additions & 3 deletions docs/7. Signal Strength.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
# Signal-Strength Header

The `Signal-Strength` HTTP header is used to approximate number of hops without breaking privacy. This is used to approximate network other statistics such as redudancy.

It works by simulating radio-wave interference. Nodes directly serving the file return a signal strength of 90%-100%. Each node after returns a signal strength 0-10% lower than the previous hop. If a node receives a signal strength of 95% or higher, it defaults to returning a signal strength of 90-100% to further obfuscate the origin.
It works by simulating radio-wave interference. Nodes directly serving the file return a signal strength of 90%-100%. Each node after returns a signal strength 0-10% lower than the previous hop. If a node receives a signal strength of 95%
or higher, it defaults to returning a signal strength of 90-100% to further obfuscate the origin.

A simulation with 100k runs provides the following signal strengths:

| Hop # | Minimum | Average | Median | Maximum |
|-------|---------|---------|--------|---------|
| ----- | ------- | ------- | ------ | ------- |
| 1 | 90 | 95 | 95 | 100 |
| 2 | 81 | 92 | 92 | 100 |
| 3 | 73 | 88 | 88 | 100 |
Expand All @@ -17,4 +20,4 @@ A simulation with 100k runs provides the following signal strengths:
| 9 | 45 | 68 | 67 | 100 |
| 10 | 43 | 65 | 64 | 100 |

Based on these averages, we can correlate signal strength with number of hops.
Based on these averages, we can correlate signal strength with number of hops.
1 change: 0 additions & 1 deletion package.json

This file was deleted.

Loading

0 comments on commit 73b00b5

Please sign in to comment.