Skip to content

Commit

Permalink
Address comments about pricing and uniqueness docs
Browse files Browse the repository at this point in the history
  • Loading branch information
choldgraf committed Apr 5, 2022
1 parent 33e45e5 commit c6454a7
Show file tree
Hide file tree
Showing 4 changed files with 49 additions and 13 deletions.
9 changes: 5 additions & 4 deletions about/2i2c.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
```

2i2c provides a **managed, customized JupyterHub service** that is tailored for research and education communities.
We manage entirely non-proprietary, open-source tools that give their user communities the [Right to Replicate](http://2i2c.org/right-to-replicate) this infrastructure with or without 2i2c.
We manage entirely non-proprietary, open-source tools that ensure user communities have the [Right to Replicate](http://2i2c.org/right-to-replicate) this infrastructure with or without 2i2c.
As a part of this service, 2i2c also makes **upstream contributions to open-source communities** as a part of continuously operating and improving this infrastructure.

This page describes why we believe that 2i2c and its service model is uniquely suited for the research and education communities.
Expand All @@ -26,6 +26,7 @@ Here are a few of the major projects our team memebers have been involved in ove
- [The UC Berkeley DataHubs](https://docs.datahub.berkeley.edu/en/latest/) - A collection of university-wide JupyterHubs for education serving many thousands of students.
- [The Binder Project](https://docs.mybinder.org/) - a large public cloud service for reproducible computing environments using JupyterHub, serving nearly 150,000 sessions each week.
- [The Syzygy Project](https://syzygy.ca/) - A network of federated JupyterHubs for more than 15 Canadian Universities running on national infrastructure.
- [The Jupyter Book](https://jupyterbook.org) and [MyST Markdown](https://myst.jupyterbook.org/) projects - A collection of tools and standards for improving scientific and technical communication and authoring with interactive computing.

## 2i2c has expertise in open source workflows and Jupyter

Expand All @@ -37,17 +38,17 @@ This makes 2i2c uniquely capable of both utilizing and improving this technology
## 2i2c has expertise with research and education workflows

2i2c has years of experience managing cloud resources specifically for research and education communities.
We have led and contributed to projects like [the Binder Project](https://docs.mybinder.org/), [the Pangeo Project](https://pangeo.io/), [the Syzygy Project](https://syzygy.ca/), and [the UC Berkeley DataHubs](https://docs.datahub.berkeley.edu/en/latest/) to serve thousands of users in the research and education community.
We have led and contributed to projects like [the Binder Project](https://docs.mybinder.org/), [the Pangeo Project](https://pangeo.io/), [the Syzygy Project](https://syzygy.ca/), [the UC Berkeley DataHubs](https://docs.datahub.berkeley.edu/en/latest/), and [the Jupyter Book project](https://jupyterbook.org) to serve thousands of users in the research and education community.
As a non-profit, we have defined our mission in order to serve research and education sector, and our team and governing body is made up of individuals from this community.
We strive to build an understanding of their needs, to represent their interests in the Jupyter and open source ecosystem, and to collaborate with them in our operations and development.
2i2c is uniquely positioned to serve as a collaborator for research and education via these efforts.

## 2i2c is a transparent, collaborative non-profit

2i2c is a mission-driven non-profit organization that has a commitment to doing its work openly, transparently, and inclusively.
Its mission is to provide researchers and educators with the infrastructure they need to do their work, and to support open source communities that underlie this infrastructure.
Our mission is to provide researchers and educators with the infrastructure they need to do their work, and to support open source communities that underlie this infrastructure.
2i2c is governed by a [Steering Council](tc:structure:steerco) made of members from the research and education community.
It manages all of its work in public spaces, including [all of its infrastructure](http://github.com/2i2c-org/infrastructure) as well as [all of its organizational strategy and practices](http://team-compass.2i2c.org/).
2i2c manages all of our work in public spaces, including [all of our infrastructure](http://github.com/2i2c-org/infrastructure) as well as [all of our organizational strategy and practices](http://team-compass.2i2c.org/).

## The bottom line

Expand Down
26 changes: 19 additions & 7 deletions about/pricing/comparison.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
# Comparing pricing to similar services
# Comparison to similar services

This page is a guide to 2i2c's services, our pricing, and how this compares with similar kinds of offerings.

:::{tip}
The content on this page can be re-used as a part of "price reasonableness and comparisons" forms when completing contracting for communities.

In each section below, we'll list a few similar companies and services that can be compared with 2i2c.
Their presence and ordering do not constitute an "endorsement" and are not exhaustive - we are merely trying to be transparent and helpful about the other organizations in this space.
:::

## How to think about 2i2c's service and pricing

As a non-profit, we choose our prices as a function of our estimated costs.
As a non-profit, we choose our prices to move forward on a sustainable path to achieve our mission according to [our cost model](costs:human) as well as [our growth model](strategy:growth).
Our service entails developing and managing entirely open-source, vendor-agnostic, and community-driven infrastructure that is customized for research and education.

Our aim with 2i2c's service model is to strike a balance between **scalability** and **flexibility**, with the constraints that we operate **transparently** and **collaboratively** with our communities, by runing **open-source** and **community-driven** infrastructure.
Expand All @@ -26,13 +29,22 @@ The most common way for organizations to achieve similar services is to staff th
2i2c encourages this, as it is aligned with our commitment to open source, vendor-agnostic tools, and the [Right to Replicate your infrastructure](https://2i2c.org/right-to-replicate).

However, hiring and retaining modern cloud engineers is difficult and costly.
If we assume that an engineer makes `$140,000` in salary, with `30%` benefits, that comes to an annual cost of `$182,000` a year, discounting any other personnel, hiring, and cloud costs.
In 2022, [the median compensation of a Site Reliability Engineer](https://www.levels.fyi/Salaries/Software-Engineer/Site-Reliability/) is roughly `$180,000` a year.

This under-estimates the true cost, as there are a few other risk factors associated with paying a single engineer to manage your cloud infrastructure:

This under-estimates the true cost, as centralizing your organization’s cloud engineering on a single person creates risk associated with having a single point of failure.
Their efficiency will depend heavily on their previous expertise, and they will likely not incorporate enhancement and security fixes as quickly as a distributed team of experts.
- Attracting and hiring people for this very in-demand position requires a significant amount of time and energy.
- Centralizing your organization’s cloud engineering on a single person creates risk associated with having a single point of failure.
- The efficiency of this role will depend heavily on their previous expertise with cloud infrastructure and Jupyter, and their capacity to make improvements to the open source tooling will be difficult unless they have previous experience in this ecosystem.
- As a sole contributor, they will likely not be as responsive to outages, make improvements, or incorporate enhancement and security fixes as quickly as a distributed team of experts.

If your organization has significant pre-existing expertise in open source, Jupyter, and cloud infrastructure, then it may be more cost effective for you to run your own JupyterHub services.
If you need to build this expertise internally, it is likely much more cost-effective to partner with a non-profit such as 2i2c.
If you need to build this expertise internally, it is likely much more cost-effective to partner with 2i2c.

:::{note}
2i2c primarily aims to be a more cost-effective alternative to this model of service delivery.
We constantly adjust our own prices and team compensation to be responsive to the ecosystem of Cloud and Site Reliability Engineering, and we'll update this information as the field evolves.
:::

## Consulting companies

Expand Down Expand Up @@ -63,5 +75,5 @@ Enterprise-level contracts for these platforms can be significantly more expensi
## Bottom line

There is a large ecosystem of vendors and services available for interactive data science.
We are heavily biased towards organizations that use non-proprietary tools, and that commit to services that are vendor-agnostic and respect your [Right to Replicate your infrastructure](https://2i2c.org/right-to-replicate).
2i2c believes that interactive computing is emerging as the vital medium for communications in research and education communities. As a result, we suggest that universities and research communities should build atop non-proprietary tools and commit to services that are vendor-agnostic and respect your [Right to Replicate your infrastructure](https://2i2c.org/right-to-replicate).
You should think about the constraints and principles that you'd like your infrastructure to follow, and choose the right approach for your organization.
5 changes: 3 additions & 2 deletions about/pricing/costs.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ This page is a short description of the basic costs that go into the fees that o

There are two major factors that go into the cost of 2i2c's services: **human costs** and **cloud costs**. We consider these costs separately below.

(costs:human)=
## Human costs

Our biggest cost is paying salaries for team members that carry out the services we provide.
Expand All @@ -27,7 +28,7 @@ The fees for each hub are thus determined by dividing this annual cost by the es

## Cloud costs

We try to pass through cloud costs directly to our communities in a transparent manner.
We pass through cloud costs directly to our communities in a transparent manner.
This encourages us to continually reduce the cloud costs for our communities, and helps them understand how their decisions affect their cloud bill.

### What components make up my cloud bill
Expand All @@ -44,7 +45,7 @@ There are some other components that go into your cloud bill (e.g., "networking

### User actions that impact cloud costs

Cloud depend on a few key factors that you and your community has control over.
Cloud costs depend on a few key factors that you and your community has control over.
Here we list some major considerations (in decreasing order of importance):

- **Base user resources needed**: The power and complexity of the user environment is the biggest driver of "base cost per user". This is largely driven by the amount of memory (RAM) each user needs. See below for a more in-depth explanation.
Expand Down
22 changes: 22 additions & 0 deletions about/strategy/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,28 @@ In the pilot, we will focus on a subset of use-cases that we believe are impactf
- **Scalable research environments** - Communities of Practice that use cloud infrastructure to scale their workflows - either by accessing large datasets or leveraging scalable computing infrastructure from an interactive session. Similar to our experience with the Pangeo project.
- **Community event hubs** - Communities of Practice that have a time-bound event (e.g., a workshop or hackathon) that would benefit from a shared space to do their work and collaborate with one another. Similar to our experience with the NeuroHackademy and Pangeo workshops.

## Our pricing strategy

See [](../pricing/index.md) for information about our pricing and cost strategy.

(strategy:growth)=
## Our growth model

Growing this service will require balancing two aspects of our team:

- Our **capacity** to serve a given number of communities at a certain complexity of use-case.
- Our **commitments** to serve a specific set of communities.

Because we are in a growth phase, we want our commitments to be near (or slightly above) our capacity.
We can increase our capacity by making infrastructure and process improvements, or by growing our team.
In the early phases of this pilot, we will focus on the former, and as our infrastructure and process is refined, we will consider the latter.
In either case, we should choose a pricing model that gives us enough buffer to be able to hire new team members when the right time comes.

To carry this out, we'll take on new communities in "batches" and define pricing models for each that at least cover [our estimated costs](costs.md).
When we take on a new batch of communities, we should feel some tension as it challenges our process, support, and infrastructure in new ways.
As we make process and infrastructure improvements, will self-assess whether our capacity has grown.
If it has grown enough, we'll decide to bring on more communities.

## Infrastructure strategy for the pilot

We make a few base assumptions about the kind of infrastructure we will focus on in this pilot.
Expand Down

0 comments on commit c6454a7

Please sign in to comment.