Since the introduction of ChatGPT, the field of Large Language Models (LLMs) has witnessed rapid evolution, with models like GPT-4, Mistral, and Grok-1 emerging one after another, exerting a profound influence on society. However, the environmental impact of LLMs has also sparked heated debates.
The training and operation of LLMs require massive energy consumption, leading to significant carbon emissions, which contradicts global sustainable development goals. Therefore, it is imperative to promote the green development of LLMs.
As a participating team in Carbon Hack 2024, we have leveraged the capabilities of the Impact framework to provide a tutorial on estimating the carbon emissions of LLMs during their two main phases: training and inference.
- For detailed information about our project idea, please refer to here.
- To access the full content of the tutorial, please follow here.
- For comprehensive usage instructions, please refer to the Usage section below.
Install the Impact Framework and the official plugins globally using npm.
npm install -g @grnsft/if
npm install -g @grnsft/if-plugins
Before to build up your owner LLM Carbon manifest, you can run the example manifest to get the breif idea. You can download or copy the manifest yaml file in the example folder as you want.
For calculating the LLM Carbon Footprint, we employ the basic formula for calculating the carbon footprint: CO2eq = CO2eq_oper + CO2eq_emb
. The CO2eq_oper
term represents the carbon footprint of the operation of the LLM, while the CO2eq_emb
term represents the carbon footprint of the embedding of the LLM.
The fundamental equation for CO2eq_oper
is CO2eq_oper = energy_oper * carb_inten
, where energy_oper
represents the energy utilized during the operation of the LLM, and carb_inten
denotes the carbon intensity of the energy consumed.
To derive energy_oper
, the Watt-hour formula energy_oper(Wh) = n * T * TDP * PUE
is employed. Hence, acquiring energy_oper
depends on the total time for training an LLM, n
number of GPUs plus the training time T
, the power consumption of the GPU (Thermal Design Power, TDP), and the Power Usage Effectiveness (PUE
).
The final equation for operational footprint is: CO2eq_oper = n * T * TDP * PUE * carb_inten
The embodied emissions for training an LLM, denoted as CO2eq_emb
, are computed as the sum of CO2eq_emb_i
values for each hardware unit involved in the process.
The embodied emissions for each hardware unit are calculated using the formula: CO2eq_emb_i = (t_i * CO2eq_chip_i) / lifetime_i
, where t_i
represents the execution duration of the hardware unit, which equates to the total time required for training an LLM. CO2eq_chip_i
denotes the CO2 emissions per chip, and lifetime_i
indicates the expected lifespan of the hardware unit. The chip’s embodied carbon footprint CO2eq_chip_i
within a specific hardware unit is calculated by CO2eq_chip_i = area_i * CPA_i
.
This is expressed by the formula: CO2eq_emb = sum(CO2eq_emb_i)
, where CO2eq_emb_i
represents the embodied emissions of each respective hardware unit. In essence, the hardware units encompass GPU, CPU, SSD, and DRAM. Thus, the aggregate embodied emissions for training an LLM can be articulated as: CO2eq_emb = sum(CO2eq_emb_GPU, CO2eq_emb_CPU, CO2eq_emb_SSD, CO2eq_emb_DRAM)
.
Based on the provided formula, the LLM CO2eq
can be computed using the IF Framework through the IF official plugin.
Here is a basic manifest(you can find it in the repo's examples folder):
The llm-carbon-basic.yml
is the example manifest for calculate the operational carbon footprint of LLM, which includes the data from GPT3. Methods such as Sum, Multiply, and Divide from the IF official plugin are utilized to construct the formula.
# llm-carbon-basic.yml
name: llm basic operational emissions manifest
description:
"
GTP-3 data resource: https://arxiv.org/abs/2005.14165
CO2eq_oper = n * T * TDP * PUE * carb_inten
CO2eq_emb = sum(CO2eq_emb_GPU, CO2eq_emb_CPU, CO2eq_emb_SSD, CO2eq_emb_DRAM)
CO2eq_emb_i = (t_i * area_i * CPA_i) / lifetime_i
T: training hour(training_hour)
n: number of gpus(gpu/num)
TDP: power consumption of the GPU(gpu/tdp)
PUE: Power Usage Effectiveness(pue)
carb_inten: carbon intensity of the energy consumed(carb_inten)
"
tags:
initialize:
outputs:
- yaml
plugins:
training-operation-carbon-multiply:
method: Multiply
path: '@grnsft/if-plugins'
global-config:
input-parameters: ['gpu/num', 'training_hour', 'gpu/tdp', 'pue', 'carb_inten']
output-parameter: 'operation-carbon'
device-expected-lifespan-hours-per-year-multiply:
method: Multiply
path: '@grnsft/if-plugins'
global-config:
input-parameters: ['expected-lifespan', 'days-per-year', 'hours-per-day']
output-parameter: 'expected-lifespan-duration'
reserved-device-hour-with-device-expected-lifespan-divide:
method: Divide
path: '@grnsft/if-plugins'
global-config:
numerator: 'training_hour'
denominator: 'expected-lifespan-duration'
output: 'expected-lifespan-rate'
gpu-embodied-carbon-multiply:
method: Multiply
path: '@grnsft/if-plugins'
global-config:
input-parameters: ['gpu/num', 'expected-lifespan-rate','gpu/cap', 'gpu/area']
output-parameter: 'gpu-carbon-embodied'
cpu-embodied-carbon-multiply:
method: Multiply
path: '@grnsft/if-plugins'
global-config:
input-parameters: ['hardware-unit-num', 'expected-lifespan-rate','cpu/cap', 'cpu/area']
output-parameter: 'cpu-carbon-embodied'
ssd-embodied-carbon-multiply:
method: Multiply
path: '@grnsft/if-plugins'
global-config:
input-parameters: ['hardware-unit-num', 'expected-lifespan-rate', 'ssd/cap', 'ssd/area']
output-parameter: 'ssd-carbon-embodied'
dram-embodied-carbon-multiply:
method: Multiply
path: '@grnsft/if-plugins'
global-config:
input-parameters: ['hardware-unit-num', 'expected-lifespan-rate', 'dram/cap', 'dram/area']
output-parameter: 'dram-carbon-embodied'
embodied-carbon-sum:
method: Sum
path: '@grnsft/if-plugins'
global-config:
input-parameters: [ 'gpu-carbon-embodied', 'cpu-carbon-embodied', 'ssd-carbon-embodied', 'dram-carbon-embodied' ]
output-parameter: 'carbon-embodied'
llm-carbon-sum:
method: Sum
path: '@grnsft/if-plugins'
global-config:
input-parameters: [ 'carbon-embodied', 'operation-carbon']
output-parameter: 'total-carbon'
tree:
children:
child:
pipeline:
- training-operation-carbon-multiply
- device-expected-lifespan-hours-per-year-multiply
- reserved-device-hour-with-device-expected-lifespan-divide
- gpu-embodied-carbon-multiply
- cpu-embodied-carbon-multiply
- ssd-embodied-carbon-multiply
- dram-embodied-carbon-multiply
- embodied-carbon-sum
- llm-carbon-sum
defaults:
thousands-per-unit: 0.001
days-per-year: 365
hours-per-day: 24
seconds-per-hour: 3600
expected-lifespan: 5
inputs:
- gpu/num: 10000
training_hour: 355.2 # 14.8 days
gpu/tdp: 0.3 # 300 Watts
pue: 1.1
carb_inten: 0.429 # CO2eq/KWh the carbon intensity of training region
gpu/cap: 1.2 # kgC02/cm2
gpu/area: 8.15 # cm2
cpu/cap: 1 # kgC02/cm2
cpu/area: 1.47 # cm2
ssd/cap: 0.024 # kgC02/GB
ssd/area: 32768 # GB 32TB
dram/cap: 0.4 # kgC02/GB
dram/area: 256 # GB
hardware-unit-num: 1250 # gpu_num / 8 assuming one CPU, SSD, DRAM for every 8 GPU/TPU chip or one server stack
You can use the IF Framework command to run the manifest:
ie --manifest llm-carbon-basic.yml
If you want to get the output as a yaml file, you can use the following command:
ie --manifest llm-carbon-basic.yml --output <result_file_name>
The result will be as follows:
...
...
tree:
children:
child:
pipeline:
- training-operation-carbon-multiply
- device-expected-lifespan-hours-per-year-multiply
- reserved-device-hour-with-device-expected-lifespan-divide
- gpu-embodied-carbon-multiply
- cpu-embodied-carbon-multiply
- ssd-embodied-carbon-multiply
- dram-embodied-carbon-multiply
- embodied-carbon-sum
- llm-carbon-sum
defaults:
thousands-per-unit: 0.001
days-per-year: 365
hours-per-day: 24
seconds-per-hour: 3600
expected-lifespan: 5
inputs:
- gpu/num: 10000
training_hour: 355.2
gpu/tdp: 0.3
pue: 1.1
carb_inten: 0.429
gpu/cap: 1.2
gpu/area: 8.15
cpu/cap: 1
cpu/area: 1.47
ssd/cap: 0.024
ssd/area: 32768
dram/cap: 0.4
dram/area: 256
hardware-unit-num: 1250
outputs:
- gpu/num: 10000
training_hour: 355.2
gpu/tdp: 0.3
pue: 1.1
carb_inten: 0.429
gpu/cap: 1.2
gpu/area: 8.15
cpu/cap: 1
cpu/area: 1.47
ssd/cap: 0.024
ssd/area: 32768
dram/cap: 0.4
dram/area: 256
hardware-unit-num: 1250
thousands-per-unit: 0.001
days-per-year: 365
hours-per-day: 24
seconds-per-hour: 3600
expected-lifespan: 5
operation-carbon: 502856.64
expected-lifespan-duration: 43800
expected-lifespan-rate: 0.008109589041095891
gpu-carbon-embodied: 793.1178082191782
cpu-carbon-embodied: 14.9013698630137
ssd-carbon-embodied: 7972.050410958906
dram-carbon-embodied: 1038.0273972602743
carbon-embodied: 9818.09698630137
total-carbon: 512674.7369863014
We can see that the CO2eq_oper
is 502856.64 kgCO2e, the CO2eq_emb
is 9818.09698630137 kgCO2e, and the total carbon footprint of the model CO2eq
is 512674.7369863014 kgCO2e.
Sometimes we want to estimate an existing LLM model's Emissions and find that we don't have the exact values for the training hours. We can use the equation T = C / ( n * FLOP_peak * eff)
to estimate the training hours.
Basically, the operational emissions of an LLM model combines the training emissions and the inference emissions. To get the operational emissions, we need to estimate the training hours and the inference hours based on this equation. Where C
represents the computation required, in total floating point operations, FLOP_peak
represents the device peak throughput, eff
represents efficiency of the device.
For the computation required for training, we can use the formula C_train ≈ 6PD
with parameter size P
and the training dataset size D
(tokens). For the computation required for inference, we can use the formula C_inference ≈ 2P * D_inference
, where D_inference
means inference dataset size (tokens).
You can find the manifest file in the repo's example folder as llm-carbon-with-estimated-training-time1.yml
and llm-carbon-with-estimated-training-time2.yml
.
According to your requirements, you can utilize the manifest template files located in the manifest folder to compute LLM training carbon emissions.