Create a manifest file for training an LLM #63
Labels
content-project
registered
A project which has been registered with the GSF
submitted
The project team has submitted their solution.
WINNER!
Type of project
Writing content about Impact Framework
Overview
The idea comes from this discussion.
Large Language Model(LLM) has evolved rapidly over the past year and has shown great potential in many areas. At the same time, we cannot ignore its impact on the environment.
Therefore, we would like to conduct a survey on the existing LLMs to generate a manifest file that could be used to calculate the energy and carbon consumed by LLM training. The actual models/calculations are not requirements here - it is enough to create a manifest file that could run as this will expose gaps in the current stack.
.yaml
fileQuestions to be answered
At the moment we have no questions.
If there are relevant suggestions or resources, feel free to leave a message.
Have you got a project team yet?
Yes and we aren't recruiting
Project team
Team members: @Xiaoliang1122, @Irenia111, @Jjing-Liang
Terms of Participation
Project Submission
Summary
This tutorial outlines methods for estimating the carbon emissions of Large Language Models (LLMs) during training and inference using the Impact framework tool. It offers manifest examples for various levels of emission estimates and helps compare emissions across different LLM configurations, promoting carbon reduction. An updated dataset with LLM carbon emissions papers and related public data is provided for convenient calculation, aiming for a sustainable AI-environment future.
Problems
Large Language Models(LLMs) have evolved rapidly over the past year and have shown great potential in many areas. At the same time, we cannot ignore its impact on the environment. The current version of the Impact Framework lacks a comprehensive plugin designed to accurately calculate the carbon emissions generated by Large Language Models (LLMs). This deficiency creates a challenge for users seeking to understand and manage the environmental impact of their AI-driven operations. Our proposed solution leverages the capabilities of the Impact framework to define a manifest specifically tailored for the assessment of carbon emissions from LLMs. This will enable users to gain insights into the constituents of LLMs carbon footprints and identify the critical factors influencing them. Additionally, our research will involve the compilation of a readily accessible dataset of public data pertinent to LLMs computation, facilitating straightforward lookup and application for those aiming to evaluate and mitigate the carbon emissions associated with their AI models.
Application
Our solution serves as a tutorial guide for calculating the LLMs carbon footprint, consisting of a collection of manifest files and explanatory articles. Collaborating with the Impact Framework, we illustrate the carbon footprint of LLMs using IF plugins. By supplying simple data input, you can achieve a comprehensive understanding of LLMs carbon emissions. The manifests and accompanying explanatory content will facilitate your exploration of the LLMs carbon footprint structure and assist in comprehending each emission component. An updated dataset with LLMs carbon emissions papers and related public data is also provided for convenient calculation.
Prize category
Best content
Judging criteria
Overall Impact:
The tutorial's approach to evaluating LLMs carbon emissions stands to bolster sustainability efforts, offering a methodology for assessing AI's environmental footprint. By equipping users with the Impact Framework to make eco-conscious decisions, it fosters more sustainable AI development and promotes a shift towards greener technology practices. To actualize this impact, the tutorial requires broad outreach to engage the AI and sustainability sectors, integration with the Impact Framework for ongoing support, and persistent research for methodological refinement and data expansion.
Clarity:
The tutorial clarifies the complex subject of LLMs carbon emissions with well-structured guidance and relatable examples. Its use of visual aids and plain language ensures that the material is comprehensible to a wide range of users, from seasoned professionals to those new to the field, making the information both accessible and engaging.
Innovation:
Innovative in both concept and execution, the tutorial breaks new ground in AI sustainability by harnessing the Impact framework for carbon emission analysis in LLMs. The introduction of diverse manifest examples not only streamlines the estimation process but also educates users on the nuances of carbon footprint assessment, marking a significant advancement in the field.
Video
https://youtu.be/pOIdXF0N9HQ
Artefacts
https://github.com/Jjing-Liang/LLMCarbon--/blob/main/content.md
Usage
https://github.com/Jjing-Liang/LLMCarbon--/blob/main/README.md
Process
The development process of this tutorial involved thorough research and analysis of the environmental impact of LLMs and existing carbon emissions evaluation tools. The Impact Framework was extensively studied to understand its functionalities and applicability. Based on the research and framework analysis, multiple manifests with different granularity were designed to calculate LLMs carbon emissions, considering factors such as server energy consumption, training time, data transfer, and manufacturing costs. This tutorial provided users with a simple and comprehensive method to calculate and assess the carbon emissions of their LLMs models. The tutorial underwent testing and optimization to ensure accuracy and reliability.
Inspiration
Our inspiration for developing this tutorial came from three key sources. Firstly, the growing awareness of environmental issues and the need for sustainable development motivated us to address the environmental impact of artificial intelligence, particularly LLMs. Secondly, the widespread application of LLMs in various domains highlighted the potential environmental challenges associated with their resource-intensive nature. Lastly, the discovery of the Impact Framework's capabilities in environmental assessment provided us with the inspiration to write a tutorial that integrates with the framework to evaluate LLMs carbon emissions. By combining these motivations, we aimed to contribute to the sustainable development of AI by providing a comprehensive solution for assessing and managing LLMs carbon footprints.
Challenges
We encountered several challenges during the process, including data availability and collection complexities for evaluating the environmental impact of LLMs. Understanding the complex operations of LLMs and accurately evaluating their environmental impact required expertise in AI and environmental evaluation methodologies. Integrating the evaluation framework with LLMs proved challenging, necessitating customization to align with their specific requirements. However, through extensive research, collaboration, and optimization, we successfully addressed these challenges, resulting in a solution for assessing and managing the environmental impact of LLMs.
Accomplishments
We are proud of our achievement in rapidly constructing the LLMs evaluation and comprehending its relationship with carbon emissions. Despite starting from scratch, we dedicated substantial time and effort to quickly acquire knowledge and apply it effectively, yielding impressive outcomes. Throughout the process, we overcame technical and theoretical hurdles through extensive research and practical experimentation. Our team demonstrated remarkable adaptability and learning capabilities, enabling us to complete the task efficiently. Ultimately, our pride stems from our ability to swiftly build the LLMs manifest and understand the complexities of carbon emissions. This accomplishment showcases our team's commitment and competence, establishing a solid foundation for future endeavors.
Learnings
Through hacking, our team has acquired valuable skills and insights. We understand the factors influencing carbon emissions in large language models. It has improved our ability to gather and organize information effectively, enabling informed decision-making. Utilizing the IF framework, we navigate complex challenges with logical statements and conditions. Our problem-solving and critical thinking skills also have sharpened, allowing us to approach challenges creatively. Overall, this journey has provided us with invaluable knowledge and skills.
What’s next
Our solution aims to have a lasting impact on the Impact Framework ecosystem by deepening our understanding of carbon emissions in LLMs, optimizing processes through advanced technologies, and ensuring scalability and replicability.
We hope we can conduct continuous research and data analysis to gain a deeper understanding of the factors that influence carbon footprints in LLMs. This knowledge will enable us to develop targeted strategies for reducing emissions. Additionally, we want to streamline processes and enhance data analytics capabilities through the integration of advanced technologies and intelligent tools. This will improve the efficiency and accuracy of data collection, organization, and analysis, allowing participants to effectively monitor and evaluate their carbon emissions. Furthermore, our solution will be designed to be scalable and replicable across different scales, regions, and sectors. This will facilitate its widespread adoption and replication, maximizing its impact and encouraging more stakeholders to embrace low-carbon practices.
In summary, our solution will contribute to the long-term development of the Impact Framework ecosystem by deepening our understanding of carbon emissions, optimizing processes, and ensuring scalability and replicability. Through these efforts, we will drive the adoption of sustainable practices and contribute to a more sustainable future.
The text was updated successfully, but these errors were encountered: