Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Communicating carbon emissions #52

Open
mnot opened this issue Apr 12, 2023 · 22 comments
Open

Communicating carbon emissions #52

mnot opened this issue Apr 12, 2023 · 22 comments
Labels
adoption Relating to the adoption of a draft

Comments

@mnot
Copy link
Member

mnot commented Apr 12, 2023

See, eg, https://datatracker.ietf.org/doc/draft-martin-http-carbon-emissions-scope-2/

@mnot mnot added the adoption Relating to the adoption of a draft label Apr 12, 2023
@gregw
Copy link

gregw commented Apr 12, 2023

This is the response i got from a sustainability consultant friend (Ken Lunty), with some links to further reading:

Hi Greg, an interesting idea and increasingly more relevant as organizations commit to disclosing their scope 3 emissions (which are scope 1 &2 emissions in their supply chain) also likely to become more and more significant with the take up of AI.

In terms of energy mix, I'm not sure the best way but typically country emissions factors are publicly available. In Australia, this would be in the National greenhouse accounts methods and factors workbook, which gets updated every year to account for increased renewables in our grid

Electricity emissions are broken down to the state level.

don't know much about the IT sector but
boundary of the service would be the more challenging piece and this would tie into a consistent methodology for everyone.

Some resources would be the GHG protocol website but probably more specifically, I would havea look at the Environdec website. https://www.environdec/. com/home

This provide environmental declarations for many products and services.... we have applied this on building materials for the construction sector but it can be applied to any product...you will even see tomatos and pasta on there. It would be interesting to see if it can be applied to a service such as web hosting. The way it works is that you get the product category rules (PCR) approved which defines the methodology for all products within the category for consistency. This is underpinned by the standard for life cycle assessment. Most importantly it defines the scope and boundary of the assessment and also the functional unit. For example, for concrete this would be per m3 or ton....

Things can get more complicated in terms of functional units, for a website would it be per minute browsed? Or per MB transferred? That's probably more your area...

As an example, we are using this framework for roads at the moment and the functional unit is m2/year of design life to account for maintenance. Forwarded It gets really interesting when people can compare products and visualize the arbon reduction through choice. Hence as important as the calculation precision. Attached is an example of what we have done for a shared user path as part of a major road project

The pavement on the left is the client reference design, the one on the right is our alternative with 70% lower carbon as it has been designed for purpose rather than tradition.

So the visualization engages the engineers to innovate...which is the best part!

Sorry for all the messages, just had a bit of time on the train..I actually just did a podcast with roads Australia about the above and believe it can be applied to anything..just requires sustainability professionals to get out of their ivory tower and work with the people who can make the change...

https://www.linkedin.com/posts/ken-lunty-putting-decarbonisation-at-the-centre-of-activity-7051 4385880051 54816-hZOe
?utm_source=share&utm_medium=
member_ios

Looking forward to some great discussion on this!
!

IMG-20230411-WA0001

@ioggstream
Copy link

While I support decarbonisation, I think the WG should probably engage with subject matter experts (in carbon emission and more generally in environmental policies) to assess the effectivity of this kind of solutions.

All this, provided that the published data are trustworthy.

@gregw
Copy link

gregw commented Apr 12, 2023

@bertysentry
Copy link

Some more background reading from the Green Software Foundation (Linux Foundation):

@bertysentry
Copy link

Some context around this idea

Note
Reminder: From the definition of scope 3 carbon emissions, when calculating upstream Scope 3 carbon emissions, you must include the Scope 2 emissions from your suppliers and providers.

In IT, we will need to measure the Scope 3 emissions of:

  • an entire infrastructure,
  • a service,
  • an application,
  • or a user.

That means we must include the Scope 2 emissions from any 3rd-party online service we rely on.

Some examples:

  • A user will rely on a Web site, a Google search, a ChatGPT conversation.
  • An application will rely on 3rd-party REST APIs.
  • An infrastructure may rely on SaaS platforms, online S3 storage, etc.

We note that most of online providers provide their services through HTTP (which represents approx. 90% of the Internet traffic).

Why would we want to assess our Scope 3 emissions (and thus know the Scope 2 emissions of our 3rd-party suppliers)? Because legal pressure is building up on organizations to do so. And because this allows users and organizations to understand what is responsible for how much CO2 emissions, and change their behavior (human or program) if necessary to reduce their carbon emissions.

Here is a summary of the regulatory obligations for companies to disclose their Scope 3 carbon emissions, worldwide:

Europe: Since 2014, large companies listed on European Union stock exchanges are required to disclose information about their environmental strategy, including their greenhouse gas (GHG) emissions related to their activities (including Scope 1, 2, and 3 emissions), in their annual report. Additionally, the EU Non-Financial Reporting Directive, adopted in 2014 and revised in 2018, requires large companies to disclose information about their policies, risks, and environmental and social impacts, including GHG emissions, in a separate report.

United States: There is no federal law in the United States that requires companies to disclose their Scope 3 carbon emissions. However, some state regulations may require the disclosure of information about GHG emissions, including Scope 3 emissions (California and the Silicon Valley is a notable example of that).

Asia: Regulations vary widely across Asian countries. For example, China requires companies to publish sustainability reports that include information about GHG emissions, including Scope 3 emissions. In Japan, companies listed on stock exchanges must disclose information about their GHG emissions, including Scope 1, 2, and 3 emissions, in their annual report.

Warning
This list is not exhaustive and regulations may have changed recently. Companies may also be incentivized to voluntarily disclose information about their Scope 3 carbon emissions in response to growing pressure from consumers, investors, and other stakeholders.

@bertysentry
Copy link

g vs J

Question: Should we report the CO2 emissions (in grams) associated to processing an HTTP request and building its response, or should we report its energy usage (in Joules)?

Pro g (grams) arguments:

  • Regulations and laws around the world require organizations to report their carbon emissions in metric tons, not their energy usage.
  • It's impossible to calculate the carbon emissions in grams from the energy usage in Joules for a 3rd-party service. You would need to know the carbon intensity of the electricity provider of the 3rd-party, but this information is not public. Or else it would need to be included in the proposed response header.

Pro J (Joules) arguments:

  • It's easy for a program like an HTTP server to calculate precisely its energy usage (e.g. using Intel's RAPL instructions). Getting the carbon intensity of the electricity provider requires access to an external Web service.
  • Developers who want to optimize the consumption of 3rd-party Web services need to know the energy usage of the service they consume. Carbon emissions are "energy usage" (in J) multiplied by "carbon intensity of the grid" (in g/J). The carbon intensity factor can vary greatly in time, so it would be difficult for a developer to understand whether a query is more efficient, or the carbon intensity just got lower, because the wind just started blowing somewhere and wind turbines reduced the overall carbon intensity on the grid.

Proposed conclusion

We probably want to expose both information: emissions in grams of CO2-eq, and energy usage in Joules.

As regulations require reporting carbon emissions in grams, this metric should be in a dedicated HTTP response header.

Energy usage in Joules being useful to developers only, this metric could be exposed through a Server-Timing header.

@gregw
Copy link

gregw commented Apr 18, 2023 via email

@bertysentry
Copy link

Is HTTP Response the right layer to expose carbon emissions?

@gregw asks this legitimate question, along with concerns about the extra bandwidth this new header requires.

Introduction on carbon emissions in IT

Typical IT departments consider 3 sources of carbon emissions: data centers, terminals, and 3rd-party services (cloud and SaaS):

image

Each area has its specificities and its own tools (or lack of):

  • For data centers, we can measure their electricity usage.
  • For terminals, we can calculate their embodied carbon footprint.
  • For cloud services, we have some sort of reports that are barely usable at this stage (looking at you, AWS), based on internal calculators.
  • For SaaS platforms (like GitHub, Atlassian, Google Docs, etc.), we have zero information.
  • For network (Internet and telecom): we have zero information, except for vague estimations.

Too many carbon footprint tools are simple calculators: how many servers you have, how many VMs, how many containers you run, etc. and you get a rough approximation of the carbon footprint of your infrastructure (disclosure: my company writes a software that measures electricity usage and carbon emissions of physical systems in data centers).

Getting actual carbon emissions values is however key to proper reporting and optimization. You cannot see whether your attempts at optimizing a software (or an architecture) really pays off if the carbon emissions are calculated or estimated, instead of measured.

IPv6

Some people think it should be integrated into a lower level (like IPv6).

I agree that exposing carbon emissions associated to the transport of data by the network infrastructure should be done at the network layer, so IPv6.

However, the network layer is not the right place to expose application-related metrics, like carbon emissions, because the energy required to perform a task (compute, memory, and storage) is going to be evaluated to the process level (in the operating system), which must know nothing about the network transport layer.

Service/resource level

Other think it should be exposed by the service provider, as a "general" information about its services, like a co2.json file placed at the root of the HTTP server, like in the fictitious example below:

{
  "/images/*": "0.00005748",
  "/rest/fictitious": {
    "GET": "0.0000046",
    "POST": "0.00245"
  }
}

Clients would query /co2.json from time to time, and aggregate these values according to the requests that have been sent to the HTTP server.

The limitation is that it allows clients to calculate their Scope 3 carbon emissions based on estimates, which doesn't encourage actual optimization on the client side. If Google announces in their /co2.json file that a query to /search averages 0.2 grams, it doesn't tell you if using double quotes in your query makes it more efficient, etc. So as a developer that relies on Google searches in its app (again, this is fictitious), I have no incentive on optimizing my queries. I will just limit the number of queries, but maybe make them more complex, which will be counter-productive.

HTTP Response

Getting the actual carbon emissions associated to the production of an HTTP response after an HTTP request is arguably the easiest way for developers, client applications and IT services to measure the carbon emissions associated to the consumption of 3rd-party services, because it will reflect their actual usage of the services, and not based on estimations.

The value exposed in the HTTP response will not include the carbon emissions associated to the network transport, which is a separate concern (and may be addressed in a separate RFC for IPv6).

First implementations of the HTTP header may actually use static estimations (as in the above example with 0.2 grams per Google search. But improvements in energy usage observability and carbon intensity measures will allow future implementations to provide actual values.

Extraneous bandwidth usage

Isn't it counter-productive to transport more data to report carbon emissions? Yes, it could in very specific cases, or really bad implementations.

However please note that this draft does not mandate that all HTTP responses produce the header. An HTTP server could choose to report the aggregated sum of the carbon emissions of the last 10 requests, and skip the header for these 9 previous requests (for the same client, obviously). Or it could skip the header when the value is not significant. This could help optimize the bandwidth usage, but at the expense of precision and optimization.

So, my personal opinion is that this extra HTTP response header is very thin in terms of bandwidth usage, compared to the average HTTP response header length (700 bytes on average, according to the HTTP Archive... according to ChatGPT 😅).

Obviously, the syntax of the header could be changed to improve that, like:

CO2: scope2=0.0004573

@gregw
Copy link

gregw commented May 5, 2023

@bertysentry Thanks for the extra information.

However, I think you are selling short the capabilities of a Service/resource level. I agree that if it was just a /co2.json resource containing the general figures for the server, then it would have the limitations that you describe.

However, it is entirely possible to design a service that has equivalent resolution to a per response when necessary, so that requests can be tuned, but avoids that expense when not necessary.

I'm thinking of a /CO2/* resource space, similar to the /proc/* space on a linux system, that allows both general and specific requests. So perhaps:

  • /CO2/summary.json for the "general" information you describe
  • /CO2/history.json?from=yyyy-mm-dd&to=yyyy-mm-dd for the specific information on individual recent requests
  • CO2/connection/summary.json for the summary information for all requests received over the same connection as the enquiry (i.e. session)
  • CO2/connection/history.json for the detailed information for specific requests received over the same connection as the enquiry (i.e. session)
  • CO2/cookie/XYZ/summary.json for the summary information for all requests with the same value as the caller for the XYZ cookie (i.e. session)
  • CO2/cookie/XYZ/history.json for the detailed information for specific requests with the same value as the caller for the XYZ cookie (i.e. session)
  • CO2/uri/foo/bar/summary.json for the detailed information for requests to the /foo/bar URI
  • CO2/uri/foo/bar/history.json?from=yyyy-mm-dd&to=yyyy-mm-dd for the detailed information for individual requests to the /foo/bar URI
  • etc.

Something like this style will give equal resolution to a per response header, so an individual response can be identified, but it also gives various useful aggregates and summaries.

There is also the possibility of a combined solution, with a /CO2/* space service being used to control which responses have a co2 header applied and what details are included in that.

The up side of this include:

  • A lot more detail can be provided than can be squeezed into a response header
  • scope3 and other info (e.g. joules) can be provided.
  • No per response data overheads
  • Rather than the server needing to heuristically decide which responses to attach a co2 header to, the client can form specific queries that match its exact needs.
  • is that if a client doesn't need/use the information, then it need not be generated (or at least left in raw form).
  • servers that do not have the mechanisms to track individual responses, could at least provide the summary details.
  • the embodied costs can be provided in the summary resources
  • I think it will attract a lot less derision from those that are sceptical of efforts to reduce CO2 emissions

The down sides of this (that I can think of) are:

  • less visibility - users that don't know about it wont see it.
  • authentication and authorization will be needed - it is a space that can leak information or be a DoS attack vector. This can be some what mitigated by using "same cookie" or "same connection" queries, but careful thought is needed. Also, using the /CO2/* to control which responses have the header could also be a good mitigation of security concerns.

@bertysentry
Copy link

@gregw Thank you for the feedback!

The /co2/* namespace would be a great service to users and application developers. However, we cannot expect all HTTP servers to implement and maintain such a complex service. This would typically be implemented as a reverse or forward proxy, like below:

  1. An enterprise forward proxy would collect and aggregate the Carbon-Emissions-Scope-2 header from all responses it processes.
  2. The proxy would expose the collected and aggregated data in /co2/* as you described. Aggregation could even be done by user groups, etc. if this proxy requires authentication.
  3. This would allow large organizations to assess their Scope 3 carbon emissions, associated to the use of external Web services.

@gregw
Copy link

gregw commented May 15, 2023

However, we cannot expect all HTTP servers to implement and maintain such a complex service.

I agree that the full /co2/* as I've described is a little complex. However, I think that it would be wise, not matter what proposal goes forward, to have several levels of compliance. So for example, the minimal compliance for a /co2/* space might just be to have the non-dynamic /CO2/summary.json resource.

But ultimately, the space is no where as near as complex as collecting the data in the first place. I would expect good open source implementations to be quickly made available..... ultimately the space would really just be implemented by something that parses the request log. So to progress this area, getting standardisation of how emissions might be logged in standard format request logs would be a good idea.

I also like the proxy idea. So it would be great to specify both a per response header(s) and a /CO2/* in a compatible way. Thus there would flexibility in solutions that could include:

  • just /CO2/* space
  • just CES2 response header(s)
  • CES2 response headers from application servers stripped & aggregated into a /CO2/* space by enterprise proxy.
  • If you really REALLY wanted to, although it would be a bit inefficient and introduce latency, `/CO2/* space on the applications servers that is used to add CES2 headers in a proxy.

However, thinking about the implementation of the last mode made me realise a difficult with the CES2 response headers: that is at the time the response headers are committed for a large response, it is unlikely that the emissions will be known if full. Thus any proposal should specify that the header can be carried in a trailer..... or just use the /CO2/* space:)

@ahengst
Copy link

ahengst commented Jul 3, 2023

Greetings / intro:
Hi My name is Andreas Hengst, worked decades in IT operations (most recently data storage & backups). I have rudimentary network knowledge, i.e. enough to configure and troubleshoot, not enough to design protocols. I am very interested in energy and carbon issues, and fortunate to have encountered @bertysentry 's presentation at GrafanaCon recently, which is how I then learned of this thread and the draft RFC. At present I'm monitoring the behaviour of an experimental heating system involving my own 2500L thermal battery to allow a heat pump to run at the "best" time, not when the heat demand is high. So I've had experience using network devices of ALL sizes in terms of their energy consumption.

I'm not an official participant in any workgroups ("httpwg") so may have to keep quiet after this post (?).
I intend to do more background study of the network technical details mentioned above.
Until then I may not appreciate how rigorous or experimental or fluid the present draft discussion is.

I do have ideas and thought experiments, I'll share two and I hope they can be of some use.

  1. devices don't know where "our company" starts and ends, and rely on both internal and externally hosted services, so cannot (should not?) be expected to accurately distinguish the different scopes, without requiring and maintaining some sort of IP address range database, or deliberate router configs, complicating management and introducing errors. (Might the RFC define router config settings that can increment "scope hops" as traffic crosses company boundaries?)
    1b) if scope is applied to business transactions between companies, should each company just tally their CO2, then share the relevant data as part of the billing-payment process (rather than network)?

  2. energy use all by itself is a good thing to measure. Any green energy not consumed means (usually) an equivalent reduction of fossil electricity created somewhere else. In other words if the only thing the end user sees is "why so much accumulated electricity for this result?" there's enough knowledge to perhaps dig deeper (make something more efficient) or to choose a more efficient alternative service. Keeping complexity to a minimum, but still achieving improved energy use.
    Multiple easy-to-operate protocols working together might be better able to reliably translate kWh or J to CO2e.

Question - disk arrays with FC connections (not S3 / http) might not have a way to embed energy usage data without other complimentary protocols - which is why I previously had pictured a lower (more physical layer) network protocol for this sort of tallying up of energy, something at the physical layer maybe. Was this ruled out for some reason?

Enough armchair brainstorming for me. I look forward to learning about how this data would be spawned, tallied up, reported or visualized, secured, protected from inappropriate manipulation, etc etc. Fascinating stuff, so much potential!

If you have suggestions for topic areas or homework I'm open to that (thanks!). I see the six links above (April 2023) and have some reading to do.
Thanks
AH
Edmonton, AB, Canada

@bertysentry
Copy link

The draft is about to expire. So this won't have a chance of ever being adopted? I think it's shame, given the challenges we're facing in observability and sustainability.

@mnot
Copy link
Member Author

mnot commented Sep 29, 2023

Based on discussion so far, I'd say that there's interest in the topic, but there's skepticism about the proposed mechanism - it may not be the best way to meet your goals.

@bertysentry
Copy link

Should I update the draft to take into account some of the comments, or should we organize a sort of discussion on this? I really don't know how you guys usually proceed.

@mnot
Copy link
Member Author

mnot commented Sep 30, 2023

If you'd like to update it, that would help move discussion forward. Generally, we adopt things that have strong support in the WG -- especially by implementers.

@ahengst
Copy link

ahengst commented Oct 3, 2023

Hi again, just a quick hello. Nothing like "expiring soon" to reinvigorate interest!
Happy to meet on zoom.
To rephrase my comments July 3...

  • emissions should not be double-counted. If HTTP is used it would be only for interest or troubleshooting or optimizing, not billing, not auditing. (is there an expectation it should be for measurement/billing? I don't yet feel that it would give a complete accounting of energy/emissions)
  • the devices we are dealing might be capable of identifying energy used, while emissions requires additional info. Focus on energy as a first step.
  • many online tools are known to have a high energy footprint. Examples like AI and bitcoin come to mind. There is clear benefit (to some) for revealing these footprints, and many people are curious. I'm hoping for a 'good' end user experience - do we have an idea what that looks like?

Question about RFCs... if observability-of-energy depends on additional components (e.g. front end visualization and device-level measurements) could the protocol specification stand on its own or do we need/want to develop the 'full stack'? I presume it would help to do so...

Andy

@mnot
Copy link
Member Author

mnot commented Oct 3, 2023

We generally stick to protocol details in RFCs.

@gregw
Copy link

gregw commented Oct 4, 2023 via email

@bertysentry
Copy link

About the difficulty to measure the energy and carbon footprint of an HTTP Response

Many responders to the original draft in the mailing list and on this GitHub issue mentioned it's difficult to get the metric in the first place (either in Joules, or in gCO2eq).

It is true the technology is still in its infancy, but we're making a lot of progress. Just to give a few examples:

Others in this discussion were concerned that the values wouldn't be precise enough to be of any use when assessing the carbon footprint of the usage of a given service, served over HTTP.

Most (if not all) organizations assess their carbon emissions with rough estimations, calculated once a year at best. Given the current state of carbon footprint/emissions reporting, adding a little more information, even if not exact or not actually measured, will be helpful to the community.

Legal pressure is building, everywhere. In some countries, all suppliers are legally mandated to report the carbon emissions of their services, to each of their customers/users. Therefore, solutions will come up, some startups will implement ways to report the carbon emissions of various Web services, live. Then some larger vendors will catch up. And everybody is going to use its own way of exposing this information.

It will be much easier and much faster if we all agree on a format to communicate this information, rather than trying to reconcile everybody in 2035.

@gregw
Copy link

gregw commented Oct 4, 2023 via email

@ahengst
Copy link

ahengst commented Oct 5, 2023

I reviewed the WEF page linked above that describes Scopes. I'm going to assume vendors tallying and reporting their (scope 1) emissions, and then billing their customers, results in customers becoming aware of their Scope 2 (for energy purchases) and Scope 3 (for upstream 'embodied' energy). The proposal is called "scope 2" but for upstream web services it looks more like Scope 3 to me. More significantly, we also identified difficulty knowing where "our" scope 1 ends and "their" scope 3 begins.

The network messaging we are discussing won't be part of that reporting unless it can fully and reliably take the place of 'conventional' reporting methods, and avoids double-reporting or having the side effect of making conventional methods more complicated.
I can't yet imagine network based messaging being an accounting/reporting/compliance tool. BUT I do see it having practical value because it reveals something largely invisible to internet users. This makes it really intriguing.
Could this new mechanism be useful even if it's NOT accurate at measuring CO2?
Could this new mechanism be useful even if it's not accurate at measuring KWh? (because that is sounding tricky too)
Is there an even simpler metric, if all we want to see as end users is "this blockchain transaction is like 85 hours of Netflix"? I'm just asking this question, because if this RFC won't be adopted maybe something simpler could be.

In the spirit of kicking tires: once an end user has identified a "high energy" internet provided service, is there a way to learn more about it?

  • was the energy used 'immediately' (compare with GJ on my heating gas bill)?
  • was the energy averaged and includes equipment standby time (compare with Fixed delivery charge on gas bill)?
  • was the energy pro-rated because a model had to be trained for X days ahead of time, then shared by a projected number of customers?

Time is an interesting part of making energy consumption visible. We don't really want to end up with something that looks like my utility bill, but what would we expect to see? Have mockups been done that I haven't seen (since I'm only here on github until now)?
Thanks
Andy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
adoption Relating to the adoption of a draft
Development

No branches or pull requests

5 participants