-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Microsoft] Ecommerce web application using eShoponWeb #32
Comments
Please find attached the case study document for Ecommerce web application. Inferences and observations are also documented - https://docs.google.com/document/d/1i2ECykAXiQNwv0eeuavYMSYUtrob9HuY/edit?usp=sharing&ouid=114480843539813321551&rtpof=true&sd=true |
Thank you for putting this together @srini1978 - maybe you can also share it on the Slack channel for those who don't get a chance to check the issue here? |
@atg-abhishek @Henry-WattTime Case study documented updated and complete |
OverviewThe application is a web application that is used in an ecommerce scenario by customers. It is a sample ASP.NET Core reference application, powered by Microsoft, demonstrating a single-process (monolithic) application architecture and deployment model. Architecture for the system under considerationThe architecture of the application is described in detail here -https://github.com/dotnet-architecture/eShopOnWeb. The architecture basically consists of a monolithic web application powered by a relational database. Technical details of the components in the architectureTechnical details of the components in the architecture is here --https://github.com/dotnet-architecture/eShopOnWeb MVCThe front end of the application's web project implements the class Model-View-Controller pattern. It includes a number of controllers in its Controllers folder, which work with Views to return rendered HTML to the client, typically a browser. The sample also demonstrates the use of alternate flavors of the MVC pattern, including Razor Pages and API Endpoints, described in more detail below. Sites for Software Sustainability ActionsEnergy EfficiencyWhile energy efficiency is not in the scope of this use case, Energy efficiency can be achieved by optimizing at all the software boundary components defined below - App server, Database server, Network components etc by additional tuning. Energy efficiency techniques include but not limited to: adding caching, optimizing the queries being fired against the database etc. However tuning involves spending additional energy which is for now, not accounted in the E value as the tuning effort is done in a physical infrastructure that is outside the boundary of how we are measuring R. . This is planned to be included in the future versions of the SCI specifications as per #223 Hardware EfficiencyHardware efficiency is achieved by reducing the number of physical resources for the software to run. From the list of components that are part of the software boundary (below), this is the App server and Database server where such tuning techniques can be tried out. At the same time, as these tests demonstrate there is a minimum hardware that needs to be provisioned to ensure the overall SCI value is lowered. Carbon AwarenessWhile carbon awareness is not in scope of this use case, we can make the application more green by shifting the application workload to a cloud region that has lower carbon intensity. The electricity grid in India has poor carbon intensity values as opposed to electricity grids in Europe and Americas. Hence a drastic SCI reduction value can be obtained in both runs by moving the workload to a different region (making the application carbon aware) Procedure(What) Software boundary
Excluded components
(Scale) Functional unitThe functional unit R remained the same across all the tests i.e. Number of users as a metric but the values changed during the load test. There were load test times at which peak user load of 45 was attained but there are times at which there were hardly 10 users. Hence when we define R as the number of users, we should call out that this value is during the peak usage of the application. For example, we can define SCI as a value per R, at the point when measurement is taken if we consider R as “number of concurrent users on the system”. The choice of functional unit applies to all components in your software boundary. For example in this case
(How) Quantification methodThe quantification is done by a combination technique - measurement of real world data for e.g CPU utilization of the app servers, Thermal design power (TDP) of the processors, number of cores etc and an estimation model for the GPU and Memory. Since the energy values for memory are much lower than the calculated energy values for processors or CPUs, we consider these values negligible. We did multiple Load Test runs with different infrastructures and calculated the power consumed in these runs. For these tests, we used a functional unit reference of number of users . The energy value is calculated using models which take as input the CPU utilization of servers. Time component is constant, and all tests are run for the same time period. The original intent of the test was to prove that SCI score can be reduced by making the application more hardware efficient. i.e., using fewer physical resources. As part of the test plan, we intended to prove that SCI score can be reduced by using Appserver with 1 core as opposed to 2 Core as compared to quad core. (Quantify) SCI Value CalculationThe Energy value is calculated using the formula P[kwH] = (Power consumed by CPU or Pc Number of cores + Power consumed by Memory or Pr + Power consumed by GPU or Pg Number of GPUs)/1000 There is no GPU. Hence Pg =0. Similarly in this case we are approximating PR=0 as we found that Power consumed by 4GB memory is close to 1.45 W and that by 8 GB memory is approximately 2.45 W and hence these values are much lower compared to that consumed by processors. Detailed calculations are in the report below (Report) |
On the above @srini1978 , isn't there a potential that the way the infrastructure is setup on the web server side, it will have impacts on the kind and size of network traffic that is emitted and hence the network traffic could potentially be included as well? Or conversely, the kind of network traffic being emitted from the database will have processing implications on the side of the server? |
There are couple of ways infrastructure can be setup on web server side. The database server can either be in the same datacenter or different data center or it could be in another cloud region. In all cases, if we assume that the traffic is going through internet backbone (the worst case scenario) rather than the cloud provider's own backbone, we have got some data from studies that energy cost of network traffic is approx 0.023 KwH /GB. Reference - https://medium.com/teads-engineering/evaluating-the-carbon-footprint-of-a-software-platform-hosted-in-the-cloud-e716e14e060c#3bf5. We can calculate the data in and data out for both the web servers and database server and use the above approximation to come up with energy costs of network traffic. Based on preliminary calculations I did these numbers are not negligible and it comes to approx 0.092 KwH for a 24 hour period . Compare this to CPU energy cost of 1.8014 KwH for a 24 hour period. However the above calculations hinge upon the reference value of 0.023 KwH/GB and hence as part of the SCI calculations we need to ratify this. @atg-abhishek |
Note that figure of |
Hey folks. I've been watching this issue with interest, the last few comments bring up a a good point. I know it's excluded from the initial model, and but since it came up it brings up an thorny issue - how to represent the emissions from transfer on a marginal basis for the SCI. We might pay for bandwidth on a per gigabyte basis, but that's because of the business model behind cloud. It doesn't necessarily mean the energy usage and emissions scale the same way. This peer reviewed paper, The real climate and transformative impact of ICT: A critique of estimates, trends, and regulations summarises it pretty succinctly:
This is a fairly accessible dive into how selling on a volumetric basis differs from the actual capacity being reserved - it's referring to the economics here, rather than the energy usage, but I think it gets across the idea that how you charge for transfer can map very differently to the underlying resource usage.
Source: AWS’s Egregious Egress by @cloudflare I've also asked this on linkedin below, to get some numbers back. https://www.linkedin.com/feed/update/urn:li:activity:6893929195408625664/ Given that network transfer is the thing continually coming up in the media, as well as coming up when looking at network traffic between servers as well, I wanted to ask what other valid approaches you've seen that would work for the SCI. Pretty much every approach I've seen uses what I would associate as an attributional approach, of dividing total energy, by total transfer for a given system, usually on an annual basis. Have you folks seen any useful guidance for using a marginal approach in this case for network transfer? |
WG comments: Different networks types have different emissions? Cell vs. wifi network, and bias action to one. Chris: It is possible to reduce the energy of networking equipment Is this a SCI data question? Can the WG provide feedback. Gadhu: NTT research project is to reduce energy consumption of networking equipment Daniel Schien - emissions data from streaming, might have feedback? Gadhu: is this number important? Is the energy use/emissions of networking energy significant compared to CPUs? Is it like RAM and inconsequential? Navveen: https://www.cloudcarbonfootprint.org/docs/methodology - This takes in account networking and memory usage, might be a useful resource. |
We could work with the CWG on how we applied the SCI to this case study. cc @Henry-WattTime |
@seanmcilroy29 to reach out to Chris about SCI Open Data to calculate network emissions |
We can also coordinate with Chris on the SCI Reporting once that spec is more well written out. |
Excellent work @srini1978! So as I read this, the only component in the software boundary in this case is the "application server" is that correct? Which is fine, the whole point of documenting the software boundary is to make these distinctions clear.
EI =3185.9451 gms of Co2 eq
This can be the start of a great case study, it's very well structured and written up in a format which others should follow I also think a case study using a very well known example app is an excellent idea. If we can expand it out to broaden the software boundary it would be far more useful. Some thoughts on data in the SCI Data project that would be useful to expand this: Networking:
Utilisation to Energy Consumption Curve:
Front End
|
@mrchrisadams my 2c is that overall reducing traffic regardless of how that traffic is paid for and block booked is the only lever that makes sense to pull. I mean data centers are the same in a way, they buy a fixed capacity but charge you a floating rate, just because the DC is only being 50% utilised doesn't mean 50% of the emissions go away, the other 50% is just sitting there unused, but overall we know that reducing the number of machines is better, reduced the number of servers bought over the long term. |
@jawache @srini1978 looking at this exchange here, something comes to mind for me - we have Discourse which I think could become a good avenue for memorializing case studies and bringing in a broader community to work through the various points of a case study? The reason to propose that is 2-fold:
So, what I'm suggesting is that with every case study, we have a "topic" on Discourse which can be used as a venue for dissecting and improving each of them. Thoughts? |
@atg-abhishek @jawache Great idea. I can create a post there. another thought is if we link to these discourse posts on our GSF linked in so that we can bring in that traffic. We can work with whoever is moderating our GSF linked in channel to do these posts ? Will check with Niloooka and/or @SaraEmilyBergman |
Hia, currently people are engaging here via the comments on the doc and now we have this issue thread so it's already a little challenging to keep track of the conversation? What's the benefit of adding Discourse here as a 3rd channel for communication, esp. since with Discourse you need to register again and everyone already has an account with Google Docs and GitHub? |
@jawache Agree to your observation here.
@jawache Great catch again. I believe I have assumed linear relationship which I realize now that it is not valid. I remember seeing a better formula to calculate energy consumption for cloud services but I dont seem to be able to find it. It would be great if you could point me to some of this and I can then rework the calculations.
|
Agreed - I would like to keep all the discussions here so that they can be referenced later on tied to the work and discussions that have already taken place here. @jawache @srini1978 we should also try and figure out where we stand on having some of the ideas floating in the GDoc separate from this thread then? |
Green-Software-Foundation/sci#236 might help to resolve the above as well. As per the SWG call this week, @jawache is working on a case study and we might now have a PR template instead of the issue template as a way to standardize the case studies |
So would the right approach be
|
I would lean towards option 2 since that seems to make more sense but we can discuss more during the WG call. |
Srini - created an updated version of case study, added the database server as well, pull request is available for review. Can see diff on old request. How to calculate client side device emissions? Can do different information/slices: Should be modeled, don't have control over it, should be held constant. Front end calc, come up with some kind of approach Certain things must be modeled, once you encounter it. Explain what baselines you are using |
@Henry-WattTime @jawache @atg-abhishek The client device calculations have been included and the case study updated. https://docs.google.com/document/d/1McS2-WOTtvubzM5aQV5TrNPye_R1E2yd/edit |
I think there are some access issues with the doc, I just requested access, thanks! |
@atg-abhishek Access given |
In the "SCI Client device calculations" doc (https://docs.google.com/document/d/1McS2-WOTtvubzM5aQV5TrNPye_R1E2yd/edit), the calculation of CPU related energy uses the following formula: E = Server utilization * Number of hours * Number of cores * TDP * TDP co-efficient I am confused by the use of BOTH the "Server utilization" and the "TDP co-efficient" in the formula when the TDP co-efficient is already a function of utilization. Per the comments in SCI Data Project “[E] Energy Estimation from Utilization Model” model, the TDP co-efficient is a "model to convert utilization to energy consumption". Therefore, if we are using the TDP co-efficient in the formula then it seems like we should not include the "Server utilization" too. Or am I missing something? |
@njwalk Good catch . I will fix the calculations |
Fixes done to the SCI calculation for Green-Software-Foundation#250, https://github.com/Green-Software-Foundation/software_carbon_intensity/issues/227 and added client device calculations
Once PR Green-Software-Foundation/sci#256 is completed, we will close out this issue. |
Discussed in Green-Software-Foundation/sci#225
Originally posted by srini1978 January 14, 2022
As we are starting to identify case studies, one option we can use is to take a popular ecommerce open source code and calculate SCI for it. eShop is a popular one that is Asp.net core based. There are multiple variants - using containers, using Managed web services
https://github.com/dotnet-architecture/eShopOnWeb
SCI = (E*I) +M per R
Proposal
@Henry-WattTime @atg-abhishek
The text was updated successfully, but these errors were encountered: