-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path225
30 lines (15 loc) · 13.8 KB
/
225
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
So I'm going to wrap up and talk about participating in open hardware economies. So I appreciate you guys wrapping up your OCP regional summit with me. My name is Sam Kocsis. I'm the Director of Standards and Technology at the Amphenol Corporation. So I participate and engage and actually lead Amphenol's engagement strategies and industry standards across market domains and segments. So we are very active participants of the open community and the OCP community. We love participating in OCP work streams. At certain times, participating as an interconnect supplier in open hardware economies becomes a little challenging.
So what I want to talk to today is sort of the importance of open hardware, open source in OCP projects and contributions, as well as that being core to OCP's mission. I also want to talk about, from an interconnect supplier's perspective, what the concept of open hardware means and how that impacts contributions to projects and work streams.
So we'll start with sort of an overview of open hardware. I'm going to talk a little bit about some of the OCP specification and the document types. What the commitments of an OCP contribution mean to those documents. As well as outline the landscape of the open hardware ecosystem within OCP and elsewhere. I am going to bring it back and tie to sustainability and then wrap up with a call to action for participating in these open hardware economies from whatever your perspective may be from your domain.
So my interest and actually thoughts about open hardware relative to OCP started at the regional summit last year. Like I said, we've been communicating and participating in OCP projects, presenting our solutions to various work streams and documents. But there was a keynote last year that highlighted some of the pitfalls of open source and talked specifically about the differences between open software and open hardware. Generally open hardware is defined as the design specifications of a physical object, schematics, blueprints, that sort of thing, CAD drawings that can be made available for anyone to modify or enhance. And I think one of the goals of open hardware is to really help eliminate the bottlenecks or limitations of design, manufacturing of those physical products.
But at what level in the project is the hardware open? What is the definition of a physical good? And so I'll try to continue with one of the themes of not only this event, but 2024 in general and speak to some of the AI deployments, artificial intelligence, high performance computing and take a look at some of the data center deployments. Typically those physical goods in the data center might be considered switches or servers, storage appliances, physical goods to silicon providers, silicon companies might be the CPUs or the switch ASICs, re-timers or even chiplets. And physical goods to interconnect companies like Amphenol would be the connectors, components, some of the cabling technology that connects those boxes together. And obviously we see in general as support some of the AI hardware, there's rack architectures that are evolving and that architecture evolution is only driving more and more demand for interconnects. So I think it's really important that we think about how we scope and specify these interconnects in these open hardware economies.
So with respect to the OCP specification types, we generally have three document types, a base spec, a design spec and a product spec. And really these happen across all aspects of projects and work streams within OCP and various members of each community come together to support from their perspective. So I've tried to outline a few examples of each type here within the OCP environment. And from these OCP specifications, these lead to contributions from individual companies to be either accepted or inspired from open standards as well as being a solution provider in the open compute project community.
I'll specifically go into one of each of these types starting with the OCP contribution of HGX from NVIDIA. And HGX platforms have become the de facto global standard for artificial intelligence training platforms. And from an interconnect perspective, there's over two dozen documented connectors, interfaces on the HGX baseboard. And the HGX baseboard really just defines everything connecting to the server platform. So what's to the left of the image on the screen there. It's a contribution, a spec contribution specifically from NVIDIA. The design package includes all of the drawings, CAD, bill of materials that I described in the open hardware definition. But the suppliers for any of the components in that architecture are not listed as contributors. And that's important because some of the IP that goes into those contributions, those components is not necessarily offered as a contribution to the open community. Although there is a means to provide access to all of those components if somebody were to try to build and manufacture the HGX platform.
So the HGX platform really fast tracked the development of a larger project within OCP, the OAI subproject. And this project was talked about yesterday. I think there's going to be follow-ons as the architecture evolution continues to expand to support the bandwidth needs in the data centers to support AI and HPC. But this project's been going on for over five years across a number of different revisions. There's over 30 companies that are participating in this community. And the 2.0 project was really completed last year, was a summation of five or so focused work streams that went into making this happen. And on this particular platform, there were over three times the amount of interconnects that were on the HGX specification. But this spec and this work stream was developed within OCP. And so this actually was drafted a little bit differently. It was meant to be more of a contribution from a group rather than an individual company. So how we manage what is and isn't open, where we draw the line for who owns which pieces of it and how to access those is really important as we develop not only the documents, but also the work streams that develop those specifications.
So I call to attention here the OCP contributions page where you can find the CLA and the FSA documents. That are used to define some of the legal boundaries, participation requirements in these specifications. Generally the projects start with a CLA to bring everybody in under an agreement that things shared are part of this project and everybody will be contributing to their piece of the puzzle, so to speak. But it's really important that we protect the IP both during this development process as well as beyond when the specification is released. And in doing so, it requires some modifications of the CLA document and the FSA document to make that happen. For instance, nobody expects in the previous slide that the accelerators or the CPUs would be given up for everybody to make. They're generally proprietary to a particular individual company. We would expect sort of the same to be true for some of the interconnect interfaces. Things like pinouts, functional requirements, how to connect between the boxes I showed a few slides ago, those are going to be very important to be documented. But leaving the internal guts of the interconnect, the cable technology, similar to the same way we treat packaging technology or semiconductor node process technology in the chips is really important for these projects.
And so the landscape of the ecosystem is only going to continue to grow. I think this has been talked about in a number of different sessions this week. The AI, machine learning, HPC markets are driving new projects within OCP. We're seeing more subprojects in server storage as well as some of the composable architectures that are being put together.
And so I wanted to talk a little bit about one of the projects, subprojects, that has done it a little differently than others and offer this as a way forward potentially. And that's the PCIe extended connectivity subproject. So it works under the OCP server project. And they actually developed what they call a requirements document without any type of CLA. So everything was sort of done under the guise of everybody's contributing in an open forum. And what they did was they enabled the quick turnaround of some of these specifications by leveraging existing standards in other places and tried to maintain flexibility for future work to take advantage of this document that they laid out. So this document did a nice job of setting goals and setting requirements needs for doing this type of architectural innovation in the data center without specifying a specific design that has to be adhered to. And I think that also enables some of the flexibility that is going to be needed as we try to keep pace with the rapid deployment of these AI systems.
So what do I mean by leveraging industry standards? This is sort of a pyramid that I put together to kind of highlight how the PCIe extended connectivity section worked with these other groups. So on the left side, these are groups that develop interconnect form factors mechanically and functionally. Definitely a strong collection of different connectors. Most of the, if not all of these, are multi-source. So they have embedded with them the ability to get components from multiple vendors. And they come with their own robust IP policies. On the right, there's more groups developing physical layer requirements or protocol specifications, specifically here for memory or storage, which is where we would see the biggest use of expanding the PCIe physical layer. And these come with a large community of embedded users and customers. They also have a robust IP and antitrust policy. And they themselves actually leverage a lot of the SFF and MSA work that goes on.
So we can see here as the OCP effort moves forward and the demand for AI/ML applications and supporting those within the open community continues to pick up, there's going to be an increased need for composable infrastructure, which will, as I mentioned a few minutes ago, lead to inevitably more interconnect. We're also seeing platform life cycles being reduced to as short as two years. Now, if the projects take four to five years to develop, it's really important that those specifications, those projects are flexible to adapt to new platform architectures, new interconnects that may come out and be forward-looking. So as best as we can, we try to keep the interconnect looking forward multiple generations. But it's also important that we drive the projects to themselves, draw the line so that there's a boundary between what has to be included in the specification to be compliant and where maybe we can be adaptable to future forward-looking technologies. And I think identifying some of those details and putting those into the license agreement documents is only going to encourage more folks that are developing some of these solutions, short-reach optics, advanced copper technologies, to be more eager to participate and contribute in these OCP workstreams.
So in general, I think if we can have top-down design really driven by the user community, that helps speed things up, rather than having a bunch of interconnect or have one area of the ecosystem driving the specification for the market. We really want to have this driven from the end customers. They're going to be the ones that are going to be using these, and they're going to be the ones that are going to be most likely to determine the life cycle of the platform. And as much as we can, leveraging the publicly available contributions from other standards -- I had mentioned a few there -- and I think drafting and changing some of the CLA and FSA requirements to be flexible for those future technologies. And it is a pain to have to go back and amend the CLA, get everybody's signatures. I think it's important that we do that to encourage participation and evolve with the project that may take a couple of years to develop. And over that time, there may be new solutions that it's really important that we are able to take advantage of. And specifically for interconnect, I think treating those contributions as the same way we treat packaging solutions, silicon solutions, is going to be really important to continue to promote participation from interconnect companies. And really limiting the exposure of those design specifications. I think a requirements document like what we saw with the PCIE extended connectivity work stream sets the groundwork for a solid base specification that doesn't have to get too specific with a reference design of exact components, but allow companies interested to participate in the open community to develop products and contribute those as OCP solutions. And so overall, if we limit the scope of the interconnect contribution or define it such that it's the mating interface or reference an existing standard that's out there, that should provide enough of a bridge to get from a requirements document or a base spec to the final product spec.
So I'll finish with the call to action here. The problem to solve is really the definition of open hardware. I didn't speak too much about how it differs from open software, but I think it should be fairly clear that it's been a problem that the industry has been wrestling with for a while. It's just a different animal. And I think it's confusing when we deal with the complexity and scope of the OCP projects and contributions. So if we make it a priority to continue to update and amend the CLA, promote participation maybe without contribution of connector interconnect IP. But still enable those folks to promote the solutions that are going to be needed to connect these boxes together. It will really be helpful in allowing the OCP community to evolve and advance with the needs for the AI market. So with that, thank you. And if you have any questions.