-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path216
36 lines (18 loc) · 8.08 KB
/
216
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
Okay. Thank you all for attending. My name is Suresh Subramaniam. I'm from Apex Semiconductors. We're based in Santa Clara, USA, California. So I'm going to talk about chiplets, kind of very different from all the things that you've heard, especially in the context of OAM.
So I'll do a very brief overview of UBB and OAM and then introduce the reconfigurable OAM. Then talk a little bit about how to address memory in this scheme and then summarize what we've talked about.
So within the OAI infrastructure, the open accelerator module has a whole bunch of companies that have developed software solutions. And so it's a critical part of the whole ecosystem.
If you look at the universal baseboard, so that sits inside one of these boxes and it can house up to eight OAMs. And so here's how they're numbered and there are ways to scale out as well as interconnect these within. So there is an accelerator fabric that connects all of these OAMs.
A little bit of detail on the OAM itself. So there's a baseboard and then whatever is the proprietary silicon that sits in this example die area and that's about 80 by 68, 80 millimeters by 68 millimeters.
So with that as the overview and context, now I can talk about what this reconfigurable OAM is.
I'll show you examples later of the SIP, but essentially the way I've been thinking about it is it needs to be a full platform. So you have the baseboard and then there is a system in package which can be organic substrate based or if you're doing advanced packages then you need an interposer. On top of that, there will be a set of performance modeling and front end design and verification tools that you need for the different chiplet models. We'll talk about that in a second. And then the overall SIP and at the module and at the universal baseboard level. And then as others have talked about, you need a way to present all of these resources to the upper layers, to the application software. So that's the unified resource view and management layer. And the application software actually will be part of what the OAI or any third party system integrator would build.
So this is the most important slide if you want to take away anything today. Essentially, I talked about in the OAM itself there's an 80 by 68 area in which is where most people build their secret sauce. So this particular thing is to scale and what you see here is there's a SIP and then each of these on the top and bottom are domain specific accelerator chips. And then there are two hub chips and on either side, they have off package connectivity which could be PCIe or 112G. And this would match whatever is part of the UBB spec. In this case, it's total bandwidth is about 2 terabits either with a by 16, that depends on whether you're Gen 5 or Gen 6. And then all of the domain specific accelerators are connected to UCI die to die, anywhere from 2 to 4 terabits of bandwidth die to die. So what are the key things? This SIP itself will have a standard footprint with which it attaches to the baseboard. Each of these chiplets will have a standard footprint. So for the hub and for the DSA, I mentioned IO, it's possible that we might break this hub into a hub and an IO chiplet. So that's why the IO is here. It uses a standard die to die interface and the thermal and power envelopes are already defined in the OAM spec. So we will honor those requirements but what will happen is that thermal budget will be spread across all of these chiplets. And then we will also define a power and area budget that is given by the size of these chiplets. Test interfaces are being designed for die to die type of solutions and those all will be incorporated into the solution.
So what's the benefit of standardizing the SIP as well as all the different chiplet sizes? If you want to start a greenfield chiplet design today, which typically happens in large vertically integrated companies or software doing full stack solutions, they have to go through a shift left where they have to simultaneously look at all of these different variables and come up with a solution that meets their requirements. So but if you're not a vertically integrated company but you have a chiplet solution or secret sauce that you want to offer, this is a tall order for you to deal with. So with the kind of module that I showed you in the previous slide, we can kind of solve all these problems or solution space, narrow it down to something very specific. So that you can go and design to that spec and bring your solution to market.
So to put it in full context, that would be the SIP module that fits into the OEM baseboard and that is with the heat sink on which would then go on to the UBB which goes into the OAI specified chassis. So essentially what we are trying to do here is extend the modularity, the composability and scalability that's part of the OAI infrastructure. We are extending that into the package and silicon.
These are just cartoon examples of if you were to take such a module and then configure a UBB, in this case all with the same types of chiplet, what you end up with is 64 copies of that chiplet. And it gives you and then so now you can distribute your workloads across these chiplets. Because there is the accelerator fabric that is already defined as part of the OCP specification.
Or you could take it to the other extreme which is you could have heterogeneous chiplets. Heterogeneous has many definitions, different technology nodes or different types of accelerators could be inference, could be video transcoding, could be CPU, whatever the case may be and you can mix and match. So but it all goes on the same OAM because the footprints are predefined. In a sense it's very similar to what we are doing with memories. So there is a predefined footprint and more than one vendor can design the same function. So it allows you the option to actually multi-source a solution.
So if you look at the OAM itself it doesn't have a mechanism for including memory. So and I just threw in this example to show you how things are being done at Meta using the accelerator fabric which is the OAM modules.
And they specifically say we will use whatever cache there is on these accelerators but we go out to the host memory for any larger storage.
But there are new and interesting things happening in the world of CXL. There is a JEDEC spec on a CXL memory module. What I'm suggesting here is you could extend that to make an OAM module.
So if you really need to bring memory closer to compute because the OAM itself talks PCI, that's the connectivity. But you can then run CXL on top of PCI and now you can build OAM modules with a CXL controller and behind it could be DRAMs. So this is an idea. The other thing one can do is think about a new type of memory chiplet which is UCI based but could fit into one of these slots. And that would have very high bandwidth which could be 256 or 512 gigabytes. But it might be limited by the actual DRAM. DDR5 could be a single die about 8 GB. So those could be two options but there's so many other solutions where there are CXL based modules available.
So what I tried to demonstrate here today is that the ROAM module can extend the composability, modularity, scalability that's already built into the OCP infrastructure into packaging and die. It enables a plug and play chiplet ecosystem. It solves many chiplet design and integration challenges. And more importantly, it reduces time to market because the SIP is already pre-qualified. There's also-- I don't have time to talk about these aspects. You can design-- reduce design NRE for the chiplets itself. Because we have some ideas on composable silicon where we provide you a shell with all the common elements abstracted out so you can just close timing on that part and get the chip into production.
So call to action, this is not a single company effort. We need to work with ecosystem of partners and collaborators especially IP vendors, accelerator chiplet vendors, system integrators, software developers. So that we can all make this work seamlessly within the OCP infrastructure. And if you're interested, please reach out to me at this email. Thank you.