174


Good morning, everyone.So we are providing an overview in terms of the OCP subproject composable memory systems.

And today, the agenda is we'll go through the CMS journey, the swim lanes that we are working on, and also focus on the current and future of composable memory system subgroup under OCP servers.

So we'll first go through the journey.In 2021, we kind of formulated the idea.We did market survey, and we came back with goals that we wanted to achieve as a group.And in 2022, we formed a small team, and we had, like, multiple use cases defined in terms of where we want to see the composable memory systems, what are the use cases, what do we want to target, like, what are the potential products that we already have in the market and going forward with that.So based on that, we defined different swim lanes, which we'll go through soon.We can see different pools of composable -- pools of memory and different swim lanes with respect to the memory as well.So in 2023, what did we do?So we had, like, CMS subproject created under OCP servers in end of 2022.And in 2023, we started defining some specs around architecture, form factors, as well as access tracking, et cetera.And we started evangelizing multiple -- with multiple industry partners, as well as standard bodies, et cetera, which we'll go through soon.We also did product demo, which we have in our OCP experience center today.

So as -- how did we even grow from 2021 to 2023 in the span of just two years?We were a small group formulating the idea in 2021.And in 2022, we are -- once the subproject was formed, we were, like, 15 people when we started.And by -- in 2023, we are, like, over 250 members.And 40 weeks, we have, like, 40 -- at least 40 weekly attendees in the call where we are discussing all the open ideas.Anyone is free to join and present any relevant topics and formulate and take the feedback as well.And we have diverse representation of hyperscalers, product groups, and various industry partners participating and providing us feedback as we go along in this journey as well.

So what are the swim lanes that we want to focus on?Right?Like, there are node memory expansion, pool memory expansion, switched memory fabrics, memory systems, as well as we have a new use case that is coming into more focus for next year's focus is, like, near memory compute.So we'll go one by one.What exactly do these swim lanes focus on?The first thing is node memory expansion, which is mostly focusing on the direct attached memory and direct attached memory expansion.It may be CXL or any other technology that comes forth in the future.In terms of pooled memory expansion, this comes with -- it may be a single node or it may be a pool of memory attached to this -- directly attached to the single node and how we deal with it when we share the pooled memory across multiple hosts.How does multiple hosts access the same memory without having a common address conflict on it?Right?That is what the pooled memory expansion focuses on.We are working around the definition of how to -- what are the specifications around it that we need to, you know, define, and what are the use cases that can leverage pooled memory, and we are also defining specs around how the use cases need to be tested, validated, and around different products that are coming up in the industry.And in terms of when we move to the switched memory fabrics, this becomes much more complex, mostly relevant to the complex AI clusters.We jump into the cluster level instead of node level or a single rack level.So when we talk about switched memory fabrics, we have multiple interconnects that come into the picture.So copper, it may be fibers, as we move along with the higher speeds in the Ethernet space or any other interconnectivity space.And we also get into the complexity of multiple hosts accessing the same address space in the memory, and that is where we have to deal with all the manageability aspects and also not having to overwrite certain data and how to handle the memory pages.And also, like, the cache coherency across multiple hosts or multiple accelerators comes into the picture where we have to deal with all the complex issues.And we are also looking at memory systems that is -- it may be a bunch of, you know, bunch of memory DIMMs plugged into -- or memory modules or memory DIMMs kind of developed in the system.We are trying to define the form factor around it as well as different architecture.It may be a chassis, it may be a sled.So we have multiple work streams and groups working towards -- to bring all the ecosystem together.And we also have, like, near-memory compute that comes into the picture to have, like, instead of having to move the data to the compute, can we do in-memory computation or, like, can we do near-memory computation?How close can it be?How can we offload certain things to the memory where we can do the computations?We have multiple demonstrations already in the experience center today.Please feel free to swing by in the CMS experience center area.

And we are not alone in this journey.We only -- we are basically trying to define architecture and specifications around it, whereas we have partners in the industry who are helping us, you know, move along.We have CXL Consortium helping us with respect to CXL specifications.We have CXL 3.0, 3.1 released already, and we'll be moving forward with the highest -- as we move along with higher speeds of PCIe, which CXL relies today on.We may explore multiple interconnects around CXL and different, you know, interconnects over which CXL protocol can be expanded as well.We have JEDEC defining -- SNIA defining the form factors and hardware specifications.And when memory comes into the picture, we definitely need JEDEC standards as a standard body as our partner, too.So we are moving along with all the industry partners, DMTF, defining the manageability around it.How do we want to manage the, you know, pooled memory and the shared memory?And also when CXL switches and other things comes into the picture, how do we deal with all the manageability aspect of these moving parts?Of course, we have academia of different universities working on multiple areas of optimization around this, and also other OCP groups supporting us in the forum.

So what do we plan to do?What have we achieved so far and what do we plan to do in H2 of 2023?We have already specified like CMS architecture documents.It is released.You can go over the wiki page for OCP CMS and look for the white papers where we have defined use cases and workloads, certain architectures.We have white papers around form factors, different form factors our industry partners are working on.We had like a day-long CMS day yesterday in which multiple form factors were discussed, multiple use cases, AI workloads, as well as database and other workloads were discussed.There are certain -- like we also have like GitHub repository now where we will be publishing more specifications, documents, and standards -- sorry, specifications and documents and related standards linked to the GitHub repository.We also have certain recommendation documents on what workloads or what benchmarks we need so that all the industry partners can test and validate to come up with a better and efficient product like while the industry is developing new products around CXL.We are also evangelizing, as mentioned before, with SNIA, JEDEC, and other standard bodies, and also in other conference, technical conferences we feel relevant.We have like great demonstrations.More than 12 companies are participating in today's Experience Center demo.So please feel free to swing by the Experience Center CMS demo and clarify all the doubts.

So what is it that we will be focusing in 2024?So 2024, we are like focusing on switch, fabric, and new explorations like NVM over CXL devices.We have different CXL architectures, especially with more complex AI clusters coming into the picture, P2P GPUs, and more optimizations on the fabric while we are trying to figure out the memory expansion challenges that we have in the AI cluster.So there are multiple emerging technologies, as discussed before, like near-memory computation and related.The interesting thing is we have CXL over PCIe today.Can CXL be extended over other alternate transport methods?Is it even a possibility, and how scalable they are?So there are other technologies, the transports that basically scale better than PCIe.Can we go over those and try to explore more with CXL, enable more capabilities with CXL?Those are certain areas that we are looking into.

And we have, as mentioned, like Experience Center, please feel free to swing by and ask more questions or provide feedback as well.

Call to Action, basically, we are focusing on different media form factors.We also have hardware software co-design work streams where we can, together, optimize the hardware along with the workloads and benchmarks that we are adding more.We already have the microbenchmarks and application benchmarks like CacheBench added to our white paper, so we'll be adding more workloads like AI benchmarks and other stuff as we go along.Please feel free to chime in and also join the CMS work stream.We need the whole industry help to contribute and move fast on this development of composable memory systems.Thank you very much.