145


Hey, good morning, everybody.My name is Todd Rosedahl.I work for Jabil.I'm a system architect, and I'm excited to be here to talk to you about the peripherals here.

Tim Lambert from Dell.I work a lot on systems management and a lot of out-of-band management interfaces.

I'm Mike Witkowski from HPE, a system architect working on some of the DCMHS and CXL for HPE.
 
I'm Kasper Wszolek platform architect from Intel .

All right, so what we're going to talk about today is some of the problems and challenges we have with some of the out-of-band interfaces.  So you saw at the keynote that there's a lot of activity with AI, obviously a lot more memory bandwidth and memory capacity required.  There's also a lot going on in the out-of-band interfaces as well, a lot more telemetry is being collected and things.  So we're going to be talking about some of the challenges that we have with that.  We want to be clear that as we're solving these problems, my background is servers, but we don't want to just focus on your standard servers.  This applies to everything.  I think Edge is really an interesting space, storage, networking.  So when we solve the problems, we want to solve them for all of those use cases.  So there are things going on in all of these various work groups, and we're just going to give an overview of some of the problems and the challenges that we're going to be facing.  We're going to dive in a little bit deeper into these problem spaces.  And obviously the goal of this is to describe these things, so we encourage collaboration.  As everyone's been saying up here, we want everybody to be able to participate and help us solve these problems.

So here are the list of problems, sort of a long, long list.  We'll be diving into these in more detail today.  But before we talk about the problems, I do want to highlight some of the successes, and really a major success that you heard it.  If you're in here all day, you heard it mentioned once, but So, we have a Jabil chassis, and it had a Jabil HBM and DCM, and ASPEED and AMI worked together.And they said, "Hey, what if I take my ASPEED DCM, DCSCM, plug it in and see if it boots?" And in a few weeks, it was booting.So, you can just plug that in.It detects the HBM, recodes some things, and boots the system.I think that's just an amazing success story that says we're on the right track here.It isn't full function.We didn't have type one yet, which is the definition that locks down the interface between the SCM and the HBM.That wasn't there.But there was no co-design.We were able to just get that working, and that shows that we're, I think, on the right track.So, in general, I'm not going to talk to each of these problems, but as I talk with AI, the world's moving on here, and we've had a lot of legacy interfaces that worked great for their day, but more bandwidth is needed.All kinds of telemetry is being collected from all these various peripherals.So, you need bandwidth.You need more fan out.And the question is, do we change some of these legacy interfaces, which is difficult, obviously, because you've got a lot of people depending on it.But at some time, you also have to move forward.You have to change some of the interfaces.And, of course, there's just the basic problem of people implemented things differently, as they were just discussing up here in the last session.So, different implementations and some of these things we need to get locked down to have greater interoperability between the SCM and the HBM.

Okay, so, starting with this out-of-management review, and what kind of interfaces we have available here, I think it doesn't get simpler than this.The first ones are the discrete signals.We have had them, like, forever.And as was mentioned earlier, the need to support this signal or to support the different endpoint devices that we will be plugging into the system creates some issues and some poor pin utilization.So, what we have been doing on this DCMHS since we started with these workgroups was to introduce these different FlexIO capabilities.So, right now we are supporting them on the peak power connectors, on the control panel, on the MXIO.And the idea of this FlexIO is to take a little bit more advantage of those discrete wires that we're already routing between the different subsystems in order to better -- or to improve the density of the connector and take better advantage of what we are currently routing from the board to the end form factor.So, I'm not sure -- or these discrete signals, like power break or wake or the reset, right, they are really, really quasi-static signals at every less than 1% of the time of the server that is running.We are toggling them.So, the idea here is really minimize where possible.There are some cases where we might get into latency concerns or other kinds of issues where we need them, but the idea is to minimize.This FlexIO is also being supported on the EMC right now with Eric in the previous presentation.And also important is right now part of the PCI-based specification, which is a center of truth for this FlexIO discovery and engagement rules we want to make -- right before make to make sure that we are interfacing compatible devices and enabling those FlexIO signals.

So, a little bit operating the protocol level, but still real time and low protocol level.We have these one wire tunnel interfaces, which is a health duplex interface, low bandwidth, really well supported on the XIO, PIC, SIF and CRPS.It is identical to the PCI-PESTI that is also on the Chapter 12 base specification.CRPS is not supporting exactly PESTI, just a derivative version of it, really similar called DSSI.If anyone is interested, we have a show or a demo running right now on the Experience Boot Center, so you can take a look at that.SGPIO and LTPI interfaces are another way to do serialization of sidebands between the HPM and the different peripherals that we're connecting.We have a little bit more pins or higher count of pins utilization, but we have considerable higher bandwidth.SGPIO currently supported on the DCSCI and it's connecting the CPLD of the DCSCM to the HPM FPGA.And also we are supporting it on the PDB management controller to do some serialization of the sidebands that we require from the power supply all the way to the host processor module.LTPI is a speed-up version of this serial interface.It can support full duplex, 200 megabits communication.We can deliver larger real-time payloads.And it's also an interface that we are supporting here on the DCSCI interface.Tim, do you want to talk about it?

Yeah, so moving up to the -- oops, give me one second.It's not -- yeah, two wire buses.I2C has been around 40 years.Everyone loves it and hates it, but you know all of its problems and how to deal with them.For backward compatibility in a lot of the interfaces like PCIe CEM, EDSFF, all this sort of thing, you're still going to need to stay at the new standards that are adopting things like I3C Basic as an optional progression.All still have to start in the old mode, and then there's now a common methodology for switching the voltage domain and the protocol domain into I3C.So it's still going to be used, still quite prevalent.You still need it with the drawing on the left as you progress through the different tiers of, say, fan-out on the baseboard, and then also, say, before you enter a PCIe CEM slot, things like that, you're still going to need I2C.It's not going away, and you can still use it as needed.Now, you know, the model use case are FRUs, EEPROMs, I/O expanders, things like that.There's quite a few emerging use cases around that in terms of the peripheral subsystems as well.It's very highly used when PCI VDM is unavailable.There's a slide later.We talk about some of the issues around PCI VDM.And then so I3C Basic is coming on strong.It's currently already in well use with CPU interfaces.DDR5 was the first thing in the server industry to pick it up.And then I/O, PCIe CEM, some of the NVMe interfaces, things like that, are supporting it as an optional interface.Again, you have to start in the two-mode, three-three tolerant, before you negotiate up.In the PCI base spec, there's this outer band management chapter trying to coalesce a framework for all of this type of functionality that, as you know, 20, 25 years of kind of you can do whatever you want.We're trying to reign a lot of that stuff in in terms of having common discovery and configuration methods and is getting as soon as you can to sort of an MCTP transport level.From a usage guidance perspective with an MHS CPU, now it's ideally point to point.You can get a lot better benefits.So I3C, by the way, people love it for in-band interrupt, some of the dynamic addressing, things like that.Now DDR5, it's available now when you have it from, obviously, from the CPU domain, but from the BMC domain, 　The question is when is the killer app?DDR6 and others have some security use cases that are coming where it will be a whole lot more interesting as time goes on.I/O is where it's been the most kind of controversial because there's a strong push, but there's some challenges with I3C, 50 picofarad problem, variable voltages, very long distances that it wasn't really designed for in enterprise systems, and then this need to go through these voltage negotiation and protocol negotiation at every level, such as when you leave the MXIO connector, which is also the same as the PCIe copperlink internal PCIe cable spec, as well as the same rules apply to the PCIe external copper spec and then the forthcoming PCIe optical.And so, yeah, from an I/O perspective, there's some issues with the long buses and turnaround times and things like that.We think one of those enablers, especially with things like the DCSCI where you can't provide everyone at point to point or with little fan out as possible, that there are tools like USBIF a year and a half ago came out with the USB to I3C bridge specification, which is one method where you can pay as you go model.Instead of plumbing I3C from your BMC throughout the system, you could say, all right, now I'm using managed USB to now I do have ideal bandwidth and signal integrity right at the destination that I need to go to, such as on a backplane when I need to fan out to a bunch of NVMes, for example.Yeah.That's good.All right.

And then managed USB is coming on pretty strong.It's now in the negotiation again under the Flex I/O domain is in that PCIe base spec.It's an ECN that just finished its final review phase, so it will be published for PCIe SIG members soon.And so managed USB, you can see these kind of use cases.It's plumbed in MXIO, DPUs.There's a lot of mainstream ASICs that are now adopting it.The PCIe CEM 6.0 spec, and we might be seeing that back to 5.0, also has the ability to support USB as a Flex I/O negotiation on the JTAG pins, which aren't used very often, so in production.So that's another one of those enablers that help you do things like shared NIC, higher bandwidth MCTP to the targets, telemetry, route of trust for update, measurement, and recovery is another good use case for that as well.

Okay.People might not think that VDM, PCIe, CXL, VDM is related to out-of-band management protocols, but it is synergistic, and there's a lot of considerations as we move forward in this space, especially when we move to CXL.But if you look at the VDM protocol with the BMCs today, they take advantage of that for things like firmware download, large log transfers, and things like that.Typically, that would be between the BMC going through the root ports of the switch to the peripheral devices, local in a server, but as we've seen, and if you've seen some of the CXL work discussions here, now we're going to be talking about peripherals that are in external shelves.So you might actually disaggregate some of that, either as dedicated external memory or peripherals or in a shared or pooled environment.So these are going to be complicating how VDM is used, and primarily where we're actually going to be running into problems is from a security standpoint.So if you consider, for example, I'm talking to a shared resource shelf, there's some questions involved.First, when I get outside the chassis, do I need that security?We're now talking about IDE application and being able to protect the data going across those external connections that are no longer just within the internal parts of the server or the ecosystem.The second aspect of this, especially again when we think about CXL and shared resources, is the whole notion of who has the keys for the VDM messages, the management messages, if it's being sourced from the processor, if it's being sourced from the BMC, or some other entity that's talking.You may have several of these trying to communicate with a shared resource.So this is going to lead us to having to work about access control and adding access control capabilities to the VDM when we start getting to this shared infrastructure.So this is an area that we have to kind of keep a close eye on.And this is wherever we have the data path for PCI Express, CXL, that's where we need to worry about that.

Can I make one comment?

Sure.

One comment.Some companies believe in a 100% isolated control plane.So they say PCI VDM is turned off, I refuse to use it because I don't want any management traffic going over the same.It's not bandwidth, it's control plane isolation.So that's why when you say, yeah, there's these beautiful benefits of PCI VDM and the MCTP, it's all a very natural way, NVMe supported it for quite some time, very fast updates, it's fantastic in a lot of ways.But we understand that that's why there's not one interface to rule them all, and that's why you're seeing some of these alternatives for that type of use case.

Thanks, Tim.Good point.

And then another aspect, and Javier talked about it a little bit, the fact that, and Tim as well, is that we're now moving into, again, bringing internal/external cables.With DCMHS, we're able to disaggregate.We don't have to run all the traces through the motherboard anymore. We're now going cabled with the MXIO cables for internal cabling, and as I mentioned in the drawing here, we're going outside.So you can go floating risers inside, or you can go to a shared resource shelf outside the chassis.And these are now being considered for cabling options.So if you look at the standards today, there's actually two copper cable standards coming out of the PCI SIG electrical group, the internal spec, which is currently at 0.9, and the external spec for the external copper connections based off of CDFP today.It's the next edition of CDFP cables.These are both at 0.9 right now, and they're very synergistic from their providing signal directionality requirements with the notion of a root complex, non-endpoint sides of that, or non-root complex.They both provide equivalent bifurcation modes x16 x8 x4.So they're all very common in that regards.But the key thing that was mentioned is the FlexIO.So we're not only supporting FlexIO on some of the internal interfaces, but also in the cables, both internal and external.And so we mentioned there's a Chapter 12 ECN in the base PCI SIG spec that is designed to enable a standardized way of determining what the function of those FlexIO signals are.So it's going to be really crucial for anybody who's adopting these kind of technologies to really understand the mechanisms and implement them so that we don't have a train wreck with lots of different protocols trying to communicate with each other, trying to use that in a non-standard way.And then on top of this, as Tim mentioned, there's work going on in the photonics and optical space, because once you get to a certain distance, we're going to need that photonics aspect.Well, photonics doesn't have sideband electrical signals anymore.And so there's got to be considerations of do we do sideband signals or sub-frequency sideband signaling on the optics.These are all concepts that are kind of being worked on right now, and there's a lot of options.So we've got to keep an eye on that, because we need to be able to provide that same level of out-of-band management capability when we go to the photonics kind of environment.

All right.So I'll reiterate what everybody's been saying up here.Come and join us.It's really a lot of fun.There's a lot of very interesting problems and challenges and designs that we are working on, and you have a chance to be a part of something that really is going to make a difference and is making a difference in the industry.So come and join us.Reach out to us via these links.We can help you get hooked up with whatever the right project is, depending on what your interest level is.But thank you all.

Questions.I'm hoping there's some questions, because there's definitely some passionate people about some of these things, and some of these topics can be quite controversial in some people's opinions.They're all in on one interface or another.Yes, sir.

Hi.So I have a question.As we're defining out-of-band methods, are you considering these kind of video modes within the speed planes?And if yes, then what are the kind of thinking is how do you detect a failure and how do you manage failures?

How do we detect and manage failures on any of these physical types of interfaces?Yeah, I mean, so obviously there's wires that don't have anything.PESTE has CRC.Everything else that you saw on here has -- you can do packet error checking, that sort of thing.We're coming from the hardware path and the platform enablement path first and foremost, and we're not specifying recovery algorithms, retries.That's where MCTP 2.0 and types of things of that nature are going to come into for reliable delivery of information.So anyone else want to comment on fault management?

Yeah, absolutely.Yeah, within the hardware management group, you know, the RAS team and all this, absolutely we want to work with them on anything that is necessary.We're coming from the logical domain in terms of most of this presentation, the electrical hardening.And what's really important to us is do we have the plumbing and the HPMs so that you can pay as you go, build your own, or use standardized peripheral subsystems and achieve what your marketing and your outcomes are going to want to be, whether it be storage, telemetry, whatever your use case is.So that's why we're saying I2C is not going to solve everyone's problems.PCI VDM on the other extreme is not going to solve everyone's problems.And so we're going to have to work with these methods that we think like the FlexIO methodology and the fact that PESTI and FlexIO have been adopted within PCI SIG as well as another body that can be relied on and is an active use for a lot of these different interfaces.We're able to avoid the land grab of interfaces, the benign signals that don't toggle.You're paying for the connector and the cable where it's not really getting payoff.You're able to tunnel virtual wires.And, you know, we're deep in this.We live it and breathe it every day.And, you know, we're very interested in people's opinions on, especially, you know, very strong opinions on I'm 100% on I3C or I'm 0% on PCI VDM and let's work together on some of these fan out control paths and stuff because some of them are when you go deep into these specs have a lot of interesting challenges.Yes, sir.

Hi.Joe Irvin from Oracle.So you mentioned that there was a USB to I3C spec that was completed.Do you know if there's any devices that are announced?

There is not.We're working with several vendors.I know for a fact of two vendors that have expressed very high interest in it.There is a group at Intel whose job it is to write specs.They wrote recently the dual ported RTC which is prevalent.They also wrote a I3C hub spec for exactly that reason of multi-sourcing with common footprint register of those classes of devices.The gentleman's next assignment is exactly this, a USB to I3C bridge using the USBIF spec and I've collected and, you know, welcome other people's feedback on what that ideal device is.Is it a one to four fan out?Does it support the advanced modes, the DDR, quad IO?Do you know how many ports?Individual separate voltage?You know, there's some issues in some of the hubs, for example, where you can't have every single port totally independent.They've got to come in pairs.There's some issues there.So we are currently in defining that and the approach that we've seen to be successful so far has been an Intel defined specification that results in these that they, you know, so it's not open.You have to get it from their portal but these device vendors are consuming that as we as an overall community can say, yes, I want dual sourcing, triple source, I2C, USB to I3C hub.So we've been working on it for a while and really trying to get device vendors to see exactly this use case of I3C as a native fan out to all, you know, I have a 40 drive NVMe backplane or something is quite hard from a BMC because of that 50 picofarad problem.Are you really going to get the benefits besides the clear ones like in band interrupt to achieve the 12.5 megahertz or the 25 megahertz or the quad IO 100 megahertz if you actually had the IO.So definitely welcome everyone's feedback.You can send it to me if you want.I mean, I can get you in contact with the Intel person as well who exactly says I want a USB to 4 port, do I need an 8 port, what is that device hierarchy and what are the capabilities of that device?We already have a scaffold of what we think those requirements are but one more feedback.

Thank you.

Thank you.Yeah, we definitely see the USB to I3C bridge as an interesting way to pay as you go, build as you go, get the benefits of I3C but not sacrificing the bandwidth because of this fan out.You see something like a DCSCI only has today I think six dedicated I3C but it has provisions to get you up to 16 but you look at the number of PCIe lanes, the XIO connectors, all the good stuff, that really creates a very large fan out problem.So you're really saying, oh, man, I did all this work and I'm going to have to have three tiers of hubs before I get to the destination and the control and the mechanisms of that is pretty challenging.But there are also good things.There's one other thing I forgot to mention is this thing called an SMB agent function which is beautiful because you can throw an MCTP command to it and it will interrupt you when you get the answer instead of blocking your mux from pointing to all of these others waiting around for a response.So there's definitely that kind of idea applied in new areas is fantastic and they're being put into super cheap low-end devices.