# Micro LED data interconnect for scale up networks with record energy efficiency - Bardia Pezeshki

https://www.youtube.com/watch?v=hy12htvWFgg

[00:00] and uh welcome to technical paper session D.
[00:05] Now we're going to start the session on scale up interconnect.
[00:07] I am Letia Juliano from Alpha Wave and I'm here to present two cutting edge approaches to scalable compute.
[00:15] The first presentation is going to be by Bardia from Avisia Microlled Interconnect and the second presentation we're going to have is on package memory with GCE presented by Dr. Dendra.
[00:27] So let me introduce Bardia.
[00:30] Bardia uh is hold a PhD in electrical engineering from Stanford University has led cutting edge R&D effort at IBM and ZL.
[00:38] He is no stranger to build from ground up as a scaled two successful company Sanur Corporation and Cayam.
[00:47] Now at a visina bar team are tackling one of the most pressing challenging in high performance computing interconnect bottleneck with microlled solution designed to meet the demand for AI cluster and other data hungry
[01:00] Application. Welcome Dr. Bardia.
[01:04] Thank you. Um let me share my slides here.
[01:12] Okay.
[01:24] Um, yeah, thank you for uh listening.
[01:29] I'm going to be talking about microLEDDs for scale up.
[01:33] Uh, when we did this six years ago, everyone thought it was crazy.
[01:38] uh I think we're getting more and more people interested in the approach now.
[01:42] I think uh there about a dozen folks that are doing something similar.
[01:47] So I won't do too much of an intro.
[01:50] I'll dive into some of the latest data that we've we've obtained.
[01:53] Some of that was in the paper that you might have seen.
[01:58] So just to uh give you a quick intro um I think we all know about scale up.
[02:00] Uh currently in the Nvidia
[02:03] racks, the GPUs are connected to the central switch with uh copper cables,
[02:09] like 5,000 copper cables within each rack.
[02:12] At 200 gig, um those cables only go about a meter.
[02:15] So the desire is to have more GPUs.
[02:19] Uh if you try to put more GPUs into the same rack, your power consumption goes crazy.
[02:24] And we've heard, you know, 600,000 watts within one rack.
[02:29] So ideally you want to have uh the scale up interconnect be able to scan uh span a few meters so you can distribute that heat load and add more GPUs within a cluster.
[02:42] Um so this area to the scale up is where things are getting very interesting.
[02:48] There's a third area uh GPU to memory interface where again these microLED technology can be very interesting and I'll allude to that uh briefly in the presentation.
[02:58] So that's the problem we're trying to address.
[03:00] Um
[03:04] the table on the left actually comes from Nvidia and it's their requirement for what they want to see in scale up.
[03:13] They want to see energy efficiency less than five picoles per bit a good shoreline density 10 meter reach.
[03:21] I think the most important factor is really reliability.
[03:22] You cannot have failures of optics uh within each of the clusters and the target cost is something on the order of 10 cents per gigabit per second.
[03:34] Now, if you take that number and multiply it out by the number of GPUs, say 4 million GPUs, 16 terabits per second per GPU, and we meet their target of 10 cents per gigabit per second, you see the scale up is a $6 billion market.
[03:51] So, it's a it's a huge opportunity.
[03:55] And the general impression is that normal DFBbased uh silicon photonics
[04:05] um or vixels or other approaches that we've done in the past can't solve the problem very simply and that is why Nvidia is staying with copper in the near future.
[04:16] So we do need a different kind of optics that can meet these kinds of uh specifications.
[04:24] Um we actually in my previous company we actually looked at co-package photonics.
[04:27] In fact maybe we were the first to really talk about CPO.
[04:33] Um my last company was called Cayam and we did optical transceivers and we actually did a demo of CPO in 2017 at optical fiber conference together with Corning.
[04:45] Um but in doing this demo it became clear that this co-ackaged optics is very challenging.
[04:50] You have all the fiber alignment issues.
[04:53] Single mode fibers are quite difficult.
[04:57] Uh lasers are sensitive to feedback.
[05:00] So you got to have isolators.
[05:01] You got to have everything AR coded.
[05:04] Nowadays people are using rings.
[05:07] In rings you got to temperature stabilize them.
[05:11] We saw in the last talk how uh celestials electroabsorption modulator is thermally stable.
[05:17] So solves one of these issues.
[05:19] Um and then you have poor laser reliability especially at high temperatures which means you got to use external laser source.
[05:26] Anyway um seven years ago we saw lots of problems with co-ackaged optics and I think many of these problems still remain.
[05:34] Um despite that there is a lot of interest for co-ackaged optics for uh moving forward.
[05:39] uh Jensen at the GTC conference talked about CPO as we all know, but even he acknowledged that currently it's not good enough for scale up, which is why they're sticking to copper.
[05:53] So, is there another technology that can get you very low cost, super reliable, and very energy efficient?
[06:03] Um, one that ideally doesn't use lasers.
[06:07] Um after my last company I was consulting for a
[06:09] MicroLED company making displays for glasses and watches and I was just really impressed with the whole display technology.
[06:17] So this is the concept we came up with um five or six years ago.
[06:22] I have a senna what if you send data basically through a display.
[06:27] Think about a camera looking at a TV.
[06:31] You can turn those pixels on and off in a very simple way and then you can have imager looking at which pixels are being turned on and off.
[06:37] If you can send data in a parallel format like that, uh you don't need lasers and you can have very high bandwidth.
[06:43] Um the reason fundamentally that people haven't looked at this parallelism of optics is that technologies for displays is generally a low frame rate.
[06:56] You're talking about 60 frames, maybe 180 frames a second.
[07:01] That's far short of what's needed for clock speeded chips.
[07:04] Well, shortly after starting Avisa, we played around with these gallium nitride microLEDDs that
[07:13] Apple was going to put in watches.
[07:15] You're starting to see in um virtual reality glasses, you start seeing in car headlights.
[07:23] And we found out that by tweaking the design of this microLEDD, you can get very high-speed operation.
[07:29] Um, so you can see the eye diagram uh on the top right.
[07:32] That's a one microLEDD modulating at 14 gigabits per second.
[07:39] And the display people know how to put thousands in fact millions of microLEDDs on silicon for display applications.
[07:48] So, the uh picture here, which hopefully will be a video, if I can get this video to work, um shows one of our chips, and we've got microLEDDs put on part of that chip.
[07:59] And those microLEDDs basically blink on and off to send data.
[08:05] If you zoom in on the microLEDD side, you see these little bricks.
[08:07] These bricks are the LEDs.
[08:10] They have N&P contacts on the
[08:13] same side, and they give off light from one side.
[08:18] The typical emitting aperture is somewhere around six microns by six microns.
[08:21] I also have a FIB cross-section of one of these LEDs.
[08:27] And you can see that there are basically soldered onto the silicon.
[08:31] And the technology exists for putting millions of these teeny little LEDs on silicon chips.
[08:38] Devices can get down to about a micron in size.
[08:40] And you can see it speeds on the order of 10 14 gigabits per second.
[08:45] You can get massive amount of bandwidth coming out of these structures and they don't have all the problems that I've dealt with in the last 30 years that they don't have all the laser issues.
[08:56] No threshold current to act to to get above.
[08:59] You don't need isolators.
[09:01] There's no polarization issues.
[09:03] There's really no modes to talk about.
[09:06] Um you don't get bit error rate floors.
[09:08] You can just turn them on and off.
[09:11] You don't need to do pam 4 or some sophisticated modulation.
[09:13] The LEDs can
[09:16] work at very high temperature, super reliable and and and low cost.
[09:22] Um, so the way you would use this is, let me get the right slide.
[09:27] The way you would use this is you would make a little chiplet and you can see this in the uh figure here.
[09:34] And the chiplet would have a display section and it would have a camera section.
[09:39] Um maybe you would have a translator chip that would take 200 gig coming in opening up to a wider bus to feed our chip.
[09:47] And these uh transmitters and receivers connected with bundles of optical fiber.
[09:53] And you can put these modules on your circuit board sort of OBO onboard optics.
[09:59] Uh or you can get rid of that translator chip and come in straight with UCIE or HBM or some other wide bus format and go straight with CPO and you get tremendous power advantages if you do that if you stay wide and you get rid of the certis.
[10:16] Um so that's the basic intro.
[10:16] Um oh on
[10:19] the fibers people always ask about fibers.
[10:21] How do you move hundreds of channels simultaneously or thousands of channels simultaneously from one side to the other side?
[10:31] There are many different solutions.
[10:32] People have played around with imaging fibers, you know, plastic imaging fibers.
[10:36] That's how we started off with.
[10:40] We're actually using bundles of illumination fiber or borosyicate fiber.
[10:43] We buy this usually in bundles of a few thousand and we have a supplier that makes cables for us.
[10:52] So here you can see a MO type cable.
[10:56] It's got two uh hexagons.
[10:59] Each of these hexagons um or octagons is is hexagons uh has about 330 fibers and the fibers are positioned quite accurately in this array.
[11:11] So once you align this within a few microns and put it on top of your emitter, each LED transmits into one fiber.
[11:16] So each fiber is where it needs to be within a couple of microns.
[11:22] Um so this is a prototype chip that we made um about a year or two ago and it's described in the paper.
[11:29] Uh it has 330 LEDs on this side.
[11:33] 288 of them are data.
[11:37] And there's some clock and spares and other things.
[11:38] And then on this side we have a camera.
[11:41] Uh this is actually a silicon photo detector array that we made ourselves and we slapped it on top of the um 16 nanometer TSMC base die.
[11:54] Note that under each LED there's a driver and then under each um detector there's a TIA.
[11:59] And we basically showed this working.
[12:02] We've shown this at conferences and other places and it transmits about 800 gig a terabit of data from one side to the other at on the order of a pico per bit.
[12:15] Here you can see the fiber bundle coming and sitting on top of the transmitter.
[12:16] So that's our basic technology.
[12:19] Now going into how the performance of the chip has
[12:24] been so far.
[12:26] Uh it's a picture of the TX array that we actually fabricated.
[12:29] You can see the individual LEDs.
[12:32] Um, in the middle I have bit error rate versus the average receive photocurren and you can see all the different channels sort of lying on top of each other.
[12:40] People ask about cross talk.
[12:43] So to measure cross talk we took a victim channel and aggressive channels.
[12:48] The victim channel had one bit error rate uh sorry one PRBS pattern while the aggressors had a different PRBS pattern.
[12:56] And when we turned on all the aggressors, you can see you pay a penalty here in uh the bit error rate.
[13:04] But the cross talk is quite negligible.
[13:06] Um the yield on these LED arrays is is very high.
[13:10] Uh here we have a video of of all 288 channels um all giving out a beautiful eye.
[13:17] Um so it's relatively easy to get large number of LEDs to uh to work a high yield.
[13:23] And of
[13:27] course remember in these parallel bus structures you can always have spares and you can have some ECC.
[13:35] Um to get to the latest data we can take just a single one of these links.
[13:39] Now this is with an external driver and an external TIA and we can drive them at higher speed.
[13:45] So you can see a SCM of one of our high-speed uh LEDs and there's a ground signal ground structure coming in.
[13:54] So we can drive it with a high-speed source.
[13:59] Um and in this case if we drive it at a relatively high current 3 milliamps uh you can get a bit error rate of 4 * 10 -3 at 11 gig.
[14:09] Um here it's 1.5 milliamps at 1 e minus 9.
[14:13] This is what the waterfall charts look like.
[14:15] There's a little bit of a curvature here because this LED wasn't actually designed to run at these high currents.
[14:21] So, you've got some issues kind of overdriving the LED, but a little bit of a hero experiment for this particular
[14:29] Link. Um, question comes up, how fast are these LEDs ultimately?
[14:34] So, I want to show you some of the latest data.
[14:36] This is the S21 of one of these LEDs.
[14:40] And you can see the 3dB bandwidth is on the order of 5 gigahertz right here.
[14:46] And what's really interesting about it is that the rolloff is very slow.
[14:51] As a comparison, I put a single pole response as the dash line.
[14:54] So this is a dash line, single pole response.
[14:56] That's 6dB per octave at 2 gig.
[14:59] This is the same thing at 3 gig.
[15:02] But you can see that our rolloff is slower than a single pole response.
[15:07] And that allows you to equalize and make it go even faster.
[15:08] Um, here we have the latest device.
[15:11] This is about a week old or so.
[15:14] A little bit of ripples.
[15:16] I've got reflections in the signals, but you can see that at very high currents, even at six gig, we're only about one dB down.
[15:22] So, we can get tremendously high speed out of these LEDs.
[15:25] Uh, I think 25 gig per lane is is
[15:31] quite possible with these LEDs in the future.
[15:35] Um, by the way, we only have three minutes left.
[15:37] Just a heads up.
[15:38] Sure.
[15:38] Um, I showed how in the previous work we used our own homemade detector.
[15:45] uh to do the receivers.
[15:47] Recently, you might have seen the press release.
[15:49] We've been working with TSMC and they have modified their camera process to uh be compatible with our blue microLEDDs.
[15:54] So, this is results just from a single TSMC uh detector on one of our links and you can see error rate going down to 10^ the minus3 at 4 Gbits per second.
[16:05] And here we doing this with about 2.3 pjles per bit.
[16:13] This is really very very impressive.
[16:15] I think this is record numbers for the lowest peaker jewels per bit.
[16:19] This is just for the for the TX.
[16:19] Um and then people always ask about reliability.
[16:22] You know LEDs for lighting are not driven at such high current density.
[16:27] So how can we do that?
[16:29] Um well
[16:31] We've optimized the processing so we can drive these things at high current.
[16:35] And here you have some life test results.
[16:37] These are a bunch of devices that have been on life test for about a year running them at 8 kilo per centime squared.
[16:43] So this is about four times the regular uh current we would run them at.
[16:48] And you can see there's a little bit of a change at the beginning.
[16:50] Actually power goes up and then starts to go down but it stabilizes fairly quickly.
[16:53] So if you extrapolate these to mean lives it shows that these reliable these LEDs should be very reliable for our applications.
[17:03] Um, so just to summarize, large arrays of microLEDDs can provide very high bandwidth.
[17:09] We've done a demo at 4 Gbits per second, but much higher per lane speeds are possible, perhaps even going to 25 gig per lane.
[17:19] We can achieve very low picoles per bit.
[17:19] And this has lots of applications.
[17:21] We talked about scale up, but also interesting for um HBM and memory.
[17:26] Here you have a case where we have a little optical chiplet
[17:33] that allows you to move HBM further away from your interposer or alternatively as a celestial AI just showed you can put that functionality in the interoser itself and be able to access memory further away.
[17:47] So we think this is a this is a very exciting approach for short distance high bandwidth communication.
[17:55] So thank you so much.
[17:58] Thank you.
[18:00] And we have some question on Zlax.
[18:00] So Dr. Bard will join Zlax after the talk and answer those offline.
