# Variational Quantum Algorithms | Dr. Bharadwaj Mummameni

https://www.youtube.com/watch?v=iTNTapAqhKQ

[00:12] Okay. Hi. Hello. Uh I am Bharadwaj.
[00:16] Okay. Hi. Hello. Uh I am Bharadwaj.
[00:17] Today I will be discussing about variational quantum algorithms from quantum basics to hybrid optimization.
[00:23] quantum basics to hybrid optimization.
[00:26] We'll go deep in detail regarding everything about the variational quantum algorithms.
[00:30] Okay. Before that, I would like to introduce myself.
[00:33] I did my PhD in quantum chemistry and condensed matter physics 4 years ago.
[00:38] Since '22, I've been working as a quantum algorithm developer here at Fraunhofer in uh Stuttgart, Germany.
[00:43] And without probably much further ado, I think we'll dive deep into the presentation.
[00:48] So, the road map for today's lecture is going to be that first, I'll be talking about the building blocks and the answers, which which would be a quick recap for you guys, something like qubit, superposition, entanglement, quantum
[01:04] superposition, entanglement, quantum gates, and stuff, right?
[01:06] And from there on, we'll start about gates to circuits,
[01:09] which is the answers, and then we'll also try talk about why do you need to choose a specific type of answers, and what are the different types of answers, and when to choose what.
[01:20] And in lecture two, we'll see about the variational principle, where uh we ask about what is a Hamiltonian, what are the expectation values, what are cost functions, the variational principle in itself, and the total hybrid quantum classical loop that the big structure that's holding this entire algorithm together.
[01:41] And also, the one of the most important part of the variational quantum algorithms will be the optimizers, and we'll talk about what are the different types of optimizers, gradient free and gradient based.
[01:52] And if any of these terminology already spooked you, you would you would see that in no time that it's very simple and quite uh intuitive.
[02:00] And along with it, as any good scientific discussion has to go, we would also discuss about challenges and the big
[02:06] discuss about challenges and the big picture of uh.
[02:07] picture of uh where uh variational quantum algorithms would be useful and what are its roadblocks right now.
[02:09] where uh variational quantum algorithms would be useful and what are its.
[02:11] would be useful and what are its roadblocks right now.
[02:13] roadblocks right now. So, let's do a quick quantum computing recap.
[02:16] quick quantum computing recap. I would take probably 5 to 10 minutes for this now, and uh we'll refresh your fundamentals regarding this, right?
[02:17] take probably 5 to 10 minutes for this now, and uh we'll refresh your.
[02:19] now, and uh we'll refresh your fundamentals regarding this, right?
[02:21] fundamentals regarding this, right? We'll visit qubit, quantum gate, superposition, entanglement, and also a Bloch sphere, which is a very important tool for visualizing the quantum operations on the qubit and also quantum gates.
[02:23] We'll visit qubit, quantum gate, superposition, entanglement, and also a.
[02:25] superposition, entanglement, and also a Bloch sphere, which is a very important.
[02:28] Bloch sphere, which is a very important tool for visualizing the quantum.
[02:31] tool for visualizing the quantum operations on the qubit and also quantum.
[02:34] operations on the qubit and also quantum gates.
[02:37] gates. The qubit, the classical bit is either zero or one, which we all know, and a qubit can be both at once.
[02:39] either zero or one, which we all know, and a qubit can be both at once.
[02:42] and a qubit can be both at once. Mathematically, the qubit state is psi, which is defined in a way that where the states state psi is basically a superposition of zero state and the one state, right?
[02:44] Mathematically, the qubit state is psi, which is defined in a way that where the.
[02:47] which is defined in a way that where the states state psi is basically a.
[02:50] states state psi is basically a superposition of zero state and the one.
[02:52] superposition of zero state and the one state, right? And what what type of superposition? And these zero and one has certain coefficients called alpha and beta, and where uh this alpha and beta actually decides that how much of this state has more of zero.
[02:54] state, right? And what what type of superposition? And these zero and one.
[02:56] superposition? And these zero and one has certain coefficients called alpha.
[02:59] has certain coefficients called alpha and beta, and where uh this alpha and.
[03:01] and beta, and where uh this alpha and beta actually decides that how much of.
[03:04] beta actually decides that how much of this state has more of zero.
[03:07] this state has more of zero and less of one and more of one and less
[03:09] and less of one and more of one and less of zero.
[03:12] So, in the orthogonality principle of the quantum state usually
[03:15] principle of the quantum state usually dictates us to have
[03:17] dictates us to have that the sums of the squares of these coefficients has to be equals to one,
[03:22] coefficients has to be equals to one, right?
[03:24] Which means the probability has to be equals to one.
[03:26] And alpha squared is basically the probability of
[03:28] is basically the probability of measuring the state psi in the zero state,
[03:30] and beta squared is measuring the probability of state psi in the one state.
[03:33] and beta squared is measuring the probability of state psi in the one state.
[03:36] Unlike the classical bit, right?
[03:38] The qubit genuinely exists in both states simultaneously and until and
[03:40] states simultaneously and until and unless it's measured.
[03:42] And measurement obviously forces a qubit to either zero or one by destroying this superposition
[03:44] obviously forces a qubit to either zero or one by destroying this superposition type of behavior that we have right now
[03:47] type of behavior that we have right now here.
[03:50] The key idea is always that the qubit carries an exponential more
[03:52] here. The key idea is always that the qubit carries an exponential more information than a classical bit, where
[03:54] information than a classical bit, where usually the n qubits would carry a two to the n number of amplitudes at
[03:56] usually the n qubits would carry a two to the n number of amplitudes at the same time.
[03:58] And when it comes to Bloch sphere, which is a very important
[04:00] to the n number of amplitudes at the same time. And when it comes to Bloch sphere, which is a very important
[04:03] same time. And when it comes to Bloch sphere, which is a very important
[04:06] And when it comes to Bloch sphere, which is a very important
[04:09] sphere, which is a very important intuitive visualization tool to visualize the qubit, let's just say if this is our uh qubit state, right?
[04:17] In this qubit state, uh we could say that the north of this uh sphere usually points to zero and the south points to one.
[04:26] A single qubit state can be visualized as any point on this particular sphere.
[04:33] So, in this in this spherical coordinates, there are also theta and phi to this uh si- to the qubit degree of freedom, which comes into directly defining the state psi as state psi basically cos theta over two times zero plus where e to the i phi times sin theta over two is the beta.
[04:54] Okay? So, the the parameters of this theta and phi usually defines where my state psi is.
[05:03] And the equator we have here usually represents here, it's almost at the halfway, so it could have it would have like exactly 50% of zero
[05:10] it would have like exactly 50% of zero and exactly 50% of the one.
[05:13] And the key idea is that the quantum gates are basically the rotation uh operators onto the state which is there on this Bloch sphere,
[05:25] where it just tries to rotate it across the sphere.
[05:29] Quantum gates are basically some sort of a rotation matrices that takes a vector in this Bloch sphere and rotate it around along the along the Bloch sphere, right?
[05:40] We have seen classical uh gates, right?
[05:43] Which is usually the X gate, NAND gate, and stuff like that.
[05:46] Much like that, we also have quantum gates, but the biggest difference between the quantum gates and the classical gates are is that the quantum gates are reversible operations.
[05:56] So, you could always go from zero to one and one to zero, which is what you can also do from the on the classical computers, but not every but the other operations are totally different.
[06:05] Let's say Hadamard gate.
[06:06] You apply a Hadamard gate on state zero, you would get a
[06:10] gate on state zero, you would get a equal superpositions of state zero and
[06:13] equal superpositions of state zero and one, which is also called as a plus
[06:16] one, which is also called as a plus state,
[06:17] state, right?
[06:19] Here, we could also see some sort of an animation on what happens when a
[06:21] of an animation on what happens when a quantum gate applies on on a state.
[06:24] quantum gate applies on on a state.
[06:26] Here, the gates we would see are X, and the Z gates.
[06:28] the Z gates.
[06:30] So, by simply following this uh animation, we would see that let's take
[06:33] an initial state zero. Applying an X
[06:36] gate on the zero would lead to state one.
[06:37] would lead to state one.
[06:40] While
[06:42] applying
[06:43] a Hadamard gate on state zero would lead
[06:47] us to an equatorial plane right here, the state zero plus one over root two,
[06:49] the state zero plus one over root two, which means it's an equal superpositions
[06:51] of zero and one. Where state X and H
[06:53] does not flip the sign of these uh
[06:56] coefficients states, while Z gate
[06:58] actually flips the sign, right?
[07:01] It's it introduces some sort of a phase, where
[07:03] uh Z gate usually does nothing to the
[07:06] zero state, but it changes the sign of
[07:11] zero state, but it changes the sign of the state one to minus one, right?
[07:13] the state one to minus one, right?
[07:15] So, much like that, when a Z gate is applied on plus state would result in a would result in minus state, right?
[07:20] Which is zero plus one to zero minus one.
[07:26] Along with these uh unique rigid gates, which are X and H, where the how much you have to rotate your uh qubit state is already predefined, there are also other gates called rotation gates, where there is no predefined angle by which you need to rotate your state.
[07:40] And here, we could see RX, RY, and RZ.
[07:44] What do they mean is rotation rotation by a certain angle theta about an axis X is called RX, the same for RY, and same for RZ.
[07:52] Rotation about their respective axis with a certain angle theta, they are called rotation gates.
[07:58] Along with And these are all the single qubit gates, right?
[08:01] So, we also have a two qubit gate, which is also very fundamentally important for us to reach any entangled state, which is the ultimate state that
[08:11] state, which is the ultimate state that we would want to reach while using the quantum mechanical principles of quantum superpositions and entanglement.
[08:19] So, using a CNOT gate, which is a two qubit gate, one can achieve that, right?
[08:23] And how does a CNOT gate work?
[08:26] This CNOT gate basically has a control line and a target line.
[08:29] The control qubit is the one that dictates the state of the target qubit, and the target qubit state only gets flipped when the control qubit is one.
[08:40] Right?
[08:43] So, in this case here, the CNOT gate along with the Hadamard gate would actually lead us to the Bell state, as we would see in a couple of slides.
[08:50] And rotation gates take rotation gates take continuous parameters, right?
[08:54] Which is a theta.
[08:57] They are tunable for us, like tunable knobs on the radio that we have.
[09:00] RX, RY, and RZ type of flavors that we have in terms of rotation gates.
[09:05] Unlike the fixed gates, uh they would let us continuously tune, that's the whole point of it, right?
[09:09] And these param-
[09:12] point of it, right?
[09:15] And these param- these thetas are usually the trainable parameters that we would talk about in the variational quantum algorithms as as they are as they form a variational circuit.
[09:25] circuit. Right?
[09:27] One more example animation of how you would see a rotation gate uh of RX, RY, and RZ, right?
[09:33] When theta is pi by four, but theta is pi by two, and theta is pi.
[09:38] Here, you see that rotation gates would lead to a different different types of states.
[09:42] So, it's up to us based on what type of theta we would like to have, right?
[09:46] And as you see here in this animation, the rotation is about that particular axis.
[09:49] If it is a RZ, the rotation is about the Z axis.
[09:54] And we have seen single qubit gates and two qubit gates.
[09:57] Now, what we should do, right?
[09:59] The two most fundamental important principles of quantum computing is quantum superposition and quantum entanglement.
[10:05] So, how do we create a superposition?
[10:07] A superposition state is basically where alpha and beta coefficients of zero and one are not zero, right?
[10:12] So, if there are there's a
[10:14] zero, right?
[10:14] So, if there are there's a certain type of value to them.
[10:16] And if certain type of value to them.
[10:16] And if the superposition could be these alphas and betas could be anything.
[10:18] It doesn't need to be 1 over root two.
[10:21] It doesn't need to be 1 over root two.
[10:22] It doesn't need to be uh uh this.
[10:24] Like it the alpha squared and the beta squared basically should be equals to one.
[10:25] That's the only condition they need to satisfy.
[10:30] This could be 90% of zero and 10% of one and 50 50 20 20 80 60 40 and stuff like that.
[10:31] And the qubit is in the both states simultaneously when it is in a superposition state.
[10:34] It is just that we don't know in which state, but only after measurement the qubit state falls to either alpha either zero state or on the one state, right?
[10:36] The measurement destroys the superposition.
[10:40] So, the important thing we need to keep in mind is that the entire quantum algorithms that we would be using or we would have to develop is that they are designed to extract as much as information in the superposition state before the final measurement because that's where the
[11:15] measurement because that's where the quantum mechanics is, right?
[11:17] Once you measure it it becomes classical.
[11:19] And that needs to be kept in mind as one one develops a quantum algorithm.
[11:22] And entanglement is the next step above the superposition where two qubits are entangled basically means that you know something about one qubit the other qubit state is already known.
[11:34] So, in this particular state which is a maximally entangled Bell state one of the famous two qubit entangled states, if we measure the first qubit and this is the first qubit, right?
[11:46] If we measure the first qubit and we get zero, we are guaranteed the other state is going to be in zero.
[11:50] If we get one, we are guaranteed the other state is going to be one, which means that entanglement is already giving information about the other half of the system without us actively searching for the other half information, right?
[12:04] The correlation usually holds no matter far how far apart they are.
[12:09] One needs to make sure that they are still entangled and this particular property still holds good.
[12:12] And quantum circuits use it to create the correlation that the classical
[12:16] the correlation that the classical circuits simply cannot.
[12:19] Entanglement is the final uh goal in which or the most important critical resource that a classical computer cannot easily replicate.
[12:28] We would need so many amount of GPUs or CPUs to to describe a particular entangled state which is very which is very non-trivial.
[12:38] And one more animation here shows how one can go from superposition to an entangled state.
[12:42] How would one do that?
[12:45] Given a state here we have zero and zero.
[12:48] And if I apply a Hadamard gate on the first qubit, I would go from 00 to 00 + 10 state.
[12:54] Like this goes into 0 + 1.
[12:55] This still stays zero.
[12:59] But if I have a CNOT, then what would happen is that 00 + 10 would change to 00 + 11.
[13:03] Here the two qubits are entangled.
[13:06] Once if you measure state the first qubit as one, the other qubit would definitely be in the state one.
[13:14] So, this is the biggest property of an entanglement.
[13:16] Now, let's move on, right?
[13:16] Sequence of
[13:18] Now, let's move on, right?
[13:18] Sequence of gates usually form a quantum circuit.
[13:21] gates usually form a quantum circuit, right?
[13:24] Which means all these gates that we have seen until now, X gate, H gate,
[13:27] and Z gate, and the CNOT gate, and stuff, rotation gates.
[13:31] All of these gates if you put together, they usually form a quantum circuit.
[13:33] And a circuit with tunable rotation parameters is called an ansatz.
[13:38] Sometimes without rotational parameters is So, it's called an A circuit with tunable rotation parameters is called an ansatz.
[13:44] A fixed circuit, right?
[13:47] A fixed circuit is where every gate is predetermined like X, H, and Z gates where the output state is always same.
[13:54] But a parameterized circuit, which is basically what we call an ansatz, the rotation angle theta's are the very free variables.
[13:59] So, the output state psi of theta changes as the theta changes.
[14:04] So, theta could be taking any value between the zero to pi and the output accordingly the output state accordingly changes.
[14:15] Unlike the fixed circuit state where we the output is
[14:19] circuit state where we the output is always the same.
[14:22] The ansatz The key idea here will will be that the ansatz defines the family of quantum states
[14:27] that are parameterized by the theta, which means that any of psi of theta
[14:31] could only be reached by varying the theta bar.
[14:33] Now, we will come to this later, but keep this in mind a classical optimizer then searches over this theta
[14:38] bar with the free variables to find the best state in that particular family
[14:43] that we are looking for,
[14:45] right?
[14:46] If this If this doesn't make sense right now, stay with me like you will see it in the later part of the lecture.
[14:52] So, ansatz usually came from German word.
[14:55] It's called ansätze where uh it's called an approach or an educated guess.
[14:58] And because we have parameter parameters, we could call a quantum circuit with parameters as a parameterized quantum circuit.
[15:02] Okay?
[15:05] Let's just say we have three different qubits here, right?
[15:07] 0 and 0 and 0.
[15:09] Now, if we apply a fixed Hadamard gate, Hadamard gate as we know would take us from state zero to superposition
[15:21] take us from state zero to superposition 0 + 1 0 + 1 and 0 + 1.
[15:25] Now, this is 0 + 1 0 + 1 and 0 + 1.
[15:25] Now, this is already expected.
[15:28] We know what comes up.
[15:28] already expected.
[15:28] We know what comes up.
[15:33] But what if I have a rotation gates RY?
[15:33] Let's just say RY here rotation around
[15:34] the about the Y axis
[15:36] with free variables of theta one, theta
[15:39] two, and theta three, right?
[15:42] Now, I don't know the output of the state until
[15:43] then I know what are what are the theta values.
[15:45] So, the as this theta value changes, the output that I get would be
[15:47] totally different.
[15:48] And here we could also add an entangled states two-qubit
[15:51] two-qubit gates like CNOT gates.
[15:53] And even after that we could also have this
[15:56] tunable gates, right?
[15:57] So, there is no particular pattern which how you need to
[15:59] arrange your gates.
[16:02] So, it's up to you how you want to build your quantum
[16:03] circuit based on your particular needs.
[16:05] And once we measure it, until here
[16:07] my entire system is quantum, but once we
[16:10] measure it, the information gets collapsed and it becomes a classical
[16:12] information.
[16:14] And here this ansätze has basically six tunable parameters and we
[16:16] have six degrees of freedom, right?
[16:19] They
[16:24] have six degrees of freedom, right?
[16:26] They are like six I don't know like so many combinations.
[16:29] The output state usually is that comes after this tunable state is psi of theta is basically what you put into this theta and times the input state of 000, right?
[16:39] Apart from the fixed gates that we know that what we already do.
[16:43] And why do we need an ansatz?
[16:46] Why are we even talking about this?
[16:47] The full quantum state space, right?
[16:49] Which is also called Hilbert space named after the great scientist David Hilbert, which has two to the n dimensions for n qubits, right?
[16:58] If I have two qubits, I would have two to the two four dimensions of it.
[17:03] If I have two to the three qubits, I would have eight, which is as this n increases, two to the 20 would become almost a million, right?
[17:12] So, there are 20 There are a million possible quantum states for a 20 qubit system, right?
[17:17] And if I want to search for it for my given problem, there is one state out of this million states
[17:26] one state out of this million states that would give my that would give me that would give my that would give me the answer to my problem, that becomes the answer to my problem, that becomes too vast to search, right?
[17:33] Directly.
[17:33] too vast to search, right?
[17:34] Directly.
[17:34] It's like again searching a needle in the haystack.
[17:36] But what happens with the ansatz is
[17:38] ansatz is instead of searching in a very huge space, we restrict our search to a very manageable subspace, which means instead of searching in a million, maybe we could search in a space of 100.
[17:40] instead of searching in a very huge space, we restrict our search to a very manageable subspace, which means instead of searching in a million, maybe we could search in a space of 100.
[17:42] space, we restrict our search to a very manageable subspace, which means instead of searching in a million, maybe we could search in a space of 100.
[17:44] manageable subspace, which means instead of searching in a million, maybe we could search in a space of 100.
[17:46] of searching in a million, maybe we could search in a space of 100.
[17:48] That time like suddenly the problem the solution finding becomes much more clearer.
[17:50] time like suddenly the problem the solution finding becomes much more clearer.
[17:52] clearer.
[17:53] So, that let's say there are few parameters, which are like small number of parameters.
[17:54] So, that let's say there are few parameters, which are like small number of parameters.
[17:56] parameters, which are like small number of parameters.
[17:58] Small number of parameters means that we can only we can only have a smaller subspace, right?
[18:00] parameters means that we can only we can only have a smaller subspace, right?
[18:02] only have a smaller subspace, right?
[18:05] Means that it is very fast to search for the solution if your solution is there in that, but there's a huge chance that the answer may not lie in that 100 out of 1 million space, right?
[18:07] the solution if your solution is there in that, but there's a huge chance that the answer may not lie in that 100 out of 1 million space, right?
[18:10] in that, but there's a huge chance that the answer may not lie in that 100 out of 1 million space, right?
[18:12] the answer may not lie in that 100 out of 1 million space, right?
[18:15] of 1 million space, right?
[18:18] But what would you do?
[18:18] Then the next obvious intuitive thing to do is that increase this number of parameters.
[18:20] intuitive thing to do is that increase this number of parameters.
[18:22] Instead of having 100 parameters, let's say have 1,000 parameters, right?
[18:25] this number of parameters.
[18:25] Instead of having 100 parameters, let's say have 1,000 parameters, right?
[18:26] 1,000 parameters, right?
[18:26] Increase by 10x.
[18:29] Maybe then out of by luck that we 10x.
[18:29] Maybe then out of by luck that we might be able to find the true answer,
[18:31] might be able to find the true answer, which is there in a larger subspace than
[18:34] which is there in a larger subspace than is than the smaller one where by adding
[18:37] is than the smaller one where by adding more parameters we could reach the
[18:39] more parameters we could reach the subspace, right?
[18:41] subspace, right?
[18:41] So, more parameters means larger subspace.
[18:44] Definitely means larger subspace.
[18:44] Definitely definitely it would cover more number of
[18:46] definitely it would cover more number of solutions, which is has a better
[18:47] solutions, which is has a better coverage.
[18:50] But what happens is as you increase the number of parameters, it
[18:52] increase the number of parameters, it becomes so much more harder to optimize,
[18:55] becomes so much more harder to optimize, right?
[18:58] right?
[18:58] There is basically no free lunch.
[19:00] Either you have a smaller one, do it fast, or if you have a bigger one,
[19:01] fast, or if you have a bigger one, you'll have to do it slow because it's
[19:03] you'll have to do it slow because it's hard to optimize.
[19:06] hard to optimize. And in principle one can actually generate the full Hilbert
[19:08] can actually generate the full Hilbert space by increasing the number of
[19:09] parameters to the dimension of the full
[19:12] parameters to the dimension of the full Hilbert space.
[19:14] Hilbert space. Let's just say here I just for the pictorially representing I
[19:16] just for the pictorially representing I just said infinite, but as you keep
[19:18] just said infinite, but as you keep increasing the number of parameters, you
[19:20] increasing the number of parameters, you would be able to span the entire Hilbert
[19:22] would be able to span the entire Hilbert space.
[19:24] space. Once you span your entire Hilbert space as slash solution
[19:27] solution Hilbert space as slash solution space, then you will be able to space, then you will be able to completely cover your solution whatever is there in it, but the problem is that it's intractable to do this, right?
[19:35] That's the whole reason why we started with ansatz. So, we need an ansatz because we'll not be we cannot track this uh two to the n dimensional Hilbert space, but we'll also have to make sure that there is a trade-off in which that uh if my solution doesn't lie in the smaller initial parameters, we have to keep increasing the number of parameters until we reach a particular true solution, right?
[20:01] And yes, the key [snorts] idea basically delivers the same point that I've been saying.
[20:04] Choosing the right answers means the covering enough of the Hilbert space while keeping the parameter count very manageable, right?
[20:12] So, now we know why an answers, right?
[20:15] The next thing is, okay, now that we know why do we need it?
[20:17] Like what are the different types of answers?
[20:19] Because not all answers are same, right?
[20:22] The key question is The key question next comes is that how much do you know about your problem?
[20:29] you know about your problem? Right? More often than not, you would
[20:31] Right? More often than not, you would know something about your problem, but
[20:33] know something about your problem, but let's just say in a case where you don't
[20:35] let's just say in a case where you don't know about the problem or even in spite
[20:37] know about the problem or even in spite of knowing something about the problem,
[20:40] of knowing something about the problem, you cannot make a clear decision, there
[20:42] you cannot make a clear decision, there is one type of answers called generic
[20:44] is one type of answers called generic answers, where usually you have you will
[20:46] answers, where usually you have you will be choosing the best quantum You will be
[20:49] be choosing the best quantum You will be choosing the gates, whatever the
[20:50] choosing the gates, whatever the hardware provider would provide. Let's
[20:52] hardware provider would provide. Let's just say IBM provides you square root of
[20:55] just say IBM provides you square root of Z gate, X gate, and CNOT gate, right?
[20:58] Z gate, X gate, and CNOT gate, right? Which means that in our quantum circuit
[21:00] Which means that in our quantum circuit answers, those Hadamard, those Z gates,
[21:03] answers, those Hadamard, those Z gates, and all of it that we have seen will be
[21:05] and all of it that we have seen will be only compiled into the native basis set
[21:08] only compiled into the native basis set gates, which are hardware efficient,
[21:10] gates, which are hardware efficient, right? So, no matter what type of
[21:12] right? So, no matter what type of quantum gates you will have, the minute
[21:14] quantum gates you will have, the minute you go to the hardware, you would be
[21:16] you go to the hardware, you would be compiling them into the basic hardware
[21:18] compiling them into the basic hardware gates that the hardware provides. These
[21:21] gates that the hardware provides. These hardware provides These basis gates,
[21:23] hardware provides These basis gates, which are native to the hardware, would
[21:25] which are native to the hardware, would change from a different hardware to
[21:27] change from a different hardware to hardware. IBM would have something else,
[21:29] hardware. IBM would have something else, trapped ions would have something else,
[21:31] trapped ions would have something else, and photon photonic base would have
[21:34] and photon photonic base would have something else. If you would need to
[21:35] something else. If you would need to choose. And in in a case where your
[21:38] choose. And in in a case where your problem usually needs supports actually
[21:41] problem usually needs supports actually certain type of gates, then that's a
[21:43] certain type of gates, then that's a best fit for you that you could happily
[21:45] best fit for you that you could happily go with the hardware efficient answers.
[21:47] go with the hardware efficient answers. But what if you know something, right?
[21:49] But what if you know something, right? When we know something, then we can
[21:51] When we know something, then we can choose an answers which is actually
[21:53] choose an answers which is actually problem inspired. What does it mean?
[21:55] problem inspired. What does it mean? Like we'll try to think more about the
[21:57] Like we'll try to think more about the problem where we try to use the
[22:00] problem where we try to use the structure of the problem and then design
[22:02] structure of the problem and then design the circuit and make variants of it
[22:05] the circuit and make variants of it exist for different different domains,
[22:07] exist for different different domains, right? Like there are like physics,
[22:08] right? Like there are like physics, chemistry, optimizations, and stuff like
[22:10] chemistry, optimizations, and stuff like that. So, there are different types of
[22:12] that. So, there are different types of problems, and one can always choose an
[22:14] problems, and one can always choose an answers that is inspired by those type
[22:16] answers that is inspired by those type of problems.
[22:18] of problems. And uh go ahead about it, right? The
[22:21] And uh go ahead about it, right? The thing is the generic answers is very
[22:23] thing is the generic answers is very easy to implement, but uh we may waste
[22:27] easy to implement, but uh we may waste parameters exploring the irrelevant
[22:29] parameters exploring the irrelevant states.
[22:30] states. Where the problem-inspired answers is
[22:32] Where the problem-inspired answers is search where it actually matters, right?
[22:34] search where it actually matters, right? Problem-inspired already knows where
[22:36] Problem-inspired already knows where would the solution be in this 2 to the
[22:38] would the solution be in this 2 to the end dimensional Hilbert space. So, use
[22:41] end dimensional Hilbert space. So, use whatever gates your quantum computer
[22:43] whatever gates your quantum computer does the best. That's the hardware
[22:45] does the best. That's the hardware efficient answers in a repeating
[22:47] efficient answers in a repeating pattern, mostly, right? So, let's say I
[22:50] pattern, mostly, right? So, let's say I have RY and RZ as the native gates for
[22:53] have RY and RZ as the native gates for my quantum hardware, then I would have
[22:56] my quantum hardware, then I would have one two-qubit CNOT gate, and then I
[22:59] one two-qubit CNOT gate, and then I would try to keep repeating this
[23:01] would try to keep repeating this particular pattern set of gates, so that
[23:03] particular pattern set of gates, so that I would have more I would have more It
[23:06] I would have more I would have more It would be easier to optimize because they
[23:08] would be easier to optimize because they are the same native gates. And by
[23:11] are the same native gates. And by repeating the same gates, there is a
[23:12] repeating the same gates, there is a higher chance that we might get signal
[23:15] higher chance that we might get signal amongst all the noise that might
[23:17] amongst all the noise that might actually come. So, rotation layers,
[23:20] actually come. So, rotation layers, entangle layer, and repeat. This is
[23:22] entangle layer, and repeat. This is usually the pattern when you do with the
[23:24] usually the pattern when you do with the when you're trying to do anything with
[23:26] when you're trying to do anything with the hardware efficient answers. So, what
[23:28] the hardware efficient answers. So, what are the pros, right? The advantages are
[23:30] are the pros, right? The advantages are definitely that it needs shallow
[23:32] definitely that it needs shallow circuits because you don't need to
[23:33] circuits because you don't need to compile a different gate set into your
[23:36] compile a different gate set into your hardware native gates. You're already
[23:38] hardware native gates. You're already writing
[23:40] writing the main answers into the hardware
[23:43] the main answers into the hardware native gates. So, there is You would
[23:45] native gates. So, there is You would have shallow circuits, which means the
[23:46] have shallow circuits, which means the depth of the circuit would be less. And
[23:48] depth of the circuit would be less. And because the current day quantum hardware
[23:51] because the current day quantum hardware is noisy, and it can run very well on
[23:53] is noisy, and it can run very well on that, and it's [clears throat] also easy
[23:55] that, and it's [clears throat] also easy to implement. But the disadvantages also
[23:58] to implement. But the disadvantages also comes with that, right? There's no
[24:00] comes with that, right? There's no domain knowledge as such here in this.
[24:02] domain knowledge as such here in this. Why do we choose Why do you only choose
[24:04] Why do we choose Why do you only choose RY and RZ, why not RX, square root of X,
[24:06] RY and RZ, why not RX, square root of X, or Hadamard? We don't know because
[24:08] or Hadamard? We don't know because there's no domain knowledge for us. And
[24:10] there's no domain knowledge for us. And we can also hit barren plateaus, right?
[24:12] we can also hit barren plateaus, right? Which we'll be discussing in second
[24:14] Which we'll be discussing in second lecture. And maybe here we see, right?
[24:17] lecture. And maybe here we see, right? What is the advantage of adding more
[24:18] What is the advantage of adding more number of parameters to the system? So,
[24:21] number of parameters to the system? So, adding more number of parameters would
[24:23] adding more number of parameters would always make your
[24:24] always make your search be
[24:26] search be search easier. Having two qubits, having
[24:28] search easier. Having two qubits, having two parameters, would would give you a
[24:30] two parameters, would would give you a landscape something like this. Having
[24:32] landscape something like this. Having four parameters, would you have this
[24:33] four parameters, would you have this type of control? And having eight
[24:35] type of control? And having eight parameters, your optimization becomes so
[24:37] parameters, your optimization becomes so much more easier. I mean to say that the
[24:39] much more easier. I mean to say that the finding the solution becomes so much
[24:40] finding the solution becomes so much more easier. Because if you start out
[24:42] more easier. Because if you start out with two parameters, you you would never
[24:44] with two parameters, you you would never reach the minimum because like there is
[24:46] reach the minimum because like there is not enough much of degree of freedom for
[24:49] not enough much of degree of freedom for you to change the parameters and try out
[24:51] you to change the parameters and try out different parts of the landscape.
[24:54] different parts of the landscape. Whereas by having increasing the number
[24:56] Whereas by having increasing the number of parameters, you would be doing a
[24:58] of parameters, you would be doing a favor for yourself on trying to find the
[25:01] favor for yourself on trying to find the optimized solution. So, more
[25:03] optimized solution. So, more [clears throat] parameters, smoother
[25:04] [clears throat] parameters, smoother landscape, and easier to optimize. And
[25:07] landscape, and easier to optimize. And there are problem-inspired answers,
[25:08] there are problem-inspired answers, right? As we discussed, if you know the
[25:09] right? As we discussed, if you know the structure of the problem, try build it
[25:11] structure of the problem, try build it into the circuit. That's the main
[25:13] into the circuit. That's the main important thing. Like physics problems,
[25:15] important thing. Like physics problems, which basically use the gates that can
[25:18] which basically use the gates that can respect the symmetries, conserve
[25:19] respect the symmetries, conserve particle spin, particle number, spin,
[25:23] particle spin, particle number, spin, and many more quantum mechanical thing.
[25:25] and many more quantum mechanical thing. And optimization problems also you could
[25:27] And optimization problems also you could directly encode the cost function into
[25:29] directly encode the cost function into the circuit structure. Whereas chemistry
[25:31] the circuit structure. Whereas chemistry problems, you could also deal with
[25:33] problems, you could also deal with excitation operators and mimic how these
[25:35] excitation operators and mimic how these electrons get excited among the energy
[25:37] electrons get excited among the energy levels. That's one, and the advantages,
[25:40] levels. That's one, and the advantages, as we see, there are only less number of
[25:43] as we see, there are only less number of parameters needed, which means that it's
[25:45] parameters needed, which means that it's tractable, but it it also avoids the
[25:48] tractable, but it it also avoids the irrelevant states. That's also good, but
[25:50] irrelevant states. That's also good, but often converges faster because the
[25:52] often converges faster because the solution space that it has is quite
[25:54] solution space that it has is quite smaller compared to the hardware
[25:56] smaller compared to the hardware efficient answers. But the con is that
[25:58] efficient answers. But the con is that it has deeper circuits because the
[26:00] it has deeper circuits because the current day NISQ devices because it's
[26:03] current day NISQ devices because it's the hardware efficient gates are not
[26:06] the hardware efficient gates are not usually the ones that the
[26:07] usually the ones that the problem-inspired answers has. So, when
[26:09] problem-inspired answers has. So, when you compile the problem-inspired answers
[26:11] you compile the problem-inspired answers into the hardware efficient native
[26:13] into the hardware efficient native gates, then we would get a deeper
[26:15] gates, then we would get a deeper circuit, right? Which means a deeper
[26:17] circuit, right? Which means a deeper circuit always means more noise. And
[26:19] circuit always means more noise. And requires domain expertise and also very
[26:22] requires domain expertise and also very pretty problem-specific. And
[26:26] pretty problem-specific. And as we see here already, there are few
[26:27] as we see here already, there are few contradicting points that we discussed.
[26:30] contradicting points that we discussed. So,
[26:31] So, we said go for more number of
[26:33] we said go for more number of parameters,
[26:34] parameters, but also going for more number of
[26:36] but also going for more number of parameters making your optimization
[26:38] parameters making your optimization hard, but having less number of
[26:39] hard, but having less number of parameters
[26:41] parameters would never lead you to the optimal
[26:43] would never lead you to the optimal solution. So, there is a particular
[26:46] solution. So, there is a particular trade-off here, and the trade-off comes
[26:48] trade-off here, and the trade-off comes with that
[26:50] with that as the problem knowledge along the
[26:53] as the problem knowledge along the x-axis grows, which means that the
[26:55] x-axis grows, which means that the problem knowledge is very negligible
[26:57] problem knowledge is very negligible around here. At this point of time, we
[26:59] around here. At this point of time, we can think of that hardware efficient
[27:01] can think of that hardware efficient answers circuits usually be here, right?
[27:04] answers circuits usually be here, right? Where the circuit depth goes along the
[27:05] Where the circuit depth goes along the y-axis. So, hardware efficient answers
[27:08] y-axis. So, hardware efficient answers usually has low problem knowledge and
[27:11] usually has low problem knowledge and low circuit depth, right? So, it's noisy
[27:14] low circuit depth, right? So, it's noisy hardware friendly, and it's also
[27:16] hardware friendly, and it's also generic, and it has shallow circuit. But
[27:18] generic, and it has shallow circuit. But as we increase the problem knowledge,
[27:21] as we increase the problem knowledge, you would go to problem inspired. It
[27:22] you would go to problem inspired. It will be more of a tailored instead of
[27:24] will be more of a tailored instead of generic, and it will have deeper
[27:26] generic, and it will have deeper circuits, and we need the domain
[27:28] circuits, and we need the domain expertise. So, one always needs to have
[27:30] expertise. So, one always needs to have a certain type of trade-off. Hybrid is
[27:33] a certain type of trade-off. Hybrid is that there are there are a particular
[27:35] that there are there are a particular methods around this. One is start with
[27:37] methods around this. One is start with the hardware efficient answers, then you
[27:39] the hardware efficient answers, then you can refine the You can also model modify
[27:42] can refine the You can also model modify your answers as you go by refining with
[27:44] your answers as you go by refining with the structure. That's also one more way
[27:46] the structure. That's also one more way of doing it. And
[27:48] of doing it. And this trade-off would also be represented
[27:50] this trade-off would also be represented in a totally different way, right? There
[27:52] in a totally different way, right? There are two competing properties for any
[27:55] are two competing properties for any given answers. One is expressibility,
[27:59] given answers. One is expressibility, and one is trainability,
[28:01] and one is trainability, right? Expressibility basically talks
[28:03] right? Expressibility basically talks about how many states can the answers
[28:06] about how many states can the answers reach? How many states can this answers
[28:08] reach? How many states can this answers reach? Like how many number of
[28:09] reach? Like how many number of parameters I can have? And trainability
[28:12] parameters I can have? And trainability means like how easy it is to optimize.
[28:14] means like how easy it is to optimize. More parameters usually are is more
[28:17] More parameters usually are is more expressible, but harder to train. As the
[28:19] expressible, but harder to train. As the number of parameters increase, the
[28:21] number of parameters increase, the expressibility goes up.
[28:24] expressibility goes up. Okay? But too few means that answer is
[28:26] Okay? But too few means that answer is probably unreachable.
[28:29] probably unreachable. And too many would give you a flat
[28:32] And too many would give you a flat landscape, which is barren plateaus.
[28:34] landscape, which is barren plateaus. We'll talk about it later. As you
[28:35] We'll talk about it later. As you increase the number of parameters, the
[28:37] increase the number of parameters, the trainability of the quantum circuit
[28:40] trainability of the quantum circuit would go away. But it kind of feels
[28:42] would go away. But it kind of feels that, okay, as expressibility increases
[28:44] that, okay, as expressibility increases with the number of parameters and the
[28:45] with the number of parameters and the trainability goes down, it seems to me
[28:47] trainability goes down, it seems to me that there is some sort of a middle way
[28:49] that there is some sort of a middle way and some sort of sweet spot, right?
[28:52] and some sort of sweet spot, right? Where we can target it. There would be a
[28:54] Where we can target it. There would be a sweet spot where we can have certain
[28:56] sweet spot where we can have certain level of expressibility without
[28:58] level of expressibility without compromising too much on the
[28:59] compromising too much on the trainability, right? So, a good answers
[29:02] trainability, right? So, a good answers usually sits around this area. And you
[29:05] usually sits around this area. And you would never know what is a good answers
[29:07] would never know what is a good answers until unless you try out few, and then
[29:09] until unless you try out few, and then you get a knowledge of how much it is
[29:11] you get a knowledge of how much it is trainable and how much it is
[29:12] trainable and how much it is expressible. So, mostly one has to focus
[29:15] expressible. So, mostly one has to focus for these type of areas when you start
[29:18] for these type of areas when you start with an answers. So, the summary would
[29:21] with an answers. So, the summary would be that the qubits live in superposition
[29:24] be that the qubits live in superposition and can be entangled, and we visualize
[29:27] and can be entangled, and we visualize them on the Bloch sphere, right? And
[29:29] them on the Bloch sphere, right? And quantum gates are usually reversible,
[29:31] quantum gates are usually reversible, and they rotate the qubit state on the
[29:33] and they rotate the qubit state on the Bloch sphere. Apart from the hard rigid
[29:36] Bloch sphere. Apart from the hard rigid gates, we also have fine-tunable
[29:38] gates, we also have fine-tunable rotation gates, RX, RY, and RZ, which
[29:40] rotation gates, RX, RY, and RZ, which would be the workhorse of the
[29:43] would be the workhorse of the variational quantum algorithms. An
[29:44] variational quantum algorithms. An answers is a quantum circuit that we
[29:46] answers is a quantum circuit that we stitch together these gates in the
[29:48] stitch together these gates in the hardware and there are two different
[29:50] hardware and there are two different types of answers. One is hardware
[29:52] types of answers. One is hardware efficient, one is problem inspired and
[29:55] efficient, one is problem inspired and we
[29:55] we we also seen the expressibility and the
[29:57] we also seen the expressibility and the trainability trade-off. What determines
[30:00] trainability trade-off. What determines the right number of parameters? And the
[30:02] the right number of parameters? And the key idea is what are we optimizing,
[30:04] key idea is what are we optimizing, right? Like what are we optimizing for?
[30:06] right? Like what are we optimizing for? Hamiltonian's expectation values, cost
[30:08] Hamiltonian's expectation values, cost functions in the hybrid quantum room.
[30:10] functions in the hybrid quantum room. This is what we would be discussing in
[30:12] This is what we would be discussing in the next session.
[30:15] the next session. >> [music]
