Protein folding is an important problem at the foundation of biological phenomena. If we could predict how a sequence of amino acids folds into a three-dimensional structure, it would greatly contribute to drug design and understanding diseases. However, the folding problem is extremely difficult in computational science.
Structure space grows explosively
Each amino-acid residue has dihedral angles ψ and φ, and their combinations generate countless possible spatial structures.
For example, if φ and ψ are each discretized into several dozen levels, the number of possible structures grows exponentially with respect to the number of residues N.
Limitations of classical computation
Classically, the search is performed using molecular dynamics (MD) simulations or Monte Carlo (MC) methods.
In the MC method in particular, the procedure “compute energy difference ΔE → determine whether to accept the new structure using the classical Metropolis rule” is repeated.
However, because many energy minima exist, classical searches tend to fall into local optima, and large-scale exploration is computationally difficult.
Potential of quantum computation
Quantum computers can make use of superposition (parallelism) and amplitude amplification.
In particular, the Szegedy-type quantum walk is a quantized version of a classical Markov chain, and the paper shows the possibility of accelerating the Metropolis method. In other words, by representing local moves with a quantum walk, it is expected that the structure space can be explored more efficiently.
Method: QFold, A protein-folding method based on quantum walks
In the proposed method, there is a classical processing part (Initialization module) that uses deep learning, and a quantum/classical Metropolis method (Simulation module) that executes quantum computation based on the obtained classical processing.The combination of this classical processing part and the quantum computation part is called QFold.The quantum computation part alone is referred to as the quantum Metropolis method.The quantum Metropolis method aims to perform an approximate exploration of the energy landscape by embedding an update rule similar to the Metropolis–Hastings method into a quantum circuit based on the Szegedy-type quantum walk.
In QFold’s classical preprocessing, prior to executing the quantum computation, a finite set of φ–ψ angle configurations is sampled.For each configuration, using quantum chemistry software (e.g., Psi4), the energy difference ΔE in the transition from the current structure to the proposed structure is calculated and saved as a dataset.Through this process, the “stability” of each local structural change is provided as numerical values. However, in this notebook, classical processing is not performed; instead, the dataset already computed in [2] is used, and the focus is placed only on the quantum computation.
For each move, the acceptance probability is defined asA=min(1,e−βΔE)(β is the inverse-temperature parameter).This probability is converted into the rotation angleθ=2arcsin(A)and Ry(θ) is applied to the coin qubit. As a result, low-energy structures are assigned higher amplitudes, and the transitions of the quantum walk are biased toward energetically favorable directions. Next, when the coin qubit is 1, the dihedral angle (φ or ψ) indicated by the move-id is updated. Afterwards, the auxiliary computation is uncomputed, and the reversibility of the entire operation is preserved. Furthermore, a Grover-type reflection operator is applied, and according to Szegedy’s quantization of the Markov chain, the amplitudes of the low-energy states are amplified through interference effects.
After repeating the walk operator for multiple steps, the φ and ψ registers are measured.Combinations that appear with high frequency in the measurement results correspond to low free-energy, stable folding structures.
The dataset used in this study is a table of energy differences ΔE that were precomputed by classical computation (e.g., Psi4) after restricting the protein folding problem to a finite number of discrete states. In this notebook, the energy set by Minifold is not computed; instead, the dataset described in reference [2] is used.Each state is uniquely represented by a binary string of five types of bits, and each bit is assigned the following meaning.The dataset stores “the ΔE obtained when a state (φ, ψ) is changed according to the move-id.”
Quantum register
Physical meaning
Notes
ϕ register
Current value of the dihedral angle ϕ (0 = 0°, 1 = π)
Rotational degree of freedom of the protein backbone
ψ register
Current value of the dihedral angle ψ (0 = 0°, 1 = π)
The other main dihedral angle
M register
move-id (0 → change ϕ / 1 → change ψ)
Specifies which angle to move
move-val
(0 = −π, 1 = +π)
In the 1-bit discretization this is always 1 (fixed at +π)
coin (auxiliary)
coin placeholder
The actual coin qubit is generated inside the circuit; the input is always 0
Example: key = “00100”
b0=0→ϕ=0°
b1=0→ψ=0°
b2=1→moveψ
b3=0→−π (the dataset always uses 1, so fixed to 1)
b4=0→ coin placeholder
The ΔE value corresponding to this key means “the energy difference when the current structure (ϕ=0,ψ=0) has ψ=0 changed by +π.”
Following Algorithm 1, for each state (ϕ,ψ,M), ΔE is read out and the corresponding θ is computed.For example, in the case M=0 (updating ψ), ΔE is obtained for the current structure with ϕ=0 and ψ=1, and by taking its average, Aϕ is determined.This is converted into θϕ, and a controlled rotation is applied to the coin register.
Similarly, when M=1, θψ for updating ψ is computed and applied to the coin register.
If the coin register is 1, the ϕ or ψ register is flipped according to the corresponding move-id.This corresponds to accepting the new structure.After that, the coin rotation is uncomputed to erase the auxiliary information, and by adding the oracle operator, the amplitude of states corresponding to low-energy structures is amplified.In QFold (particularly in the quantum circuit of Fig. 6), the reason the oracle is applied to ∣CM⟩=∣00⟩ is that at every step of the walk, the auxiliary registers for coin and move are always reset to ∣00⟩.
In other words, since all update processes (the accept/reject decision) proceed starting from that state, the oracle operator for selecting “good structures” is also defined with respect to ∣CM⟩=∣00⟩.
Below, the actions of the coin operator and the oracle operator are explained.First, write the state as ∣ϕψMC⟩:∣ψ0⟩=21ϕ,ψ∈0,1∑∣ϕψ⟩⊗∣M=0⟩⊗∣C=0⟩Next, a rotation is given to the coin based on ΔE.For example, when M=0 (updating ϕ):∣ϕψ,M=0,C=0⟩⟶1−Aϕ∣ϕψ,0,0⟩+Aϕ∣ϕψ,0,1⟩Here, Aϕ=min(1,e−βΔE).
Next, the coin operator is applied (Compute).
If C=1, ϕ is flipped:∣ϕψ,M=0,C=1⟩→∣(ϕ⊕1)ψ,0,1⟩If C=0, nothing is done.Next, the Uncompute of the coin operator is performed.
Then, by reversing the coin rotation, C is reset to 0:1−Aϕ∣ϕψ,0,0⟩+Aϕ∣(ϕ⊕1)ψ,0,0⟩As a result, the information of the rotation angle (which rotation angle was applied) remains only in ϕψ, and M,C return again to ∣00⟩.Finally, by applying the oracle operator to the subspace ∣MC⟩=∣00⟩, the information of the acceptance probability alone is marked:∣ϕψ,00⟩↦−∣ϕψ,00⟩Therefore, by alternately applying the coin operator, the oracle operator, and the shift operator at each walk step, the information including the acceptance probability Aϕ is amplified, and by repeatedly implementing the quantum walk, the information reflecting the optimal structure is obtained.More detailed information is described in [1].In Algorithm 1 of the paper, in constructing the coin operator, energy-difference values that are close to each other are used.For R0, “00001” and “01001” are used, and for R1, “00101” and “10101” are used.One might think that it is not necessary to include other information, but by using only the information of ϕ and ψ, other rotation information (for example, ∣110⟩) can be included through quantum parallelism arising from superposition.This can be expressed by the following code.
import jsonimport mathdef build_coin_prep_from_dataset(data: dict, beta: float = 1.0): delta_tbl = data["deltas"] # ΔE → θ def theta(dE: float) -> float: A = min(1.0, math.exp(-beta * dE)) return 2 * math.asin(math.sqrt(A)) # φ keys_phi = ["00001", "01001"] # ψ keys_psi = ["00101", "10101"] # average A A_phi = 0.5 * sum(min(1.0, math.exp(-beta * delta_tbl[k])) for k in keys_phi) A_psi = 0.5 * sum(min(1.0, math.exp(-beta * delta_tbl[k])) for k in keys_psi) # θ theta_phi = 2 * math.asin(math.sqrt(A_phi)) theta_psi = 2 * math.asin(math.sqrt(A_psi)) return theta_phi, theta_psi