When Scoring Functions Smile and Cry

"Score is -8.5. Good."

Lina checked the docking results.

"But the measured value?" Eiji asked.

"IC50 is 100 micromolar. Disappointing."

Akira sighed. "Deceived by the scoring function again."

"Deceived?"

"Even with a good score, actual activity is low. It happens often," Lina explained.

"Why are scoring functions so inaccurate?"

"Because it's too complex," Akira answered. "We're trying to represent the free energy of protein-ligand binding with a simple equation."

Lina wrote an equation on the whiteboard.

"ΔG = ΔG_vdW + ΔG_elec + ΔG_hbond + ΔG_desolv + ΔG_entropy"

"Van der Waals, electrostatic, hydrogen bonding, desolvation, entropy..."

"We add these together to predict binding affinity. But the weight of each term is difficult."

Eiji supplemented. "Scoring functions are trained on known complex structures. But generalization performance is limited."

"Generalization performance?"

"The property of being able to accurately predict targets and ligands not in the training data."

Lina gave a specific example. "For instance, if you apply a scoring function trained on hydrophobic pockets to highly polar pockets, it tends to fail."

"Scoring functions have strengths and weaknesses too."

"Yes. So we use multiple scoring functions. Consensus scoring."

Akira explained. "Evaluate with five scoring functions and judge by majority vote."

"But," Eiji said, "even then it can be wrong."

"What's the problem?"

Lina looked serious. "Fundamentally, it's the handling of the entropy term."

"Entropy?"

"The degrees of freedom lost upon binding. Both ligand and protein become less mobile. That's entropically unfavorable."

"But how do you calculate it?"

"You can't calculate it accurately. So we add empirical penalties."

Akira supplemented. "Like a penalty proportional to the number of rotatable bonds. But it's a rough approximation."

"So what should we do?"

Eiji answered. "Free energy perturbation, FEP. Quantum mechanical calculations. Highly accurate but computationally expensive."

"Not practical?"

"Can't use it for screening. But effective for final candidate selection."

Lina presented another perspective. "Scoring functions don't have to be perfect."

"Huh?"

"It's enough if they can rank relatively. If we can judge which is better between compounds A and B, that's sufficient."

"I see."

Akira continued. "So we look at ranking, not absolute score values."

"But rankings can be wrong too."

"Yes. That's why we ultimately confirm with experiments."

Eiji said, "Scoring functions are hypothesis generation tools. They don't tell you the truth."

Lina smiled. "When they smile and when they cry."

"When they smile?"

"When the score is good and the measurement is also good. When the scoring function was correct."

"When they cry?"

"When the score is good but the measurement is bad. When betrayed."

Akira laughed. "But we cry more often."

"So we don't blindly trust scores. We just use them as reference."

Sena entered. "What are you talking about?"

"About the limitations of scoring functions," Lina answered.

"Sounds difficult again..."

"But important," Eiji said. "If you don't know the limitations of your tools, you'll draw wrong conclusions."

Lina concluded. "Scoring functions are imperfect. But if mastered, they become powerful tools."

"Mastering them, huh," Sena murmured.

Dreaming of the day when scoring functions smile, research continues.