Abstract A novel molecular shape similarity comparison
method, namely SHeMS, derived from spherical harmonic
(SH) expansion, is presented in this study. Through weight
optimization using genetic algorithms for a customized
reference set, the optimal combination of weights for the
translationally and rotationally invariant (TRI) SH shape
descriptor, which can specifically and effectively distin-
guish overall and detailed shape features according to the
molecular surface, is obtained for each molecule. This
method features two key aspects: firstly, the SH expansion
coefficients from different bands are weighted to calculate
similarity, leading to a distinct contribution of overall and
detailed features to the final score, and thus can be better
tailored for each specific system under consideration.
Secondly, the reference set for optimization can be totally
configured by the user, which produces great flexibility,
allowing system-specific and customized comparisons. The
directory of useful decoys (DUD) database was adopted to
validate and test our method, and principal component
analysis (PCA) reveals that SH descriptors for shape
comparison preserve sufficient information to separate
actives from decoys. The results of virtual screening
indicate that the proposed method based on optimal SH
descriptor weight combinations represents a great improve-
ment in performance over original SH (OSH) and ultra-fast
shape recognition (USR) methods, and is comparable to
many other popular methods. Through combining efficient
shape similarity comparison with SH expansion method,
and other aspects such as chemical and pharmacophore
features, SHeMS can play a significant role in this field and
can be applied practically to virtual screening by means of
similarity comparison with 3D shapes of known active
compounds or the binding pockets of target proteins.