Abstract Skin sensitization is an important toxic endpoint
in the risk assessment of chemicals. In this paper,
structure–activity relationships analysis was performed on
the skin sensitization potential of 357 compounds with
local lymph node assay data. Structural fragments were
extracted by GASTON (GrAph/Sequence/Tree extractiON)
from the training set. Eight fragments with accuracy significantly
higher than 0.73 (p\0.1) were retained to make
up an indicator descriptor fragment. The fragment
descriptor and eight other physicochemical descriptors
closely related to the endpoint were calculated to construct
the recursive partitioning tree (RP tree) for classification.
The balanced accuracy of the training set, test set I, and test
set II in the leave-one-out model were 0.846, 0.800, and
0.809, respectively. The results highlight that fragmentbased
RP tree is a preferable method for identifying skin
sensitizers. Moreover, the selected fragments provide

useful structural information for exploring sensitization
mechanisms, and RP tree creates a graphic tree to identify
the most important properties associated with skin sensitization.
They can provide some guidance for designing of
drugs with lower sensitization level.