Fragments reweighting is one of the most useful query modificatio

Fragments reweighting is one of the most useful query modification techniques in IR systems [20�C22]. In our previous works, the retrieval performance of Bayesian inference network was observed to improve significantly when relevance feedback and turbo search screening were used [23].In this paper, we enhanced the screening effectiveness of BIN using a weighting factor. HTS In this approach, weighting factors are calculated for each fragment of the multireference input query based on the frequency of their occurrence in the set of references’ input. This weighting factor is later used to calculate a new weight for each fragment of the reference structure.2. Material and MethodsThis study has compared the retrieval results obtained using three different similarity-based screening models.

The first screening system was based on the tanimoto (TAN) coefficient, which has been used in ligand-based virtual screening for many years and is now considered a reference standard. The second model was based on a basic BIN [24] using the Okapi (OKA) weight, which was found to perform the best in their experiments and which we shall refer to as the conventional BIN model. The third model, our proposed model, is a BIN based on reweighted fragments, which we shall refer to as the BINRF model. In what follows, we give a brief description of each of these three models.2.1. Tanimoto-Based Similarity ModelThis model used the continuous form of the tanimoto coefficient, which is applicable to nonbinary data of fingerprint.

SK,L is the similarity between objects or molecules K and L, which, using tanimoto, is given by (1):SkL=��j=1Mwjkwjl��j=1M(wjk)2+��j=1M(wjl)2?��j=1M(wkwjl).(1)For molecules described by continuous variables, the molecular space is defined by an M �� N matrix, where entry wji is the value of the jth fragments (1 �� j �� M) in the ith molecule (1 �� i �� N). The origins of this coefficient can be found in a review paper AV-951 by Ellis et al. [25].2.2. Conventional BIN ModelThe conventional BIN model, as shown in Figure 1, is used in molecular similarity searching. It consists of three types of nodes: compound nodes as roots, fragment nodes, and a reference structure node as leaf. The roots of the network are the nodes without parent nodes and the leaves are the nodes without child nodes. Each compound node represents an actual compound in the collection and has one or more fragment nodes as children. Each fragment node has one or more compound nodes as parents and one reference structure node as a child (or more where multiple references are used). Each network node is a binary value, taking one of the two values from the set true, false.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>