OpenBioLink
OpenBioLink is a resource and evaluation framework for evaluating link prediction models on heterogeneous biomedical graph data. It contains benchmark datasets as well as tools for creating custom benchmarks and evaluating models.
Peer reviewed version in the journal Bioinformatics (for citations)
The OpenBioLink benchmark aims to meet the following criteria:
Openly available
Large-scale
Wide coverage of current biomedical knowledge and entity types
Standardized, balanced train-test split
Open-source code for benchmark dataset generation
Open-source code for evaluation (independent of model)
Integrating and differentiating multiple types of biological entities and relations (i.e., formalized as a heterogeneous graph)
Minimized information leakage between train and test sets (e.g., avoid inclusion of trivially inferable relations in the test set)
Coverage of true negative relations, where available
Differentiating high-quality data from noisy, low-quality data
Differentiating benchmarks for directed and undirected graphs in order to be applicable to a wide variety of link prediction methods
Clearly defined release cycle with versions of the benchmark and public leaderboard
Installation
Requirements
Python 3.6 or later
A suitable installation of Pytorch
Install OpenBioLink
The latest stable version of OpenBioLink can be downloaded and installed from PyPI with:
$ pip install openbiolink
The latest version of OpenBioLink can be installed directly from the source on GitHub with:
$ git clone https://github.com/OpenBioLink/OpenBioLink.git
$ cd OpenBioLink
$ sudo python setup.py install
Graph generation
Commands
To generate the default graph (with all edges of all qualifies) in the current directory, use:
openbiolink generate
For a list of arguments, use:
openbiolink generate --help
File description
TSV
Default File Name |
Description |
Columns |
---|---|---|
ALL_nodes.csv |
All nodes present in the graph |
Node Id, Node type |
edges.csv |
All true positive edges |
Node 1 ID, Edge type, Node 2 ID, Quality score, Source |
edges_list.csv |
List of edge types present in edges.csv |
Edge type |
nodes.csv |
All nodes present in edges.csv |
Node ID, Node type |
nodes_list.csv |
List of node types present in nodes.csv |
Node type |
TN_edges.csv |
All true negative edges |
Node 1 ID, Edge type, Node 2 ID, Quality score, Source |
TN_edges_list.csv |
List of edge types present in TN_edges.csv |
Edge type |
TN_nodes.csv |
All nodes present in TN_edges.csv |
Node ID, Node type |
TN_nodes_list.csv |
List of node types present in TN_nodes.csv |
Node type |
ids_no_mapping.tsv |
ID’s of nodes that could not be mapped to other ontology systems |
Node ID, Node type |
tn_ids_no_mapping.tsv |
ID’s of nodes that could not be mapped to other ontology systems |
Node ID, Node type |
stats.txt |
Statistics about edges.csv and nodes.csv |
(See column headers of file) |
tn_stats.txt |
Statistics about TN_edges.csv and TN_nodes.csv |
(See column headers of file) |
Biological Expression Language (BEL)
The Biological Expression Language (BEL) is a domain specific language that enables the expression of biological relationships in a machine-readable format. It is supported by the PyBEL software ecosystem.
BEL can be exported with:
openbiolink generate --output-format BEL
Default File Name |
Description |
---|---|
positive.bel.gz |
All true positive edges in BEL Script format (gzipped) for usage in PyBEL or other BEL-aware applications) |
positive.bel.nodelink.json.gz |
All true positive edges in Nodelink JSON format (gzipped) for direct usage with PyBEL |
negative.bel.gz |
All true negative edges in BEL Script format (gzipped) |
negative.bel.nodelink.json.gz |
All true negative edges in Nodelink JSON format (gzipped) |
Example opening BEL Script using pybel.from_bel_script() :
import gzip
from pybel import from_bel_script
with gzip.open('positive.bel.gz') as file:
graph = from_bel_script(file)
Example opening Nodelink JSON using pybel.from_nodelink_gz() :
from pybel import from_nodelink_gz
graph = from_nodelink_gz('positive.bel.nodelink.json.gz')
There’s an externally hosted copy of OpenBioLink here that contains the exports as BEL.
CURIE’s
All node ID’s in the graph are encoded as CURIE’s, meaning entities can be easily looked up online by concatenating https://identifiers.org/ with the ID, f.e.:
CURIE |
Identifiers.org |
---|---|
REACTOME:R-HSA-201451 |
Detailed information of how the Identifiers are resolved can be found here https://registry.identifiers.org/
Dataset split
Commands
To split the default graph using the random scheme, use:
openbiolink split rand --edges graph_files/edges.csv --tn-edges graph_files/TN_edges.csv --nodes graph_files/nodes.csv
For a list of arguments, use:
openbiolink split rand --help
Splitting can also be done by time with
openbiolink split time
File description
Default file name |
Description |
Column descriptions |
---|---|---|
train_sample.csv |
All positive samples from the training set |
Node 1 ID, Edge type, Node 2 ID, Quality score, TP/TN, Source |
test_sample.csv |
All positive samples from the test set |
Node 1 ID, Edge type, Node 2 ID, Quality score, TP/TN, Source |
val_sample.csv |
All positive samples from the validation set |
Node 1 ID, Edge type, Node 2 ID, Quality score, TP/TN, Source |
negative_train_sample.csv |
All negative samples from the training set |
Node 1 ID, Edge type, Node 2 ID, Quality score, TP/TN, Source |
negative_test_sample.csv |
All negative samples from the test set |
Node 1 ID, Edge type, Node 2 ID, Quality score, TP/TN, Source |
negative_val_sample.csv |
All negative samples from the validation set |
Node 1 ID, Edge type, Node 2 ID, Quality score, TP/TN, Source |
train_val_nodes.csv |
All nodes present in the training and validation set combined |
Node ID, Node type |
test_nodes.csv |
All nodes present in the test set |
Node ID, Node typ |
removed_test_nodes.csv |
All nodes which got removed from the test set, due to not being present in the trainingset |
Node ID |
removed_val_nodes.csv |
All nodes which got removed from the validation set, due to not being present in the trainingset |
Node ID |
Random split
In the random split setting, first, negative sampling is performed. Afterwards, the whole dataset (containing positive and negative examples) is split randomly according to the defined ratio. Finally, post-processing steps are performed to facilitate training and to avoid information leakage.
Time-slice split
In the time-slice split setting, for both of the provided time slices, first, negative sampling is performed. Afterwards, the first time slice (t-1 graph) is used as training sample, while the difference between the first and the second time slice serves as the test set. Finally, post-processing steps are performed to facilitate training and to avoid information leakage.
Generally, the time slice setting is trickier to implement than the random split strategy, as it requires more manual evaluation and knowledge of the data. One of the most difficult factors is the change of the source databases over time. For example, a database might change its quality score, or even its ID-format. Also, the number of relationships stored might increase sharply due to new mapping files being used. This might also result in ‘vanishing edges’, where edges that were present in the t-1 graph are no longer existent in the current graph. Although the OpenBioLink toolbox tries to assist the user with different kinds of warnings to identify such difficulties in the data, it is unfortunately not possible to automatically detect nor solve all these problems, making some manual pre- and post-processing of the data inevitable.
Post-processing
To facilitate model application
Edges that contain nodes that are not present in the training set are dropped from the test set. This facilitates use of embedding-based models that usually cannot make predictions for nodes that have not been embedded during training.
Avoiding train-test information leakage and trivial predictions in the test set
Removal of reverse edges If the graph is directed, reverse edges are removed from the training set. The reason for this is that if the original edge a-b was undirected, both directions a→b and a←b are materialized in the directed graph. If one of these directed edges would be present in the training set and one in the test set, the prediction would be trivial. Therefore, in these cases, the reverse edges from the training set are removed. (Note that edges are removed from the training set instead of the test set because this is advantagous for maintaining the train-test-set ratio)
Removal of super-properties Some types of edges have sub-property characteristics, meaning that relationship x indicates a generic interaction between two entities (e.g. _protein_interaction_protein_), while relationship y further describes this relationship in more detail (e.g., _protein_activation_protein_). This means that the presence of x between two nodes does not imply the existence of a relation y between those same entities, but the presence of y necessarily implies the existence of x. These kinds of relationships could cause information leakage in the datasets, therefore super-relations of relations present in the training set are removed from the test set.
Evaluation
To evaluate a trained model on a OpenBioLink dataset a subclass of Evaluator
has to be created that implements the score_batch()
function and initializes the Evaluator.
Initializing
The Evaluator
class uses an instantiation of DataLoader
, which is used to download the specified version of the OpenBioLink dataset. Furthermore it maps the string labels of entities and relations to integer identifiers, either by passing the paths to dictionary files or, if no such files are provided, by creating a mapping from scratch.
Implementing score_batch()
The function score_batch()
is used by Evaluator
to retrieve the scores of a set of test triples. It gets called with a tensor containing the batch test triples and should return a tuple (pos_score_head, neg_score_head, pos_score_tail, neg_score_tail)
.
pos_score_head
and pos_score_tail
should contain the positive scores for the head and tail of all triples in the batch respectively. Therefore both tensors should be of size (batch_size,) where pos_score_head[i]
should be the score for predicting the head of triple batch[i,:]
and pos_score_tail[i]
should be the score for predicting the tail of triple batch[i,:]
(Note: In many embedding models pos_score_head == pos_score_tail
).
neg_score_head
and neg_score_tail
should contain the scores of for corrupted triples of triples in the batch. OpenBioLink is evaluated by corrupting the head/tail of a triple with all possible entities in the dataset. Consider \(\mathcal{K} = (\mathcal{E},\mathcal{R},\mathcal{T})\) where \(\mathcal{E}\) is the set of unique entities, \(\mathcal{R}\) is the set of unique relations, and \(\mathcal{T}\) are the triples in the dataset. Then the corrupted tail triples for a positive triple \((h, r, t)\) are \((h, r, e)\) for \(e \in \mathcal{E}\) and the corrupted head triples are \((e, r, t)\) for \(e \in \mathcal{E}\). A 1-d tensor containing all possible entities can be retrieved from self.entities
on the Evaluator. The length of this tensor is also stored in self.num_neg
. The shape of the tensors neg_score_head
and neg_score_tail
therefore should be (batch_size, num_neg), where neg_score_head[i,j]
is the score of the triple batch[i]
which head got corrupted with the entity j.
The following two examples show how to evaluate a model trained with DGL-KE and a rule-based approach called SAFRAN.
Calling evaluate()
After the creation of a custom class that implements Evaluator
, your approach can be evaluated by calling evaluate()
with the desired batch_size and the amount of processors to use for evaluation.
The attribute filtering
should only be set to false, if you are evaluating a filtered top-k file (see example SAFRAN)
Example DGL-KE
"""
Import dependencies, such as the DGL-KE ScoreInfer class and the OpenBioLink DataLoader and Evaluator."""
import torch
import os
import numpy as np
from openbiolink.evaluation.dataLoader import DataLoader
from openbiolink.evaluation.evaluation import Evaluator
from dglke.models.infer import ScoreInfer
from dglke.utils import load_model_config
"""As we do not create a DGLGraph Object, DGL-KE needs an auxilary class that stores the embeddings of the positive edges"""
class FakeEdge(object):
def __init__(self, head_emb, rel_emb, tail_emb):
self._hobj = {}
self._robj = {}
self._tobj = {}
self._hobj['emb'] = head_emb
self._robj['emb'] = rel_emb
self._tobj['emb'] = tail_emb
@property
def src(self):
return self._hobj
@property
def dst(self):
return self._tobj
@property
def data(self):
return self._robj
class DglkeEvaluator(Evaluator):
def __init__(self, dataset_name, model_path, entity_to_id_path, relation_to_id_path):
dl = DataLoader(dataset_name, entity_to_id_path=entity_to_id_path, relation_to_id_path=relation_to_id_path)
super().__init__(dl)
config = load_model_config(os.path.join(model_path, 'config.json'))
model = ScoreInfer(-1, config, model_path)
model.load_model()
self.model = model.model
self.entity_emb = self.model.entity_emb(self.entities.long())
self.entity_emb.share_memory_()
self.relation_emb = self.model.relation_emb(self.relations.long())
self.relation_emb.share_memory_()
def score_batch(self, batch):
head_neg_score = self.model.score_func.create_neg(True)
tail_neg_score = self.model.score_func.create_neg(False)
head_neg_prepare = self.model.score_func.create_neg_prepare(True)
tail_neg_prepare = self.model.score_func.create_neg_prepare(False)
pos_head_emb = self.entity_emb[batch[:, 0], :]
pos_tail_emb = self.entity_emb[batch[:, 2], :]
pos_rel = batch[:, 1].long()
pos_rel_emb = self.model.relation_emb(pos_rel)
edata = FakeEdge(pos_head_emb, pos_rel_emb, pos_tail_emb)
pos_score = self.model.score_func.edge_func(edata)['score']
neg_head, tail = head_neg_prepare(pos_rel, 1, self.entity_emb, pos_tail_emb, -1, False)
neg_scores_head = head_neg_score(neg_head, pos_rel_emb, tail,
1, len(batch), self.num_neg)
head, neg_tail = tail_neg_prepare(pos_rel, 1, pos_head_emb, self.entity_emb, -1, False)
neg_scores_tail = tail_neg_score(head, pos_rel_emb, neg_tail,
1, len(batch), self.num_neg)
return pos_score, neg_scores_head.squeeze(0), pos_score, neg_scores_tail.squeeze(0)
if __name__ == "__main__":
torch.manual_seed(145)
np.random.seed(145)
model_path = r"G:\ckpts\TransE_l2_FB15k_0"
entity_to_id_path = r"G:\triples\entities.tsv"
relation_to_id_path = r"G:\triples\relations.tsv"
evaluator = DglkeEvaluator("HQ_DIR", model_path, entity_to_id_path, relation_to_id_path)
result = evaluator.evaluate(100, -1)
print(result)
Example SAFRAN
SAFRAN is a rule-based approach, that creates a filtered top-k text file in the form of
DOID:14320 DIS_DRUG PUBCHEM.COMPOUND:122282
Heads: DOID:14320 0.9824 DOID:4964 0.9713 DOID:594 0.7095 DOID:10763 0.6424 DOID:8986 0.6423 DOID:1596 0.5923 DOID:10825 0.4874 DOID:1825 0.3771 DOID:750 0.3608 DOID:1470 0.3416
Tails: PUBCHEM.COMPOUND:10240 0.6357 PUBCHEM.COMPOUND:122282 0.5567 PUBCHEM.COMPOUND:4585 0.4798 PUBCHEM.COMPOUND:2160 0.4310 PUBCHEM.COMPOUND:3696 0.3965 PUBCHEM.COMPOUND:2726 0.2493 PUBCHEM.COMPOUND:3559 0.2251 PUBCHEM.COMPOUND:2995 0.2008 PUBCHEM.COMPOUND:2520 0.0172 PUBCHEM.COMPOUND:3386 0.0142
DOID:14320 DIS_DRUG PUBCHEM.COMPOUND:2771
Heads: DOID:10933 0.9822 DOID:594 0.9551 DOID:14320 0.8485 DOID:11257 0.7170 DOID:2055 0.4382 DOID:2030 0.4334 DOID:0060891 0.3585 DOID:0060895 0.2242 DOID:9970 0.1990 DOID:0060896 0.0977
Tails: PUBCHEM.COMPOUND:10240 0.8967 PUBCHEM.COMPOUND:4585 0.7613 PUBCHEM.COMPOUND:2160 0.7521 PUBCHEM.COMPOUND:3696 0.7000 PUBCHEM.COMPOUND:2726 0.6095 PUBCHEM.COMPOUND:3559 0.5914 PUBCHEM.COMPOUND:2995 0.4491 PUBCHEM.COMPOUND:2520 0.4050 PUBCHEM.COMPOUND:3386 0.3957 PUBCHEM.COMPOUND:5002 0.1693
DOID:14320 DIS_DRUG PUBCHEM.COMPOUND:2712
Heads: DOID:240 0.8082 DOID:13603 0.7477 DOID:12030 0.7133 DOID:4353 0.7011 DOID:2089 0.6067 DOID:13141 0.5286 DOID:9741 0.3540 DOID:10808 0.3119 DOID:14320 0.2678 DOID:4964 0.0284
Tails: PUBCHEM.COMPOUND:2712 0.9847 PUBCHEM.COMPOUND:10240 0.7101 PUBCHEM.COMPOUND:4585 0.6751 PUBCHEM.COMPOUND:2160 0.6031 PUBCHEM.COMPOUND:3696 0.5430
To evaluate such a filtered top-k file, a custom class is needed that reads the file on initialization and implements the score_batch()
function of the Evaluator. As the file contains filtered top-k predictions, the prediction of all negative entities and the filtering can be omitted.
import torch
from openbiolink.evaluation.dataLoader import DataLoader
from openbiolink.evaluation.evaluation import Evaluator
class SafranEvaluator(Evaluator):
def __init__(self, dataset_name, evaluation_file_path):
dl = DataLoader(dataset_name)
super().__init__(dl)
with open(evaluation_file_path) as infile:
content = infile.readlines()
content = [x.strip() for x in content]
self.predictions = dict()
for i in range(0, len(content), 3):
head, rel, tail = content[i].split(" ")
head = self.dl.entity_to_id[head]
rel = self.dl.relation_to_id[rel]
tail = self.dl.entity_to_id[tail]
pos_head = 0.0
neg_head = []
head_predictions = content[i+1]
if(head_predictions == "Heads:"):
continue
else:
head_predictions = head_predictions[len("Heads: "):].split("\t")
for j in range(0, len(head_predictions), 2):
head_prediction = self.dl.entity_to_id[head_predictions[j]]
confidence = float(head_predictions[j+1])
if head == head_prediction:
# Correct prediction
pos_head = confidence
else:
# False prediction
neg_head.append((head_prediction, confidence))
pos_tail = 0.0
neg_tail = []
tail_predictions = content[i+2]
if tail_predictions == "Tails:":
continue
else:
tail_predictions = tail_predictions[len("Tails: "):].split("\t")
for j in range(0, len(tail_predictions), 2):
tail_prediction = self.dl.entity_to_id[tail_predictions[j]]
confidence = float(tail_predictions[j+1])
if tail == tail_prediction:
# Correct prediction
pos_tail = confidence
else:
# False prediction
neg_tail.append((tail_prediction, confidence))
self.predictions[f"{str(head)};{str(rel)};{str(tail)}"] = (pos_head, neg_head, pos_tail, neg_tail)
def score_batch(self, batch):
pos_score_head = torch.zeros((len(batch),), dtype=torch.float)
neg_score_head = torch.zeros((100, self.num_neg), dtype=torch.float)
pos_score_tail = torch.zeros((len(batch),), dtype=torch.float)
neg_score_tail = torch.zeros((100, self.num_neg), dtype=torch.float)
for i in range(batch.shape[0]):
head, rel, tail = batch[i,:]
key = f"{str(head.item())};{str(rel.item())};{str(tail.item())}"
if key in self.predictions:
(pos_head, neg_heads, pos_tail, neg_tails) = self.predictions[key]
pos_score_head[i] = pos_head
for neg_head, confidence in neg_heads:
neg_score_head[i, neg_head] = confidence
pos_score_tail[i] = pos_tail
for neg_tail, confidence in neg_tails:
neg_score_tail[i, neg_tail] = confidence
else:
pass
return pos_score_head, neg_score_head, pos_score_tail, neg_score_tail
if __name__ == "__main__":
evaluation_file_path = r"G:\prediction.txt"
evaluator = SafranEvaluator("HQ_DIR", evaluation_file_path)
result = evaluator.evaluate(100, 1, filtering=False)
print(result)
Sources/Licenses
Source type |
Source name |
License |
True neg. |
Score |
---|---|---|---|---|
edge (gene-gene) |
CC BY |
No |
Yes |
|
edge (gene-go) |
CC BY |
No |
Yes |
|
edge (gene-disease) |
CC BY-NC-CA |
No |
Yes |
|
edge (gene-phenotype) |
Custom: HPO |
No |
No |
|
edge (gene-anatomy) |
CC 0 |
Yes |
Yes |
|
edge (gene-drug) |
CC BY |
No |
Yes |
|
edge (gene-pathway) |
Custom: CTD |
No |
No |
|
edge (disease-phenotype) |
Custom: HPO |
Yes |
No |
|
edge (disease-drug) |
CC BY-SA |
Yes |
No |
|
edge (drug-phenotype) |
CC BY-NC-CA |
No |
No |
|
ontology (genes) |
CC BY |
|||
ontology (diseases) |
CC 0 |
|||
ontology (phenotype) |
Custom: HPO |
|||
ontology (anatomy) |
CC BY |
|||
mapping (UMLS-DO) |
CC BY-NC-CA |
|||
mapping (STRING-NCBI) |
CC BY |
|||
mapping (ENSEMBL/UNIPROT-NCBI) |
CC BY |
|||
id (genes) |
Public Domain |
|||
id (go) |
CC BY |
|||
id (anatomy) |
CC BY |
|||
id (disease) |
CC 0 |
|||
id (drug) |
Public Domain |
|||
id (phenotype) |
Custom: HPO |
|||
id (pathway) |
CC BY |
|||
id (pathway) |
Custom: KEGG |
(True neg.: whether the data contains true negative relations; Score: whether the data contains evidence quality scores for filtering relations)
The OpenBioLink benchmark files integrate data or identifiers from these sources. The provenance of data items is captured in the benchmark files, and licensing terms of source databases apply to these data items. Please mind these licensing terms when utilizing or redistributing the benchmark files or derivatives thereof.
All original data in the benchmark files created by the OpenBioLink project (not covered by the licenses of external data sources) are released as CC 0.
We offer the benchmark files as-is and make no representations or warranties of any kind concerning the benchmark files, express, implied, statutory or otherwise, including without limitation warranties of title, merchantability, fitness for a particular purpose, non infringement, or the absence of latent or other defects, accuracy, or the present or absence of errors, whether or not discoverable, all to the greatest extent permissible under applicable law.
OpenBioLink2020
The OpenBioLink2020 Dataset is a highly challenging benchmark dataset containing over 5 million positive and negative edges. The test set does not contain trivially predictable, inverse edges from the training set and does contain all different edge types, to provide a more realistic edge prediction scenario.
OpenBioLink2020: directed, high quality is the default dataset that should be used for benchmarking purposes. To allow anayzing the effect of data quality as well as the directionality of the evaluation graph, four variants of OpenBioLink2020 are provided – in directed and undirected setting, with and without quality cutoff.
Additionally, each graph is available in RDF N3 format (without train-validation-test splits).
Download
All datasets are hosted on zenodo.
OpenBioLink2020: directed, high quality // RDF (default dataset for benchmarking)
Leaderboard
Model |
MRR |
h@1 |
||
---|---|---|---|---|
Latent |
RESCAL |
.320 |
.212 |
.544 |
TransE |
.280 |
.175 |
.500 |
|
DistMult |
.300 |
.193 |
.521 |
|
ComplEx |
.319 |
.211 |
.547 |
|
ConvE |
.288 |
.186 |
.510 |
|
RotatE |
.286 |
.180 |
.511 |
|
Interpretable |
AnyBURL (Maximum) |
.277 |
.192 |
.457 |
AnyBURL (Noisy-OR) |
.159 |
.098 |
.295 |
|
SAFRAN* |
.306 |
.214 |
.501 |
If you want to see your results added to the Leaderboard please create a new issue.
Summary
Dataset |
Train |
Test |
Valid |
Entities |
Relations |
---|---|---|---|---|---|
directed, high quality |
8.503.580 |
401.901 |
397.066 |
184.732 |
28 |
undirected, high quality |
7.559.921 |
372.877 |
357.297 |
184.722 |
28 |
directed, no quality cutoff |
51.636.927 |
2.079.139 |
2.474.921 |
486.998 |
32 |
undirected, no quality cutoff |
41.383.093 |
2.010.662 |
1.932.436 |
486.998 |
32 |
DataLoader
- class openbiolink.evaluation.dataLoader.DataLoader(root='dataset', name='HQ_DIR', entity_to_id_path=None, relation_to_id_path=None)[source]
- Parameters
root (
str
) – Pathlike string to directory in which dataset files should be stored
- filter_scores(batch, scores, filter_col, filter_val=nan)[source]
Filters true positive .
- Parameters
batch – Batch of triples. Shape (batch_size,3)
scores – Batch of triples. Shape (batch_size,num_entities)
filter_col – Batch of triples. Shape (batch_size,num_entities)
filter_val – Batch of triples. Shape (batch_size,num_entities), default NaN
- Returns
filtered_scores: torch.tensor where the value at [i,j] is the score of the triple (j, batch[i][1], batch[i][2]). Shape (batch_size, num_entities)
Evaluator
- class openbiolink.evaluation.evaluation.Evaluator(dl, higher_is_better=True)[source]
- Parameters
dl (
DataLoader
) – Dataloader containing the OpenBioLink datasethigher_is_better (
bool
) – Boolean which should be set to True if higher scores are considered better, False otherwise.
- evaluate(batch_size=100)[source]
Evaluates a model by retrieving scores from the (implemented) score_batch function.
- abstract score_batch(batch)[source]
Abstract function, has to be implemented. Should return two arrays containing the head and tail scores of a batch of test data from a model.
- Parameters
batch (
Tensor
) – Batch of test data. Shape (batch_size,3)- Return type
- Returns
head_scores: torch.tensor where the value at [i,j] is the score of the triple (j, batch[i][1], batch[i][2]). Shape (batch_size, num_entities)
tail_scores: torch.tensor where the value at [i,j] is the score of the triple (batch[i][0], batch[i][1], j). Shape (batch_size, num_entities)