Queries documents entry#
Sometimes you will have options where your models predict relevance for query/document combinations that don’t appear in your observed sample. And sometimes the opposite happens - some methods won’t be able to predict some options that are available on the observed information. So behaviour of the runx in such options is interesing.
from ranx import Qrels, Run, evaluate
Lack of documents#
If you have missing documents - it’s fine ranx will deal with it somehow. To be on the safe side, consider possible cases in the following cells.
Qrels
The follwing cell considers case where there is some documents in Run
wich weren’t defined in the Qrels
.
qrels = Qrels({
"q_1": {"d_1": 1, "d_2": 2},
"q_2": {"d_2": 1}
})
run = Run({
"q_1": {"d_1": 1, "d_2": 2, "d_3": 3},
"q_2": {"d_1": 3, "d_2": 1, "d_3": 2}
})
evaluate(
qrels = qrels, run = run, metrics = "ndcg@4",
)
0.5848359082471151
Runs
And in case there are documents that only appear in the Qrels
.
qrels = Qrels({
"q_1": {"d_1": 1, "d_2": 2, "d_3": 3},
"q_2": {"d_1": 3, "d_2": 1, "d_3": 2}
})
run = Run({
"q_1": {"d_1": 1, "d_2": 2},
"q_2": {"d_1": 3}
})
evaluate(
qrels=qrels, run=run, metrics=["ndcg@4"]
)
0.5912532431001917
Lack of queries#
but if there are queries that occur only in Runs
or Qrels
, this leads to an error. The following cell shows both cases.
qrels = Qrels({
"q_1": {"d_1": 1, "d_2": 2},
"q_2": {"d_1": 2, "d_2": 1},
"q_3": {"d_1": 1, "d_2": 1}
})
run = Run({
"q_1": {"d_1": 1, "d_2": 2},
"q_2": {"d_1": 3, "d_2": 1}
})
try:
evaluate(
qrels=qrels, run=run, metrics="ndcg@4",
)
except Exception as e:
print("got exception -", e)
qrels = Qrels({
"q_1": {"d_1": 1, "d_2": 2},
"q_2": {"d_1": 2, "d_2": 1}
})
run = Run({
"q_1": {"d_1": 1, "d_2": 2},
"q_2": {"d_1": 3, "d_2": 1},
"q_3": {"d_1": 1, "d_2": 3}
})
try:
evaluate(
qrels=qrels, run=run, metrics="ndcg@4",
)
except Exception as e:
print("got exception -", e)
got exception - Qrels and Run query ids do not match. Pass `make_comparable=True` to add empty results for queries missing from the run and remove those not appearing in qrels.
got exception - Qrels and Run query ids do not match. Pass `make_comparable=True` to add empty results for queries missing from the run and remove those not appearing in qrels.
You can fix that by passing argument make_comparable=True
.
qrels = Qrels({
"q_1": {"d_1": 1, "d_2": 2},
"q_2": {"d_1": 2, "d_2": 1},
"q_3": {"d_1": 1, "d_2": 1}
})
run = Run({
"q_1": {"d_1": 1, "d_2": 2, "d_3": 3},
"q_2": {"d_1" : 3, "d_2": 1, "d_3": 2}
})
evaluate(
qrels=qrels, run=run, metrics="ndcg@4", make_comparable=True
)
0.5399687444280219
Or you can achieve the same result by simply passing an empty dictionary instead of the missing query.
qrels = Qrels({
"q_1": {"d_1": 1, "d_2": 2},
"q_2": {"d_1": 2, "d_2": 1},
"q_3": {"d_1": 1, "d_2": 1}
})
run = Run({
"q_1": {"d_1": 1, "d_2": 2, "d_3": 3},
"q_2": {"d_1" : 3, "d_2": 1, "d_3": 2},
"q_3": {}
})
evaluate(
qrels=qrels, run=run, metrics="ndcg@4",
)
0.5399687444280219