You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Same larger goal as #1168 but for querying instead of indexing.
As discussed with @oryx1729 and @tholor, we'd like to design the node and Pipeline APIs so that batches of queries can be processed together. This will allow for a better user experience but also allows in future for batch optimizations to speed up querying. Note that we're here focusing first on designing the interfaces. The optimization can be handled separately for each node in different issues.
One design choice we decided upon is have explicit separation of the single and batch functions. This will make for a clear user experience, and avoid any ambiguity for developers looking into the code. More specifically we want to avoid any uncertainty in the input and output formats of nodes.
If we call Pipeline.run_batch() every node should be executed using their Node.run_batch() method, not Node.run().
In particular, @bogdankostic and @julian-risch agreed in the refinement of this issue that Pipeline.run_batch() should be implemented with the following signature and should return a list of the elements usually returned by run():
There might be use cases when the same query is executed separately on different sets of documents. Therefore queries is still allowed to be a single query.
Note that every query in the batch needs to use the same params for now. Otherwise, optimization will be hardly possible. We could change that limitation in future if necessary.
file_paths and meta are so far only used in indexing pipelines. If they are set, we should call run() instead of run_batch or trigger an error message if that is impossible.
Note that we should allow for flexibility in the format of the metadata passed into Pipeline.run_batch(). If it is a single meta dictionary, it should be applied to all queries. If it is a list, it should be one meta dict for each query.
Further, every node's predict_batch() should have a batch_size param that can be passed via params.
For now, a Node.run_batch() can be implemented in a naive, non-optimized way by simply calling Node.run() multiple times in a loop and split queries, labels, documents and meta if necessary so that the run() method is called with a single query/list of document. The individual results need to be collected in a list of results.
As an alternative, BaseComponent could implement run_batch() by calling the node's run() method.
The FARMReader already has a predict_batch() method that can be use:
Same larger goal as #1168 but for querying instead of indexing.
As discussed with @oryx1729 and @tholor, we'd like to design the node and Pipeline APIs so that batches of queries can be processed together. This will allow for a better user experience but also allows in future for batch optimizations to speed up querying. Note that we're here focusing first on designing the interfaces. The optimization can be handled separately for each node in different issues.
One design choice we decided upon is have explicit separation of the single and batch functions. This will make for a clear user experience, and avoid any ambiguity for developers looking into the code. More specifically we want to avoid any uncertainty in the input and output formats of nodes.
The following methods should be implemented:
If we call
Pipeline.run_batch()
every node should be executed using theirNode.run_batch()
method, notNode.run()
.In particular, @bogdankostic and @julian-risch agreed in the refinement of this issue that
Pipeline.run_batch()
should be implemented with the following signature and should return a list of the elements usually returned byrun()
:There might be use cases when the same query is executed separately on different sets of documents. Therefore queries is still allowed to be a single query.
Note that every query in the batch needs to use the same
params
for now. Otherwise, optimization will be hardly possible. We could change that limitation in future if necessary.file_paths
andmeta
are so far only used in indexing pipelines. If they are set, we should callrun()
instead ofrun_batch
or trigger an error message if that is impossible.Note that we should allow for flexibility in the format of the metadata passed into
Pipeline.run_batch()
. If it is a single meta dictionary, it should be applied to all queries. If it is a list, it should be one meta dict for each query.Further, every node's
predict_batch()
should have abatch_size
param that can be passed viaparams
.For now, a
Node.run_batch()
can be implemented in a naive, non-optimized way by simply callingNode.run()
multiple times in a loop and split queries, labels, documents and meta if necessary so that the run() method is called with a single query/list of document. The individual results need to be collected in a list of results.As an alternative, BaseComponent could implement
run_batch()
by calling the node'srun()
method.The FARMReader already has a
predict_batch()
method that can be use:haystack/haystack/nodes/reader/farm.py
Line 528 in f33c2b9
For the transformer based models, we could make use of transformer's pipeline batching mechanism: https://huggingface.co/docs/transformers/main_classes/pipelines#pipeline-batching
However, docs say
The text was updated successfully, but these errors were encountered: