You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since the later already provides the correct description.
The slight problem in paper that is describes MAX towards the labels rather than bags.
So for sample gradients within bags we adopt avg function, where the main assumption is that we take into account other synonymous attitudes.
We use this feature in earlier works (https://github.com/nicolay-r/sentiment-pcnn/tree/clls-2018)
Anyway, since in last research we adopt BagSize = 1, it means that we do not exploit this feature.
in the original approach https://www.aclweb.org/anthology/D15-1203.pdf,
authors select a best instance j-th within a bag, where best denotes a max value of p(y_i|m_i,j) across all other values within a bag. This way we obtain Loss function on bags level and hence use the result value in order to update Theta using stochastic SGD (using AdaDelta)
AREkit/contrib/networks/context/architectures/base/base.py
Line 151 in ac07e88
NOTE:
This should be moved and clarified into another repository, which is related to benchmark results for RuSentRel-1.2
The text was updated successfully, but these errors were encountered: