Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetching plain text with WtXmlEndTag fails with VisitNotFoundException #197

Closed
rzo1 opened this issue Aug 13, 2018 · 2 comments
Closed

Fetching plain text with WtXmlEndTag fails with VisitNotFoundException #197

rzo1 opened this issue Aug 13, 2018 · 2 comments
Assignees
Milestone

Comments

@rzo1
Copy link
Contributor

rzo1 commented Aug 13, 2018

Fetching plain text fails with:

2018-08-13 14:31:24,678 WARN  [ForkJoinPool.commonPool-worker-4] WikiGraphStoragePipelineService (80): de.fau.cs.osr.utils.visitor.VisitNotFoundException: Unable to find visit() method for node of type `org.sweble.wikitext.parser.nodes.WtXmlEndTag' in visitor `de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter'
de.fau.cs.osr.utils.visitor.VisitingException: de.fau.cs.osr.utils.visitor.VisitNotFoundException: Unable to find visit() method for node of type `org.sweble.wikitext.parser.nodes.WtXmlEndTag' in visitor `de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter'
	at de.fau.cs.osr.utils.visitor.VisitorBase.handleVisitingException(VisitorBase.java:92)
	at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:118)
	at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:90)
	at de.fau.cs.osr.utils.visitor.VisitorBase.resolveAndVisit(VisitorBase.java:119)
	at de.fau.cs.osr.ptk.common.AstVisitor.dispatch(AstVisitor.java:56)
	at de.fau.cs.osr.ptk.common.AstVisitor.iterate(AstVisitor.java:66)
	at de.tudarmstadt.ukp.wikipedia.api.sweble.PlainTextConverter.visit(PlainTextConverter.java:211)
	at sun.reflect.GeneratedMethodAccessor130.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at de.fau.cs.osr.utils.visitor.VisitorLogic$Target.invoke(VisitorLogic.java:361)
	at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:110)
	at de.fau.cs.osr.utils.visitor.VisitorLogic.resolveAndVisit(VisitorLogic.java:90)
	at de.fau.cs.osr.utils.visitor.VisitorBase.resolveAndVisit(VisitorBase.java:119)
	at de.fau.cs.osr.ptk.common.AstVisitor.dispatch(AstVisitor.java:56)
	at de.fau.cs.osr.ptk.common.AstVisitor.dispatch(AstVisitor.java:28)
	at de.fau.cs.osr.utils.visitor.VisitorBase.go(VisitorBase.java:111)
	at de.tudarmstadt.ukp.wikipedia.api.Page.parsePage(Page.java:599)
	at de.tudarmstadt.ukp.wikipedia.api.Page.getPlainText(Page.java:580)

Example article is Liste von Materia Medica der traditionellen uigurischen Medizin with markup-text:

Dies ist eine '''Liste von [[Materia Medica]] der traditionellen [[Uiguren|uigurischen]] [[Medizin]]'''. Die uigurische Medizin entwickelte sich aus der [[Arabische Medizin|arabischen Medizin]], der [[Medizin_des_Altertums#Medizin im Antiken Griechenland|antiken griechischen Medizin]] und der [[Traditionelle chinesische Medizin|traditionellen chinesischen Medizin]].<ref>[http://www.cintcm.com/e_cintcm/e_cmm/Uigur%20drugs.htm cintcm.com: The traditional Uigur drugs] – gefunden am 13. Juli 2010</ref>

Die Angaben erfolgen zusätzlich in [[Pinyin]]-Schreibung und in [[Chinesische Schrift|chinesischen Kurzzeichen]]:

== Übersicht ==

<small><center>Quellen: [http://www.cintcm.com/e_cintcm/e_cmm/Uigur%20drugs.htm cintcm.com], [http://www.tcm-resources.com/xiangguan/zhongdianziyuan/minzuyaozhonglei.doc tcm-resources.com]</center></small>

*[[Prunus amygdalus]] (Badanxing 巴旦杏)
*[[Vitis vinifera]] (Suosuo putao 索索葡萄) 
*[[Cuminum cyminum]] (Ziran 孜然)
*[[Vernonia solanifolia]] (Qucong banjiuju 驱虫斑鸠菊)
*[[Alhagi pseudalhagi]] (Citang 刺糖)
*[[Matricaria chamomilla]] (Yan ganju 洋甘菊)
*[[Anethum graveolens]] (Shiluo 莳萝) 
*[[Ziziphora clinopodioides]] (Cunxiangcao 唇香草)
*[[Cicer arietinum]] (Xinjiang yingzueidou 新疆鹰嘴豆)
*[[Dracocephalum heterophyllum]] (Yiye qinglan 异叶青兰)
*[[Saussurea laniceps]] oder [[Saussurea involucrate]] oder [[Saussurea medusa]] (Xuelianhua 雪莲花)
*[[Populus diversifolia]] (Huyang 胡杨) (Hutonglei 胡桐泪) 

*[[Moschus moschiferus]] (Shexiang 麝香)
*[[Physeter catodon]] (Longxianxiang 龙涎香)
*[[Viverra zibztha]] (Hailixiang 海狸香)
*(Daiyicao 黛衣草)
*[[Syzygium aromaticum]] (Dingxiang 丁香)
*[[Amomum cardamomum]] (Doukou 豆蔻)
*[[Piper longum]] (Bibo 荜茇)

*[[Strychnos nux-vomica]] (Maqianzi 马钱子) hochgiftig
*[[Datura metel]] oder [[Datura innoxia]] (Mantuoluo 曼陀罗) hochgiftig
*[[Hyoscyamus niger]] (Tianxianzi 天仙子) hochgiftig
*[[Peganum harmala]] (Luotuopeng 骆驼蓬) hochgiftig

*[[Polygonatum odoratum]] (Yuzhu 玉竹) ist [[Xinjiang Polygonatum]] (Xinjiang huangjin 新疆黄精)
*[[Dictamnus dasycarpus]] (Baixianpi 白鲜皮) ist Xianye baixian 狭叶白鲜
*[[Leonurus heterophyllus]] (Yimucao 益母草) ist [[Xinjiang Leonurus]] (Xinjiang yimucao 新疆益母草)
*[[Nelumbo nucifera]] (Hehua 荷花) ist [[Nymphaea tetragna]] (Shuilian 睡莲) 

*[[Saposhnikovia divaricata]] (Fangfeng 防风)
*[[Paeonia lactiflora]] oder [[Paeonia obovata]] oder [[Paeonia  veitchii]] (Chishao 赤芍)
*[[Notopterygium incisum]] oder [[Notopterygium forbesii]] oder [[Notopterygium franchetii]] (Qianghuo 羌活)
*[[Angelica pubescens]] oder [[Angelica dahurica]] oder [[Angelica porphyrocaulis]] oder [[Heracleum hemsleyanum]] oder [[Heracleum lanatum]] oder [[Aralia cordata]] (Duhuo 独活)
*[[Aucklandia lappa]] (Muxiang 木香)
*[[Rubia cordifolia]] (Qiancao 茜草)
*[[Codonopsis pilosula]] (Dangshen 党参)
*[[Rhizoma ligustici]] (Gaoben 藁本)
*[[Ephedra sinica]] oder [[Ephedra equisetina]] oder [[Ephedra intermedia]] (Mahuang 麻黄)
*[[Clematis chinensis]] (Weilingxian 威灵仙)

== Gesundheitshinweis ==
Einige der Materia medica sind hochgiftig.

== Siehe auch ==
* [[Liste von Heilpflanzen]]

== Literatur ==
* ''Xinjiang Weiwu'er yaozhi'' 新疆维吾尔药志

== Weblinks ==
* [http://www.cintcm.com/e_cintcm/e_cmm/Uigur%20drugs.htm cintcm.com: The traditional Uigur drugs] – Englisch 
* [http://www.tcm-resources.com/xiangguan/zhongdianziyuan/minzuyaozhonglei.doc tcm-resources.com: Minzuyao zhonglei] – Chinesisch (pdf-Datei)
* [http://www.cintcm.com/lanmu/shaoshu_yixue/shaoshu_weiyi/weiyi_jianshi.htm cintcm.com. Weiwu'erzu yiyao jianshi] – Chinesisch

== Einzelnachweise ==
<references/>

{{Gesundheitshinweis}}

[[Kategorie:Xinjiang|!]]
[[Kategorie:Alternativmedizin|!]]
[[Kategorie:Medizingeschichte|!]]
[[Kategorie:Liste (Medizin)|Materia Medica Der Traditionellen Uigurischen Medizin]]
[[Kategorie:Liste (Botanik)|Materia Medica Der Traditionellen Uigurischen Medizin]]
[[Kategorie:Wikipedia:Liste|Materia Medica Der Traditionellen Uigurischen Medizin]]

This is related to #193

@rzo1 rzo1 changed the title Fetching plain text with WtXmlEndTagfails with VisitNotFoundException Fetching plain text with WtXmlEndTag fails with VisitNotFoundException Aug 13, 2018
@mawiesne mawiesne self-assigned this Aug 21, 2018
@mawiesne mawiesne added this to the 1.2.0 milestone Aug 21, 2018
@mawiesne
Copy link
Contributor

@rzo1 I will tackle the reported bug ASAP. FYI: @reckart

mawiesne added a commit that referenced this issue Aug 21, 2018
…xception

- Introduces a reduced article (german: `Liste von Materia Medica der traditionellen uigurischen Medizin`) to the embedded demo database. It contains two `WtXmlEndTag`-like XML elements.
- Implements handling of `WtXmlEndTag` structures in `PlainTextConverter`.
- Adds a new JUnit test case in `PageTest` to demonstrate XML closing tags are now parsed correctly.
- Adjusts existing test cases.
@mawiesne
Copy link
Contributor

@rzo1 Please see PR #198 which will fix the reported issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants