-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.htm
executable file
·278 lines (269 loc) · 17.9 KB
/
index.htm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<!-- DW6 -->
<head>
<!-- Copyright 2005 Macromedia, Inc. All rights reserved. -->
<title>Home Page</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<link rel="stylesheet" href="mm_spa.css" type="text/css" />
<style type="text/css">
<!--
.style2 {color: #FFFFFF; font-size: 12px;}
.style3 {
font-family: Geneva, Arial, Helvetica, sans-serif;
font-weight: bold;
color: #99FF33;
font-size: 14px;
}
.style4 {
color: #99FF66;
font-weight: bold;
}
.style6 {
color: #CCCCCC;
font-size: 12px;
}
.style12 {
font-weight: bold;
font-size: 18px;
}
.style14 {color: #CCFF33}
.style17 {
font-size: 13px;
color: #00FF66;
}
.style18 {color: #CCCCCC; }
.style20 {font-size: 12px}
.style21 {color: #00FF66}
.style22 {color: #CCFF33; font: 11px Arial, Helvetica, sans-serif; }
.style23 {color: #999999; font-size: 12px; }
.style24 {font-size: 20px}
.style26 {font-size: 14}
.style27 {
font-size: 14px;
font-weight: bold;
}
-->
</style>
</head>
<body bgcolor="#990000" background="mm_bg_red.gif">
<table border="0" cellspacing="0" cellpadding="0">
<tr bgcolor="#220103">
<td rowspan="2" colspan="2" nowrap="nowrap"> <img src="mm_spa_photo11.jpg" alt="Header image" width="215" height="109" border="0" /> </td>
<td colspan="2" height="55" nowrap="nowrap" id="logo" valign="bottom"><span class="pageName style24">Shared Task on Mixed Scrip Information Retrieval</span> </td>
<td width="316" rowspan="2"><!--<img src="Fire logo.png" />--></td>
</tr>
<tr bgcolor="#220103">
<td height="54" colspan="2" nowrap="nowrap" id="tagline" valign="top"><div align="center"><span class="style26">In conjunction with <span class="style27">FIRE 2016</span></span></div></td>
</tr>
<tr bgcolor="#FF9900">
<td colspan="6"><img src="mm_spacer.gif" alt="" width="1" height="1" border="0" /></td>
</tr>
<tr bgcolor="#FF080E">
<td colspan="6"><img src="mm_spacer.gif" alt="" width="1" height="2" border="0" /></td>
</tr>
<tr bgcolor="#FF9900">
<td colspan="6"><img src="mm_spacer.gif" alt="" width="1" height="1" border="0" /></td>
</tr>
<tr bgcolor="#FF080E">
<td colspan="6"><img src="mm_spacer.gif" alt="" width="1" height="18" border="0" /></td>
</tr>
<tr bgcolor="#FF9900">
<td colspan="6"><img src="mm_spacer.gif" alt="" width="1" height="1" border="0" /></td>
</tr>
<tr bgcolor="#FF080E">
<td colspan="6"><img src="mm_spacer.gif" alt="" width="1" height="2" border="0" /></td>
</tr>
<tr bgcolor="#FF9900">
<td colspan="6"><img src="mm_spacer.gif" alt="" width="1" height="1" border="0" /></td>
</tr>
<tr>
<td width="165" valign="top" id="navborder"><br />
<table border="0" cellspacing="0" cellpadding="0" width="160" id="navigation">
<tr>
<td width="160"><a href="javascript:;" class="navText">HOME</a></td>
</tr>
<tr>
<td width="160"><a href="#task" class="navText">THE TASKS </a></td>
</tr>
<tr>
<td width="160"><a href="#organizers" class="navText">ORGANIZERS</a></td>
</tr>
<tr>
<td width="160"><a href="#dates" class="navText">DATES</a></td>
</tr>
<tr>
<td width="160"><a href="#contact" class="navText">CONTACT</a></td>
</tr>
</table> </td>
<td width="50"><img src="mm_spacer.gif" alt="" width="50" height="1" border="0" /></td>
<td width="689" valign="top"><img src="mm_spacer.gif" alt="" width="305" height="1" border="0" /><br />
<br />
<br />
<table border="0" cellspacing="0" cellpadding="0" width="680">
<tr>
<td width="680" class="pageName"> </td>
</tr>
<tr>
<td class="bodyText"><p align="justify" class="style2">A large number of languages, including Arabic, Russian, and most of the South and South East Asian languages, are written using indigenous scripts. However, often the websites and the user generated content (such as tweets and blogs) in these languages are written using Roman script due to various socio-cultural and technological reasons. This process of phonetically representing the words of a language in a non-native script is called <em>transliteration</em>. Transliteration, especially into Roman script, is used abundantly on the Web not only for documents, but also for user queries that intend to search for these documents. This situation, where both documents and queries can be in more than one scripts, and the user expectation could be to retrieve documents across scripts is referred to as <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=226107" target="_new" onclick="stc(this, 26)">Mixed Script Information Retrieval.</a></p>
<p align="justify" class="pageName"><br />
History</p>
<p align="justify" class="style2">Two pilot subtasks on transliterated search were introduced as a part of FIRE 2013. Subtask 1 was on language identification of the query words and then transliteration of the Indian language words. The subtask was conducted for three Indian languages - Hindi, Bangla and Gujarati. Subtask 2 was on ad hoc retrieval of Bollywood song lyrics - one of the most common forms of transliterated search that commercial search engines have to tackle. Five teams had participated in the shared task. </p>
<p align="justify" class="style2">In FIRE 2014, the scope of subtask 1 was extended to cover three more South Indian languages - Tamil, Kannada and Malayalam. In subtask 2, we introduced (a) queries in Devanagari script, and (b) more natural queries with splitting and joining of words. More than 15 teams participated in the tasks. </p>
<p align="justify" class="style2">In FIRE 2015, the shared task was renamed from "Transliterated Search" to "Mixed Script Information Retrieval" for aligning it to the framework proposed by (Gupta et al. 2014). Three subtasks were conducted. Subtask 1 was extended further by including more Indic languages, and transliterated text from all the languages were mixed. Subtask 2 was on searching movie dialogues and reviewed along with song lyrics. Mixed script question answering (MSQA) was introduced as Subtask 3.</p>
<p align="justify" class="style2"> </p>
<p align="justify" class="style2"><span class="pageName"> <a name='task'></a>Task Description</span></p>
<p align="justify" class="style2"><span class="style3">Subtask 1: Code-Mixed Cross-Script Question Classification</span><br />
<br />
Being a classic application of natural language processing, question answering (QA) has practical applications in various domains such as education, health care, personal assistance, etc. QA is a retrieval task which is
more challenging than the task of common search engine because the purpose of QA is to find accurate and
concise answer to a question rather than just retrieving relevant documents containing the answer (Li and Roth,
2002). Recently, Banerjee et al. (2015) formally introduced the code-mixed cross-script QA research problem. The first step of understanding a question is to perform question analysis. Question classification is an
important task of question analysis which detects the answer type of the question. Question classification helps
not only filter out a wide range of candidate answers but also determine answer selection strategies (Li and
Roth, 2002). Furthermore, the performance of question classification has significant influence on the overall
performance of a QA system (Ittycheriah et al., 2001; Hovy et al., 2001; Moldovan et al., 2003)</p>
<p align="justify" class="style2"> Let, Q = {q1, q2, . . . , qn} be a set of factoid questions written in Romanized Bengali along with English (i.e., it
also contains English words and phrases). Let C = {c1, c2,…,cn} be the set of question classes. The task is to
classify each given question into one of the predefined coarse-grained classes.</p>
<p align="justify" class="style2"><span class="style4">Language: </span>Code-mixed Bengali-English</p>
<p align="justify" class="style2"><strong class="style4">Example:</strong><br />
Question: last volvo bus kokhon chare ?<br />
Question Class: TEMPORAL</p>
<p align="justify" class="style2"> <span class="style4">Data and Resources:</span><br />
A dataset of questions tagged with question classes will be released as training data for this task. Participants
can use any other resources that they have access to.<br />
Each entry in the dataset has the format: <em>q_no q_string q_class</em><br />
Where, q_no, q_string and q_class refer to question number, code-mixed cross-script question string and the
class of the question respectively.<br />
Example: last volvo bus kokhon chare ? TEMPORAL<br />
</p>
<p align="justify" class="style2"> </p>
<p align="justify" class="style2"><span class="style3">Subtask 2: Mixed-script Ad hoc retrieval</span></p>
<p align="justify" class="style23">Will be updated soon... </p>
<p align="justify" class="style23"> </p>
<p align="justify"><span class="pageName">How to Participate</span></p>
<ul>
<li class="style2">Who can participate: The shared task is open to all. Students, faculty members and researchers, as well as engineers from industry are all welcome to participate in the shared task. Participation will be in teams, where a team can consist of one or more members. There is no upper limit on the number of members in a team (though we believe 2 to 4 are the optimal team size for these tasks).</li>
<li class="style2">Registration: It is mandatory for a team to register for this shared-task to participate. The test and training data will be sent through emails only to registered teams. Click <a href="http://bit.ly/1UmYuno">here</a> to register.</li>
<li class="style2">Which subtasks: A team can choose to participate in all or two or one of the subtasks. </li>
<li class="style2">How many runs: A team can submit up to three runs per subtask. A "run" is defined as an output for the test set from a particular system. If you want to try out more than one systems on our test data (which might be because you are not sure which system will perform the best or you are curious to know how slightly different systems that you have built compare), you can submit multiple runs (up to 3).</li>
</ul>
<p align="justify" class="style2"><span class="pageName"><br />
<a name="dates"></a>Important Dates</span></p>
<ul>
<li class="style2">Registration for the task begins: 20th July 2016</li>
<li class="style2"> Training/Dev data release: 11th Aug 2016</li>
<li class="style2"> Registration closes: 31st Aug 2016</li>
<li class="style2"> Test Set release: 28th Sep 2016</li>
<li class="style2"> Submit Run: 5th Oct 2016</li>
<li class="style2"> Results distributed: 19th Oct 2016</li>
<li class="style2"> Working Notes submission deadline: 26th Oct 2016</li>
<li class="style2"> Working Notes reviews: 6th Nov 2016</li>
<li class="style2"> Working Notes final versions due: 13th Nov 2016</li>
<li class="style2"> FIRE Workshop: 8-10th Dec 2016</li>
</ul>
<p align="justify" class="pageName"><br />
References</p>
<ul>
<li class="style6"> Li, X. and D. Roth. 2002. Learning question classifiers. In: 19th International Conference on Compuatational
Linguistics (COLING), pages 556–562.</li>
<li class="style6"> Babak Loni. 2011. A survey of state-of-the-art methods on question classification. Delft University of Technology,
Tech. Rep (2011): 1-40</li>
<li class="style6"> Ittycheriah, M. Franz, W. J. Zhu, A. Ratnaparkhi, and R. J. Mammone. 2001. IBM’s statistical question answering
system. In: 9th Text Retrieval Conference, NIST, 2001.</li>
<li class="style6"> Eduard Hovy, Laurie Gerber, Ulf Hermjakob, Chin yew Lin, and Deepak Ravichandran. 2001. Toward semanticsbased answer pinpointing, 2001. In: Human language technology research (pp. 1-7). Association for Computational Linguistics.</li>
<li class="style6"> Dan Moldovan, Marius Pa¸sca, Sanda Harabagiu, and Mihai Surdeanu. 2003. Performance issues and error analysis
in an open-domain question answering system. In: ACM Trans. Inf. Syst., 21:133–154.</li>
<li class="style6"><span class="style6"> Somnath Banerjee, Sudip Kumar Naskar, Paolo Rosso, and Sivaji Bandyopadhyay. 2016. The First Cross-Script Code-Mixed Question Answering Corpus. In: Modeling, Learning and Mining for Cross/Multilinguality Workshop, 38th European Conference on Information Retrieval (ECIR), 2016.</span><br />
<br />
</li>
</ul></td>
</tr>
<tr>
<td class="bodyText"> </td>
</tr>
</table>
<br />
<br /> </td>
<td width="50"><img src="mm_spacer.gif" alt="" width="50" height="1" border="0" /></td>
<td width="316" align="left" valign="top"><p> </p>
<p> </p>
<p><span class="pageName">News</span></p>
<p align="left" class="style12"><span class="bodyText style14">19/6/2016: Registration for the shared task is now open. Please register your team through </span><span class="style22"><a href="http://bit.ly/1UmYuno" target="_self" onclick="stc(this, 41)">this link</a></span><span class="bodyText style14">.</span></p>
<p align="left" class="pageName"><a name="contact"></a>Contact</p>
<ul>
<li class="style17">
<p>For general queries: Monojit Choudhury <<a href="mailto:[email protected]" class="style18" onclick="stc(this, 47)">[email protected]</a>></p>
</li>
<li class="style17">
<p>For Subtask 1: Somnath Banerjee<br />
<<a href="mailto:[email protected]" class="style18" onclick="stc(this, 50)">[email protected]</a>></p>
</li>
<li class="style17">
<p>For Subtask 2: Amitava Das <br />
<<a href="mailto:[email protected]" class="style18" onclick="stc(this, 49)">[email protected]</a>></p>
</li>
</ul>
<p class="pageName"><br />
<a name="organizers"></a>Task Coordinators </p>
<ul>
<li class="style17">
<p>Monojit Choudhury, Microsoft Research</p>
</li>
<li class="style17">
<p> Somnath Banerjee, Jadavpur University</p>
</li>
<li class="style17">
<p> Sudip Kumar Naskar, Jadavpur University</p>
</li>
<li class="style17">
<p> Paolo Rosso, Technical University of Valencia</p>
</li>
<li class="style17">
<p>Sivaji Bandyopadhyay, Jadavpur University</p>
</li>
<li class="style17">
<p>Amitava Das, IIIT Sriharikota</p>
</li>
<li class="style17">
<p>unal Chakma, NIT Agartala</p>
</li>
</ul>
<p class="pageName"><br />
Useful Links </p>
<ul class="style20">
<li class="style21">
<p><a href="http://www.isical.ac.in/~fire/" class="style18" onclick="stc(this, 42)">FIRE: Forum for IR Evaluation</a></p>
</li>
<li class="style21">
<p><a href="http://research.microsoft.com/en-us/events/fire13_st_on_transliteratedsearch/fire15st.aspx" title="" target="_self" class="style18" onclick="stc(this, 43)" alt="">FIRE 2015 Shared Task on Mixed Script IR</a></p>
</li>
<li class="style21">
<p><a href="http://ceur-ws.org/Vol-1587/" target="_self" class="style18" onclick="stc(this, 44)">Working Notes of FIRE 2015 Shared Task</a></p>
</li>
<li class="style21"><a href="http://research.microsoft.com/en-US/events/fire13_st_on_transliteratedsearch/fire14st.aspx" title="" target="_self" class="style18" onclick="stc(this, 43)" alt="">FIRE 2014 Shared Task on Transliterated Search</a>
<p><a href="http://www.isical.ac.in/~fire/2014/working-notes.html" target="_self" class="style18" onclick="stc(this, 44)">Working Notes of FIRE 2014 Shared Task</a></p>
</li>
<li class="style21">
<p><a href="http://research.microsoft.com/en-US/events/fire13_st_on_transliteratedsearch/default.aspx" title="" target="_self" class="style18" onclick="stc(this, 45)" alt="">FIRE 2013 Shared Task on Transliterated Search</a></p>
</li>
<li class="style21">
<p><a href="http://www.isical.ac.in/~fire/2013/working-notes.html" class="style18" onclick="stc(this, 46)">Working Notes of FIRE 2013 Shared Task</a></p>
</li>
</ul>
<p class="pageName"> </p>
</td>
<td width="4"> </td>
</tr>
<tr>
<td width="165"> </td>
<td width="50"> </td>
<td width="689"> </td>
<td width="50"> </td>
<td width="316"> </td>
<td width="4"> </td>
</tr>
</table>
</body>
</html>