By
Habash, Nizar; Dorr, Bonnie; Monz, Christof
7 Citations
The last few years have witnessed an increasing interest in hybridizing surfacebased statistical approaches and rulebased symbolic approaches to machine translation (MT). Much of that work is focused on extending statistical MT systems with symbolic knowledge and components. In the brand of hybridization discussed here, we go in the opposite direction: adding statistical bilingual components to a symbolic system. Our base system is Generationheavy machine translation (GHMT), a primarily symbolic asymmetrical approach that addresses the issue of Interlingual MT resource poverty in sourcepoor/targetrich language pairs by exploiting symbolic and statistical targetlanguage resources. GHMT’s statistical components are limited to targetlanguage models, which arguably makes it a simple form of a hybrid system. We extend the hybrid nature of GHMT by adding statistical bilingual components. We also describe the details of retargeting it to Arabic–English MT. The morphological richness of Arabic brings several challenges to the hybridization task. We conduct an extensive evaluation of multiple system variants. Our evaluation shows that this new variant of GHMT—a primarily symbolic system extended with monolingual and bilingual statistical components—has a higher degree of grammaticality than a phrasebased statistical MT system, where grammaticality is measured in terms of correct verbargument realization and longdistance dependency translation.
By
Cuong, Hoang; Sima’an, Khalil
Differences in domains of language use between training data and test data have often been reported to result in performance degradation for phrasebased machine translation models. Throughout the past decade or so, a large body of work aimed at exploring domainadaptation methods to improve system performance in the face of such domain differences. This paper provides a systematic survey of domainadaptation methods for phrasebased machinetranslation systems. The survey starts out with outlining the sources of errors in various components of phrasebased models due to domain change, including lexical selection, reordering and optimization. Subsequently, it outlines the different research lines to domain adaptation in the literature, and surveys the existing work within these research lines, discussing how these approaches differ and how they relate to each other.
By
Carter, Simon; Monz, Christof
2 Citations
This article describes a method that successfully exploits syntactic features for nbest translation candidate reranking using perceptrons. We motivate the utility of syntax by demonstrating the superior performance of parsers over ngram language models in differentiating between Statistical Machine Translation output and human translations. Our approach uses discriminative language modelling to rerank the nbest translations generated by a statistical machine translation system. The performance is evaluated for ArabictoEnglish translation using NIST’s MTEval benchmarks. While deep features extracted from parse trees do not consistently help, we show how features extracted from a shallow PartofSpeech annotation layer outperform a competitive baseline and a stateoftheart comparative reranking approach, leading to significant BLEU improvements on three different test sets.
By
Ambati, Bharat Ram; Deoskar, Tejaswini; Steedman, Mark
In this paper, we present an approach for automatically creating a combinatory categorial grammar (CCG) treebank from a dependency treebank for the subject–object–verb language Hindi. Rather than a direct conversion from dependency trees to CCG trees, we propose a two stage approach: a language independent generic algorithm first extracts a CCG lexicon from the dependency treebank. An exhaustive CCG parser then creates a treebank of CCG derivations. We also discuss special cases of this generic algorithm to handle linguistic phenomena specific to Hindi. In doing so we extract different constructions with longrange dependencies like coordinate constructions and nonprojective dependencies resulting from constructions like relative clauses, noun elaboration and verbal modifiers.
By
Maillette de Buy Wenniger, Gideon; Sima’an, Khalil
1 Citations
Longrange word order differences are a wellknown problem for machine translation. Unlike the standard phrasebased models which work with sequential and local phrase reordering, the hierarchical phrasebased model (Hiero) embeds the reordering of phrases within pairs of lexicalized contextfree rules. This allows the model to handle long range reordering recursively. However, the Hiero grammar works with a single nonterminal label, which means that the rules are combined together into derivations independently and without reference to context outside the rules themselves. Followup work explored remedies involving nonterminal labels obtained from monolingual parsers and taggers. As of yet, no labeling mechanisms exist for the many languages for which there are no good quality parsers or taggers. In this paper we contribute a novel approach for acquiring reordering labels for Hiero grammars directly from the wordaligned parallel training corpus, without use of any taggers or parsers. The new labels represent types of alignment patterns in which a phrase pair is embedded within larger phrase pairs. In order to obtain alignment patterns that generalize well, we propose to decompose word alignments into trees over phrase pairs. Beside this labeling approach, we contribute coarse and sparse features for learning soft, weighted labelsubstitution as opposed to standard substitution. We report extensive experiments comparing our model to two baselines: Hiero and the known syntax augmented machine translation (SAMT) variant, which labels Hiero rules with nonterminals extracted from monolingual syntactic parses. We also test a simplified labeling scheme based on inversion transduction grammar (ITG). For the Chinese–English task we obtain performance improvement up to 1 BLEU point, whereas for the German–English task, where morphology is an issue, a minor (but statistically significant) improvement of 0.2 BLEU points is reported over SAMT. While ITG labeling does give a performance improvement, it remains sometimes suboptimal relative to our proposed labeling scheme.
By
Stanojević, Miloš
Current chartbased parsers of Minimalist Grammars exhibit prohibitively high polynomial complexity that makes them unusable in practice. This paper presents a transitionbased parser for Minimalist Grammars that approximately searches through the space of possible derivations by means of beam search, and does so very efficiently: the worst case complexity of building one derivation is
$$O(n^2)$$
and the best case complexity is O(n). This approximated inference can be guided by a trained probabilistic model that can condition on larger context than standard chartbased parsers. The transitions of the parser are very similar to the transitions of bottomup shiftreduce parsers for ContextFree Grammars, with additional transitions for online reordering of words during parsing in order to make nonprojective derivations projective.
By
Sima'an, Khalil
2 Citations
Recent models of natural language processing employ statistical reasoning for dealing with the ambiguity of formal grammars. In this approach, statistics, concerning the various linguistic phenomena of interest, are gathered from actual linguistic data and used to estimate the probabilities of the various entities that are generated by a given grammar, e.g., derivations, parsetrees and sentences. The extension of grammars with probabilities makes it possible to state ambiguity resolution as a constrained optimization formula, which aims at maximizing the probability of some entity that the grammar generates given the input (e.g., maximum probability parsetree given some input sentence). The implementation of these optimization formulae in efficient algorithms, however, does not always proceed smoothly. In this paper, we address the computational complexity of ambiguity resolution under various kinds of probabilistic models. We provide proofs that some, frequently occurring problems of ambiguity resolution are NPcomplete. These problems are encountered in various applications, e.g., language understanding for text and speechbased applications. Assuming the common model of computation, this result implies that, for many existing probabilistic models it is not possible to devise tractable algorithms for solving these optimization problems.
By
Kokke, Pepijn
We propose an improvement of Barker and Shan’s [4] NL
$$_{\text {CL}}$$
for which derivability is decidable, which has a normalform for proof search, can analyse scope islands, and distinguish between strong and weak quantifiers.
By
Hassan, Hany; Sima’an, Khalil; Way, Andy
1 Citations
A challenging aspect of Statistical Machine Translation from Arabic to English lies in bringing the Arabic source morphosyntax to bear on the lexical as well as wordorder choices of the English target string. In this article, we extend the featurerich discriminative Direct Translation Model 2 (DTM2) with a novel lineartime parsing algorithm based on an eager, incremental interpretation of Combinatory Categorial Grammar. This way we can reap the benefits of a target syntactic enhancement that leads to more grammatical output while also enabling dynamic decoding without the risk of blowing up decoding space and time requirements. Our model defines a mix of model parameters, some of which involve DTM2 source morphosyntactic features, and others are novel target side syntactic features. Alongside translation features extracted from the derived parse tree, we explore syntactic features extracted from the incremental derivation process. Our empirical experiments show that our model significantly outperforms the stateoftheart DTM2 system.
By
Bod, Rens
The DataOriented Parsing (DOP) model employs an annotated corpus or treebank directly as a stochastic grammar. New input is parsed by combining subtrees from the treebank. The most probable analysis is estimated on the basis of the occurrencefrequencies of the treebanksubtrees. The model as originally defined imposes no constraints on the size and complexity of the subtrees that may be invoked in parsing new input. Both from a theoretical and from a computational perspective we may therefore wonder whether it is possible to impose constraints on the subtrees that are used, in such a way that the performance of the model does not deteriorate or perhaps even improves. That is the main question addressed in the current paper. Moreover, by imposing different constraints on the subtree set, we can simulate several other stochastic grammars, ranging from stochastic contextfree grammars to stochastic lexicalized grammars, thus allowing for a proper performance comparison. Experiments with the ATIS and Wall Street Journal treebanks indicate that very few constraints on the treebank subtrees are warranted. We conclude with a brief discussion of the consequences of our results.
By
Sima’an, Khalil
Spoken utterances do not always abide by linguistically motivated grammatical rules. These utterances exhibit various phenomena considered outside the realm of theoreticallyoriented linguistic research. For a language model that extends linguistically motivated grammars with probabilistic reasoning, the problem is how to feature the robustness that is necessary for speech understanding. This paper addresses the issue of the robustness of the Data Oriented Parsing (DOP) model within a Dutch speechbased dialogue system. It presents an extension of the DOP model into a headdriven variant, which allows for Markovian generation of parse trees. It is shown empirically that the new variant improves over the original DOP model on two tasks: the formal understanding of speech utterances, and the extraction of semantic concepts from word lattices output by a speech recognizer.
By
Dijk, Teun A.
3 Citations
This paper is intended as a contribution to the recent developments in the logical analysis of natural language. In particular, it investigates some of the logical properties of socalled text grammars, i.e. grammars formally describing the structure of texts in natural language. Our main working hypothesis will be that the base of a grammar contains, or is identical with a (natural) logic, and that such a ‘logic’ should most appropriately take the form of what might be called a text logic. Besides a more general discussion of the relations between logic and linguistics and of the present status of logical systems in grammatical research, we will specify some of the main tasks and requirements of a text logic. Special attention will be given to the abstract structures underlying description, quantification and identity in texts. These problems are also relevant in standard sentence grammars, but it will be argued that they can be better formulated in the framework of a text grammar and its corresponding text logic, thus leading to a number of interesting generalizations. It must be underlined that our attempt is provisional and often highly speculative, and that the treatment of the different problems will remain rather informal.
By
Arnold, Doug; Moffat, Dave; Sadler, Louisa; Way, Andrew
2 Citations
A Test Suite (TS) is typically a collection of Natural Language sentences against which the coverage of a Natural Language Processing system can be evaluated. We describe a method by which such suites can be produced automatically, involving a modification and extension of the Definite Clause Grammar formalism, and describe some of the advantages of the method over the traditional method of manual construction.
By
Zeevat, Henk
1 Citations
Bayesian interpretation is a technique in signal processing and its application to natural language semantics and pragmatics (BNLSP from here on and BNLI if there is no particular emphasis on semantics and pragmatics) is basically an engineering decision. It is a cognitive science hypothesis that humans emulate BNLSP. That hypothesis offers a new perspective on the logic of interpretation and the recognition of other people’s intentions in interhuman communication. The hypothesis also has the potential of changing linguistic theory, because the mapping from meaning to form becomes the central one to capture in accounts of phonology, morphology and syntax. Semantics is essentially read off from this mapping and pragmatics is essentially reduced to probability maximation within Grice’s intention recognition. Finally, the stochastic models used can be causal, thus incorporating new ideas on the analysis of causality using Bayesian nets. The paper explores and connects these different ways of being committed to BNLSP.
By
Benthem, Johan
Now that the notions of ‘point structure’ and ‘period structure’ have been developed to some extent, it becomes of interest to relate the two in a systematic fashion. For this purpose, once more, here are the relevant notions as they evolved in the previous discussions.
By
Benthem, Johan
The Priorean language of the preceding chapter is not inextricably tied up with the usual point ontology: it may also be interpreted in period structures. Indeed, the same truth definition II.2.1.2. works as well in the latter context.
By
Rijke, Maarten; Balog, Krisztian; Bogers, Toine; Bosch, Antal
Show all (4)
1 Citations
Entity profiling is the task of identifying and ranking descriptions of a given entity. The task may be viewed as one where the descriptions being sought are terms that need to be selected from a knowledge source (such as an ontology or thesaurus). In this case, entity profiling systems can be assessed by means of precision and recall values of the descriptive terms produced. However, recent evidence suggests that more sophisticated metrics are needed that go beyond mere lexical matching of systemproduced descriptors against a ground truth, allowing for graded relevance and rewarding diversity in the list of descriptors returned. In this note, we motivate and propose such a metric.
By
Dijk, Teun A.
Dialogues are verbal interaction sequences performed among language users. In order to be able to adequately accomplish their respective actions which constitute a dialogue, these language users must ‘go through’ a number of highly complex cognitive processes. It is the aim of this paper to briefly discuss some of the properties of the processes and representations involved in the cognitive management of dialogues.
By
Venema, Yde
6 Citations
We give a characterization of the simple, and of the subdirectly irreducible boolean algebras with operators (including modal algebras), in terms of the dual descriptive frame, or, topological relational structure. These characterizations involve a special binary toporeachability relation on the dual structure; we call a point u a toporoot of the dual structure if every ultrafilter is toporeachable from u. We prove that a boolean algebra with operators is simple iff every point in the dual structure is a toporoot; and that it is subdirectly irreducible iff the collection of toporoots is open and nonempty in the Stone topology on the dual structure iff this collection has nonempty interior in that topology.
By
Honing, Henkjan
1 Citations
This paper is about the importance of applying computational modeling and artificial intelligence techniques to music cognition and computer music research. The construction of microworlds as a methodology plays a key role in the different stages of this research. Several uses of microworlds are described. Microworlds have been criticized in the domains of artificial intelligence and the cognitive sciences, but this critique has to be seen in its proper context (i.e. in modeling of human intelligence, not as a methodology). It is shown that the microworld approach is still an important methodology in music cognition and computer music research, and a promising strategy in the design of a general representation formalism of musical knowledge.
By
Aarts, Erik
5 Citations
In the Lambek calculus of order 2 we allow only sequents in which the depth of nesting of implications is limited to 2. We prove that the decision problem of provability in the calculus can be solved in time polynomial in the length of the sequent. A normal form for proofs of second order sequents is defined. It is shown that for every proof there is a normal form proof with the same axioms. With this normal form we can give an algorithm that decides provability of sequents in polynomial time.
By
Venema, Yde
6 Citations
We prove that every abstractly defined game algebra can be represented as an algebra of consistent pairs of monotone outcome relations over a game board. As a corollary we obtain Goranko's result that van Benthem's conjectured axiomatization for equivalent game terms is indeed complete.
By
Madsen, Mathias Winther
The N400 and the P600 are two patterns of electrical brain potentials which can sometimes be found when people read or hear unexpected words. They have been widely claimed to be the neurological correlates of semantic and syntactic anomalies, respectively, but evidence accumulated over the last decade has raised some serious doubts about that interpretation. In this paper, I first review some of this evidence and then present an alternative way to think about the issue. My suggestion is built on Shannon’s concept of noisychannel decoding by tables of typical sets, and it is thus fundamentally statistical in nature. I show that a proper application of Shannon’s concepts to the reading process provides an interesting reinterpretation of our notion of “syntax,” thus questioning some fundamental assumptions of linguistics.
By
Benthem, Johan
The concept of a ‘point in time’ is an extremely abstract one, just as that of ‘point in space’. Centuries of school geometry and physics have made us familiar with this abstraction — but it remains a fact that even the strongest phrases of our ordinary language (‘the very moment that’, ‘right then’) refer to some (small) period. Relative size does not take one any nearer to the point level: ‘small’ is a pragmatic qualification. The splitsecond it takes you to close your eyes may well mean more than a lifetime to some elementary particles in your eyelids. Similarly, there is no ‘punctual’ present ^{1}; ‘right now’ being a small period of immediate awareness. Indeed, the phenomenon of awareness itself implies duration.
By
Benthem, Johan
The Priorean research program has blossomed into a whole logical discipline; its slightly shaky motivation notwithstanding.^{1} The enterprise is described in various textbooks already; whence a new introduction is not intended here. This chapter will rather contain a presentation stressing the general layout and spirit of the subject, providing a perspective upon contemporary technical research. As these considerations are not restricted to this special field, here is a sketch of the underlying view of logic in general.
By
Jongh, Dick; Montagna, Franco
2 Citations
It is shown that for arithmetical interpretations that may include free variables it is not the GuaspariSolovay system R that is arithmetically complete, but their system R^{−}. This result is then applied to obtain the nonvalidity of some rules under arithmetical interpretations including free variables, and to show that some principles concerning Rosser orderings with free variables cannot be decided, even if one restricts oneself to “usual” proof predicates.
By
Abnar, Samira; Dehghani, Mostafa; Shakery, Azadeh
2 Citations
Text alignment is one of the main steps of plagiarism detection in textual environments. Considering the pattern in distribution of the common semantic elements of the two given documents, different strategies may be suitable for this task. In this paper we assume that the obfuscation level, i.e the plagiarism type, is a function of the distribution of the common elements in the two documents. Based on this assumption, we propose Meta Text Aligner which predicts plagiarism relation of two given documents and employs the prediction results to select the best text alignment strategy. Thus, it will potentially perform better than the existing methods which use a same strategy for all cases. As indicated by the experiments, we have been able to classify document pairs based on plagiarism type with the precision of
$$89\%$$
. Furthermore exploiting the predictions of the classifier for choosing the proper method or the optimal configuration for each type we have been able to improve the Plagdet score of the existing methods.
By
Spaan, Edith
4 Citations
In [4], Ladner investigated the complexity of the provability problems for modal logics. In particular, he showed that provability in all modal logics between K and S4 is PSPACEhard, and he constructed polynomial space bounded algorithms for deciding provability in K, T, and S4, which implies that the provability problems for these logics are PSPACEcomplete.
By
Rijke, Maarten
In [6] Albert Visser shows that ILP completely axiomatizes all schemata about provability and relative interpretability that are provable in finitely axiomatized theories. In this paper we introduce a system called ILP^{ω} that completely axiomatizes the arithmetically valid principles of provability in and interpretability over such theories. To prove the arithmetical completeness of ILP^{ω} we use a suitable kind of tail models; as a byproduct we obtain a somewhat modified proof of Visser's completeness result.
By
Janssen, Theo; Kok, Gerard; Meertens, Lambert
2 Citations
Various restrictions on transformational grammars have been investigated in order to reduce their generative power from recursively enumerable languages to recursive languages.
It will be shown that any restriction on transformational grammars defining a recursively enumerable subset of the set of all transformational grammars, is either too weak (in the sense that there does not exist a general decision procedure for all languages generated under such a restriction) or too strong (in the sense that there exists a recursive language that cannot be generated by any transformational grammar thus restricted). In addition, some related problems will be discussed.
By
Benthem, Johan
In this chapter the traditional temporal structures will be studied, consisting of points in time ordered by a relation of precedence (‘earlier’, ‘before’).
By
Benthem, Johan
Even the technical working languages of Part I were formalized fragments of natural language. The actual variety of temporal expressions is much greater, however, if only because of the interaction between temporal expressions and other linguistic constructions. A short reconnaissance will be made here.
By
Benthem, Johan
Points and periods have been studied in great technical detail in the previous chapters. This final chapter of Part I will be simpler and more speculative.
By
Prüst, Hub; Scha, Remko; Berg, Martin
13 Citations
We argue that an adequate treatment of verb phrase anaphora (VPA) must depart in two major respects from the standard approaches. First of all, VP anaphors cannot be resolved by simply identifying the anaphoric VP with an antecedent VP. The resolution process must establish a syntactic/semantic parallelism between larger units (clauses or discourse constituent units) that the VPs occur in. Secondly, discourse structure has a significant influence on the reference possibilities of VPA. This influence must be accounted for.
We propose a treatment which meets these requirements. It builds on a discourse grammar which characterizes discourse cohesion by means of a syntactic/semantic matching procedure which recognizes parallel structures in discourse. It turns out that this independently motivated procedure yields the resolution of VPA as a side effect.
By
Benthem, Johan
The two temporal ontologies of Part I turned out to be related, in Chapter 1.4. Now that the languages of tense logic have been interpreted at both levels, there arises the matter of their connections as well. Or in more familiar methodological terms. are there any interesting reductions to be found between temporal discourse at the ‘macrolevel’ of periods and at the ‘microlevel’ of points? A first speculative study is all that is undertaken in this chapter.
By
Jongh, Dick; Jumelet, Marc; Montagna, Franco
9 Citations
Solovay's 1976 completeness result for modal provability logic employs the recursion theorem in its proof. It is shown that the uses of the recursion theorem can in this proof be replaced by the diagonalization lemma for arithmetic and that, in effect, the proof neatly fits the framework of another, enriched, system of modal logic (the socalled Rosser logic of GauspariSolovay, 1979) so that any arithmetical system for which this logic is sound is strong enough to carry out the proof, in particular IΔ_{0}+EXP. The method is adapted to obtain a similar completeness result for the Rosser logic.
By
Benthem, Johan
In this chapter the main ingredients are surveyed that go into our temporal structures: individuals, relations and operations. Some of the more interesting ones will be selected for further logical investigation.
By
Schuth, Anne; Balog, Krisztian; Kelly, Liadh
3 Citations
In this paper we report on the first Living Labs for Information Retrieval Evaluation (LL4IR) CLEF Lab. Our main goal with the lab is to provide a benchmarking platform for researchers to evaluate their ranking systems in a live setting with real users in their natural task environments. For this first edition of the challenge we focused on two specific usecases: product search and web search. Ranking systems submitted by participants were experimentally compared using interleaved comparisons to the production system from the corresponding usecase. In this paper we describe how these experiments were performed, what the resulting outcomes are, and conclude with some lessons learned.
By
Baltag, A.; Cinà, G.
We give a definition of bisimulation for conditional modalities interpreted on selection functions and prove the correspondence between bisimilarity and modal equivalence, generalizing the Hennessy–Milner Theorem to a wide class of conditional operators. We further investigate the operators and semantics to which these results apply. First, we show how to derive a solid notion of bisimulation for conditional belief, behaving as desired both on plausibility models and on evidence models. These novel definitions of bisimulations are exploited in a series of undefinability results. Second, we treat relativized common knowledge, underlining how the same results still hold for a different modality in a different semantics. Third, we show the flexibility of the approach by generalizing it to multiagent systems, encompassing the case of multiagent plausibility models.
By
Huurnink, Bouke; Hofmann, Katja; Rijke, Maarten; Bron, Marc
Show all (4)
2 Citations
We design and validate simulators for generating queries and relevance judgments for retrieval system evaluation. We develop a simulation framework that incorporates existing and new simulation strategies. To validate a simulator, we assess whether evaluation using its output data ranks retrieval systems in the same way as evaluation using realworld data. The realworld data is obtained using logged commercial searches and associated purchase decisions. While no simulator reproduces an ideal ranking, there is a large variation in simulator performance that allows us to distinguish those that are better suited to creating artificial testbeds for retrieval experiments. Incorporating knowledge about document structure in the query generation process helps create more realistic simulators.
By
Meij, Edgar; Rijke, Maarten
We describe our participation in the 2008 CLEF Domainspecific track. We evaluate blind relevance feedback models and concept models on the CLEF domainspecific test collection. Applying relevance modeling techniques is found to have a positive effect on the 2008 topic set, in terms of mean average precision and precision@10. Applying concept models for blind relevance feedback, results in even bigger improvements over a querylikelihood baseline, in terms of mean average precision and early precision.
By
Grädel, Erich; Väänänen, Jouko
53 Citations
We introduce an atomic formula
$${\vec{y} \bot_{\vec{x}}\vec{z}}$$
intuitively saying that the variables
$${\vec{y}}$$
are independent from the variables
$${\vec{z}}$$
if the variables
$${\vec{x}}$$
are kept constant. We contrast this with dependence logic
$${\mathcal{D}}$$
based on the atomic formula =
$${(\vec{x}, \vec{y})}$$
, actually equivalent to
$${\vec{y} \bot_{\vec{x}}\vec{y}}$$
, saying that the variables
$${\vec{y}}$$
are totally determined by the variables
$${\vec{x}}$$
. We show that
$${\vec{y} \bot_{\vec{x}}\vec{z}}$$
gives rise to a natural logic capable of formalizing basic intuitions about independence and dependence. We show that
$${\vec{y} \bot_{\vec{x}}\vec{z}}$$
can be used to give partially ordered quantifiers and IFlogic an alternative interpretation without some of the shortcomings related to so called signaling that interpretations using =
$${(\vec{x}, \vec{y})}$$
have.
By
van Benthem, Johan; Bezhanishvili, Guram; Gehrke, Mai
11 Citations
For a Euclidean space
$$\mathbb{R}^n $$
, let L_{n} denote the modal logic of chequered subsets of
$$\mathbb{R}^n $$
. For every n ≥ 1, we characterize L_{n} using the more familiar Kripke semantics, thus implying that each L_{n} is a tabular logic over the wellknown modal system Grz of Grzegorczyk. We show that the logics L_{n} form a decreasing chain converging to the logic L_{∞} of chequered subsets of
$$\mathbb{R}^\infty $$
. As a result, we obtain that L_{∞} is also a logic over Grz, and that L_{∞} has the finite model property. We conclude the paper by extending our results to the modal language enriched with the universal modality.
By
Bezhanishvili, Guram; Bezhanishvili, Nick; Ilin, Julia
4 Citations
We generalize the
$${(\wedge, \vee)}$$
canonical formulas to
$${(\wedge, \vee)}$$
canonical rules, and prove that each intuitionistic multiconclusion consequence relation is axiomatizable by
$${(\wedge, \vee)}$$
canonical rules. This yields a convenient characterization of stable superintuitionistic logics. The
$${(\wedge, \vee)}$$
canonical formulas are analogues of the
$${(\wedge,\to)}$$
canonical formulas, which are the algebraic counterpart of Zakharyaschev’s canonical formulas for superintuitionistic logics (silogics for short). Consequently, stable silogics are analogues of subframe silogics. We introduce cofinal stable intuitionistic multiconclusion consequence relations and cofinal stable silogics, thus answering the question of what the analogues of cofinal subframe logics should be. This is done by utilizing the
$${(\wedge,\vee,\neg)}$$
reduct of Heyting algebras. We prove that every cofinal stable silogic has the finite model property, and that there are continuum many cofinal stable silogics that are not stable. We conclude with several examples showing the similarities and differences between the classes of stable, cofinal stable, subframe, and cofinal subframe silogics.
By
Koolen, Marijn; Bogers, Toine; Gäde, Maria; Hall, Mark; Huurdeman, Hugo; Kamps, Jaap; Skov, Mette; Toms, Elaine; Walsh, David
Show all (9)
3 Citations
The Social Book Search (SBS) Lab investigates book search in scenarios where users search with more than just a query, and look for more than objective metadata. Realworld information needs are generally complex, yet almost all research focuses instead on either relatively simple search based on queries or recommendation based on profiles. The goal is to research and develop techniques to support users in complex book search tasks. The SBS Lab has two tracks. The aim of the Suggestion Track is to develop test collections for evaluating ranking effectiveness of book retrieval and recommender systems. The aim of the Interactive Track is to develop user interfaces that support users through each stage during complex search tasks and to investigate how users exploit professional metadata and usergenerated content.
By
Ihwe, Jens
Four contexts are specified in which the notion of ‘text’ is to be treated: textgrammars, texttypologies, textprocessing, and textdidactics.
These four contexts, in this particular ordering, also imply the thesis that the introduction of this notion will only be of operational value if it has been sufficiently anchored both empirically and applicationally.
The paradigm chosen is that of the study of literature. Here it can be shown with all clarity, how motivations and aims both inherent in as well as external to linguistics are to be related to assure the empirical relevance of the basic data, of concept definition, and of theory construction.
The guiding point of view is that linguistics must develop the general frame within which explication and description of the complex phenomenon ‘textprocessing’ may be pursued. Founded on an empirically anchored (i.e. heuristicalexperimental) basis of specification (‘textgrammar’), specific domains of research within all socalled languagecentred disciplines may then be delimited (‘texttypologies’). As guiding points of view, at least for the paradigm chosen here, perspectives of application, in particular those of a didactic nature, are outlined (‘textdidactics’).
By
Benthem, Johan
In this survey and position paper, we discuss some issues in logical modeling of interactive behavior. We draw together a number of lines in current logics for social action, emphasizing uses of ‘small models’ rather than complex spaces.
By
Benthem, Johan
Post to Citeulike
1 Citations
Possible worlds semantics for Modal Logic evolved in the fifties, through the work of Kanger, Hintikka and Kripke. Its main ideas are the use of possible worlds (which may stand for worlds in some grand sense, but also for points in time, situations, information stages or computer states), structured by a pattern of accessibility—with individual objects living in domains per world and having properties there, which may change in passing from one world to another. In propositional modal logic, where only worlds and accessibility matter (plus a ‘valuation’ for interpreting atomic propositions over the whole pattern), this picture has always seemed perfectly obvious. But in modal predicate logic, there has been recurrent debate concerning appropriate choices to be made in the semantics, starting from early doubts in Quine [25] about the very coherence of ascribing necessary properties to objects, and continuing into the sixties and seventies with various accounts of ‘transworld identity’ for individuals across worlds. Noticeable are the ‘counterpart theory’ of Lewis [21], denying that objects can sensibly be identical across different worlds, or the ‘rigid designation’ theory in Kripke [20], affirming that only such objects make sense. More elaborate accounts of various possible approaches are given in Fine [9] and Carson [13]. Thus, the philosophical literature shows a variety of possible options in the semantics of modal predicate logic, concerning both the nature of individuals, and the interpretation of necessary propositions concerning them. Still, a widespread standard view exists (cf. Hughes and Cresswell [17] with domains growing along accessibility patterns (i.e., whenever xRy, then
), and calling a statement □φ(x_{l},..., x_{n}) true of objects d_{1},..., d_{n} in a world iff φ holds of those same objects in all accessible worlds.
By
Ditmarsch, Hans; Knight, Sophia; Özgün, Aybüke
1 Citations
In this work, we present a multiagent logic of knowledge and change of knowledge interpreted on topological structures. Our dynamics are of the socalled semiprivate character where a group G of agents is informed of some piece of information
$$\varphi $$
, while all the other agents observe that group G is informed, but are uncertain whether the information provided is
$$\varphi $$
or
$$\lnot \varphi $$
. This article follows up on our prior work (van Ditmarsch et al. in Proceedings of the 15th TARK. pp 95102, 2015) where the dynamics were public events. We provide a complete axiomatization of our logic, and give two detailed examples of situations with agents learning information through semiprivate announcements.
By
Braschler, Martin; Choukri, Khalid; Ferro, Nicola; Hanbury, Allan; Karlgren, Jussi; Müller, Henning; Petras, Vivien; Pianta, Emanuele; Rijke, Maarten; Santucci, Giuseppe
Show all (10)
1 Citations
Participative Research labOratory for Multimedia and Multilingual Information Systems Evaluation (PROMISE) is a Network of Excellence, starting in conjunction with this first independent CLEF 2010 conference, and designed to support and develop the evaluation of multilingual and multimedia information access systems, largely through the activities taking place in CrossLanguage Evaluation Forum (CLEF) today, and taking it forward in important new ways.
PROMISE is coordinated by the University of Padua, and comprises 10 partners: the Swedish Institute for Computer Science, the University of Amsterdam, Sapienza University of Rome, University of Applied Sciences of Western Switzerland, the Information Retrieval Facility, the Zurich University of Applied Sciences, the Humboldt University of Berlin, the Evaluation and Language Resources Distribution Agency, and the Centre for the Evaluation of Language Communication Technologies.
The single most important step forward for multilingual and multimedia information access which PROMISE will work towards is to provide an open evaluation infrastructure in order to support automation and collaboration in the evaluation process.
By
Groenendijk, Jeroen; Stokhof, Martin
11 Citations
The aim of this paper is a modest one. In what follows, we will argue that if one takes into consideration certain constructions involving interrogatives, a flexible approach to the relationship between syntactic categories and semantic types may be of great help. More in particular, we will try to show that if one uses something like an orthodox intensional type theory as one’s semantic tool, a more liberal association between syntactic categories and semantic types becomes imperative. However, we will also see that such flexibility is by no means easily introduced into the grammar, and that it needs to be properly checked in order to avoid undesirable consequences.
By
Jijkoun, Valentin; Rijke, Maarten
3 Citations
This paper describes the WebCLEF 2007 task. The task definition—which goes beyond traditional navigational queries and is concerned with undirected information search goals—combines insights gained at previous editions of WebCLEF and of the WiQA pilot that was run at CLEF 2006. We detail the task, the assessment procedure and the results achieved by the participants.
By
Amigó, Enrique; CarrillodeAlbornoz, Jorge; Chugur, Irina; Corujo, Adolfo; Gonzalo, Julio; Meij, Edgar; Rijke, Maarten; Spina, Damiano
Show all (8)
12 Citations
This paper describes the organisation and results of RepLab 2014, the third competitive evaluation campaign for Online Reputation Management systems. This year the focus lied on two new tasks: reputation dimensions classification and author profiling, which complement the aspects of reputation analysis studied in the previous campaigns. The participants were asked (1) to classify tweets applying a standard typology of reputation dimensions and (2) categorise Twitter profiles by type of author as well as rank them according to their influence. New data collections were provided for the development and evaluation of systems that participated in this benchmarking activity.
By
Benthem, Johan; Gerbrandy, Jelle; Kooi, Barteld
22 Citations
Current dynamicepistemic logics model different types of information change in multiagent scenarios. We generalize these logics to a probabilistic setting, obtaining a calculus for multiagent update with three natural slots: prior probability on states, occurrence probabilities in the relevant process taking place, and observation probabilities of events. To match this update mechanism, we present a complete dynamic logic of information change with a probabilistic character. The completeness proof follows a compositional methodology that applies to a much larger class of dynamicprobabilistic logics as well. Finally, we discuss how our basic update rule can be parameterized for different update policies, or learning methods.
By
Schuth, Anne; Marx, Maarten
2 Citations
We introduce two metrics aimed at evaluating systems that select facetvalues for a faceted search interface. Facetvalues are the values of metadata fields in semistructured data and are commonly used to refine queries. It is often the case that there are more facetvalues than can be displayed to a user and thus a selection has to be made. Our metrics evaluate these selections based on binary relevant assessments for the documents in a collection. Both our metrics are based on Normalized Discounted Cumulated Gain, an often used Information Retrieval metric.
By
Benthem, J. van; Bezhanishvili, G.; Cate, B. ten; Sarenac, D.
Show all (4)
14 Citations
We introduce the horizontal and vertical topologies on the product of topological spaces, and study their relationship with the standard product topology. We show that the modal logic of products of topological spaces with horizontal and vertical topologies is the fusion S4 ⊕ S4. We axiomatize the modal logic of products of spaces with horizontal, vertical, and standard product topologies.We prove that both of these logics are complete for the product of rational numbers ℚ × ℚ with the appropriate topologies.
By
Giampiccolo, Danilo; Forner, Pamela; Herrera, Jesús; Peñas, Anselmo; Ayache, Christelle; Forascu, Corina; Jijkoun, Valentin; Osenova, Petya; Rocha, Paulo; Sacaleanu, Bogdan; Sutcliffe, Richard
Show all (11)
9 Citations
The fifth QA campaign at CLEF [1], having its first edition in 2003, offered not only a main task but an Answer Validation Exercise (AVE) [2], which continued last year’s pilot, and a new pilot: the Question Answering on Speech Transcripts (QAST) [3, 15]. The main task was characterized by the focus on crosslinguality, while covering as many European languages as possible. As novelty, some QA pairs were grouped in clusters. Every cluster was characterized by a topic (not given to participants). The questions from a cluster possibly contain coreferences between one of them and the others. Finally, the need for searching answers in web formats was satisfied by introducing Wikipedia as document corpus. The results and the analyses reported by the participants suggest that the introduction of Wikipedia and the topic related questions led to a drop in systems’ performance.
By
Benthem, Johan; Bezhanishvili, Nick; Enqvist, Sebastian
We propose a new perspective on logics of computation by combining instantial neighborhood logic
$$\mathsf {INL}$$
with bisimulation safe operations adapted from
$$\mathsf {PDL}$$
.
$$\mathsf {INL}$$
is a recent modal logic, based on an extended neighborhood semantics which permits quantification over individual neighborhoods plus their contents. This system has a natural interpretation as a logic of computation in open systems. Motivated by this interpretation, we show that a number of familiar program constructors can be adapted to instantial neighborhood semantics to preserve invariance for instantial neighborhood bisimulations, the appropriate bisimulation concept for
$$\mathsf {INL}$$
. We also prove that our extended logic
$$\mathsf {IPDL}$$
is a conservative extension of dualfree game logic, and its semantics generalizes the monotone neighborhood semantics of game logic. Finally, we provide a sound and complete system of axioms for
$$\mathsf {IPDL}$$
, and establish its finite model property and decidability.
By
De Gooijer, Jan G.; Laan, Nancy M.
This paper studies the problem of detecting multiplechanges at unknown times in the mean level of elision in thetrimeter sequences of the Orestes, a play written by theAncient Greek dramatist Euripides (485–406 B.C.). Changedetection statistics proposed by MacNeill (1978) and Jandhayala and MacNeill(1991) are adopted for this purpose. Analysis of the trimetersequences yields several points of change. A general explanation fortheir occurrence appears to be that Euripides varies his use ofelision according to the emotional content of his text, i.e., heseems to change the form to support the content and, thus, seems touse elision frequency as a dramatic instrument.
By
Giannakidou, Anastasia
75 Citations
Limited distribution phenomena related to negation and negative polarity are usually thought of in terms of affectivity where affective is understood as negative or downward entailing. In this paper I propose an analysis of affective contexts as nonveridical and treat negative polarity as a manifestation of the more general phenomenon of sensitivity to (non)veridicality (which is, I argue, what affective dependencies boil down to). Empirical support for this analysis will be provided by a detailed examination of affective dependencies in Greek, but the distribution of any will also be shown to follow from (non)veridicality.
By
Benthem, Johan; Doets, Kees
11 Citations
What is nowadays the central part of any introduction to logic, and indeed to some the logical theory par excellence, used to be a modest fragment of the more ambitious language employed in the logicist program of Frege and Russell. ‘Elementary’ or ‘firstorder’, or ‘predicate logic’ only became a recognized stable base for logical theory by 1930, when its interesting and fruitful metaproperties had become clear, such as completeness, compactness and LöwenheimSkolem. Richer higherorder and type theories receded into the background, to such an extent that the (re)discovery of useful and interesting extensions and variations upon firstorder logic came as a surprise to many logicians in the sixties.
By
van Rooy, Robert
35 Citations
Why do we ask questions? Because we want tohave some information. But why this particular kind ofinformation? Because only information of this particularkind is helpful to resolve the decision problemthat the agent faces. In this paper I argue thatquestions are asked because their answers help toresolve the questioner's decision problem, and that thisassumption helps us to interpret interrogativesentences. Interrogative sentences are claimed to have asemantically underspecified meaning and thisunderspecification is resolved by means of the decisionproblem.
By
Bezhanishvili, Nick; Jongh, Dick
2 Citations
We give alternative characterizations of exact, extendible and projective formulas in intuitionistic propositional calculus IPC in terms of nuniversal models. From these characterizations we derive a new syntactic description of all extendible formulas of IPC in two variables. For the formulas in two variables we also give an alternative proof of Ghilardi’s theorem that every extendible formula is projective.
By
Dekker, Paul
3 Citations
In this paper I revive two important formal approaches to the interpretation of natural language, that of Montague and that of Karttunen and Peters. Armed with insights from dynamic semantics (Heim, Krifka) the two turn out to stand up against ageold criticisms in an orthodox fashion. The plan is mainly methodological, as I only want to illustrate the technical feasibility of the revived proposals. Even so, there are illuminating and welcome empirical consequences on the subject of scope islands (as discussed by Abusch and Kratzer, among many others), as well as unintended theoretical implications in the contextualist debate (Grice, Recanati, Simons, Stanley, and many others again).
By
Van Rooy, Robert
6 Citations
In this paper I argue that anaphoric pronouns should always be interpreted exhaustively. I propose that pronouns are either used referentially and refer to the speaker's referents of their antecedent indefinites, or descriptively and go proxy for the description recoverable from its antecedent clause. I show how this view can be implemented within a dynamic semantics, and how it can account for various examples that seemed to be problematic for the view that for all unbound pronouns there always should be a notion of exhaustivity/uniqueness involved. The uniqueness assumption for the use of singular pronouns is also shown to be importantto explain what the discourse referents used in dynamic semantics represent.
By
Jijkoun, Valentin; Rijke, Maarten
1 Citations
We describe our participation in the WebCLEF 2007 task, targeted at snippet retrieval from web data. Our system ranks snippets based on a simple similaritybased centrality, inspired by the web page ranking algorithms. We experimented with retrieval units (sentences and paragraphs) and with the similarity functions used for centrality computations (word overlap and cosine similarity). We found that using paragraphs with the cosine similarity function shows the best performance with precision around 20% and recall around 25% according to human assessments of the first 7,000 bytes of responses for individual topics.
By
Benthem, Johan; Alechina, Natasha
The semantics for quantifiers described in this paper can be viewed both as a new semantics for generalized quantifiers and as a new look at standard firstorder quantification, bringing the latter closer to modal logic.
By
ArlóCosta, Horacio; Pacuit, Eric
10 Citations
The paper focuses on extending to the first order case the semantical program for modalities first introduced by Dana Scott and Richard Montague. We focus on the study of neighborhood frames with constant domains and we offer in the first part of the paper a series of new completeness results for salient classical systems of first order modal logic. Among other results we show that it is possible to prove strong completeness results for normal systems without the Barcan Formula (like FOL + K)in terms of neighborhood frames with constant domains. The first order models we present permit the study of many epistemic modalities recently proposed in computer science as well as the development of adequate models for monadic operators of high probability. Models of this type are either difficult of impossible to build in terms of relational Kripkean semantics [40].
We conclude by introducing general first order neighborhood frames with constant domains and we offer a general completeness result for the entire family of classical first order modal systems in terms of them, circumventing some wellknown problems of propositional and first order neighborhood semantics (mainly the fact that many classical modal logics are incomplete with respect to an unmodified version of either neighborhood or relational frames). We argue that the semantical program that thus arises offers the first complete semantic unification of the family of classical first order modal logics.
By
Baltag, Alexandru; Christoff, Zoé; Rendsvig, Rasmus K.; Smets, Sonja
Show all (4)
We take a logical approach to threshold models, used to study the diffusion of opinions, new technologies, infections, or behaviors in social networks. Threshold models consist of a network graph of agents connected by a social relationship and a threshold value which regulates the diffusion process. Agents adopt a new behavior/product/opinion when the proportion of their neighbors who have already adopted it meets the threshold. Under this diffusion policy, threshold models develop dynamically towards a guaranteed fixed point. We construct a minimal dynamic propositional logic to describe the threshold dynamics and show that the logic is sound and complete. We then extend this framework with an epistemic dimension and investigate how information about more distant neighbors’ behavior allows agents to anticipate changes in behavior of their closer neighbors. Overall, our logical formalism captures the interplay between the epistemic and social dimensions in social networks.
By
Berendsen, Richard; Tsagkias, Manos; Rijke, Maarten; Meij, Edgar
Show all (4)
1 Citations
Pseudo test collections are automatically generated to provide training material for learning to rank methods. We propose a method for generating pseudo test collections in the domain of digital libraries, where data is relatively sparse, but comes with rich annotations. Our intuition is that documents are annotated to make them better findable for certain information needs. We use these annotations and the associated documents as a source for pairs of queries and relevant documents. We investigate how learning to rank performance varies when we use different methods for sampling annotations, and show how our pseudo test collection ranks systems compared to editorial topics with editorial judgements. Our results demonstrate that it is possible to train a learning to rank algorithm on generated pseudo judgments. In some cases, performance is on par with learning on manually obtained ground truth.
By
Bezhanishvili, G.; Bezhanishvili, N.; LuceroBryan, J.; Mill, J.
Show all (4)
It is a landmark theorem of McKinsey and Tarski that if we interpret modal diamond as closure (and hence modal box as interior), then
$$\mathsf S4$$
is the logic of any denseinitself metrizable space. The McKinsey–Tarski Theorem relies heavily on a metric that gives rise to the topology. We give a new and more topological proof of the theorem, utilizing Bing’s Metrization Theorem.
By
Grotov, Artem; Chuklin, Aleksandr; Markov, Ilya; Stout, Luka; Xumara, Finde; Rijke, Maarten
Show all (6)
7 Citations
Click models have become an essential tool for understanding user behavior on a search engine result page, running simulated experiments and predicting relevance. Dozens of click models have been proposed, all aiming to tackle problems stemming from the complexity of user behavior or of contemporary result pages. Many models have been evaluated using proprietary data, hence the results are hard to reproduce. The choice of baseline models is not always motivated and the fairness of such comparisons may be questioned. In this study, we perform a detailed analysis of all major click models for web search ranging from very simplistic to very complex. We employ a publicly available dataset, opensource software and a range of evaluation techniques, which makes our results both representative and reproducible. We also analyze the query space to show what type of queries each model can handle best.
By
Jijkoun, Valentin; Hofmann, Katja; Ahn, David; Khalid, Mahboob Alam; Rantwijk, Joris; Rijke, Maarten; Tjong Kim Sang, Erik
Show all (7)
We describe a new version of our question answering system, which was applied to the questions of the 2007 CLEF Question Answering Dutch monolingual task. This year, we made three major modifications to the system: (1) we added the contents of Wikipedia to the document collection and the answer tables; (2) we completely rewrote the module interface code in Java; and (3) we included a new table stream which returned answer candidates based on information which was learned from questionanswer pairs. Unfortunately, the changes did not lead to improved performance. Unsolved technical problems at the time of the deadline have led to missing justifications for a large number of answers in our submission. Our single run obtained an accuracy of only 8% with an additional 12% of unsupported answers (compared to 21% in the last year’s task).
By
Pacuit, Eric
4 Citations
Adam Brandenburger and H. Jerome Keisler have recently discovered a two person Russellstyle paradox. They show that the following configurations of beliefs is impossible: Ann believes that Bob assumes that Ann believes that Bob’s assumption is wrong. In [7] a modal logic interpretation of this paradox is proposed. The idea is to introduce two modal operators intended to represent the agents’ beliefs and assumptions. The goal of this paper is to take this analysis further and study this paradox from the point of view of a modal logician. In particular, we show that the paradox can be seen as a theorem of an appropriate hybrid logic.
By
Benthem, Johan
Dov Gabbay is not just a 50yearold person, his name also denotes a phenomenon. I have felt his and its influence for many years: which are hereby gratefully acknowledged. Two of these influences are especially relevant for what follows. The first is Dov’s general view of modal logic as a theory of firstorder definable operators over relational models (Gabbay [8]). The second is his work on labelled deduction as a general format for the proof theory of substructural logics with a resourcesensitive slant, be it categorial or dynamic (Gabbay [9]). This generalizes standard type theories, with their binary statements assigning types to terms, or proofs to propositions. The two themes are related. In my view, the following equation sums up much of Dov’s recent work.
By
Benthem, Johan Van
25 Citations
Taking Löb's Axiom in modal provability logic as a running thread, we discuss some general methods for extending modal frame correspondences, mainly by adding fixedpoint operators to modal languages as well as their correspondence languages. Our suggestions are backed up by some new results – while we also refer to relevant work by earlier authors. But our main aim is advertizing the perspective, showing how modal languages with fixedpoint operators are a natural medium to work with.
By
Amigó, Enrique; Carrillo de Albornoz, Jorge; Chugur, Irina; Corujo, Adolfo; Gonzalo, Julio; Martín, Tamara; Meij, Edgar; Rijke, Maarten; Spina, Damiano
Show all (9)
20 Citations
This paper summarizes the goals, organization, and results of the second RepLab competitive evaluation campaign for Online Reputation Management Systems (RepLab 2013). RepLab focused on the process of monitoring the reputation of companies and individuals, and asked participant systems to annotate different types of information on tweets containing the names of several companies: first tweets had to be classified as related or unrelated to the entity; relevant tweets had to be classified according to their polarity for reputation (Does the content of the tweet have positive or negative implications for the reputation of the entity?), clustered in coherent topics, and clusters had to be ranked according to their priority (potential reputation problems had to come first). The gold standard consists of more than 140,000 tweets annotated by a group of trained annotators supervised and monitored by reputation experts.
By
Azarbonyad, Hosein; Saan, Ferron; Dehghani, Mostafa; Marx, Maarten; Kamps, Jaap
Show all (5)
1 Citations
Text interestingness is a measure of assessing the quality of documents from users’ perspective which shows their willingness to read a document. Different approaches are proposed for measuring the interestingness of texts. Most of these approaches suppose that interesting texts are also topically diverse and estimate interestingness using topical diversity. In this paper, we investigate the relation between interestingness and topical diversity. We do this on the Dutch and Canadian parliamentary proceedings. We apply an existing measure of interestingness, which is based on structural properties of the proceedings (eg, how much interaction there is between speakers in a debate). We then compute the correlation between this measure of interestingness and topical diversity.
Our main findings are that in general there is a relatively low correlation between interestingness and topical diversity; that there are two extreme categories of documents: highly interesting, but hardly diverse (focused interesting documents) and highly diverse but not interesting documents. When we remove these two extreme types of documents there is a positive correlation between interestingness and diversity.
By
Cuong, Hoang; Sima’an, Khalil
This paper focuses on the insensitivity of existing word alignment models to domain differences, which often yields suboptimal results on large heterogeneous data. A novel latent domain word alignment model is proposed, which induces domainfocused lexical and alignment statistics. We propose to train the model on a heterogeneous corpus under partial supervision, using a small number of seed samples from different domains. The seed samples allow estimating sharper, domainfocused word alignment statistics for sentence pairs. Our experiments show that the derived domainfocused statistics, once combined together, produce significant improvements both in word alignment accuracy and in translation accuracy of their resulting SMT systems. Going beyond the findings, we surmise that virtually any large corpus (e.g., Europarl, Hansards, Common Crawl) harbors an arbitrary diversity of hidden domains, unknown in advance. We address the novel challenge of unsupervised induction of hidden domains in parallel corpora, applied within a domainfocused wordalignment modeling framework. On the technical side, we contrast flat estimation for the unsupervised induction of domains to a simple form of hierarchical estimation, consisting of two steps aiming at avoiding bad local maxima. Extensive experiments, conducted over seven different language pairs with fully unsupervised induction of domains for word alignment, demonstrate significant improvements in alignment accuracy.
By
Lauridsen, Frederik M.
1 Citations
We characterise the intermediate logics which admit a cutfree hypersequent calculus of the form
$$\mathbf {HLJ} + \mathscr {R}$$
, where
$$\mathbf {HLJ}$$
is the hypersequent counterpart of the sequent calculus
$$\mathbf {LJ}$$
for propositional intuitionistic logic, and
$$\mathscr {R}$$
is a set of socalled structural hypersequent rules, i.e., rules not involving any logical connectives. The characterisation of this class of intermediate logics is presented both in terms of the algebraic and the relational semantics for intermediate logics. We discuss various—positive as well as negative—consequences of this characterisation.
By
Krennmayr, Tina; Steen, Gerard
The VU Amsterdam Metaphor Corpus consists of manual annotations of metaphors in four different registers—news texts, fiction, academic texts, and conversations. The goal of building this corpus was to investigate which metaphors are used in which forms, in which discourse contexts, in which registers, and for which purposes. This chapter reports on the development of the annotation scheme and its physical representation, describes the annotation process, and reports on interannotator agreement and quality control as well as current usage of the corpus. It also includes some quantitative results on the interaction between metaphor, register, and word class.
By
van Rooy, Robert
51 Citations
In this paper I will discuss why (un) marked expressionstypically get an (un)marked interpretation: Horn'sdivision of pragmatic labor. It is argued that it is aconventional fact that we use language this way.This convention will be explained in terms ofthe equilibria of signalling games introduced byLewis (1969), but now in an evolutionary setting. Iwill also relate this signalling game analysis withParikh's (1991, 2000, 2001) gametheoretical analysis ofsuccessful communication, which in turn is compared withBlutner's: 2000) bidirectional optimality theory.
