Showing 1 to 100 of 495 matching Articles
Results per page:
Export (CSV)
By
Stede, Manfred
Post to Citeulike
7 Citations
Empirical studies of text coherence often use treelike structures in the spirit of Rhetorical Structure Theory (RST) as representational device. This paper identifies several sources of ambiguity in RSTinspired trees and argues that such structures are therefore not as explanatory as a text representation should be. As an alternative, an approach toward multilevel annotation (MLA) of texts is proposed, which separates the information into distinct levels of representation, in particular: referential structure, thematic structure, conjunctive relations, and intentional structure. Levels are conceptually built upon each other, and human annotators can produce them using a dedicated software environment. We argue that the resulting multilevel corpora are descriptively more adequate, and as a resource are more useful than RSTstyle treebanks.
more …
By
Boleda, Gemma; Schulte im Walde, Sabine; Badia, Toni
Post to Citeulike
2 Citations
This article reports on a largescale experiment for gathering human judgements with respect to a semantic classification of Catalan adjectives. The goal of our experiment was to classify 210 Catalan adjectives as basic, eventrelated, or objectrelated adjectives, allowing for multiple class assignments to account for polysemy. The experiment was directed at nonexpert native speakers and administered via the Web, collecting data from 322 participants. We assess the degree of interannotator agreement through an innovative methodology based on observed agreement and kappa, and use weighted versions of these measures to account for partial agreement in polysemous assignments. Because the obtained scores (kappa 0.20–0.34) are too low to establish a reliably labelled dataset, we then perform a series of posthoc analyses on the human judgements to investigate the sources of disagreement, by comparing the participants’ classifications with a classification obtained from experts. Our analysis shows that polysemous items and eventrelated adjectives are more problematic than other types of adjectives. Furthermore, the analysis helps to distinguish disagreement caused by the task as opposed to that caused by the experimental design, thus pointing to specific difficulties in both aspects of the research. The methodology developed for this analysis might therefore prove useful for the design of experiments for related tasks.
more …
By
Versley, Yannick
Post to Citeulike
5 Citations
In this paper, we argue that difficulties in the definition of coreference itself contribute to lower interannotator agreement in certain cases. Data from a large referentially annotated corpus serves to corroborate this point, using a quantitative investigation to assess which effects or problems are likely to be the most prominent. Several examples where such problems occur are discussed in more detail, and we then propose a generalisation of Poesio, Reyle and Stevenson’s Justified Sloppiness Hypothesis to provide a unified model for these cases of disagreement and argue that a deeper understanding of the phenomena involved allows to tackle problematic cases in a more principled fashion than would be possible using only pretheoretic intuitions.
more …
By
Willis, Alistair; Chantree, Francis; Roeck, Anne
Post to Citeulike
7 Citations
We present the concept of nocuous ambiguity, which occurs when text is interpreted differently by different readers. In contrast, text exhibits innocuous ambiguity if different readers interpret it in the same way, even though structural or semantic analyses suggest that multiple interpretations may be possible. We collect multiple human judgements of a set of English phrases obtained from requirements documents. We focus on coordination ambiguity and show that across a group of judges there may be wide variation in what is perceived to be the correct interpretation. We develop the concept of an ambiguity threshold, which expresses the amount of variation between judgements that can be tolerated. We then develop and evaluate a heuristically based method of automatically predicting which sentences may be misunderstood for a given ambiguity threshold.
more …
By
Knees, Mareile Hillevi
Post to Citeulike
I present different types of ambiguity that occur in annotating and resolving the German anaphoric adverbial danach (“thereafter”). By means of two pilot studies it is shown that referential ambiguity (i.e. the anaphor has several plausible referents) and structual dissociation (i.e. different antecedents specify the same referent) cause bad interannotator agreement. Both phenomena can only be explored in detail as the annotation studies do not only concentrate on the textual but also on the referential level involved in anaphoric references. Thus, it can be shown that the competing referents in most referentially ambiguous cases are more or less temporally and conceptually related to each other and specify a similiar reference time for danach (“thereafter”). Moreover, the competing antecedents often textually overlap so that some structurally dissociated cases can be handled by stricter annotation guidelines. Thus, considering the textual and the referential dimensions of anaphoric reference provides further insights into the cognitive processing of sentential anaphors like danach (“thereafter”).
more …
By
Lavelli, Alberto; Califf, Mary Elaine; Ciravegna, Fabio; Freitag, Dayne; Giuliano, Claudio; Kushmerick, Nicholas; Romano, Lorenza; Ireson, Neil
Show all (8)
Post to Citeulike
9 Citations
We survey the evaluation methodology adopted in information extraction (IE), as defined in a few different efforts applying machine learning (ML) to IE. We identify a number of critical issues that hamper comparison of the results obtained by different researchers. Some of these issues are common to other NLPrelated tasks: e.g., the difficulty of exactly identifying the effects on performance of the data (sample selection and sample size), of the domain theory (features selected), and of algorithm parameter settings. Some issues are specific to IE: how leniently to assess inexact identification of filler boundaries, the possibility of multiple fillers for a slot, and how the counting is performed. We argue that, when specifying an IE task, these issues should be explicitly addressed, and a number of methodological characteristics should be clearly defined. To empirically verify the practical impact of the issues mentioned above, we perform a survey of the results of different algorithms when applied to a few standard datasets. The survey shows a serious lack of consensus on these issues, which makes it difficult to draw firm conclusions on a comparative evaluation of the algorithms. Our aim is to elaborate a clear and detailed experimental methodology and propose it to the IE community. Widespread agreement on this proposal should lead to future IE comparative evaluations that are fair and reliable. To demonstrate the way the methodology is to be applied we have organized and run a comparative evaluation of MLbased IE systems (the Pascal Challenge on MLbased IE) where the principles described in this article are put into practice. In this article we describe the proposed methodology and its motivations. The Pascal evaluation is then described and its results presented.
more …
By
Cobreros, Pablo
Post to Citeulike
9 Citations
It is often assumed that the supervaluationist theory of vagueness is committed to a global notion of logical consequence, in contrast with the local notion characteristic of modal logics. There are, at least, two problems related to the global notion of consequence. First, it brings some counterexamples to classically valid patterns of inference. Second, it is subject to an objection related to higherorder vagueness. This paper explores a third notion of logical consequence, and discusses its adequacy for the supervaluationist theory. The paper proceeds in two steps. In the first step, the paper provides a deductive notion of consequence for global validity using the tableaux method. In the second step, the paper provides a notion of logical consequence which is an alternative to global validity, and discusses i) whether it is acceptable to the supervaluationist and ii) whether it plays a better role in a theory of vagueness in the face of the problems related to the global notion.
more …
By
Shapiro, Stewart
Post to Citeulike
1 Citations
It is a commonplace that the extensions of most, perhaps all, vague predicates vary with such features as comparison class and paradigm and contrasting cases. My view proposes another, more pervasive contextual parameter. Vague predicates exhibit what I call open texture: in some circumstances, competent speakers can go either way in the borderline region. The shifting extension and antiextensions of vague predicates are tracked by what David Lewis calls the “conversational score”, and are regulated by what Kit Fine calls penumbral connections, including a principle of tolerance. As I see it, vague predicates are responsedependent, or, better, judgementdependent, at least in their borderline regions. This raises questions concerning how one reasons with such predicates.
In this paper, I present a model theory for vague predicates, so construed. It is based on an overall supervaluationiststyle framework, and it invokes analogues of Kripke structures for intuitionistic logic. I argue that the system captures, or at least nicely models, how one ought to reason with the shifting extensions (and antiextensions) of vague predicates, as borderline cases are called and retracted in the course of a conversation. The model theory is illustrated with a forced march sorites series, and also with a thought experiment in which vague predicates interact with socalled future contingents. I show how to define various connectives and quantifiers in the language of the system, and how to express various penumbral connections and the principle of tolerance. The project fits into one of the topics of this special issue. In the course of reasoning, even with the external context held fixed, it is uncertain what the future extension of the vague predicates will be. Yet we still manage to reason with them. The system is based on that developed, more fully, in my Vagueness in Context, Oxford, Oxford University Press, 2006, but some criticisms and replies to critics are incorporated.
more …
By
Zardini, Elia
Post to Citeulike
30 Citations
According to the naive theory of vagueness, the vagueness of an expression consists in the existence of both positive and negative cases of application of the expression and in the nonexistence of a sharp cutoff point between them. The sorites paradox shows the naive theory to be inconsistent in most logics proposed for a vague language. The paper explores the prospects of saving the naive theory by revising the logic in a novel way, placing principled restrictions on the transitivity of the consequence relation. A latticetheoretical framework for a whole family of (zerothorder) “tolerant logics” is proposed and developed. Particular care is devoted to the relation between the salient features of the formal apparatus and the informal logical and semantic notions they are supposed to model. A suitable nontransitive counterpart to classical logic is defined. Some of its properties are studied, and it is eventually shown how an appropriate regimentation of the naive theory of vagueness is consistent in such a logic.
more …
By
Cohen, Ariel
Post to Citeulike
Most solutions to the sorites reject its major premise, i.e. the quantified conditional
$${\forall{i}(P(ai) \rightarrow {P(ai+))}}$$
. This rejection appears to imply a discrimination between two elements that are supposed to be indiscriminable. Thus, the puzzle of the sorites involves in a fundamental way the notion of indiscriminability. This paper analyzes this relation and formalizes it, in a way that makes the rejection of the major premise more palatable.
The intuitive idea is that we consider two elements indiscriminable by default, i.e. unless we know some information that discriminates between them. Specifically, following Rough Set Theory, two elements are defined to be indiscernible if they agree on the vague property in question. Then, a is defined to be indiscriminable from b if a is indiscernible by default from b. That is to say, a is indiscriminable from b if it is consistent to assume that a and b agree on the relevant vague property.
Indiscernibility by default is formalized with the use of Default Logic, and is shown to have intuitively desirable properties: it is entailed by equality, is reflexive and symmetric. And while the relation is neither transitive nor substitutive, it is “almost” substitutive.
This definition of indiscriminability is incorporated into three major theories of vagueness, namely the supervaluationist, epistemic, and contextualist views. Each one of these theories is reduced to a different strategy dealing with multiple extensions in Default Logic, and the rejection of the major premise is shown to follow naturally. Thus, while the proposed notion of indiscriminability does not solve the sorites by itself, it does make the unintuitive conclusion of many of its proposed solutions—the rejection of the major premise—a bit easier to accept.
more …
By
Verdée, Peter; Gulik, Stephan
Post to Citeulike
3 Citations
In this paper, we present a generic format for adaptive vague logics. Logics based on this format are able to (1) identify sentences as vague or nonvague in light of a given set of premises, and to (2) dynamically adjust the possible set of inferences in accordance with these identifications, i.e. sentences that are identified as vague allow only for the application of vague inference rules and sentences that are identified as nonvague also allow for the application of some extra set of classical logic rules. The generic format consists of a set of minimal criteria that must be satisfied by the vague logic in casu in order to be usable as a basis for an adaptive vague logic. The criteria focus on the way in which the logic deals with a special ⊡operator. Depending on the kind of logic for vagueness that is used as a basis for the adaptive vague logic, this operator can be interpreted as completely true, definitely true, clearly true, etc. It is proven that a wide range of famous logics for vagueness satisfies these criteria when extended with a specific ⊡operator, e.g. fuzzy basic logic and its well known extensions, cf. [7], super and subvaluationist logics, cf. [6], [9], and clarity logic, cf. [13]. Also a fuzzy logic is presented that can be used for an adaptive vague logic that can deal with higherorder vagueness. To illustrate the theory, some toyexamples of adaptive vague proofs are provided.
more …
By
Vetterlein, Thomas
Post to Citeulike
Fuzzy logics are in most cases based on an adhoc decision about the interpretation of the conjunction. If they are useful or not can typically be found out only by testing them with example data. Why we should use a specific fuzzy logic can in general not be made plausible. Since the difficulties arise from the use of additional, unmotivated structure with which the set of truth values is endowed, the only way to base fuzzy logics on firm ground is the development of alternative semantics to all of whose components we can associate a meaning.
In this paper, we present one possible approach to justify ex post Łukasiewicz Logic as well as Basic Logic. The notion of ambiguity is central. Our framework consists of a Boolean or a Heyting algebra, respectively, endowed with an equivalence relation expressing ambiguity. The quotient set bears naturally the structure of an MV or a BLalgebra, respectively, and thus can be used to interpret propositions of the mentioned logics.
more …
By
Milne, Peter
Post to Citeulike
1 Citations
Uncertainty and vagueness/imprecision are not the same: one can be certain about events described using vague predicates and about imprecisely specified events, just as one can be uncertain about precisely specified events. Exactly because of this, a question arises about how one ought to assign probabilities to imprecisely specified events in the case when no possible available evidence will eradicate the imprecision (because, say, of the limits of accuracy of a measuring device).
Modelling imprecision by rough sets over an approximation space presents an especially tractable case to help get one’s bearings. Two solutions present themselves: the first takes as upper and lower probabilities of the event X the (exact) probabilities assigned X’s upper and lower roughset approximations; the second, motivated both by formal considerations and by a simple betting argument, is to treat X’s roughset approximation as a conditional event and assign to it a pointvalued (conditional) probability.
With rough sets over an approximation space we get a lot of good behaviour. For example, in the first construction mentioned the lower probabilities are nmonotone, for every
$${n \in \mathbb{N}^{+}}$$
. When we examine other models of approximation/imprecision/vagueness, and in particular, proximity spaces, we lose a lot of that good behaviour. In the literature there is not (even) agreement on the definition of upper and lower approximations for events (subsets) in the underlying domain. Betting considerations suggest one choice and, again, ways to assign upper and lower and pointvalued probabilities, but nothing works well.
more …
By
Mirman, Daniel
Post to Citeulike
4 Citations
The speech signal is inherently ambiguous and all computational and behavioral research on speech perception has implicitly or explicitly investigated the mechanism of resolution of this ambiguity. It is clear that context and prior probability (i.e., frequency) play central roles in resolving ambiguities between possible speech sounds and spoken words (speech perception) as well as between meanings and senses of a word (semantic ambiguity resolution). However, the mechanisms of these effects are still under debate. Recent advances in understanding context and frequency effects in speech perception suggest promising approaches to investigating semantic ambiguity resolution. This review begins by motivating the use of insights from speech perception to understand the mechanisms of semantic ambiguity resolution. Key to this motivation is the description of the structural similarity between the two domains with a focus on two parallel sets of findings: context strength effects, and an attractor dynamics account for the contrasting patterns of inhibition and facilitation due to ambiguity. The main part of the review then discusses three recent, influential sets of findings in speech perception, which suggest that (1) topdown contextual and bottomup perceptual information interact to mutually constrain processing of ambiguities, (2) word frequency influences online access, rather than response biases or resting levels, and (3) interactive integration of topdown and bottomup information is optimal given the noisy, yet highly constrained nature of realworld communication, despite the possible consequence of illusory perceptions. These findings and the empirical methods behind them provide auspicious future directions for the study of semantic ambiguity resolution.
more …
By
Welty, Chris; Murdock, J. William; Fan, James
Post to Citeulike
In our research on using information extraction to help populate semantic web resources, we have encountered significant obstacles to interoperability between the technologies. We believe these obstacles to be endemic to the basic paradigms and not quirks of the specific implementations we have worked with. In particular, we identify five dimensions of interoperability that must be addressed to successfully employ information extraction systems to populate semantic web resources that are suitable for reasoning. We call the task of transforming IE data into knowledgebased resources knowledge integration and we report results of experiments in which the knowledge integration process uses the deeper semantics of OWL ontologies to improve by between 8% and 13% the precision of relation extraction from text.
more …
By
Albrecht, Joshua S.; Hwa, Rebecca
Post to Citeulike
6 Citations
Machine learning offers a systematic framework for developing metrics that use multiple criteria to assess the quality of machine translation (MT). However, learning introduces additional complexities that may impact on the resulting metric’s effectiveness. First, a learned metric is more reliable for translations that are similar to its training examples; this calls into question whether it is as effective in evaluating translations from systems that are not its contemporaries. Second, metrics trained from different sets of training examples may exhibit variations in their evaluations. Third, expensive developmental resources (such as translations that have been evaluated by humans) may be needed as training examples. This paper investigates these concerns in the context of using regression to develop metrics for evaluating machinetranslated sentences. We track a learned metric’s reliability across a 5 year period to measure the extent to which the learned metric can evaluate sentences produced by other systems. We compare metrics trained under different conditions to measure their variations. Finally, we present an alternative formulation of metric training in which the features are based on comparisons against pseudoreferences in order to reduce the demand on human produced resources. Our results confirm that regression is a useful approach for developing new metrics for MT evaluation at the sentence level.
more …
By
Busso, Carlos; Bulut, Murtaza; Lee, ChiChun; Kazemzadeh, Abe; Mower, Emily; Kim, Samuel; Chang, Jeannette N.; Lee, Sungbok; Narayanan, Shrikanth S.
Show all (9)
Post to Citeulike
222 Citations
Since emotions are expressed through a combination of verbal and nonverbal channels, a joint analysis of speech and gestures is required to understand expressive human communication. To facilitate such investigations, this paper describes a new corpus named the “interactive emotional dyadic motion capture database” (IEMOCAP), collected by the Speech Analysis and Interpretation Laboratory (SAIL) at the University of Southern California (USC). This database was recorded from ten actors in dyadic sessions with markers on the face, head, and hands, which provide detailed information about their facial expressions and hand movements during scripted and spontaneous spoken communication scenarios. The actors performed selected emotional scripts and also improvised hypothetical scenarios designed to elicit specific types of emotions (happiness, anger, sadness, frustration and neutral state). The corpus contains approximately 12 h of data. The detailed motion capture information, the interactive setting to elicit authentic emotions, and the size of the database make this corpus a valuable addition to the existing databases in the community for the study and modeling of multimodal and expressive human communication.
more …
By
Chaves, Rui P.
Post to Citeulike
11 Citations
This paper addresses a phenomenon in which certain wordparts can be omitted. The evidence shows that the full range of data cannot be captured by a sublexical analysis, since the phenomena can be observed both in phrasal and in lexical environments. It is argued that a form of deletion is involved, and that the phenomena—lexical or otherwise—are subject to the same phonological, semantic, and syntactic constraints. In the formalization that is proposed, all of the above constraints are cast in a parallel and declarative fashion, in the framework of HeadDriven Phrase Structure Grammar (Pollard and Sag Headdriven phrase structure grammar, 1994), since the various levels of linguistic description are locally and simultaneously available. Building on recent accounts of ellipsis, this paper proposes a unified and general account of wordpart ellipsis and phrasal ellipsis.
more …
By
Shoham, Sharon; Francez, Nissim
Post to Citeulike
In this paper, we propose a game semantics for the (associative) Lambek calculus. Compared to the implicational fragment of intuitionistic propositional calculus, the semantics deals with two features of the logic: absence of structural rules, as well as directionality of implication. We investigate the impact of these variations of the logic on its game semantics.
more …
By
Denecke, Klaus; Phusanga, Dara
Post to Citeulike
2 Citations
Defining a composition operation on sets of formulas one obtains a manysorted algebra which satisfies the superassociative law and one more identity. This algebra is called the clone of formulas of the given type. The interpretations of formulas on an algebraic system of the same type form a manysorted algebra with similar properties. The satisfaction of a formula by an algebraic system defines a Galois connection between classes of algebraic systems of the same type and collections of formulas. Hypersubstitutions are mappings sending pairs of operation symbols to pairs of terms of the corresponding arities and relation symbols to formulas of the same arities. Using hypersubstitutions we define hyperformulas. Satisfaction of a hyperformula by an algebraic system defines a second Galois connection between classes of algebraic systems of the same type and collections of formulas. A class of algebraic systems is said to be solid if every formula which is satisfied is also satisfied as a hyperformula. On the basis of these two Galois connections we construct a conjugate pair of additive closure operators and are able to characterize solid classes of algebraic systems.
more …
By
Forster, Thomas
Post to Citeulike
Sharvy’s puzzle concerns a situation in which common knowledge of two parties is obtained by repeated observation each of the other, no fixed point being reached in finite time. Can a fixed point be reached?
By
Bezhanishvili, Nick
Post to Citeulike
4 Citations
In this paper we define the notion of frame based formulas. We show that the wellknown examples of formulas arising from a finite frame, such as the Jankovde Jongh formulas, subframe formulas and cofinal subframe formulas, are all particular cases of the frame based formulas. We give a criterion for an intermediate logic to be axiomatizable by frame based formulas and use this criterion to obtain a simple proof that every locally tabular intermediate logic is axiomatizable by Jankovde Jongh formulas. We also show that not every intermediate logic is axiomatizable by frame based formulas.
more …
By
Fang, Jie
Post to Citeulike
In this paper, we introduce a variety bdO of Ockham algebras with balanced double pseudocomplementation, consisting of those algebras
$${(L; \wedge, \vee, f,\,^{*},^{+}, 0, 1)}$$
of type
$${\langle2,\,2,\,1,\,1,\,1,\,0,\,0\rangle}$$
where
$${(L; \wedge, \vee, f, 0, 1)}$$
is an Ockham algebra,
$${(L; \wedge, \vee, f,\,^{*},^{+}, 0, 1)}$$
is a double palgebra, and the operations
$${x \mapsto f(x), x \mapsto x^{*}}$$
and
$${x \mapsto x^{+}}$$
are linked by the identities [f(x)]* = [f(x)]^{+} = f^{2}(x), f(x*) = x^{**} and f(x^{+}) = x^{++}. We give a description of the congruences on the algebras, and show that there are precisely nine nonisomorphic subdirectly irreducible members in the class of the algebras via the Priestley duality. We also describe all axioms in the variety bdO, and provide a characterization of all subvarieties of bdO determined by 12 noneequivalent axioms, identifying therein the biggest subvariety in which every principal congruence is complemented.
more …
By
Queiroz, Ruy J. G. B.
Post to Citeulike
3 Citations
The intention here is that of giving a formal underpinning to the idea of ‘meaningisuse’ which, even if based on proofs, it is rather different from prooftheoretic semantics as in the Dummett–Prawitz tradition. Instead, it is based on the idea that the meaning of logical constants are given by the explanation of immediate consequences, which in formalistic terms means the effect of elimination rules on the result of introduction rules, i.e. the socalled reduction rules. For that we suggest an extension to the Curry– Howard interpretation which draws on the idea of labelled deduction, and brings back Frege’s device of variableabstraction to operate on the labels (i.e., proofterms) alongside formulas of predicate logic.
more …
By
Horwich, Paul
Post to Citeulike
6 Citations
This paper offers a critique of mainstream formal semantics. It begins with a statement of widely assumed adequacy conditions: namely, that a good theory must (1) explain relations of entailment, (ii) show how the meanings of complex expressions derive from the meanings of their parts, and (iii) characterize facts of meaning in truththeoretic terms. It then proceeds to criticize the orthodox conception of semantics that is articulated in these three desiderata. This critique is followed by a sketch of an alternative conception—one that is argued to be more in tune with the empirical objectives of linguistics and the clarificatory aims of philosophy. Finally, the paper proposes and defends a specific theoretical approach—use based rather than truth based—that is suggested by that alternative conception.
more …
By
Sharvit, Yael
Post to Citeulike
26 Citations
The purpose of this paper is to shed some light on the familiar puzzle of free indirect discourse (FID). FID shares some properties with standard indirect discourse and with direct discourse, but there is currently no known theory that can accommodate such a hybrid. Based on the observation that FID has ‘de se’ pronouns, I argue that it is a kind of an attitude report.
more …
By
Mittwoch, Anita
Post to Citeulike
14 Citations
A sentence in the Resultative perfect licenses two inferences: (a) the occurrence of an event (b) the state caused by this event obtains at evaluation time. In this paper I show that this use of the perfect is subject to a large number of distributional restrictions that all serve to highlight the result inference at the expense of the event inference. Nevertheless, only the event inference determines the truth conditions of this use of the perfect, the result inference being a unique type of conventional implicature. I argue furthermore that, since the result state is singular, the event that causes it must also be singular, whereas the Experiential perfect is purely quantificational. But in outoftheblue contexts the past tense is also normally interpreted as singular. This leads to a certain amount of competition between the Resultative perfect and the past tense, and it is this competition, I suggest, that maintains the conventional (nontruth conditional) result state inference.
more …
By
Schulte im Walde, Sabine; Melinger, Alissa; Roth, Michael; Weber, Andrea
Show all (4)
Post to Citeulike
4 Citations
This article presents a study to distinguish and quantify the various types of semantic associations provided by humans, to investigate their properties, and to discuss the impact that our analyses may have on NLP tasks. Specifically, we concentrate on two issues related to word properties and word relations: (1) We address the task of modelling word meaning by empirical features in dataintensive lexical semantics. Relying on largescale corpusbased resources, we identify the contextual categories and functions that are activated by the associates and therefore contribute to the salient meaning components of individual words and across words. As a result, we discuss conceptual roles and present evidence for the usefulness of cooccurrence information in distributional descriptions. (2) We assume that semantic associates provide a means to investigate the range of semantic relations between words and contexts, and we provide insight into which types of semantic relations are treated as important or salient by the speakers of the language.
more …
By
Boyd, Adriane; Dickinson, Markus; Meurers, W. Detmar
Post to Citeulike
4 Citations
Dependency relations between words are increasingly recognized as an important level of linguistic representation that is close to the data and at the same time to the semantic functorargument structure as a target of syntactic analysis and processing. Correspondingly, dependency structures play an important role in parser evaluation and for the training and evaluation of tools based on dependency treebanks. Gold standard dependency treebanks have been created for some languages, most notably Czech, and annotation efforts for other languages are under way. At the same time, general techniques for detecting errors in dependency annotation have not yet been developed. We address this gap by exploring how a technique proposed for detecting errors in constituencybased syntactic annotation can be adapted to systematically detect errors in dependency annotation. Building on an analysis of key properties and differences between constituency and dependency annotation, we discuss results for dependency treebanks for Swedish, Czech, and German. Complementing the focus on detecting errors in dependency treebanks to improve these gold standard resources, the discussion of dependency error detection for different languages and annotation schemes also raises questions of standardization for some aspects of dependency annotation, in particular regarding the locality of annotation, the assumption of a single head for each dependency relation, and phenomena such as coordination.
more …
By
Nederhof, MarkJan; Satta, Giorgio
Post to Citeulike
4 Citations
We investigate the problem of computing the partition function of a probabilistic contextfree grammar, and consider a number of applicable methods. Particular attention is devoted to PCFGs that result from the intersection of another PCFG and a finite automaton. We report experiments involving the Wall Street Journal corpus.
more …
By
Pendar, Nick
Post to Citeulike
This paper proposes that linguistic constraint satisfaction can be viewed as an instance of general human soft constraint satisfaction. After a discussion on the relation between modularity in grammar and soft constraints and a review of the conceptions of gradient phenomena in language, the paper presents a generalized theory of soft constraint satisfaction from the AI literature (Bistarelli 2001). It then shows that a unifying underlying theory of constraint satisfaction allows us to bring different constraintbased linguistic theories (e.g., LOT and HPSG) closer together as well as account for certain gradient phenomena straightforwardly.
more …
By
Ju, Shier; Wen, Xuefeng
Post to Citeulike
First we show that the classical twoplayer semantic game actually corresponds to a threevalued logic. Then we generalize this result and give an nplayer semantic game for an n + 1valued logic with n binary connectives, each associated with a player. We prove that player i has a winning strategy in game
$${G(\varphi, M)}$$
if and only if the truth value of
$${\varphi}$$
is t_{i} in the model M, for 1 ≤ i ≤ n; and none of the players has a winning strategy in
$${G(\varphi, M)}$$
if and only if the truth value of
$${\varphi}$$
is t_{0} in M.
more …
By
Fermüller, C. G.
Post to Citeulike
8 Citations
An overview of different versions and applications of Lorenzen’s dialogue game approach to the foundations of logic, here largely restricted to the realm of manyvalued logics, is presented. Among the reviewed concepts and results are Giles’s characterization of Łukasiewicz logic and some of its generalizations to other fuzzy logics, including interval based logics, a parallel version of Lorenzen’s game for intuitionistic logic that is adequate for finite and infinitevalued Gödel logics, and a truth comparison game for infinitevalued Gödel logic.
more …
By
Avron, A.; Konikowska, B.
Post to Citeulike
5 Citations
In the paper we explore the idea of describing Pawlak’s rough sets using threevalued logic, whereby the value t corresponds to the positive region of a set, the value f — to the negative region, and the undefined value u — to the border of the set. Due to the properties of the above regions in rough set theory, the semantics of the logic is described using a nondeterministic matrix (Nmatrix). With the strong semantics, where only the value t is treated as designated, the above logic is a “common denominator” for Kleene and Łukasiewicz 3valued logics, which represent its two different “determinizations”. In turn, the weak semantics—where both t and u are treated as designated—represents such a “common denominator” for two major 3valued paraconsistent logics.
We give sound and complete, cutfree sequent calculi for both versions of the logic generated by the rough set Nmatrix. Then we derive from these calculi sequent calculi with the same properties for the various “determinizations” of those two versions of the logic (including Łukasiewicz 3valued logic). Finally, we show how to embed the four abovementioned determinizations in extensions of the basic rough set logics obtained by adding to those logics a special twovalued “definedness” or “crispness” operator.
more …
By
Castiglioni, J. L.; Menni, M.; Sagastume, M.
Post to Citeulike
8 Citations
Motivated by an old construction due to J. Kalman that relates distributive lattices and centered Kleene algebras we define the functor K^{•} relating integral residuated lattices with 0 (IRL_{0}) with certain involutive residuated lattices. Our work is also based on the results obtained by Cignoli about an adjunction between Heyting and Nelson algebras, which is an enrichment of the basic adjunction between lattices and Kleene algebras. The lifting of the functor to the category of residuated lattices leads us to study other adjunctions and equivalences. For example, we treat the functor C whose domain is cuRL, the category of involutive residuated lattices M whose unit is fixed by the involution and has a Boolean complement c (the underlying set of CM is the set of elements greater or equal than c). If we restrict to the full subcategory NRL of cuRL of those objects that have a nilpotent c, then C is an equivalence. In fact, CM is isomorphic to C_{e}M, and C_{e} is adjoint to
$${\widehat{(_{})}}$$
, where
$${\widehat{(_{})}}$$
assigns to an object A of IRL_{0} the product A × A^{0} which is an object of NRL.
more …
By
Foo, Norman; Low, Boon Toh
Post to Citeulike
1 Citations
The work on prototypes in ontologies pioneered by Rosch [10] and elaborated by Lakoff [8] and Freund [3] is related to vagueness in the sense that the more remote an instance is from a prototype the fewer people agree that it is an example of that prototype. An intuitive example is the prototypical “mother”, and it is observed that more specific instances like ”single mother”, “adoptive mother”, “surrogate mother”, etc., are less and less likely to be classified as “mothers” by experimental subjects. From a different direction Gärdenfors [4] provided a persuasive account of natural predicates to resolve paradoxes of induction like Goodman’s “Grue” predicate [5]. Gärdenfors proposed that “quality dimensions” arising from human cognition and perception impose topologies on concepts such that the ones that appear “natural” to us are convex in these topologies. We show that these two cognitive principles — prototypes and predicate convexity — are equivalent to unimodal (convex) fuzzy characteristic functions for sets. Then we examine the case when the fuzzy set characteristic function is not convex, in particular when it is multimodal. We argue that this is an indication that the fuzzy concept should really be regarded as a super concept in which the decomposed components are subconcepts in an ontological taxonomy.
more …
By
Čulo, Oliver; Erk, Katrin; Padó, Sebastian; Schulte im Walde, Sabine
Show all (4)
Post to Citeulike
1 Citations
In this article, we address the task of comparing and combining different semantic verb classifications within one language. We present a methodology for the manual analysis of individual resources on the level of semantic features. The resulting representations can be aligned across resources, and allow a contrastive analysis of these resources. In a case study on the Manner of Motion domain across four German verb classifications, we find that some features are used in all resources, while others reflect individual emphases on specific meaning aspects. We also provide evidence that feature representations can ultimately provide the basis for linking verb classes themselves across resources, which allows us to combine their coverage and descriptive detail.
more …
By
LönnekerRodman, Birte
Post to Citeulike
7 Citations
This paper concerns metaphor resource creation. It provides an account of methods used, problems discovered, and insights gained at the Hamburg Metaphor Database project, intended to inform similar resource creation initiatives, as well as future metaphor processing algorithms. After introducing the project, the theoretical underpinnings that motivate the subdivision of represented information into a conceptual and a lexical level are laid out. The acquisition of metaphor attestations from electronic corpora is explained, and annotation practices as well as database contents are evaluated. The paper concludes with an overview of related projects and an outline of possible future work.
more …
By
Frunza, Oana; Inkpen, Diana
Post to Citeulike
Partial cognates are pairs of words in two languages that have the same meaning in some, but not all contexts. Detecting the actual meaning of a partial cognate in context can be useful for Machine Translation tools and for ComputerAssisted Language Learning tools. We propose a supervised and a semisupervised method to disambiguate partial cognates between two languages: French and English. The methods use only automaticallylabeled data; therefore they can be applied to other pairs of languages as well. The aim of our work is to automatically detect the meaning of a French partial cognate word in a specific context.
more …
By
Mihalcea, Rada; Leong, Chee Wee
Post to Citeulike
17 Citations
This paper addresses and evaluates the hypothesis that pictorial representations can be used to effectively convey simple sentences across language barriers. The paper makes two main contributions. First, it proposes an approach to augmenting dictionaries with illustrative images using volunteer contributions over the Web. The paper describes the PicNet illustrated dictionary, and evaluates the quality and quantity of the contributions collected through several online activities. Second, starting with this illustrated dictionary, the paper describes a system for the automatic construction of pictorial representations for simple sentences. Comparative evaluations show that a considerable amount of understanding can be achieved using visual descriptions of information, with evaluation figures within a comparable range of those obtained with linguistic representations produced by an automatic machine translation system.
more …
By
Mel’čuk, Igor; Wanner, Leo
Post to Citeulike
5 Citations
This paper addresses one of the least studied, although very important, problems of machine translation—the problem of morphological mismatches between languages and their handling during transfer. The level at which we assume transfer to be carried out is the DeepSyntactic Structure (DSyntS) as proposed in the MeaningText Theory. DSyntS is abstract enough to avoid all types of surface morphological divergences. For the remaining ‘genuine’ divergences between grammatical significations, we propose a morphological transfer model. To illustrate this model, we apply it to the transfer of grammemes of definiteness and aspect for the language pair Russian–German and German–Russian, respectively.
more …
By
Spinks, M.; Veroff, R.
Post to Citeulike
10 Citations
The goal of this twopart series of papers is to show that constructive logic with strong negation N is definitionally equivalent to a certain axiomatic extension NFL_{ew} of the substructural logic FL_{ew}. The main result of Part I of this series [41] shows that the equivalent variety semantics of N (namely, the variety of Nelson algebras) and the equivalent variety semantics of NFL_{ew} (namely, a certain variety of FL_{ew}algebras) are term equivalent. In this paper, the term equivalence result of Part I [41] is lifted to the setting of deductive systems to establish the definitional equivalence of the logics N and NFL_{ew}. It follows from the definitional equivalence of these systems that constructive logic with strong negation is a substructural logic.
more …
By
Sayed Ahmed, Tarek
Post to Citeulike
3 Citations
Following research initiated by Tarski, Craig and Németi, and futher pursued by Sain and others, we show that for certain subsets G of ^{ω}ω, atomic countable G polyadic algebras are completely representable. G polyadic algebras are obtained by restricting the similarity type and axiomatization of ωdimensional polyadic algebras to finite quantifiers and substitutions in G. This contrasts the cases of cylindric and relation algebras.
more …
By
Anglberger, Albert J. J.
Post to Citeulike
2 Citations
In Meyer’s promising account [7] deontic logic is reduced to a dynamic logic. Meyer claims that with his account “we get rid of most (if not all) of the nasty paradoxes that have plagued traditional deontic logic.” But as was shown by van der Meyden in [4], Meyer’s logic also contains a paradoxical formula. In this paper we will show that another paradox can be proven, one which also effects Meyer’s “solution” to contrary to duty obligations and his logic in general.
more …
By
Hill, Brian
Post to Citeulike
5 Citations
In the companion paper (Towards a “sophisticated” model of belief dynamics. Part I), a general framework for realistic modelling of instantaneous states of belief and of the operations involving them was presented and motivated. In this paper, the framework is applied to the case of belief revision. A model of belief revision shall be obtained which, firstly, recovers the Gärdenfors postulates in a wellspecified, natural yet simple class of particular circumstances; secondly, can accommodate iterated revisions, recovering several proposed revision operators for iterated revision as special cases; and finally, offers an analysis of Rott’s recent counterexample to several Gärdenfors postulates [32], elucidating in what sense it fails to be one of the special cases to which these postulates apply.
more …
By
Studer, Thomas
Post to Citeulike
7 Citations
We study the prooftheoretic relationship between two deductive systems for the modal mucalculus. First we recall an infinitary system which contains an omega rule allowing to derive the truth of a greatest fixed point from the truth of each of its (infinitely many) approximations. Then we recall a second infinitary calculus which is based on nonwellfounded trees. In this system proofs are finitely branching but may contain infinite branches as long as some greatest fixed point is unfolded infinitely often along every branch. The main contribution of our paper is a translation from proofs in the first system to proofs in the second system. Completeness of the second system then follows from completeness of the first, and a new proof of the finite model property also follows as a corollary.
more …
By
LeszczyńskaJasion, Dorota
Post to Citeulike
5 Citations
The aim of this paper is to present the method of Socratic proofs for seven modal propositional logics: K5, S4.2, S4.3, S4M, S4F, S4R and G. This work is an extension of [10] where the method was presented for the most common modal propositional logics: K, D, T, KB, K4, S4 and S5.
more …
By
Hoeksema, Jack
Post to Citeulike
2 Citations
Guerzoni and Sharvit (Linguistics and Philosophy 30:361–391, 2007) provide an argument that plural, but not singular, whphrases may contain a negative polarity item in their restriction, and connect this with the semantic property of exhaustivity. I will show that this claim is factually incorrect, and that the theory of negative polarity licensing does not need to be complicated by taking number distinctions into account. In addition, I will argue that number distinctions do not appear to be relevant for polarity items in the restriction of definite noun phrases either.
more …
By
Manurung, Ruli; Ritchie, Graeme; Pain, Helen; Waller, Annalu; Black, Rolf; O’Mara, Dave
Show all (6)
Post to Citeulike
1 Citations
As part of a project to construct an interactive program which would encourage children to play with language by building jokes, we developed a lexical database, starting from WordNet. To the existing information about part of speech, synonymy, hyponymy, etc., we have added phonetic representations and phonetic similarity ratings for pairs of words/phrases.
more …
By
Bale, Alan Clinton
Post to Citeulike
20 Citations
Comparative constructions form two classes, those that permit direct comparisons (comparisons of measurements as in Seymour is taller than he is wide) and those that only allow indirect comparisons (comparisons of relative positions on separate scales as in Esme is more beautiful than Einstein is intelligent). In contrast with other semantic theories, this paper proposes that the interpretation of the comparative morpheme remains the same whether it appears in sentences that compare individuals directly or indirectly. To develop a unified account, I suggest that all comparisons (whether in terms of height, intelligence or beauty) involve a scale of universal degrees that are isomorphic to the rational (fractional) numbers between 0 and 1. Crucial to a unified treatment, the connection between the individuals being compared and universal degrees involves two steps. First individuals are mapped to a value on a primary scale that ranks individuals with respect to the gradable property (whether it be height, beauty or intelligence). Second, the value on the primary scale is mapped to a universal degree that encodes the value’s relative position on the primary scale. Direct comparison results if measurements such as seven feet participate in the primary scale (as in Seven feet is tall). Otherwise the result is an indirect comparison.
more …
By
Horsten, Leon; Douven, Igor
Post to Citeulike
3 Citations
In this article, we reflect on the use of formal methods in the philosophy of science. These are taken to comprise not just methods from logic broadly conceived, but also from other formal disciplines such as probability theory, game theory, and graph theory. We explain how formal modelling in the philosophy of science can shed light on difficult problems in this domain.
more …
By
Huttegger, Simon M.; Skyrms, Brian
Post to Citeulike
6 Citations
We study a simple game theoretic model of information transfer which we consider to be a baseline model for capturing strategic aspects of epistemological questions. In particular, we focus on the question whether simple learning rules lead to an efficient transfer of information. We find that reinforcement learning, which is based exclusively on payoff experiences, is inadequate to generate efficient networks of information transfer. Fictitious play, the game theoretic counterpart to Carnapian inductive logic and a more sophisticated kind of learning, suffices to produce efficiency in information transfer.
more …
By
Andréka, H.; Madarász, J. X.; Németi, I.; Székely, G.
Show all (4)
Post to Citeulike
13 Citations
A part of relativistic dynamics is axiomatized by simple and purely geometrical axioms formulated within firstorder logic. A geometrical proof of the formula connecting relativistic and rest masses of bodies is presented, leading up to a geometric explanation of Einstein’s famous E = mc^{2}. The connection of our geometrical axioms and the usual axioms on the conservation of mass, momentum and fourmomentum is also investigated.
more …
By
Baltag, A.; Smets, S.
Post to Citeulike
7 Citations
In this paper we show how recent concepts from Dynamic Logic, and in particular from Dynamic Epistemic logic, can be used to model and interpret quantum behavior. Our main thesis is that all the nonclassical properties of quantum systems are explainable in terms of the nonclassical flow of quantum information. We give a logical analysis of quantum measurements (formalized using modal operators) as triggers for quantum information flow, and we compare them with other logical operators previously used to model various forms of classical information flow: the “test” operator from Dynamic Logic, the “announcement” operator from Dynamic Epistemic Logic and the “revision” operator from Belief Revision theory. The main points stressed in our investigation are the following: (1) The perspective and the techniques of “logical dynamics” are useful for understanding quantum information flow. (2) Quantum mechanics does not require any modification of the classical laws of “static” propositional logic, but only a nonclassical dynamics of information. (3) The main such nonclassical feature is that, in a quantum world, all informationgathering actions have some ontic sideeffects. (4) This ontic impact can affect in its turn the flow of information, leading to nonclassical epistemic sideeffects (e.g. a type of nonmonotonicity) and to states of “objectively imperfect information”. (5) Moreover, the ontic impact is nonlocal: an informationgathering action on one part of a quantum system can have ontic sideeffects on other, faraway parts of the system.
more …
By
Bueno, Otávio
Post to Citeulike
2 Citations
Scientific change has two important dimensions: conceptual change and structural change. In this paper, I argue that the existence of conceptual change brings serious difficulties for scientific realism, and the existence of structural change makes structural realism look quite implausible. I then sketch an alternative account of scientific change, in terms of partial structures, that accommodates both conceptual and structural changes. The proposal, however, is not realist, and supports a structuralist version of van Fraassen’s constructive empiricism (structural empiricism).
more …
By
Schurz, G.; Leitgeb, H.
Post to Citeulike
8 Citations
In this paper a theory of finitistic and frequentistic approximations — in short: fapproximations — of probability measures P over a countably infinite outcome space N is developed. The family of subsets of N for which fapproximations converge to a frequency limit forms a preDynkin system
$${{D\subseteq\wp(N)}}$$
. The limiting probability measure over D can always be extended to a probability measure over
$${{\wp(N)}}$$
, but this measure is not always σadditive. We conclude that probability measures can be regarded as idealizations of limiting frequencies if and only if σadditivity is not assumed as a necessary axiom for probabilities. We prove that σadditive probability measures can be characterized in terms of socalled canonical and in terms of socalled full fapproximations. We also show that every nonσadditive probability measure is fapproximable, though neither canonically nor fully fapproximable. Finally, we transfer our results to probability measures on open or closed formulas of firstorder languages.
more …
By
Halaš, Radomír
Post to Citeulike
5 Citations
It has been recently shown [4] that the lattice effect algebras can be treated as a subvariety of the variety of socalled basic algebras. The open problem whether all subdirectly irreducible distributive lattice effect algebras are just subdirectly irreducible MVchains and the horizontal sum
$${\mathcal{H}}$$
of two 3element chains is in the paper transferred into a more tractable one. We prove that modulo distributive lattice effect algebras, the variety generated by MValgebras and
$${\mathcal{H}}$$
is definable by three simple identities and the problem now is to check if these identities are satisfied by all distributive lattice effect algebras or not.
more …
By
Buszkowski, Wojciech; Palka, Ewa
Post to Citeulike
2 Citations
Action logic of Pratt [21] can be presented as Full Lambek Calculus FL [14, 17] enriched with Kleene star *; it is equivalent to the equational theory of residuated Kleene algebras (lattices). Some results on axiom systems, complexity and models of this logic were obtained in [4, 3, 18]. Here we prove a stronger form of *elimination for the logic of *continuous action lattices and the
$${\Pi_{1}^{0}}$$
–completeness of the equational theories of action lattices of subsets of a finite monoid and action lattices of binary relations on a finite universe. We also discuss possible applications in linguistics.
more …
By
Rosencrantz, Holger
Post to Citeulike
3 Citations
The paper provides a formal representation of goal systems. The focus is on three properties: consistency, conflict, and coherence. An aim is to attain conceptual clarity of these properties. It is argued that consistency is adequately regarded as a property relative to the decision situation or, more specifically, the set of alternatives that the agent faces. Moreover, as a condition of rationality, consistency is stronger than some writers have claimed. Conflict is adequately regarded as a relation over subsets of a given goal system and should likewise be regarded as relative to the set of alternatives that the agent faces. Coherence is given a probabilistic interpretation, based on a support relation over subsets of goal systems.
more …
By
Hughes, Jesse; Royakkers, Lambèr M. M.
Post to Citeulike
This paper studies longterm norms concerning actions. In Meyer’s Propositional Deontic Logic (PD_{e}L), only immediate duties can be expressed, however, often one has duties of longer durations such as: “Never do that”, or “Do this someday”. In this paper, we will investigate how to amend PD_{e}L so that such longterm duties can be expressed. This leads to the interesting and suprising consequence that the longterm prohibition and obligation are not interdefinable in our semantics, while there is a duality between these two notions. As a consequence, we have provided a new analysis of the longterm obligation by introducing a new atomic proposition I (indebtedness) to represent the condition that an agent has some unfulfilled obligation.
more …
By
Hill, Brian
Post to Citeulike
2 Citations
It is wellknown that classical models of belief are not realistic representations of human doxastic capacity; equally, models of actions involving beliefs, such as decisions based on beliefs, or changes of beliefs, suffer from a similar inaccuracies. In this paper, a general framework is presented which permits a more realistic modelling both of instantaneous states of belief, and of the operations involving them. This framework is motivated by some of the inadequacies of existing models, which it overcomes, whilst retaining technical rigour in so far as it relies on known, natural logical and mathematical notions. The companion paper (Towards a “sophisticated” model of belief dynamics. Part II) contains an application of this framework to the particular case of belief revision.
more …
By
Nola, Antonio Di; Georgescu, George; Spada, Luca
Post to Citeulike
2 Citations
In this paper we study the notion of forcing for Łukasiewicz predicate logic (Ł∀, for short), along the lines of Robinson’s forcing in classical model theory. We deal with both finite and infinite forcing. As regard to the former we prove a Generic Model Theorem for Ł∀, while for the latter, we study the generic and existentially complete standard models of Ł∀.
more …
By
Mitkov, Ruslan; Pekar, Viktor; Blagoev, Dimitar; Mulloni, Andrea
Show all (4)
Post to Citeulike
4 Citations
The identification of cognates has attracted the attention of researchers working in the area of Natural Language Processing, but the identification of false friends is still an underresearched area. This paper proposes novel methods for the automatic identification of both cognates and false friends from comparable bilingual corpora. The methods are not dependent on the existence of parallel texts, and make use of only monolingual corpora and a bilingual dictionary necessary for the mapping of cooccurrence data across languages. In addition, the methods do not require that the newly discovered cognates or false friends are present in the dictionary and hence are capable of operating on outofvocabulary expressions. These methods are evaluated on English, French, German and Spanish corpora in order to identify English–French, English–German, English–Spanish and French–Spanish pairs of cognates or false friends. The experiments were performed in two settings: (i) assuming ‘ideal’ extraction of cognates and false friends from plaintext corpora, i.e. when the evaluation data contains only cognates and false friends, and (ii) a realworld extraction scenario where cognates and false friends have to first be identified among words found in two comparable corpora in different languages. The evaluation results show that the developed methods identify cognates and false friends with very satisfactory results for both recall and precision, with methods that incorporate background semantic knowledge, in addition to cooccurrence data obtained from the corpora, delivering the best results.
more …
By
Imsombut, Aurawan; Kawtrakul, Asanee
Post to Citeulike
4 Citations
This paper presents a methodology for automatic learning of ontologies from Thai text corpora, by extraction of terms and relations. A shallow parser is used to chunk texts on which we identify taxonomic relations with the help of cues: lexicosyntactic patterns and item lists. The main advantage of the approach is that it simplify the task of concept and relation labeling since cues help for identifying the ontological concept and hinting their relation. However, these techniques pose certain problems, i.e. cue word ambiguity, item list identification, and numerous candidate terms. We also propose the methodology to solve these problems by using lexicon and cooccurrence features and weighting them with information gain. The precision, recall and Fmeasure of the system are 0.74, 0.78 and 0.76, respectively.
more …
By
Hwang, Soonhee; Yoon, Aesun; Kwon, HyukChul
Post to Citeulike
3 Citations
The complexity of Korean numeral classifiers demands semantic as well as computational approaches that employ natural language processing (NLP) techniques. The classifier is a universal linguistic device, having the two functions of quantifying and classifying nouns in noun phrase constructions. Many linguistic studies have focused on the fact that numeral classifiers afford decisive clues to categorizing nouns. However, few studies have dealt with the semantic categorization of classifiers and their semantic relations to the nouns they quantify and categorize in building ontologies. In this article, we propose the semantic recategorization of the Korean numeral classifiers in the context of classifier ontology based on large corpora and KorLex Noun 1.5 (Korean wordnet; Korean Lexical Semantic Network), considering its high applicability in the NLP domain. In particular, the classifier can be effectively used to predict the semantic characteristics of nouns and to process them appropriately in NLP. The major challenge is to make such semantic classification and the attendant NLP techniques efficient. Accordingly, a Korean numeral classifier ontology (KorLexClas 1.0), including semantic hierarchies and relations to nouns, was constructed.
more …
By
Bond, Francis; Fujita, Sanae; Tanaka, Takaaki
Post to Citeulike
In this paper we describe the current state of a new Japanese lexical resource: the Hinoki treebank. The treebank is built from dictionary definitions, examples and news text, and uses an HPSG based Japanese grammar to encode both syntactic and semantic information. It is combined with an ontology based on the definition sentences to give a detailed sense level description of the most familiar 28,000 words of Japanese.
more …
By
Ekbal, Asif; Bandyopadhyay, Sivaji
Post to Citeulike
21 Citations
The rapid development of language resources and tools using machine learning techniques for less computerized languages requires appropriately tagged corpus. A tagged Bengali news corpus has been developed from the web archive of a widely read Bengali newspaper. A web crawler retrieves the web pages in Hyper Text Markup Language (HTML) format from the news archive. At present, the corpus contains approximately 34 million wordforms. Named Entity Recognition (NER) systems based on pattern based shallow parsing with or without using linguistic knowledge have been developed using a part of this corpus. The NER system that uses linguistic knowledge has performed better yielding highest FScore values of 75.40%, 72.30%, 71.37%, and 70.13% for person, location, organization, and miscellaneous names, respectively.
more …
By
Hashimoto, Chikara; Bond, Francis; Tanaka, Takaaki; Siegel, Melanie
Show all (4)
Post to Citeulike
2 Citations
We have constructed a large scale and detailed database of lexical types in Japanese from a treebank that includes detailed linguistic information. The database helps treebank annotators and grammar developers to share precise knowledge about the grammatical status of words that constitute the treebank, allowing for consistent largescale treebanking and grammar development. In addition, it clarifies what lexical types are needed for precise Japanese NLP on the basis of the treebank. In this paper, we report on the motivation and methodology of the database construction.
more …
By
Zhao, Jun; Liu, Feifan
Post to Citeulike
4 Citations
There are many expressive and structural differences between product names and general named entities such as person names, location names and organization names. To date, there has been little research on product named entity recognition (NER), which is crucial and valuable for information extraction in the field of market intelligence. This paper focuses on product NER (PRO NER) in Chinese text. First, we describe our efforts on data annotation, including welldefined specifications, data analysis and development of a corpus with annotated product named entities. Second, a hierarchical hidden Markov modelbased approach to PRO NER is proposed and evaluated. Extensive experiments show that the proposed method outperforms the cascaded maximum entropy model and obtains promising results on the data sets of two different electronic product domains (digital and cell phone).
more …
By
Wong, KamFai; Xia, Yunqing
Post to Citeulike
5 Citations
Realtime communication platforms such as ICQ, MSN and online chat rooms are getting more popular than ever on the Internet. There are, however, real risks where criminals and terrorists can perpetrate illegal and criminal abuses. This highlights the security significance of accurate detection and translation of the chat language to its stand language counterpart. The language used on these platforms differs significantly from the standard language. This language, referred to as chat language, is comparatively informal, anomalous and dynamic. Such features render conventional language resources such as dictionaries, and processing tools such as parsers ineffective. In this paper, we present the NIL corpus, a chat language text collection annotated to facilitate training and testing of chat language processing algorithms. We analyse the NIL corpus to study the linguistic characteristics and contextual behaviour of a chat language. First we observe that majority of the chat terms, i.e. informal words in a chat text, is formed by phonetic mapping. We then propose the eXtended Source Channel Model (XSCM) for the normalization of the chat language, which is a process to convert messages expressed in a chat language to its standard language counterpart. Experimental results indicate that the performance of XSCM in terms of chat term recognition and normalization accuracy is superior to its Source Channel Model (SCM) counterparts, and is also more consistent over time.
more …
By
Roxas, Rachel Edita Oñate; Borra, Allan; Ko Cheng, Charibeth; Lim, Nathalie Rose; Ong, Ethel Chuajoy; Tan, Michelle Wendy
Show all (6)
Post to Citeulike
1 Citations
In this paper, we present the building of various language resources for a multiengine bidirectional EnglishFilipino Machine Translation (MT) system. Since linguistics information on Philippine languages are available, but as of yet, the focus has been on theoretical linguistics and little is done on the computational aspects of these languages, attempts are reported here on the manual construction of these language resources such as the grammar, lexicon, morphological information, and the corpora which were literally built from almost nonexistent digital forms. Due to the inherent difficulties of manual construction, we also discuss our experiments on various technologies for automatic extraction of these resources to handle the intricacies of the Filipino language, designed with the intention of using them for the MT system. To implement the different MT engines and to ensure the improvement of translation quality, other language tools (such as the morphological analyzer and generator, and the part of speech tagger) were developed.
more …
By
Bond, Francis; Ogura, Kentaro
Post to Citeulike
3 Citations
We present a method for combining two bilingual dictionaries to make a third, using one language as a pivot. In this case we combine a JapaneseEnglish dictionary with a MalayEnglish dictionary, to produce a JapaneseMalay dictionary. Our method differs from previous methods in its improved matching through normalization of the pivot language. We have made a prototype dictionary of around 76,000 JapaneseMalay pairs for 50,000 Japanese head words.
more …
By
Nicolas, David
Post to Citeulike
13 Citations
A dilemma put forward by Schein (1993, Plurals and events. Cambridge MIT Press) and Rayo (2002, Noûs, 36, 436–464) suggests that, in order to characterize the semantics of plurals, we should not use predicate logic, but plural logic, a formal language whose terms may refer to several things at once. We show that a similar dilemma applies to mass nouns. If we use predicate logic and sets when characterizing their semantics, we arrive at a Russellian paradox. And if we use predicate logic and mereological sums, the semantics turns out to be too weak. We then develop an account where mass nouns are treated as nonsingular terms. This semantics is faithful to the intuition that, if there are eight pieces of silverware on a table, the speaker refers to eight things at once when he says: The silverware that is on the table comes from Italy. We show that this account provides a satisfactory semantics for a wide range of sentences.
more …
By
Singh, Raj
Post to Citeulike
16 Citations
Hurford’s Constraint (Hurford, Foundations of Language, 11, 409–411, 1974) states that a disjunction is infelicitous if its disjuncts stand in an entailment relation: #John was born in Paris or in France. Gazdar (Pragmatics, Academic Press, NY, 1979) observed that scalar implicatures can obviate the constraint. For instance, sentences of the form (A or B) or (Both Aand B) are felicitous due to the exclusivity implicature of the first disjunct: A or B implicates ‘not (A and B)’. Chierchia, Fox, and Spector (Handbook of semantics, 2008) use the obviation of Hurford’s Constraint in these cases to argue for a theory of local implicature. I present evidence indicating that the constraint needs to be modified in two ways. First, implicatures can obviate Hurford’s Constraint only in earlier disjuncts, not later ones: #(Both A and B) or (A or B). Second, the constraint rules out not only disjuncts that stand in an entailment relation, but also disjuncts that are even mutually consistent: #John is from Russia or Asia. I propose to make sense of these facts by providing an incremental evaluation procedure which checks that each new disjunct to the right is inconsistent with the information to its left, before the disjunct can be strengthened by local implicature.
more …
By
Torrens, Antoni
Post to Citeulike
3 Citations
In a classical paper [15] V. Glivenko showed that a proposition is classically demonstrable if and only if its double negation is intuitionistically demonstrable. This result has an algebraic formulation: the double negation is a homomorphism from each Heyting algebra onto the Boolean algebra of its regular elements. Versions of both the logical and algebraic formulations of Glivenko’s theorem, adapted to other systems of logics and to algebras not necessarily related to logic can be found in the literature (see [2, 9, 8, 14] and [13, 7, 14]). The aim of this paper is to offer a general frame for studying both logical and algebraic generalizations of Glivenko’s theorem. We give abstract formulations for quasivarieties of algebras and for equivalential and algebraizable deductive systems and both formulations are compared when the quasivariety and the deductive system are related. We also analyse Glivenko’s theorem for compatible expansions of both cases.
more …
By
Kostrzycka, Zofia; Zaionc, Marek
Post to Citeulike
4 Citations
This paper presents a systematic approach for obtaining results from the area of quantitative investigations in logic and type theory. We investigate the proportion between tautologies (inhabited types) of a given length n against the number of all formulas (types) of length n. We investigate an asymptotic behavior of this fraction. Furthermore, we characterize the relation between number of premises of implicational formula (type) and the asymptotic probability of finding such formula among the all ones. We also deal with a distribution of these asymptotic probabilities. Using the same approach we also prove that the probability that randomly chosen fourth order type (or type of the order not greater than 4), which admits decidable lambda definability problem, is zero.
more …
By
Wansing, Heinrich; Shramko, Yaroslav
Post to Citeulike
9 Citations
According to Suszko’s Thesis, there are but two logical values, true and false. In this paper, R. Suszko’s, G. Malinowski’s, and M. Tsuji’s analyses of logical twovaluedness are critically discussed. Another analysis is presented, which favors a notion of a logical system as encompassing possibly more than one consequence relation.
[A] fundamental problem concerning manyvaluedness is to know what it really is.
[13, p. 281]
more …
By
Cresto, Eleonora
Post to Citeulike
The paper suggests a way of modeling belief changes within the tradition of formal belief revision theories. The present model extends the scope of traditional proposals, such as AGM, so as to take care of “structural belief changes” – a type of radical shifts that is best illustrated with, but not limited to, instances of scientific discovery; we obtain AGM expansions and contractions as limiting cases. The representation strategy relies on a nonstandard use of a semantic machinery. More precisely, the model seeks to correlate knowledge states with interpretations of a given formal language L, in such a way that the epistemic state of an agent at a given time gives rise to a picture of how things could be, if there weren’t anything else to know. Interpretations of L proceed along supervaluational ideas; hence, the model as a whole can be seen as a particular application of supervaluational semantics to epistemic matters.
more …
By
Spinks, Matthew; Veroff, Robert
Post to Citeulike
14 Citations
The goal of this twopart series of papers is to show that constructive logic with strong negation N is definitionally equivalent to a certain axiomatic extension NFL_{ew} of the substructural logic FL_{ew}. In this paper, it is shown that the equivalent variety semantics of N (namely, the variety of Nelson algebras) and the equivalent variety semantics of NFL_{ew} (namely, a certain variety of FL_{ew}algebras) are term equivalent. This answers a longstanding question of Nelson [30]. Extensive use is made of the automated theoremprover Prover9 in order to establish the result.
The main result of this paper is exploited in Part II of this series [40] to show that the deductive systems N and NFL_{ew} are definitionally equivalent, and hence that constructive logic with strong negation is a substructural logic over FL_{ew}.
more …
By
Brasoveanu, Adrian
Post to Citeulike
19 Citations
The paper argues that two distinct and independent notions of plurality are involved in natural language anaphora and quantification: plural reference (the usual nonatomic individuals) and plural discourse reference, i.e., reference to a quantificational dependency between sets of objects (e.g., atomic/nonatomic individuals) that is established and subsequently elaborated upon in discourse. Following van den Berg (PhD dissertation, University of Amsterdam, 1996), plural discourse reference is modeled as plural information states (i.e., as sets of variable assignments) in a new dynamic system couched in classical type logic that extends Compositional DRT (Muskens, Linguistics and Philosophy, 19, 143–186, 1996). Given the underlying type logic, compositionality at subclausal level follows automatically and standard techniques from Montague semantics become available. The idea that plural info states are semantically necessary (in addition to nonatomic individuals) is motivated by relativeclause donkey sentences with multiple instances of singular donkey anaphora that have mixed (weak and strong) readings. At the same time, allowing for nonatomic individuals in addition to plural info states enables us to capture the intuitive parallels between singular and plural (donkey) anaphora, while deriving the incompatibility between singular (donkey) anaphora and collective predicates. The system also accounts for empirically unrelated phenomena, e.g., the uniqueness effects associated with singular (donkey) anaphora discussed in Kadmon (Linguistics and Philosophy, 13, 273–324, 1990) and Heim (Linguistics and Philosophy, 13, 131–177, 1990) among others.
more …
By
Schulte im Walde, Sabine
Post to Citeulike
2 Citations
This article investigates whether human associations to verbs as collected in a web experiment can help us to identify salient features for semantic verb classes. Starting from the assumption that the associations, i.e., the words that are called to mind by the stimulus verbs, reflect highly salient linguistic and conceptual features of the verbs, we apply a cluster analysis to the verbs, based on the associations, and validate the resulting verb classes against standard approaches to semantic verb classes. Then, we perform various clusterings on the same verbs using standard corpusbased feature types, and evaluate them against the associationbased clustering as well as GermaNet and FrameNet classes. Comparing the cluster analyses provides an insight into the usefulness of standard feature types in verb clustering, and assesses shallow vs. deep syntactic features, and the role of corpus frequency. We show that (a) there is no significant preference for using a specific syntactic relationship (such as direct objects) as nominal features in clustering; (b) that simple window cooccurrence features are not significantly worse (and in some cases even better) than selected grammarbased functions; and (c) that a restricted feature choice disregarding high and lowfrequency features is sufficient. Finally, by applying the feature choices to GermaNet and FrameNet verbs and classes, we address the question of whether the same types of features are salient for different types of semantic verb classes. The variation of the gold standard classifications demonstrates that the clustering results are significantly different, even when relying on the same features.
more …
By
Shen, Libin; Champollion, Lucas; Joshi, Aravind K.
Post to Citeulike
4 Citations
We introduce LTAGspinal, a novel variant of traditional Lexicalized Tree Adjoining Grammar (LTAG) with desirable linguistic, computational and statistical properties. Unlike in traditional LTAG, subcategorization frames and the argument–adjunct distinction are left underspecified in LTAGspinal. LTAGspinal with adjunction constraints is weakly equivalent to LTAG. The LTAGspinal formalism is used to extract an LTAGspinal Treebank from the Penn Treebank with Propbank annotation. Based on Propbank annotation, predicate coordination and LTAG adjunction structures are successfully extracted. The LTAGspinal Treebank makes explicit semantic relations that are implicit or absent from the original PTB. LTAGspinal provides a very desirable resource for statistical LTAG parsing, incremental parsing, dependency parsing, and semantic parsing. This treebank has been successfully used to train an incremental LTAGspinal parser and a bidirectional LTAG dependency parser.
more …
By
Itai, Alon; Wintner, Shuly
Post to Citeulike
26 Citations
We describe a suite of standards, resources and tools for computational encoding and processing of Modern Hebrew texts. These include an array of XML schemas for representing linguistic resources; a variety of text corpora, raw, automatically processed and manually annotated; lexical databases, including a broadcoverage monolingual lexicon, a bilingual dictionary and a WordNet; and morphological processors which can analyze, generate and disambiguate Hebrew word forms. The resources are developed under centralized supervision, so that they are compatible with each other. They are freely available and many of them have already been used for several applications, both academic and industrial.
more …
