Hutchins, John
18 Citations
The first proposals for various component tools of what is now called the “translator's workstation” or “translator's workbench” are traced back to the 1970s and early 1980s in various, often independent, proposals at different stages in the development of computers and in their use by translators.
Lee, JaeWon; Seo, Jungyun; Kim, Gil Chang
1 Citations
In some cases, to make a proper translation of an utterance in a dialogue, different pieces of contextual information are needed. Interpreting such utterances often requires dialogue analysis including speech acts and discourse analysis. In this paper, a statistical dialogue analysis model for Korean–English dialogue machine translation based on speech acts is proposed. The model uses syntactic patterns and ngrams of speech acts. The syntactic patterns include surface syntactic features which are related to the languagedependent expressions of speech acts. Speechact ngrams are used to approximate the context of utterances. The key feature is the use of speechact ngrams based on hierarchical recency. Experimental results with trigrams show that the proposed model achieves an accuracy of 66.87% for the top candidate and 82.35% for the top three candidates. It indicates that the proposed model based on hierarchical recency outperforms the model based on linear recency.
Sheremetyeva, Svetlana; Jin, Wanying; Nirenburg, Sergei
This paper presents a model of morphological analysis which, unlike practically all other approaches to computational morphology, does not rely on a large dictionary of stems. Consequently, the effort to acquire the static knowledge sources for systems based on this model is significantly smaller. The model is motivated by engineering concerns: the considerations of economy and efficiency led to the use of nontraditional definitions of morphemes. The model has been implemented in the RDM system in the framework of the Corelli project at CRL. It was initially done on Russian material and then successfully applied to SerboCroatian. Examples in this paper are mainly from RDMRussian.
Geurts, Bart
24 Citations
This paper consists of two main parts and a coda. In the first part I present the 'binding theory' of presupposition projection, which is the framework that I adopt in this paper (Section 1.1). I outline the main problems that arise in the interplay between presuppositions and anaphors on the one hand and attitude reports on the other (Section 1.2), and discuss Heim's theory of presuppositions in attitude contexts (Section 1.3).
In the second part of the paper I present my own proposal. To begin with, I define an extension of DRT in which attitude reports can be represented (Sections 2.1–2.2). I then argue that the verb believe triggers a certain presupposition and that, given the binding theory, this presupposition determines the projection behaviour of the verb (Section 2.3). This analysis yields predictions which are incomplete in the sense that they do not fully account for speakers' intuitions about presuppositions and anaphors in belief contexts. In Section 2.4 I suggest that this is as it should be because we may assume on independent grounds that there is a class of plausibility inferences which complement the predictions of the presupposition theory. Finally, the analysis is extended to the verb want (Section 2.5).
The paper concludes with a brief discussion of related phenomena in other domains: modals, quantifiers, and nondeclarative speech acts (Section 3).
Kerth, Rainer
13 Citations
We will present several results on two types of continuous models of λcalculus, namely graph models and extensional models. By introducing a variant of Engeler's model construction, we are able to generalize the results of [7] and to give invariants that determine a large family of graph models up to applicative isomorphism. This covers all graph models considered in the litterature so far. We indicate briefly how these invariants may be modified in order to determine extensional models as well.
Furthermore, we use our construction to exhibit
$$2^{N_0 } $$
graph models that are not equationally equivalent. We indicate once again how the construction passes on to extensional models.
Elliott, Ward E. Y.; Valenza, Rober J.
In “Response to Elliott and Valenza, 'And Then There Were None'”, (1996) Donald Foster has taken strenuous issue with our Shakespeare Clinic's final report, which concluded that none of the testable Shakespeare claimants, and none of the Shakespeare Apocrypha poems and plays – including Funeral Elegy by W.S. – match Shakespeare. Though he seems to accept most of our exclusions – notably excepting those of the Elegy and A Lover's Complaint – he believes that our methodology is nonetheless fatally flawed by “worthless figures ... wrong more often than right”, “rigorous cherry–picking”, “playing with a stacked deck”, and “conveniently exil[ing] ... inconvenient data.” He describes our tests as “foul vapor” and “methodological madness.”
We believe that this criticism is seriously overdrawn, and that our tests and conclusions have emerged essentially intact. By our count, he claims to have found 21 errors of consequence in our report. Only five of these claims, all trivial, have any validity at all. If fully proved, they might call for some cautions and slight refinements for five of our 54 tests, but in no case would they come close to invalidating the questioned test. The remaining 49 tests are wholly intact. Total erosion of our findings from the Foster critique could amount, at most, to half of one percent. None of his accusations of cherry–picking, deck–stacking, and evidence–ignoring are substantiated.
Dzierzgowski, Daniel
In this paper, we prove that Heyting's arithmetic can be interpreted in an intuitionistic version of Russell's Simple Theory of Types without extensionality.
Foster, Donald W.
Ward Elliott (from 1987) and Robert Valenza (from 1989) set out to the find the ”true“ Shakespeare from among 37 antiStratfordian ”Claimants.“ As directors of the Claremont Shakespeare Authorship Clinic, Elliott and Valenza developed novel attributional tests, from which they concluded that most ”Claimants“ are ”notShakespeare.“ From 19904, Elliott and Valenza developed tests purporting further to reject much of the Shakespeare canon as ”notShakespeare“ (1996a). Foster (1996b) details extensive and persistent flaws in the Clinic's work: data were collected haphazardly; canonical and comparative textsamples were chronologically mismatched; procedural controls for genre, stanzaic structure, and date were lacking. Elliott and Valenza counter by estimating maximum erosion of the Clinic's findings to include ”five of our 54 tests“, which can ”amount, at most, to half of one percent“ (1998). This essay provides a brief history, showing why the Clinic foundered. Examining several of the Clinic's representative tests, I evaluate claims that Elliott and Valenza continue to make for their methodology. A final section addresses doubts about accuracy, validity and replicability that have dogged the Clinic's work from the outset.
Bezhanishvili, Guram
21 Citations
This paper deals with the varieties of monadic Heyting algebras, algebraic models of intuitionistic modal logic MIPC. We investigate semisimple, locally finite, finitely approximated and splitting varieties of monadic Heyting algebras as well as varieties with the disjunction and the existence properties. The investigation of monadic Heyting algebras clarifies the correspondence between intuitionistic modal logics over MIPC and superintuitionistic predicate logics and provides us with the solutions of several problems raised by Ono [35].
Madarász, Judit X.
14 Citations
Continuing work initiated by Jónsson, Daigneault, Pigozzi and others; Maksimova proved that a normal modal logic (with a single unary modality) has the Craig interpolation property iff the corresponding class of algebras has the superamalgamation property (cf. [Mak 91], [Mak 79]). The aim of this paper is to extend the latter result to a large class of logics. We will prove that the characterization can be extended to all algebraizable logics containing Boolean fragment and having a certain kind of local deduction property. We also extend this characterization of the interpolation property to arbitrary logics under the condition that their algebraic counterparts are discriminator varieties. We also extend Maksimova's result to normal multimodal logics with arbitrarily many, not necessarily unary modalities, and to not necessarily normal multimodal logics with modalities of ranks smaller than 2, too.
The problem of extending the above characterization result to no nnormal nonunary modal logics remains open.
Related issues of universal algebra and of algebraic logic are discussed, too. In particular we investigate the possibility of extending the characterization of interpolability to arbitrary algebraizable logics.
Plotkin, Tatjana L.; Kraus, Sarit; Plotkin, Boris I.
1 Citations
The paper is devoted to applications of algebraic logic to databases. In databases a query is represented by a formula of first order logic. The same query can be associated with different formulas. Thus, a query is a class of equivalent formulae: equivalence here being similar to that in the transition to the LindenbaumTarski algebra. An algebra of queries is identified with the corresponding algebra of logic. An algebra of replies to the queries is also associated with algebraic logic. These relations lie at the core of the applications.
In this paper it is shown how the theory of Halmos (polyadic) algebras (a notion introduced by Halmos as a tool in the algebraization of the first order predicate calculus) is used to create the algebraic model of a relational data base. The model allows us, in particular, to solve the problem of databases equivalence as well as develop a formal algebraic definition of a database's state description. In this paper we use the term "state description" for the logical description of the model. This description is based on the notion of filters in Halmos algebras. When speaking of a state description, we mean the description of a function which realizes the symbols of relations as real relations in the given system of data.
Dayal, Veneeta
45 Citations
The primary theoretical focus of this paper is on Free Choice uses of any, in particular on two phenomena that have remained largely unstudied. One involves the ability of any phrases to occur in affirmative episodic statements when aided by suitable noun modifiers. The other involves the difference between modals of necessity and possibility with respect to licensing of any. The central thesis advanced here is that FC any is a universal determiner whose domain of quantification is not a set of particular individuals but the set of possible individuals of the relevant kind. In a theory of genericity utilizing situations, an any phrase can be seen as having a universal quantifier binding the situation variable of the common noun. This inherent genericity is argued to be at the heart of the intuition that any statements support counterfactual inferences and do not involve existential commitments. A conflict in presuppositions is shown to account for the incompatibility of unmodified any phrases in affirmative episodic statements and the crucial role played by modification in ameliorating this clash is explicated. In the case of modals of necessity, the interaction between the universal force of any and the particular modal base is shown to be crucial. In view of these facts it is argued that FC any is not directly licensed by modal or generic operators as generally assumed but that its felicitous use is sensitive to the pragmatics of epistemic modality. Turning to its polarity sensitive uses, language internal as well as crosslinguistic evidence is presented to distinguish it from FC any in having the existential quantificational force typical of indefinites. The paper concludes by suggesting that the common tie between them is that they both occur in statements that apply to a class of entities, rather than to particular members of the class.
Tweedie, Fiona J.; Baayen, R. Harald
136 Citations
A wellknown problem in the domain of quantitative linguistics and stylistics concerns the evaluation of the lexical richness of texts. Since the most obvious measure of lexical richness, the vocabulary size (the number of different word types), depends heavily on the text length (measured in word tokens), a variety of alternative measures has been proposed which are claimed to be independent of the text length. This paper has a threefold aim. Firstly, we have investigated to what extent these alternative measures are truly textual constants. We have observed that in practice all measures vary substantially and systematically with the text length. We also show that in theory, only three of these measures are truly constant or nearly constant. Secondly, we have studied the extent to which these measures tap into different aspects of lexical structure. We have found that there are two main families of constants, one measuring lexical richness and one measuring lexical repetition. Thirdly, we have considered to what extent these measures can be used to investigate questions of textual similarity between and within authors. We propose to carry out such comparisons by means of the empirical trajectories of texts in the plane spanned by the dimensions of lexical richness and lexical repetition, and we provide a statistical technique for constructing confidence intervals around the empirical trajectories of texts. Our results suggest that the trajectories tap into a considerable amount of authorial structure without, however, guaranteeing that spatial separation implies a difference in authorship.
Skvortsov, Dmitrij
The Kripkecompleteness and incompleteness of some intermediate predicate logics is established. In particular, we obtain a Kripkeincomplete logic (H* +A+D+K) where H* is the intuitionistic predicate calculus, A is a disjunctionfree propositional formula, D = ∀x(P(x) V Q) ⊃ ∀xP(x) V Q, K = ¬¬∀x(P(x) V ¬P(x)) (the negative answer to a question of T. Shimura).
Kisielewicz, Andrzej
1 Citations
Using two distinct membership symbols makes possible to base set theory on one general axiom schema of comprehension. Is the resulting system consistent? Can set theory and mathematics be based on a single axiom schema of comprehension?
By
Bäuerle, Frank A.; Albrecht, David; Crossley, John N.; Jeavons, John S.
In this paper we 1. provide a natural deduction system for full firstorder linear logic, 2. introduce CurryHowardstyle terms for this version of linear logic, 3. extend the notion of substitution of CurryHoward terms for term variables, 4. define the reduction rules for the CurryHoward terms and 5. outline a proof of the strong normalization for the full system of linear logic using a development of Girard's candidates for reducibility, thereby providing an alternative to Girard's proof using proofnets.
Gabbay, Dov M.; Olivetti, Nicola
4 Citations
In this work we develop goaldirected deduction methods for the implicational fragment of several modal logics. We give sound and complete procedures for strict implication of K, T, K4, S4, K5, K45, KB, KTB, S5, G and for some intuitionistic variants. In order to achieve a uniform and concise presentation, we first develop our methods in the framework of Labelled Deductive Systems [Gabbay 96]. The proof systems we present are strongly analytical and satisfy a basic property of cut admissibility. We then show that for most of the systems under consideration the labelling mechanism can be avoided by choosing an appropriate way of structuring theories. One peculiar feature of our proof systems is the use of restart rules which allow to reask the original goal of a deduction. In case of K, K4, S4 and G, we can eliminate such a rule, without loosing completeness. In all the other cases, by dropping such a rule, we get an intuitionistic variant of each system. The present results are part of a larger project of a goal directed proof theory for nonclassical logics; the purpose of this project is to show that most implicational logics stem from slight variations of a unique deduction method, and from different ways of structuring theories. Moreover, the proof systems we present follow the logic programming style of deduction and seem promising for proof search [Gabbay and Reyle 84, Miller et al. 91].
van Halteren, Hans
This paper examines the feasibility of incremental annotation, i.e. using existing annotation on a text as the basis for further annotation rather than starting the new annotation from scratch. It contains a theoretical component, describing basic methodology and potential obstacles, as well as a practical component, describing an experiment which tests the efficiency of incremental annotation. Apart from guidelines for the execution of such pilot experiments, the experiment demonstrates that incremental annotation is most effective when supported by thorough preplanning and documentation. Unplanned, opportunistic use of existing annotation is much less effective in its reduction of annotation time and furthermore increases the development time of the annotation software, so that this type of incremental annotation appears only practical for large amounts of heritage data.
Goranko, Valentin
6 Citations
A certain type of inference rules in (multi) modal logics, generalizing Gabbay's Irreflexivity rule, is introduced and some general completeness results about modal logics axiomatized with such rules are proved.
Fujita, Kenetsu
4 Citations
There is an intimate connection between proofs of the natural deduction systems and typed lambda calculus. It is wellknown that in simply typed lambda calculus, the notion of formulaeastypes makes it possible to find fine structure of the implicational fragment of intuitionistic logic, i.e., relevant logic, BCKlogic and linear logic. In this paper, we investigate three classical substructural logics (GL, GLc, GLw) of Gentzen's sequent calculus consisting of implication and negation, which contain some of the right structural rules. In terms of Parigot's λμcalculus with proper restrictions, we introduce a proof term assignment to these classical substructural logics. According to these notions, we can classify the λμterms into four categories. It is proved that welltyped GLxλμterms correspond to GLx proofs, and that a GLxλμterm has a principal type if stratified where x is nil, c, w or cw. Moreover, we investigate embeddings of classical substructural logics into the corresponding intuitionistic substructural logics. It is proved that the Gödelstyle translations of GLxλμterms are embeddings preserving substructural logics. As byproducts, it is obtained that an inhabitation problem is decidable and welltyped GLxλμterms are strongly normalizable.
Hall, Steven
2 Citations
ChadwyckHealey has a long tradition of electronic publishing. Beginning with production of CDbased literary corpora, it has recently moved many of its products to a webaccessible online environment. The article reflects on experiences with both CD and webbased publications.
Driscoll, Adrian; Scott, Brad
This article describes how an independent commercial academic publisher initiated its electronic publishing programme. It outlines the range of electronic activities under development and some of the issues addressed during the creation of electronic resources. Case studies of two early projects are included: a multimedia teaching too, A Right to Die? The Dax Cowart Case; and an SGML textbase, the Arden Shakespeare CDROM. In addition, the Routledge Encyclopedia of Philosophy is discussed as an example of the second generation of electronic projects at Routledge, highlighting lessons learned from previous projects and some of the issues relating to the production of a simultaneous print and electronic resource.
Robinson, Peter; Taylor, Kevin
3 Citations
The article reports on one of the more sophisticated critical editions ever to be published in electronic format. The Wife of Bath is richly encoded, provides access to literally thousands of manuscript images, and enables users to assess the relationships between the numerous extant manuscript editions. The authors assess the methods used in the edition's development and the lessons learned through its production.
Stachniak, Zbigniew
A prooftheoretical analysis of finitevaluedness in the domain of cumulative inference systems is presented.
Di Nola, Antonio; Grigolia, Revaz; Panti, Giovanni
18 Citations
The MValgebra S
_{m}^{w}
is obtained from the (m+1)valued Łukasiewicz chain by adding infinitesimals, in the same way as Chang's algebra is obtained from the twovalued chain. These algebras were introduced by Komori in his study of varieties of MValgebras. In this paper we describe the finitely generated totally ordered algebras in the variety MV
_{m}^{w}
generated by S
_{m}^{w}
. This yields an easy description of the free MV
_{m}^{w}
algebras over one generator. We characterize the automorphism groups of the free MValgebras over finitely many generators.
Gispert, Joan; Torrens, Antoni
4 Citations
In this paper we show that the quasivariety generated by an infinite simple MValgebra only depends on the rationals which it contains. We extend this property to arbitrary families of simple MValgebras.
Hähnle, Reiner
12 Citations
We provide tools for a concise axiomatization of a broad class of quantifiers in manyvalued logic, socalled distribution quantifiers. Although sound and complete axiomatizations for such quantifiers exist, their size renders them virtually useless for practical purposes. We show that for quantifiers based on finite distributive lattices compact axiomatizations can be obtained schematically. This is achieved by providing a link between skolemized signed formulas and filters/ideals in Boolean set lattices. Then lattice theoretic tools such as Birkhoff's representation theorem for finite distributive lattices are used to derive tableaustyle axiomatizations of distribution quantifiers.
Höhle, Ulrich
13 Citations
Qvalued sets are nonclassical models of the formalized theory of identity with existence predicate based on the axioms of a noncommutative and nonidempotent logic. The singleton monad on the category of Qvalued sets is constructed, and elementary properties of Talgebras of the singleton monad are investigated.
Baaz, Matthias; Fermüller, Christian G.; Salzer, Gernot; Zach, Richard
23 Citations
A general class of labeled sequent calculi is investigated, and necessary and sufficient conditions are given for when such a calculus is sound and complete for a finitevalued logic if the labels are interpreted as sets of truth values (setsassigns). Furthermore, it is shown that any finitevalued logic can be given an axiomatization by such a labeled calculus using arbitrary "systems of signs," i.e., of sets of truth values, as labels. The number of labels needed is logarithmic in the number of truth values, and it is shown that this bound is tight.
Fleming, Dan
This paper describes and analyses a webbased preprints project in the UK's Electronic Libraries Programme in order to raise issues about the forms of scholarship that are best suited to online working. Specifically, the paper describes some of the underlying processes at work in academic research and seeks to match these, where appropriate, to forms of online working. In doing so, the paper describes in detail a scholarship of integration which seems well suited to online tools such as preprints systems, but speculates that such forms of scholarship are too seldom explicitly identified when academics refer to research as a totality. As a consequence the potential match between working practices and emerging tools may not be obvious to academic researchers. To investigate these issues further, the paper examines the degrees of formality involved in different kinds of online communication and describes how academic working practices might be supported by adapting established ‘groupware’ tools such as Lotus Notes. The eLib ‘Formations’ project, which is using Notes to develop an integrated preprints and ejournal system for research in cultural studies and related fields, is described in detail, focusing on the underlying technology and the overall design.
Baaz, Matthias; Hájek, Petr; Švejda, David; Krajíček, Jan
13 Citations
We construct a faithful interpretation of Łukasiewicz's logic in product logic (both propositional and predicate). Using known facts it follows that the product predicate logic is not recursively axiomatizable.
We prove a completeness theorem for product logic extended by a unary connective δ of Baaz [1]. We show that Gödel's logic is a sublogic of this extended product logic.
We also prove NPcompleteness of the set of propositional formulas satisfiable in product logic (resp. in Gödel's logic).
Raskin, Victor; Nirenburg, Sergei
11 Citations
This paper is devoted to determining and representing adjectival meaning. The results form a microtheory in the Mikrokosmos project on computational ontological semantics. Mikrokosmos microtheories cover the meaning of lexical categories in several languages, the ontological model used as metalanguage for language description and syntax–semantics mapping as well as the actual process of text analysis and generation. This paper presents a critical analysis of the body of knowledge on adjectives amassed to date in linguistics and presents a detailed, practically tested methodology and heuristics for the acquisition of adjectival lexical entries for computational applications. The work is based on the set of over 6,000 English and about 1,500 Spanish adjectives obtained from task oriented corpora.
Asher, Nicholas; Lascarides, Alex
19 Citations
In this paper we explore how compositional semantics, discourse structure, and the cognitive states of participants all contribute to pragmatic constraints on answers to questions in dialogue. We synthesise formal semantic theories on questions and answers with techniques for discourse interpretation familiar from computational linguistics, and show how this provides richer constraints on responses in dialogue than either component can achieve alone.
Bond, Francis; Ogura, Kentaro
3 Citations
This paper shows the necessity of distinguishing different referential uses of NPs in Machine Translation. We propose a three way distinction between the generic, referential and ascriptive uses of noun phrases (NPs), and argue that this is the minimum necessary to generate articles and number correctly when translating from Japanese to English. A detailed algorithm is proposed for determining the referentiality of Japanese NPs, based on a defeasible hierarchy of pragmatic rules that are applied topdown, from the clause to the NP. We also sketch the process of generating determiners and number using rules based on the different NP referentialities for a Japanese–English MT system. Using the proposed heuristics has raised the percentage of NPs generated with correct use of articles and number in the Japanese–English MT system ALTJ/E from 65% to 85%.
King, Paul John; Simov, Kiril Ivanov
1 Citations
Classifying linguistic objects is a widespread and important linguistic task, but hand deducing a classificatory system from a general linguistic theory can consume much effort and introduce pernicious errors. We present an abstract prototype device that effectively deduces an accurate classificatory system from a finite linguistic theory.
Mitrana, Victor
1 Citations
Two strategies of parallel adjoining of contexts are considered for contextual grammars with choice. After a short comparison between them, there are provided ChomskySchutzenberger type characterizations of contextfree and recursively enumerable languages.
Kappes, Martin
2 Citations
Bracketed contextual grammars are contextual grammars with an induced Dyckstructure to control the derivation process and to provide derivation trees. In this paper, we study the generative capacity and closure properties of bracketed and fully bracketed contextual grammars. It will be shown that some subclasses of such grammars are strictly included in the contextfree languages and that there are regular languages which cannot be generated by any bracketed contextual grammar.
Nelson, George C.
Many results concerning the equivalence between a syntactic form of formulas and a model theoretic conditions are proven directly without using any form of a continuum hypothesis. In particular, it is demonstrated that any reduced product sentence is equivalent to a Horn sentence. Moreover, in any first order language without equality one now has that a reduced product sentence is equivalent to a Horn sentence and any sentence is equivalent to a Boolean combination of Horn sentences.
Kornai, András
From the perspective of the linguist, the theory of formal languages serves as an abstract model to address issues such as complexity, learnability, information content, etc. which are hard to investigate directly on natural languages. One question that has not been sufficiently addressed in the literature is to what extent can a result proved on an abstract model be presumed to hold for the concrete languages that are, after all, the real object of interest in linguistics. In this paper we attempt to remedy this defect by developing some figures of merit that measure how well a formal language approximates an actual language. We will review and refine some standard notions of mathematical density to arrive at a numerical figure that shows the degree to which one language approximates another, and show how such a figure can be computed between some formal languages and empirically measured between a real language and its formal model. In the concluding section of the paper we will argue that from the statistical perspective developed here even some classical results of mathematical linguistics, such as Chomsky's (1957) demonstration of the inadequacy of finite state models, are highly suspect.
Hollenberg, Marco
3 Citations
Negative definability ([18]) is an alternative way of defining classes of Kripke frames via a modal language, one that enables us, for instance, to define the class of irreflexive frames. Besides a list of closure conditions for negatively definable classes, the paper contains two main theorems. First, a characterization is given of negatively definable classes of (rooted) finite transitive Kripke frames and of such classes defined using both traditional (positive) and negative definitions. Second, we characterize the negatively definable classes of rooted general frames.
van Benthem, Johan; D'Agostino, Giovanna; Montanari, Angelo; Policriti, Alberto
6 Citations
In this paper, we generalize the settheoretic translation method for polymodal logic introduced in [11] to extended modal logics. Instead of devising an adhoc translation for each logic, we develop a general framework within which a number of extended modal logics can be dealt with. We first extend the basic settheoretic translation method to weak monadic secondorder logic through a suitable change in the underlying set theory that connects up in interesting ways with constructibility; then, we show how to tailor such a translation to work with specific cases of extended modal logics.
Dvurečenskij, Anatolij; Kim, Hee Sik
2 Citations
We discuss the interrelations between BCKalgebras and posets with difference. Applications are given to bounded commutative BCKalgebras, difference posets, MValgebras, quantum MValgebras and orthoalgebras.
Yamada, Kenji
1 Citations
Realworld natural language sentences are often long and complex, and contain unexpected grammatical constructions. They even include noise and ungrammaticality. This paper describes the Controlled Skip Parser, a program that parses such realworld sentences by skipping some of the words in the sentence. The new feature of this parser is that it controls its behavior by finding out which words to skip, without using domainspecific knowledge. The parser is a prioritybased chart parser. By assigning appropriate priority levels to the constituents in the chart, the parser's behavior is controlled. Statistical information is used for assigning priority levels. The statistical information (ngrams) can be thought of as a generalized approximation of the grammar learned from past successful experiences. The control mechanism gives a great speedup and reduction in memory usage. Experiments on real newspaper articles are shown, and our experience with this parser in a machine translation system is described.
Simard, Michel; Plamondon, Pierre
12 Citations
Sentence alignment is the problem of making explicit the relations that exist between the sentences of two texts that are known to be mutual translations. Automatic sentencealignment methods typically face two kinds of difficulties. First, there is the question of robustness. In real life, discrepancies between a source text and its translation are quite common: differences in layout, omissions, inversions, etc. Sentencealignment programs must be ready to deal with such phenomena. Then, there is the question of accuracy. Even when translations are “clean”, alignment is still not a trivial matter: some decisions are hard to make, even for humans. We report here on the current state of our ongoing efforts to produce a sentencealignment program that is both robust and accurate. The method that we propose relies on two new alignment engines: one that produces highly reliable and robust characterlevel alignments, and one that relies on statistical lexical knowledge to produce accurate mappings. Experimental results are presented which demonstrate the method's effectiveness, and highlight where problems remain to be solved.
Macklovitch, Elliott; Hannan, MarieLouise
5 Citations
We present a quantitative evaluation of one wellknown wordalignment algorithm, as well as an analysis of frequent errors in terms of this model's underlying assumptions. Despite error rates that range from 22% to 32%, we argue that this technology can be put to good use in certain automated aids for human translators. We support our contention by pointing to several successful applications and outline ways in which text alignments below the sentence level would allow us to improve the performance of other translation support tools.
Gonzalo, Julio; Verdejo, Felisa; Peters, Carol; Calzolari, Nicoletta
14 Citations
We discuss ways in which EuroWordNet (EWN) can be used in multilingual information retrieval activities, focusing on two approaches to CrossLanguage Text Retrieval that use the EWN database as a largescale multilingual semantic resource. The first approach indexes documents and queries in terms of the EuroWordNet InterLingualIndex, thus turning term weighting and query/document matching into languageindependent tasks. The second describes how the information in the EWN database could be integrated with a corpusbased technique, thus allowing retrieval of domainspecific terms that may not be present in our multilingual database. Our objective is to show the potential of EuroWordNet as a promising alternative to existing approaches to CrossLanguage Text Retrieval.
Fellbaum, Christiane
33 Citations
We give a brief outline of the design and contents of the English lexical database WordNet, which serves as a model for similarly conceived wordnets in several European languages. WordNet is a semantic network, in which the meanings of nouns, verbs, adjectives, and adverbs are represented in terms of their links to other (groups of) words via conceptualsemantic and lexical relations. Each part of speech is treated differently reflecting different semantic properties. We briefly discuss polysemy in WordNet, and focus on the case of meaning extensions in the verb lexicon. Finally, we outline the potential uses of WordNet not only for applications in natural language processing, but also for research in stylistic analyses in conjunction with a semantic concordance.
Veale, Tony; Conway, Alan; Collins, BrÓna
20 Citations
The sign languages used by deaf communities around the world represent a linguistic challenge that naturallanguage researchers in AI have only recently begun to take up. This challenge is particularly relevant to research in Machine Translation (MT), as natural sign languages have evolved in deaf communities into efficient modes of gestural communication, which differ from English not only in modality but in grammatical structure, exploiting a higher dimensionality of spatial expression. In this paper we describe Zardoz, an ongoing AI research system that tackles the crossmodal MT problem, translating English text into fluid sign language. The paper presents an architectural overview of Zardoz, describing its central blackboard organization, the nature of its interlingual representation, and the major components which interact through this blackboard both to analyze the verbal input and generate the corresponding gestural output in one of a number of sign variants.
Peters, Wim; Vossen, Piek; DíezOrzas, Pedro; Andriaens, Geert
7 Citations
This paper discusses the design of the EuroWordNet database, in which semantic databases like WordNet1.5 for several languages are combined via a socalled interlingualindex. In this database, languageindependent data is shared whilst languagespecific properties are maintained. A special interface has been developed to compare the semantic configurations across languages and to track down differences.
Gomolińska, Anna
4 Citations
The logic of acceptance and rejection (AEL2) is a nonmonotonic formalism to represent states of knowledge of an introspective agent making decisions about available information. Though having much in common, AEL2 differs from Moore's autoepistemic logic (AEL) by the fact that the agent not only can accept or reject a given fact, but he/she also has the possibility not to make any decision in case he/she does not have enough knowledge.
Weaver, George; Lippel, David
Clark and Krauss [1977] presents a classification of complete, satisfiable and ℵ_{o}categorical theories in first order languages with finite nonlogical vocabularies. In 1988 the first author modified this classification and raised three questions about the distribution of finitely axiomatizable theories. This paper answers two of those questions.
Vossen, Piek
35 Citations
This paper gives a global introduction to the aims and objectives of the EuroWordNet project, and it provides a general framework for the other papers in this volume. EuroWordNet is an EC project that develops a multilingual database with wordnets in several European languages, structured along the same lines as the Princeton WordNet. Each wordnet represents an autonomous structure of languagespecific lexicalizations, which are interconnected via an InterLingualIndex. The wordnets are built at different sites from existing resources, starting from a shared level of basic concepts and extended topdown. The results will be publicly available and will be tested in crosslanguage information retrieval applications.
Alonge, Antonietta; Calzolari, Nicoletta; Vossen, Piek; Bloksma, Laura; Castellon, Irene; Marti, Maria Antonia; Peters, Wim
6 Citations
In this paper the linguistic design of the database under construction within the EuroWordNet project is described. This is mainly structured along the same lines as the Princeton WordNet, although some changes have been made to the WordNet overall design due to both theoretical and practical reasons. The most important reasons for such changes are the multilinguality of the EuroWordNet database and the fact that it is intended to be used in Language Engineering applications. Thus, i) some relations have been added to those identified in WordNet; ii) some labels have been identified which can be added to the relations in order to make their implications more explicit and precise; iii) some relations, already present in the WordNet design, have been modified in order to specify their role more clearly.
Rodríguez, Horacio; Climent, Salvador; Vossen, Piek; Bloksma, Laura; Peters, Wim; Alonge, Antonietta; Bertagna, Francesca; Roventini, Adriana
13 Citations
This paper describes two fundamental aspects in the process of building of the EuroWordNet database. In EuroWordNet we have chosen for a flexible design in which local wordnets are built relatively independently as languagespecific structures, which are linked to an InterLingualIndex (ILI). To ensure compatibility between the wordnets, a core set of common concepts has been defined that has to be covered by every language. Furthermore, these concepts have been classified via the ILI in terms of a Top Ontology of 63 fundamental semantic distinctions used in various semantic theories and paradigms. This paper first discusses the process leading to the definition of the set of Base Concepts, and the structure and the rationale of the Top Ontology.
Kanovei, Vladimir; Reeken, Michael
1 Citations
In continuation of our study of HST, Hrbaček set theory (a nonstandard set theory which includes, in particular, the ZFC Replacement and Separation schemata in the st∈language, and Saturation for wellorderable families of internal sets), we consider the problem of existence of elementary extensions of inner "external" subclasses of the HST universe.
We show that, given a standard cardinal κ, any set R ⊑ *κ generates an "internal" class S(R) of all sets standard relatively to elements of R, and an "external" class L[S(R)] of all sets constructible (in a sense close to the Gödel constructibility) from sets in S(R). We prove that under some mild saturationlike requirements for R the class L[S(R)] models a certain κversion of HST including the principle of κ+saturation; moreover, in this case L[S(R′)] is an elementary extension of L[S(R)] in the st∈language whenever sets R ⊑ R′ satisfy the requirements.
Tsuji, Marcelo
10 Citations
Suszko's Thesis maintains that manyvalued logics do not exist at all. In order to support it, R. Suszko offered a method for providing any structural abstract logic with a complete set of bivaluations. G. Malinowski challenged Suszko's Thesis by constructing a new class of logics (called qlogics by him) for which Suszko's method fails. He argued that the key for logical twovaluedness was the "bivalent" partition of the Lindenbaum bundle associated with all structural abstract logics, while his qlogics were generated by "trivalent" matrices. This paper will show that contrary to these intuitions, logical twovaluedness has more to do with the geometrical properties of the deduction relation of a logical structure than with the algebraic properties embedded on it.
Van Benthem, Johan
10 Citations
It has been known since the seventies that the formulas of modal logic are invariant for bisimulations between possible worlds models — while conversely, all bisimulationinvariant firstorder formulas are modally definable. In this paper, we extend this semantic style of analysis from modal formulas to dynamic program operations. We show that the usual regular operations are safe for bisimulation, in the sense that the transition relations of their values respect any given bisimulation for their arguments. Our main result is a complete syntactic characterization of all firstorder definable program operations that are safe for bisimulation. This is a semantic functional completeness result for programming, which may be contrasted with the more usual analysis in terms of computational power. The 'Safety Theorem' can be modulated in several ways. We conclude with a list of variants, extensions, and further developments.
Helmreich, Stephen; Farwell, David
5 Citations
This paper examines differences between two professional translations into English of the same Spanish newspaper article. Among other explanations for these differences, such as outright errors and free variation, we find a significant number of differences are due to differing beliefs on the part of the translators about the subject matter and about what the author wished to say. Furthermore, these differences are consistent with divergent global views of the translators about the likelihood of future events (earthquakes and tidal waves) and about (rational or irrational) reactions of people to such likelihood. We discuss the requirements for a pragmaticsbased model of translation that would account for these differences.
Lombardo, Vincenzo
This paper introduces a computational model of recovery in sentence processing. The model consists of a general computational framework which defines a space of possible models and a set of heuristics which constrain the framework to a specific model. The basic idea is to diagnose the error source that has caused the failure and to repair the structure in order to solve the inconsistencies. The diagnosis of the error is accomplished through a heuristic search procedure (which makes selected accesses to the syntactic structure and returns the last safe position) operating together with a constraint on possible structural repairs. The repair component reprocesses input items only when it is not possible to reuse the structures built during the first pass analysis. This chapter describes the architecture of the general framework and the component modules, and sketches some heuristics that explain wellknown cases of reanalysis in the literature.
CarsonBerndsen, Julie
Descriptive linguistic theories have been greatly influenced by computational linguistics. In computational linguistics, a fundamental distinction is made between the declarative and procedural aspect of a computational model of linguistic description. The primary concern of computational linguistics is with the declarative aspect, with formal, computationally interpretable representations of linguistic descriptions. The procedural aspect, the computational processing of the model in terms of an algorithm, is considered to be a separate issue. The influence of computational linguistics has been most obvious in the area of syntax and semantics where the search for a computationally interpretable grammar formalism has led to the introduction of unificationbased grammars which in turn have led to new syntactic and semantic theories. This approach is primarily declarative in that structures are characterised by partial information and mutually independent constraints on wellformedness and is procedurally neutral in that no reference is made as to how the constraints should be applied. The trend in computational linguistics is therefore to design a declarative, processorindependent, monotonic grammar formalism which makes no commitment to a particular procedural interpretation with respect to analysis or generation. A declarative linguistic description allows for many different processing models and is not committed from the outstart to any particular one.
Quesada, José F.
Unification has become a major paradigm in Mathematical and Computational Linguistics. The research done in this area may be classified in four main streams: feature structures as an adequate model for the description of linguistic phenomena, typed unification, representation of feature structures, and unification algorithms. This work proposes a new approach to unificationbased Mathematical and Computational Linguistics: the Lexical Object Theory. The main design criteria are based on linguistic motivation, computational efficiency and formal soundness. The first part of the work outlines the main characteristics of the Lexical Object Theory, its comprehensive orientation, and its layered structure based on the separation of the following levels: specification, transformation, typification, representation and unification. The second part concentrates on the specification level of the Lexical Object Theory. The linguistic motivation of this model is presented, as well as a detailed description of the specification formalism, the computational model it is based on, and finally, the inference rules on lexical objects at the specification level.
Culy, Christopher
1 Citations
Sampson (1987, 1992, and 1995) argues that there is no grammatical/ungrammatical distinction, based on a study (Sampson, 1987) of the distribution of noun phrases in the LancasterOslo/Bergen (LOB) corpus of British English (Garside et al., 1987). As many phrases occur rarely, it is impossible to make a principled distinction between grammatical and ungrammatical phrases, Sampson claims. This paper examines Sampson's evidence against the grammatical/ungrammatical distinction. It will first be argued that another putative counterargument to Sampson's claim (Taylor et al., 1989) is incorrect. It will then be shown that Sampson's evidence does not at all bear on the issue of the grammatical/ungrammatical distinction.
Fodor, Janet Dean; Inoue, Atsu
33 Citations
The diagnostic model of garden path recovery that we have advocated in previous work holds that no repair processes are intrinsically costly. Repair costs depend entirely on the difficulty of establishing what revisions to make. The diagnosis process does not require a specialpurpose inference system as long as the parser abides by the Attach Anyway principle: when it encounters an input word that doesn’t fit into the current structure, it attaches it in the least unacceptable way. The attachment creates a conflict internal to the phrase marker, which is then resolved in consultation with the grammar by a process we call Adjust. In this chapter we propose a principled constraint on the operations of Adjust: the Grammatical Dependency Principle. We show that this clarifies some previously noted phenomena such as the Thematic Overlay Effect and the differential difficulty of different types of steal operations. The examples we present show that neither raising repairs nor semantic revisions are difficult per se.
Sturt, Patrick; Crocker, Matthew W.
9 Citations
A common assumption in psycholinguistic theory is that reanalysis is constrained by a preference to preserve certain aspects of the representation built in response to previous input. In this chapter, we discuss this notion of representationpreservation in the wider context of models of reanalysis as a whole, and point out that in order to define a representationpreserving constraint on reanalysis, we must specify not only which aspects of representation should be preserved, but what is meant by the notion of preservation. We propose that the appropriate notion of preservation is that which is assumed in monotonic models of parsing, where structural relations between linguistic elements are updated totally nondestructively from state to state. Previous monotonic theories of parsing have limited themselves to consideration of phrase structure representations. In contrast, we propose a general framework within which one may formulate models which apply the same notion of preservation to other representation types. The framework is discussed with reference to a model which preserves thematic structure.
Johannessen, Janne Bondi
Using a corpus to investigate empirically grammatical phenomena prior to writing grammatical rules or constraints for a disambiguating tagger is important. The paper shows how even case distinctions on pronouns are used more diversely than is usually assumed. Both in English and Norwegian nominative pronouns are used in more positions than the expected Subject one. Although the other uses are statistically less frequent, they may be important to the users of the resulting tagged corpus – who are often theoretical linguists. A tagger should therefore tag correctly also the more infrequent constructions. The paper shows how this can be done in a Constraint Grammar type tagger.
Lewis, Richard L.
18 Citations
This chapter develops a theory of reanalysis called limited repair parsing. Repair parsers deal with the problem of local ambiguity in part by modifying previously built structure when the chosen structure later proves to be inconsistent. This modification of existing structure distinguishes repair parsing from parallel or multipath parsing, leastcommitment parsing, backtracking, or reparsing strategies. Parsers with a limited capability for repair are psycholinguistically important because they can potentially explain the contrasts between difficult garden path structures (when repair fails) and unproblematic local ambiguities (when repair is successful or easy). Although the idea of repair has been implicit in some psycholinguistic work (and emerged explicitly in the diagnosis model of Fodor & Inoue, 1994, and the NLSoar model of Lewis, 1993), there has been no clear formulation of the general class of repair parsers. This chapter makes a first step toward such a formulation, shows how repair parsing offers significant computational advantages over other alternatives for reanalysis, and proposes a particular repair mechanism, snip, that explains a wide range of crosslinguistic reanalysis phenomena. Snip is a proposal for a simple, automatic, online repair process. The chapter concludes by briefly describing how snip can be embedded in a more comprehensive sentence processing architecture that maintains the structural sensitivity of purely syntactic theories like Pritchett’s (1992), yet still accounts for the flexibility of parsing as revealed by interactive studies.
