Melissourgou, Maria N.; Frantzi, Katerina T.
English language proficiency exams are often associated with highstakes decisions. Guidance however, concerning the writing tasks is often implicit. The emphasis is usually placed upon grammatical and lexical features rather than the pragmatic aspects that differentiate the tasks. Aiming to boost genre awareness as part of L2 pragmatic competence in this context the present paper provides a description of individual exam genres and their relations to each other. Such knowledge is expected to assist both teaching and material writing. Using a pedagogical genrebased corpus with model answers from teaching material we contrast eight genres to each other based on a set of sixteen features. Each feature is associated with a specific text property. Findings reveal some unexpected relations between pairs of genres and offer insight as to the points of convergence and divergence. It is shown that assumptions made about the similarity of texts which belong to the same text type group can sometimes be mistaken. Therefore, it is argued that the tendency to use general labels for text categories in teaching material may mislead novice writers.
Ndzotom Mbakop, Antoine Willy
One of the words that all Christians agree upon is ‘Amen’. Although many do not know its exact meaning, as informal observations have shown, there is tacit consensus over one of its functions, namely Gospel truth marker. Over the years and following the various developments in the Christian faith, the word has acquired new functions according to the obedience. This paper sets out to investigate the pragmatic functions of ‘Amen’ as used in two different religious trends, namely mainstream Protestant Churches and Pentecostal Churches. The data were collected in one prototypical parish of each trend through participant observation, taperecording, and field note. Basically, one church service from each trend was randomly selected and transcribed. The analysis of the data revealed that each trend seems to assign different functions to ‘Amen’. In fact, while mainstream Protestant Churches have kept its traditional conclusion and Gospel truth marker illocutionary force, NewBorn Churches add to these the phatic communion and power marker among others.
The stakes are both communicative and political since the quality of the service, in terms of faithful involvement, may influence the future of both religious trends in the country. This communicative influence spills over the political one as these new functions are slowly but steadily making their way into some traditional churches in the country.
Korat, Omer
This study investigates the difference between more than n and at least n + 1. It is observed that these two quantifiers can generate different implicatures in intensional contexts due to their exhaustivity properties. Building on theoretical notions of argumentation in discourse, it is proposed that more than n, but not at least n + 1, is associated with positive argumenatative orientation (an attempt to convince the addressee of something). In a largescale corpus investigation, more than n is shown to be associated with much larger numerical values than at least n (as well as other comparative quantifiers). Based on a qualitative investigation of the data, I propose that more than n is used more commonly to convey subjective positions that are being made more convincing by larger values of n, while at least n + 1 is used more commonly to convey objectively informative content. The quantifiers’ behavior in intensional contexts is explained as a combined effect of their respective argumentative orientation and the exhaustivity implicatures they generate.
Jucker, Andreas H.
2 Citations
This paper explores two different methods of tracing a specific speech act in a historical corpus. As an example, the development of apologies is investigated in the two hundred years covered by the Corpus of Historical American English (COHA, 1810–2009). One method retrieves apologies through their typical illocutionary force indicating devices (IFIDs), such as sorry, excuse, apologise and pardon, while the other retrieves passages in which apologies are explicitly mentioned (metapragmatic expression analysis). Both methods require a considerable amount of manual analysis of retrieved hits, which has to be verified through elaborate interrater reliability testing. The searches are restricted to fictional texts because they show a greater frequency of apologies than the alternative genres available in COHA, and they often allow the identification of behaviour as apologetic because it is discursively described as such by the fictional characters or the narratorial voice. The results show that the frequency of apologies increased considerably throughout the period covered by COHA. In the earliest period the IFID sorry was no more frequent than pardon and forgive. In the most recent period its frequency has multiplied almost sixfold and is more than three times larger than all the others taken together. The metapragmatic expression analysis allows an analysis of the development of strategies used to perform apologies. IFIDs have become more important while Taking on Responsibility and Explanation receded somewhat in their frequencies. On the basis of these results it is speculated that the force of apologies has decreased. What used to be sincere requests for exoneration has in many cases turned to token displays of regret.
Ma, Minghui; Lin, Yuanlei
Every Berman’s variety
$$\mathbb {K}_p^q$$
which is the subvariety of Ockham algebras defined by the equation
$${\sim ^{2p+q}}a = {\sim ^q}a$$
(
$$p\ge 1$$
and
$$q\ge 0$$
) determines a finitary substitution invariant consequence relation
$$\vdash _p^q$$
. A sequent system
$$\mathsf {S}_p^q$$
is introduced as an axiomatization of the consequence relation
$$\vdash _p^q$$
. The system
$$\mathsf {S}_p^q$$
is characterized by a single finite frame
$$\mathfrak {F}_p^q$$
under the frame semantics given for the formal language. By the duality between frames and algebras,
$$\mathsf {S}_p^q$$
can be viewed as a
$$4^{2p+q}$$
valued logic as it is characterized by a distributive lattice of
$$4^{2p+q}$$
elements with a unary operator. Moreover, a structuralrulefree, cutfree and terminating sequent system
$$\mathsf {G}_p^q$$
is established for
$$\vdash _p^q$$
. The Craig interpolation property of
$$\vdash _p^q$$
is shown prooftheoretically utilizing
$$\mathsf {G}_p^q$$
.
Estarrona, Ainara; Aldezabal, Izaskun; Díaz de Ilarraza, Arantza
This article describes the method used to build the Basque Verb Index (BVI), a corpusbased lexicon. The BVI is the result of semiautomatic annotation of the EPEC corpus with verb predicate information, following the PropBankVerbNet model. The method presented is the product of a deep study of the syntactic–semantic behaviour of verbs in EPECRolSem (the EPEC corpus tagged with verb predicate information). During the process of annotating EPECRolSem, we have identified and stored in the BVI lexicon the different rolepatterns associated with all verbs appearing in the corpus. In addition, each entry in the BVI is linked to the corresponding verb entry in wellknown resources such as PropBank, VerbNet, WordNet and FrameNet. We have also implemented a tool called eROLda to facilitate the process of looking up verb patterns in the BVI and examples in EPECRolSem as a basis for future studies.
French, Rohan; Ripley, David
This paper considers some issues to do with valuational presentations of consequence relations, and the Galois connections between spaces of valuations and spaces of consequence relations. Some of what we present is known, and some even wellknown; but much is new. The aim is a systematic overview of a range of results applicable to nonreflexive and nontransitive logics, as well as more familiar logics. We conclude by considering some connectives suggested by this approach.
Khan, Wahab; Daud, Ali; Nasir, Jamal Abdul; Amjad, Tehmina; Arafat, Sachi; Aljohani, Naif; Alotaibi, Fahd S.
Part of speech (POS) tagging, the assignment of syntactic categories for words in running text, is significant to natural language processing as a preliminary task in applications such as speech processing, information extraction, and others. Urdu language processing presents a challenge due to the dual behaviour of various Urdu POS tags in differing situations (morphosyntactic ambiguity). This paper addresses this challenge by developing a novel tagging approach using linearchain conditional random fields (CRF). Our work is the first instance of a CRF approach for Urdu POS tagging. The proposed model employs a strong, stable and balanced languageindependent as well as language dependent feature set. The languagedependent feature considered includes partofspeech tag of the previous word and suffix of the current word while the languageindependent features includes the ‘context words window’. Our approach was evaluated against support vector machine techniques for Urdu POS—considered as state of the art—on two benchmark datasets. The results show our CRF approach to improve upon the Fmeasure of prior attempts by 8.3–8.5%.
Tsakalidis, Adam; Papadopoulos, Symeon; Voskaki, Rania; Ioannidou, Kyriaki; Boididou, Christina; Cristea, Alexandra I.; Liakata, Maria; Kompatsiaris, Yiannis
Sentiment lexicons and word embeddings constitute wellestablished sources of information for sentiment analysis in online social media. Although their effectiveness has been demonstrated in stateoftheart sentiment analysis and related tasks in the English language, such publicly available resources are much less developed and evaluated for the Greek language. In this paper, we tackle the problems arising when analyzing text in such an underresourced language. We present and make publicly available a rich set of such resources, ranging from a manually annotated lexicon, to semisupervised word embedding vectors and annotated datasets for different tasks. Our experiments using different algorithms and parameters on our resources show promising results over standard baselines; on average, we achieve a 24.9% relative improvement in Fscore on the crossdomain sentiment analysis task when training the same algorithms with our resources, compared to training them on more traditional feature sources, such as ngrams. Importantly, while our resources were built with the primary focus on the crossdomain sentiment analysis task, they also show promising results in related tasks, such as emotion analysis and sarcasm detection.
Ive, Julia; Max, Aurélien; Yvon, François
Traditionally, human–machine interaction to reach an improved machine translation (MT) output takes place expost and consists of correcting this output. In this work, we investigate other modes of intervention in the MT process. We propose a PreEdition protocol that involves: (a) the detection of MT translation difficulties; (b) the resolution of those difficulties by a human translator, who provides their translations (pretranslation); and (c) the integration of the obtained information prior to the automatic translation. This approach can meet individual interaction preferences of certain translators and can be particularly useful for production environments, where more control over output quality is needed. Early resolution of translation difficulties can prevent downstream errors, thus improving the final translation quality “for free”. We show that translation difficulty can be reliably predicted for English for various source units. We demonstrate that the pretranslation information can be successfully exploited by an MT system and that the indirect effects are genuine, accounting for around 16% of the total improvement. We also provide a study of the human effort involved in the resolution process.
Karimova, Sariya; Simianer, Patrick; Riezler, Stefan
The advantages of neural machine translation (NMT) have been extensively validated for offline translation of several language pairs for different domains of spoken and written language. However, research on interactive learning of NMT by adaptation to human postedits has so far been confined to simulation experiments. We present the first user study on online adaptation of NMT to user postedits in the domain of patent translation. Our study involves 29 human subjects (translation students) whose postediting effort and translation quality were measured on about 4500 interactions of a human posteditor and an NMT system integrating an online adaptive learning algorithm. Our experimental results show a significant reduction in human postediting effort due to online adaptation in NMT according to several evaluation metrics, including hTER, hBLEU, and KSMR. Furthermore, we found significant improvements in BLEU/TER between NMT outputs and professional translations in granted patents, providing further evidence for the advantages of online adaptive NMT in an interactive setup.
Mensa, Enrico; Radicioni, Daniele P.; Lieto, Antonio
2 Citations
Lexical resources are fundamental to tackle many tasks that are central to present and prospective research in Text Mining, Information Retrieval, and connected to Natural Language Processing. In this article we introduce COVER, a novel lexical resource, along with COVERAGE, the algorithm devised to build it. In order to describe concepts, COVER proposes a compact vectorial representation that combines the lexicographic precision characterizing BabelNet and the rich commonsense knowledge featuring ConceptNet. We propose COVER as a reliable and mature resource, that has been employed in as diverse tasks as conceptual categorization, keywords extraction, and conceptual similarity. The experimental assessment is performed on the last task: we report and discuss the obtained results, pointing out future improvements. We conclude that COVER can be directly exploited to build applications, and coupled with existing resources, as well.
Le, NgocTien; Lecouteux, Benjamin; Besacier, Laurent
This paper addresses the automatic quality estimation of spoken language translation (SLT). This relatively new task is defined and formalized as a sequencelabeling problem where each word in the SLT hypothesis is tagged as good or bad according to a large feature set. We propose several word confidence estimators (WCE) based on our automatic evaluation of transcription (ASR) quality, translation (MT) quality, or both (combined ASR + MT). This research work is possible because we built a specific corpus, which contains 6.7k utterances comprising the quintuplet: ASR output, verbatim transcript, text translation, speech translation, and postedition of the translation. The conclusion of our multiple experiments using joint ASR and MT features for WCE is that MT features remain the most influential while ASR features can bring interesting complementary information. In addition, the last part of the paper proposes to disentangle ASR errors and MT errors where each word in the SLT hypothesis is tagged as good,
$$asr\_error$$
or
$$mt\_error$$
. Robust quality estimators for SLT can be used for rescoring speech translation graphs or for providing feedback to the user in interactive speech translation or computerassisted speechtotext scenarios.
Kano, Takatomo; Takamichi, Shinnosuke; Sakti, Sakriani; Neubig, Graham; Toda, Tomoki; Nakamura, Satoshi
Speech translation is a technology that helps people communicate across different languages. The most commonly used speech translation model is composed of automatic speech recognition, machine translation and texttospeech synthesis components, which share information only at the text level. However, spoken communication is different from written communication in that it uses rich acoustic cues such as prosody in order to transmit more information through nonverbal channels. This paper is concerned with speechtospeech translation that is sensitive to this paralinguistic information. Our longterm goal is to make a system that allows users to speak a foreign language with the same expressiveness as if they were speaking in their own language. Our method works by reconstructing input acoustic features in the target language. From the many different possible paralinguistic features to handle, in this paper we choose duration and power as a first step, proposing a method that can translate these features from input speech to the output speech in continuous space. This is done in a simple and languageindependent fashion by training an endtoend model that maps sourcelanguage duration and power information into the target language. Two approaches are investigated: linear regression and neural network models. We evaluate the proposed methods and show that paralinguistic information in the input speech of the source language can be reflected in the output speech of the target language.
Pritsos, Dimitrios; Stamatatos, Efstathios
1 Citations
Web genre detection is a task that can enhance information retrieval systems by providing rich descriptions of documents and enabling more specialized queries. Most of previous studies in this field adopt the closedset scenario where a given palette comprises all available genre labels. However this is not a realistic setup since web genres are constantly enriched with new labels and existing web genres are evolving in time. Openset classification, where some pages used in the evaluation phase do not belong to any of the known genres, is a more realistic setup for this task. In this case, all pages not belonging to known genres can be seen as noise. This paper focuses on systematic evaluation of openset web genre identification when the noise is either structured or unstructured. Two openset methods combined with alternative text representation schemes and similarity measures are tested based on two benchmark corpora. Moreover, we adopt the openness test for web genre identification that enables the observation of effectiveness for a varying number of known/unknown labels.
Hadifar, Amir; Momtazi, Saeedeh
Word embedding, has been a great success story for natural language processing in recent years. The main purpose of this approach is providing a vector representation of words based on neural network language modeling. Using a large training corpus, the model most learns from cooccurrences of words, namely Skipgram model, and capture semantic features of words. Moreover, adding the recently introduced character embedding model to the objective function, the model can also focus on morphological features of words. In this paper, we study the impact of training corpus on the results of word embedding and show how the genre of training data affects the type of information captured by word embedding models. We perform our experiments on the Persian language. In line of our experiments, providing two wellknown evaluation datasets for Persian, namely Google semantic/syntactic analogy and Wordsim353, is also part of the contribution of this paper. The experiments include computation of word embedding from various public Persian corpora with different genres and sizes while considering comprehensive lexical and semantic comparison between them. We identify words whose usages differ between these datasets resulted totally different vector representation which ends to significant impact on different domains in which the results vary up to 9% on Google analogy and up to 6% on Wordsim353. The resulted word embedding for each of the individual corpora as well as their combinations will be publicly available for any further research based on word embedding for Persian.
Balyan, Renu; Chatterjee, Niladri
Design and implementation of automatic evaluation methods is an integral part of any scientific research in accelerating the development cycle of the output. This is no less true for automatic machine translation (MT) systems. However, no such global and systematic scheme exists for evaluation of performance of an MT system. The existing evaluation metrics, such as BLEU, METEOR, TER, although used extensively in literature have faced a lot of criticism from users. Moreover, performance of these metrics often varies with the pair of languages under consideration. The above observation is no less pertinent with respect to translations involving languages of the Indian subcontinent. This study aims at developing an evaluation metric for English to Hindi MT outputs. As a part of this process, a set of probable errors have been identified manually as well as automatically. Linear regression has been used for computing weight/penalty for each error, while taking human evaluations into consideration. A sentence score is computed as the weighted sum of the errors. A set of 126 models has been built using different single classifiers and ensemble of classifiers in order to find the most suitable model for allocating appropriate weight/penalty for each error. The outputs of the models have been compared with the stateoftheart evaluation metrics. The models developed for manually identified errors correlate well with manual evaluation scores, whereas the models for the automatically identified errors have low correlation with the manual scores. This indicates the need for further improvement and development of sophisticated linguistic tools for automatic identification and extraction of errors. Although many automatic machine translation tools are being developed for many different language pairs, there is no such generalized scheme that would lead to designing meaningful metrics for their evaluation. The proposed scheme should help in developing such metrics for different language pairs in the coming days.
Di Nola, Antonio; Lenzi, Giacomo
We start from Marra–Spada duality between semisimple MValgebras and Tychonoff spaces, and we consider the particular cases when the
$$\omega $$
skeleta of the MValgebras are restricted in some way. In particular we consider antiskeletal MValgebras, that is, the ones whose
$$\omega $$
skeleton is trivial.
Mann, Allen L.; Aarnio, Ville
Hintikka and Sandu’s independencefriendly (IF) logic is a conservative extension of firstorder logic that allows one to consider semantic games with imperfect information. In the present article, we first show how several variants of the Monty Hall problem can be modeled as semantic games for IF sentences. In the process, we extend IF logic to include semantic games with chance moves and dub this extension stochastic IF logic. Finally, we use stochastic IF logic to analyze the Sleeping Beauty problem, leading to the conclusion that the thirders are correct while identifying the main error in the halfers’ argument.
Omori, Hitoshi; Alama , Jesse
1 Citations
We outline the rather complicated history of attempts at axiomatizing Jaśkowski’s discussive logic
$$\mathbf {D_2}$$
and show that some clarity can be had by paying close attention to the language we work with. We then examine the problem of axiomatizing
$$\mathbf {D_2}$$
in languages involving discussive conjunctions. Specifically, we show that recent attempts by Ciuciura are mistaken. Finally, we present an axiomatization of
$$\mathbf {D_2}$$
in the language Jaśkowski suggested in his second paper on discussive logic, by following a remark of da Costa and Dubikajtis. We also deal with an interesting variant of
$$\mathbf {D_2}$$
, introduced by Ciuciura, in which negation is also taken to be discussive.
Kurahashi, Taishi
1 Citations
We prove that for each recursively axiomatized consistent extension T of Peano Arithmetic and
$$n \ge 2$$
, there exists a
$$\Sigma _2$$
numeration
$$\tau (u)$$
of T such that the provability logic of the provability predicate
$$\mathsf{Pr}_\tau (x)$$
naturally constructed from
$$\tau (u)$$
is exactly
$$\mathsf{K}+ \Box (\Box ^n p \rightarrow p) \rightarrow \Box p$$
. This settles Sacchetti’s problem affirmatively.
Gruszczyński, Rafał; Pietruszczak, Andrzej
1 Citations
This is the first, out of two papers, devoted to Andrzej Grzegorczyk’s pointfree system of topology from Grzegorczyk (Synthese 12(2–3):228–235, 1960.
https://doi.org/10.1007/BF00485101
). His system was one of the very first fully fledged axiomatizations of topology based on the notions of region, parthood and separation (the dual notion of connection). Its peculiar and interesting feature is the definition of point, whose intention is to grasp our geometrical intuitions of points as systems of shrinking regions of space. In this part we analyze (quasi)separation structures and Grzegorczyk structures, and establish their properties which will be useful in the sequel. We prove that in the class of Urysohn spaces with countable chain condition, to every topologically interpreted representative of a point in the sense of Grzegorczyk’s corresponds exactly one point of a space. We also demonstrate that Tychonoff firstcountable spaces give rise to complete Grzegorczyk structures. The results established below will be used in the second part devoted to points and topological spaces.
Pietruszczak, Andrzej; Jarmużek, Tomasz
By a pure modal logic of names (PMLN) we mean a quantifierfree formulation of such a logic which includes not only traditional categorical, but also modal categorical sentences with modalities de re and which is an extension of Propositional Logic. For categorical sentences we use two interpretations: a “natural” one; and Johnson and Thomason’s interpretation, which is suitable for some reconstructions of Aristotelian modal syllogistic (Johnson in Notre Dame J Form Logic 30(2):271–284, 1989; Thomason in J Philos Logic 22(2):111–128, 1993 and J Philos Logic 26:129–141, 1997. In both cases we use Johnsonlike models (1989). We also analyze different kinds of versions of PMLN, for both general and singular names. We present complete tableau systems for the different versions of PMLN. These systems enable us to present some decidability methods. It yields “strong decidability” in the following sense: for every inference starting with a finite set of premises (resp. every syllogism, every formula) we can specify a finite number of steps to check whether it is logically valid. This method gives the upper bound of the cardinality of models needed for the examination of the validity of a given inference (resp. syllogism, formula).
Bezhanishvili, G.; Bezhanishvili, N.; LuceroBryan, J.; Mill, J.
1 Citations
It is a landmark theorem of McKinsey and Tarski that if we interpret modal diamond as closure (and hence modal box as interior), then
$$\mathsf S4$$
is the logic of any denseinitself metrizable space. The McKinsey–Tarski Theorem relies heavily on a metric that gives rise to the topology. We give a new and more topological proof of the theorem, utilizing Bing’s Metrization Theorem.
Lávička, Tomáš; Noguera, Carles
This paper continues the investigation, started in Lávička and Noguera (Stud Log 105(3): 521–551, 2017), of infinitary propositional logics from the perspective of their algebraic completeness and filter extension properties in abstract algebraic logic. If follows from the Lindenbaum Lemma used in standard proofs of algebraic completeness that, in every finitary logic, (completely) intersectionprime theories form a basis of the closure system of all theories. In this article we consider the open problem of whether these properties can be transferred to lattices of filters over arbitrary algebras of the logic. We show that in general the answer is negative, obtaining a richer hierarchy of pairwise different classes of infinitary logics that we separate with natural examples. As byproducts we obtain a characterization of subdirect representation for arbitrary logics, develop a fruitful new notion of natural expansion, and contribute to the understanding of semilinear logics.
Kremer, Philip
The simplest bimodal combination of unimodal logics
$$\text {L} _1$$
and
$$\text {L} _2$$
is their fusion,
$$\text {L} _1 \otimes \text {L} _2$$
, axiomatized by the theorems of
$$\text {L} _1$$
for
$$\square _1$$
and of
$$\text {L} _2$$
for
$$\square _2$$
, and the rules of modus ponens, necessitation for
$$\square _1$$
and for
$$\square _2$$
, and substitution. Shehtman introduced the frame product
$$\text {L} _1 \times \text {L} _2$$
, as the logic of the products of certain Kripke frames: these logics are twodimensional as well as bimodal. Van Benthem, Bezhanishvili, ten Cate and Sarenac transposed Shehtman’s idea to the topological semantics and introduced the topological product
$$\text {L} _1 \times _t \text {L} _2$$
, as the logic of the products of certain topological spaces. For almost all wellstudies logics, we have
$$\text {L} _1 \otimes \text {L} _2 \subsetneq \text {L} _1 \times \text {L} _2$$
, for example,
$$\text {S4} \otimes \text {S4} \subsetneq \text {S4} \times \text {S4} $$
. Van Benthem et al. show, by contrast, that
$$\text {S4} \times _t \text {S4} = \text {S4} \otimes \text {S4} $$
. It is straightforward to define the product of a topological space and a frame: the result is a topologized frame, i.e., a set together with a topology and a binary relation. In this paper, we introduce topologicalframe products
$$\text {L} _1 \times _ tf \text {L} _2$$
of modal logics, providing a complete axiomatization of
$$\text {S4} \times _ tf \text {L} $$
, whenever
$$\text {L} $$
is a Kripke complete Horn axiomatizable extension of the modal logic D: these extensions include
$$\text {T} , \text {S4} $$
and
$$\text {S5} $$
, but not
$$\text {K} $$
or
$$\text {K4} $$
. We leave open the problem of axiomatizing
$$\text {S4} \times _ tf \text {K} $$
,
$$\text {S4} \times _ tf \text {K4} $$
, and other related logics. When
$$\text {L} = \text {S4} $$
, our result confirms a conjecture of van Benthem et al. concerning the logic of products of Alexandrov spaces with arbitrary topological spaces.
Rospocher, Marco; Corcoglioniti, Francesco; Palmero Aprosio, Alessio
PreMOn is a freely available linguistic resource for exposing predicate models (PropBank, NomBank, VerbNet, and FrameNet) and mappings between them (e.g., SemLink and the predicate matrix) as linguistic linked open data (LOD). It consists of two components: (1) the PreMOn Ontology, that builds on the OntoLexLemon model by the W3C ontologyLexica community group to enable an homogeneous representation of data from various predicate models and their linking to ontological resources; and, (2) the PreMOn Dataset, a LOD dataset integrating various versions of the aforementioned predicate models and mappings, linked to other LOD ontologies and resources (e.g., FrameBase, ESO, WordNet RDF). PreMOn is accessible online in different ways (e.g., SPARQL endpoint), and extensively documented.
Millson, Jared
In recent years, the effort to formalize erotetic inferences—i.e., inferences to and from questions—has become a central concern for those working in erotetic logic. However, few have sought to formulate a proof theory for these inferences. To fill this lacuna, we construct a calculus for (classes of) sequents that are sound and complete for two species of erotetic inferences studied by Inferential Erotetic Logic (IEL): erotetic evocation and erotetic implication. While an effort has been made to axiomatize the former in a sequent system, there is currently no proof theory for the latter. Moreover, the extant axiomatization of erotetic evocation fails to capture its defeasible character and provides no rules for introducing or eliminating questionforming operators. In contrast, our calculus encodes defeasibility conditions on sequents and provides rules governing the introduction and elimination of erotetic formulas. We demonstrate that an elimination theorem holds for a version of the cut rule that applies to both declarative and erotetic formulas and that the rules for the axiomatic account of question evocation in IEL are admissible in our system.
Fang, Jie
An endomorphism on an algebra
$${\mathcal {A}}$$
is said to be strong if it is compatible with every congruence on
$${\mathcal {A}}$$
; and
$${\mathcal {A}}$$
is said to have the strong endomorphism kernel property if every congruence on
$${\mathcal {A}}$$
, other than the universal congruence, is the kernel of a strong endomorphism on
$${\mathcal {A}}$$
. Here we characterise the structure of Ockham algebras with balanced pseudocomplementation those that have this property via Priestley duality.
Saeed, Ali; Nawab, Rao Muhammad Adeel; Stevenson, Mark; Rayson, Paul
The aim of word sense disambiguation (WSD) is to correctly identify the meaning of a word in context. All natural languages exhibit word sense ambiguities and these are often hard to resolve automatically. Consequently WSD is considered an important problem in natural language processing (NLP). Standard evaluation resources are needed to develop, evaluate and compare WSD methods. A range of initiatives have lead to the development of benchmark WSD corpora for a wide range of languages from various language families. However, there is a lack of benchmark WSD corpora for South Asian languages including Urdu, despite there being over 300 million Urdu speakers and a large amounts of Urdu digital text available online. To address that gap, this study describes a novel benchmark corpus for the Urdu Lexical Sample WSD task. This corpus contains 50 target words (30 nouns, 11 adjectives, and 9 verbs). A standard, manually crafted dictionary called Urdu Lughat is used as a sense inventory. Four baseline WSD approaches were applied to the corpus. The results show that the best performance was obtained using a simple Bag of Words approach. To encourage NLP research on the Urdu language the corpus is freely available to the research community.
Aglianó, Paolo
In this paper we investigate splitting algebras in varieties of logics, with special consideration for varieties of BLalgebras and similar structures. In the case of the variety of all BLalgebras a complete characterization of the splitting algebras is obtained.
Czelakowski, Janusz
This paper, being a companion to the book [2] elaborates the deontology of sequential and compound actions based on relational models and formal constructs borrowed from formal linguistics. The semantic constructions presented in this paper emulate to some extent the content of [3] but are more involved. Although the present work should be regarded as a sequel of [3] it is selfcontained and may be read independently. The issue of permission and obligation of actions is presented in the form of a logical system
. This system is semantically defined by providing its intended models in which the role of actions of various types (atomic, sequential and compound ones) is accentuated. Since the consequence relation
is not finitary, other semantically defined variants of
are defined. The focus is on the finitary system
in which only finite compound actions are admissible. An adequate axiom system for
it is defined. The strong completeness theorem is the central result. The role of the canonical model in the proof of the completeness theorem is emphasized.
Wang, Yanshan; Afzal, Naveed; Fu, Sunyang; Wang, Liwei; Shen, Feichen; RastegarMojarad, Majid; Liu, Hongfang
1 Citations
The adoption of electronic health records (EHRs) has enabled a wide range of applications leveraging EHR data. However, the meaningful use of EHR data largely depends on our ability to efficiently extract and consolidate information embedded in clinical text where natural language processing (NLP) techniques are essential. Semantic textual similarity (STS) that measures the semantic similarity between text snippets plays a significant role in many NLP applications. In the general NLP domain, STS shared tasks have made available a huge collection of text snippet pairs with manual annotations in various domains. In the clinical domain, STS can enable us to detect and eliminate redundant information that may lead to a reduction in cognitive burden and an improvement in the clinical decisionmaking process. This paper elaborates our efforts to assemble a resource for STS in the medical domain, MedSTS. It consists of a total of 174,629 sentence pairs gathered from a clinical corpus at Mayo Clinic. A subset of MedSTS (MedSTS_ann) containing 1068 sentence pairs was annotated by two medical experts with semantic similarity scores of 0–5 (low to high similarity). We further analyzed the medical concepts in the MedSTS corpus, and tested four STS systems on the MedSTS_ann corpus. In the future, we will organize a shared task by releasing the MedSTS_ann corpus to motivate the community to tackle the real world clinical problems.
Grilletti, Gianluca
1 Citations
Classical firstorder logic
$$\texttt {FO}$$
is commonly used to study logical connections between statements, that is sentences that in every context have an associated truthvalue. Inquisitive firstorder logic
$$\texttt {InqBQ}$$
is a conservative extension of
$$\texttt {FO}$$
which captures not only connections between statements, but also between questions. In this paper we prove the disjunction and existence properties for
$$\texttt {InqBQ}$$
relative to inquisitive disjunction
and inquisitive existential quantifier
$$\overline{\exists }$$
. Moreover we extend these results to several families of theories, among which the one in the language of
$$\texttt {FO}$$
. To this end, we initiate a modeltheoretic approach to the study of
$$\texttt {InqBQ}$$
. In particular, we develop a toolkit of basic constructions in order to transform and combine models of
$$\texttt {InqBQ}$$
.
Manzano, Maria; Martins, Manuel; Huertas, Antonia
1 Citations
Equational hybrid propositional type theory (
$$\mathsf {EHPTT}$$
) is a combination of propositional type theory, equational logic and hybrid modal logic. The structures used to interpret the language contain a hierarchy of propositional types, an algebra (a nonempty set with functions) and a Kripke frame. The main result in this paper is the proof of completeness of a calculus specifically defined for this logic. The completeness proof is based on the three proofs Henkin published last century: (i) Completeness in type theory, (ii) The completeness of the firstorder functional calculus and (iii) Completeness in propositional type theory. More precisely, from (i) and (ii) we take the idea of building the model described by the maximal consistent set; in our case the maximal consistent set has to be named,
$$\Diamond $$
saturated and extensionally algebraicsaturated due to the hybrid and equational nature of
$$\mathsf {EHPTT}$$
. From (iii), we use the result that any element in the hierarchy has a name. The challenge was to deal with all the heterogeneous components in an integrated system.
Kabala, Jakub
This paper applies computational methods of authorship attribution to shed light on a still open question concerning two Latin works of the twelfth century: are the anonymous authors of the Translatio s. Nicolai (ca. 1101–1108) and the Gesta principum polonorum (ca. 1113–1117) one and the same person? The Translatio was written by the socalled Monk of Lido and describes Venice’s role in the First Crusade. The Gesta were written by the socalled Gallus Anonymous and contain a panegyric of the contemporary Polish ruler, Bolesław III the WryMouthed (r. 1102–1138). This study attributes authorship to these works within four corpora of Latin texts composed between the tenth and twelfth centuries, each with between 39 and 116 texts written by between 15 and 22 different authors. The goal of including four corpora is to see how robust the similarity between the target texts is to changes in text length, genre, and class balance in the corpora. In each corpus, nine different distance metrics and one machinelearning algorithm are used to classify the authors of the Translatio and Gesta. I conclude that it is highly likely that Gallus and Monk were indeed one and same anonymous author, and highlight the effectiveness of the Bray–Curtis distance and logistic regression as methods of attribution.
Streufert, Peter A.
1 Citations
It would be useful to have a category of extensiveform games whose isomorphisms specify equivalences between games. Since working with entire games is too large a project for a single paper, I begin here with preforms, where a “preform” is a rooted tree together with choices and information sets. In particular, this paper first defines the category
$$\mathbf {Tree}$$
, whose objects are “functioned trees”, which are specially designed to be incorporated into preforms. I show that
$$\mathbf {Tree}$$
is isomorphic to the full subcategory of
$$\mathbf {Grph}$$
whose objects are converging arborescences. Then the paper defines the category
$$\mathbf {NCP}$$
, whose objects are “nodeandchoice preforms”, each of which consists of a node set, a choice set, and an operator mapping nodechoice pairs to nodes. I characterize the
$$\mathbf {NCP}$$
isomorphisms, define a forgetful functor from
$$\mathbf {NCP}$$
to
$$\mathbf {Tree}$$
, and show that
$$\mathbf {Tree}$$
is equivalent to the full subcategory of
$$\mathbf {NCP}$$
whose objects are perfectinformation preforms. The paper also shows that many gametheoretic entities can be derived from preforms, and that these entities are wellbehaved with respect to
$$\mathbf {NCP}$$
morphisms and isomorphisms.
Eckert, Daniel; Herzberg, Frederik S.
Arrow’s axiomatic foundation of social choice theory can be understood as an application of Tarski’s methodology of the deductive sciences—which is closely related to the latter’s foundational contribution to model theory. In this note we show in a modeltheoretic framework how Arrow’s use of von Neumann and Morgenstern’s concept of winning coalitions allows to exploit the algebraic structures involved in preference aggregation; this approach entails an alternative indirect ultrafilter proof for Arrow’s dictatorship result. This link also connects Arrow’s seminal result to key developments and concepts in the history of model theory, notably ultraproducts and preservation results.
Flaminio, T.; Hosni, H.; Lapenta, S.
This paper introduces a logical analysis of convex combinations within the framework of Łukasiewicz realvalued logic. This provides a natural link between the fields of manyvalued logics and decision theory under uncertainty, where the notion of convexity plays a central role. We set out to explore such a link by defining convex operators on MValgebras, which are the equivalent algebraic semantics of Łukasiewicz logic. This gives us a formal language to reason about the expected value of bounded random variables. As an illustration of the applicability of our framework we present a logical version of the Anscombe–Aumann representation result.
Herzberg, Frederik
CerreiaVioglio et al. (Econ Theory 48(2–3):341–375, 2011) have proposed a very general axiomatisation of preferences in the presence of ambiguity, viz. Monotonic Bernoullian Archimedean preference orderings. This paper investigates the problem of Arrovian aggregation of such preferences—and proves dictatorial impossibility results for both finite and infinite populations. Applications for the special case of aggregating expectedutility preferences are given. A novel proof methodology for special aggregation problems, based on model theory (in the sense of mathematical logic), is employed.
more …
By
Litak, Tadeusz
1 Citations
This paper criticizes nonconstructive uses of set theory in formal economics. The main focus is on results on preference aggregation and Arrow’s theorem for infinite electorates, but the present analysis would apply as well, e.g., to analogous results in intergenerational social choice. To separate justified and unjustified uses of infinite populations in social choice, I suggest a principle which may be called the Hildenbrand criterion and argue that results based on unrestricted axiom of choice do not meet this criterion. The technically novel part of this paper is a proposal to use a settheoretic principle known as the axiom of determinacy (
$$\mathsf {AD}$$
), not as a replacement for Choice, but simply to eliminate applications of set theory violating the Hildenbrand criterion. A particularly appealing aspect of
$$\mathsf {AD}$$
from the point of view of the research area in question is its gametheoretic character.
Fišer, Darja; Ljubešić, Nikola; Erjavec, Tomaž
The paper presents the results of the Janes project, which aimed to develop language resources and tools for Slovene user generated content. The paper first describes the 200 million word Janes corpus, containing tweets, forum posts, news comments, user and talk pages from Wikipedia, and blogs and blog comments, where each text is accompanied by rich metadata. The developed processing tools for Slovene user generated content are presented next, which include a tokeniser, wordnormaliser, partofspeech tagger and lemmatiser, and a named entity recogniser. A set of manually annotated datasets was also produced, both for tool training as well as for linguistic research.
The developed resources and tools are made publicly available under Creative Commons licences in the repository of the CLARIN.SI research infrastructure and on GitHub, while the corpora are also available through the CLARIN.SI concordancers.
Cruz, Lilian J.; Poveda, Yuri A.
An explicit categorical equivalence is defined between a proper subvariety of the class of
$${ PMV}$$
algebras, as defined by Di Nola and Dvurečenskij, to be called
$${ PMV}_{f}$$
algebras, and the category of semilow
$$f_u$$
rings. This categorical representation is done using the prime spectrum of the
$${ MV}$$
algebras, through the equivalence between
$${ MV}$$
algebras and
$$l_u$$
groups established by Mundici, from the perspective of the Dubuc–Poveda approach, that extends the construction defined by Chang on chains. As a particular case, semilow
$$f_u$$
rings associated to Boolean algebras are characterized.
Standefer, Shawn
Two common forms of natural deduction proof systems are found in the Gentzen–Prawitz and Jaśkowski–Fitch systems. In this paper, I provide translations between proofs in these systems, pointing out the ways in which the translations highlight the structural rules implicit in the systems. These translations work for classical, intuitionistic, and minimal logic. I then provide translations for classical S4 proofs.
Martin, Éric
Parametric logic is a framework that generalises classical firstorder logic. A generalised notion of logical consequence—a form of preferential entailment based on a closed world assumption—is defined as a function of some parameters. A concept of possible knowledge base—the counterpart to the consistent theories of firstorder logic—is introduced. The notion of compactness is weakened. The degree of weakening is quantified by a nonnull ordinal—the larger the ordinal, the more significant the weakening. For every possible knowledge base T, a hierarchy of sentences that are generalised logical consequences of T is built. The first layer of the hierarchies corresponds to sentences that can be obtained by a deductive inference, characterised by the compactness property. The second layer of the hierarchies corresponds to sentences that can be obtained by an inductive inference, characterised by the property of weak compactness quantified by 1. Weaker forms of compactness—quantified by nonnull ordinals—determine higher layers in the hierarchies, corresponding to more complex inferences. The naturalness of the hierarchies built over the possible knowledge bases is attested by fundamental connections with notions from Learning theory and from topology. The naturalness of the hierarchies built over the possible knowledge bases is attested by fundamental connections with notions from Learning theory—classification in the limit, with or without a bounded number of mind changes—and from topology—in reference to the Borel and the difference hierarchies. In this paper, we introduce the key modeltheoretic aspects of Parametric logic, justify the concept of the knowledge base, define the hierarchies of generalised logical consequences and illustrate their relevance to Nonmonotonic reasoning. More specifically, we show that the degree of nonmonotonicity that is required to infer a sentence can be characterised by the least nonnull ordinal that quantifies the weakening of compactness used to locate the inferred sentence in the hierarchies.
Schulte, Oliver
Occam’s razor directs us to adopt the simplest hypothesis consistent with the evidence. Learning theory provides a precise definition of the inductive simplicity of a hypothesis for a given learning problem. This definition specifies a learning method that implements an inductive version of Occam’s razor. As a case study, we apply Occam’s inductive razor to causal learning. We consider two causal learning problems: learning a causal graph structure that presents global causal connections among a set of domain variables, and learning contextsensitive causal relationships that hold not globally, but only relative to a context. For causal graph learning, Occam’s inductive razor directs us to adopt the model that explains the observed correlations with a minimum number of direct causal connections. For expanding a causal graph structure to include contextsensitive relationships, Occam’s inductive razor directs us to adopt the expansion that explains the observed correlations with a minimum number of free parameters. This is equivalent to explaining the correlations with a minimum number of probabilistic logical rules. The paper provides a gentle introduction to the learningtheoretic definition of inductive simplicity and the application of Occam’s razor for causal learning.
Şahin, Gözde Gül; Adalı, Eşref
In this work, we report largescale semantic role annotation of arguments in the Turkish dependency treebank, and present the first comprehensive Turkish semantic role labeling (SRL) resource: Turkish Proposition Bank (PropBank). We present our annotation workflow that harnesses crowd intelligence, and discuss the procedures for ensuring annotation consistency and quality control. Our discussion focuses on syntactic variations in realization of predicateargument structures, and the large lexicon problem caused by complex derivational morphology. We describe our approach that exploits framesets of root verbs to abstract away from syntax and increase selfconsistency of the Turkish PropBank. The issues that arise in the annotation of verbs derived via valency changing morphemes, verbal nominals, and nominal verbs are explored, and evaluation results for interannotator agreement are provided. Furthermore, semantic layer described here is aligned with universal dependency (UD) compliant treebank and released to enable more researchers to work on the problem. Finally, we use PropBank to establish a baseline score of 79.10 F1 for Turkish SRL using the matetool (an opensource SRL tool based on supervised machine learning) enhanced with basic morphological features. Turkish PropBank and the extended SRL system are made publicly available.
Cecchini, Flavio Massimiliano; Riedl, Martin; Fersini, Elisabetta; Biemann, Chris
This article presents a comparison of different Word Sense Induction (wsi) clustering algorithms on two novel pseudoword data sets of semanticsimilarity and cooccurrencebased word graphs, with a special focus on the detection of homonymic polysemy. We follow the original definition of a pseudoword as the combination of two monosemous terms and their contexts to simulate a polysemous word. The evaluation is performed comparing the algorithm’s output on a pseudoword’s ego word graph (i.e., a graph that represents the pseudoword’s context in the corpus) with the known subdivision given by the components corresponding to the monosemous source words forming the pseudoword. The main contribution of this article is to present a selfsufficient pseudowordbased evaluation framework for wsi graphbased clustering algorithms, thereby defining a new evaluation measure (top2) and a secondary clustering process (hyperclustering). To our knowledge, we are the first to conduct and discuss a largescale systematic pseudoword evaluation targeting the induction of coarsegrained homonymous word senses across a large number of graph clustering algorithms.
Astrakhantsev, Nikita
6 Citations
Automatically recognized terminology is widely used for various domainspecific texts processing tasks, such as machine translation, information retrieval or ontology construction. However, there is still no agreement on which methods are best suited for particular settings and, moreover, there is no reliable comparison of already developed methods. We believe that one of the main reasons is the lack of stateoftheart method implementations, which are usually nontrivial to recreate—mostly, in terms of software engineering efforts. In order to address these issues, we present ATR4S, an opensource software written in Scala that comprises 13 stateoftheart methods for automatic terminology recognition (ATR) and implements the whole pipeline from text document preprocessing, to term candidates collection, term candidate scoring, and finally, term candidate ranking. It is highly scalable, modular and configurable tool with support of automatic caching. We also compare 13 stateoftheart methods on 7 open datasets by average precision and processing time. Experimental comparison reveals that no single method demonstrates best average precision for all datasets and that other available tools for ATR do not contain the best methods.
Bučar, Jože; Žnidaršič, Martin; Povh, Janez
In this study, we introduce Slovene webcrawled news corpora with sentiment annotation on three levels of granularity: sentence, paragraph and document levels. We describe the methodology and tools that were required for their construction. The corpora contain more than 250,000 documents with political, business, economic and financial content from five Slovene media resources on the web. More than 10,000 of them were manually annotated as negative, neutral or positive. All corpora are publicly available under a Creative Commons copyright license. We used the annotated documents to construct a Slovene sentiment lexicon, which is the first of its kind for Slovene, and to assess the sentiment classification approaches used. The constructed corpora were also utilised to monitor withinthedocument sentiment dynamics, its changes over time and relations with news topics. We show that sentiment is, on average, more explicit at the beginning of documents, and it loses sharpness towards the end of documents.
Hee, Cynthia; Lefever, Els; Hoste, Véronique
To push the state of the art in text mining applications, research in natural language processing has increasingly been investigating automatic irony detection, but manually annotated irony corpora are scarce. We present the construction of a manually annotated irony corpus based on a finegrained annotation scheme that allows for identification of different types of irony. We conduct a series of binary classification experiments for automatic irony recognition using a support vector machine (SVM) that exploits a varied feature set and compare this method to a deep learning approach that is based on an LSTM network and (pretrained) word embeddings. Evaluation on a heldout corpus shows that the SVM model outperforms the neural network approach and benefits from combining lexical, semantic and syntactic information sources. A qualitative analysis of the classification output reveals that the classifier performance may be further enhanced by integrating implicit sentiment information and context and userbased features.
Majewska, Olga; Vulić, Ivan; McCarthy, Diana; Huang, Yan; Murakami, Akira; Laippala, Veronika; Korhonen, Anna
VerbNet—the most extensive online verb lexicon currently available for English—has proved useful in supporting a variety of NLP tasks. However, its exploitation in multilingual NLP has been limited by the fact that such classifications are available for few languages only. Since manual development of VerbNet is a major undertaking, researchers have recently translated VerbNet classes from English to other languages. However, no systematic investigation has been conducted into the applicability and accuracy of such a translation approach across different, typologically diverse languages. Our study is aimed at filling this gap. We develop a systematic method for translation of VerbNet classes from English to other languages which we first apply to Polish and subsequently to Croatian, Mandarin, Japanese, Italian, and Finnish. Our results on Polish demonstrate high translatability with all the classes (96% of English member verbs successfully translated into Polish) and strong interannotator agreement, revealing a promising degree of overlap in the resultant classifications. The results on other languages are equally promising. This demonstrates that VerbNet classes have strong crosslingual potential and the proposed method could be applied to obtain gold standards for automatic verb classification in different languages. We make our annotation guidelines and the six languagespecific verb classifications available with this paper.
Shterionov, Dimitar; Superbo, Riccardo; Nagle, Pat; Casanellas, Laura; O’Dowd, Tony; Way, Andy
5 Citations
Neural machine translation (NMT) has recently gained substantial popularity not only in academia, but also in industry. For its acceptance in industry it is important to investigate how NMT performs in comparison to the phrasebased statistical MT (PBSMT) model, that until recently was the dominant MT paradigm. In the present work, we compare the quality of the PBSMT and NMT solutions of KantanMT—a commercial platform for custom MT—that are tailored to accommodate largescale translation production, where there is a limited amount of time to train an endtoend system (NMT or PBSMT). In order to satisfy the time requirements of our production line, we restrict the NMT training time to 4 days; to train a PBSMT system typically requires no longer than one day with the current training pipeline of KantanMT. To train both NMT and PBSMT engines for each language pair, we strictly use the same parallel corpora and the same pre and postprocessing steps (when applicable). Our results show that, even with timerestricted training of 4 days, NMT quality substantially surpasses that of PBSMT. Furthermore, we challenge the reliability of automatic quality evaluation metrics based on ngram comparison (in particular Fmeasure, BLEU and TER) for NMT quality evaluation. We support our hypothesis with both analytical and empirical evidence. We investigate how suitable these metrics are when comparing the two different paradigms.
Castilho, Sheila; Moorkens, Joss; Gaspari, Federico; Sennrich, Rico; Way, Andy; Georgakopoulou, Panayota
3 Citations
This article reports a multifaceted comparison between statistical and neural machine translation (MT) systems that were developed for translation of data from massive open online courses (MOOCs). The study uses four language pairs: English to German, Greek, Portuguese, and Russian. Translation quality is evaluated using automatic metrics and human evaluation, carried out by professional translators. Results show that neural MT is preferred in sidebyside ranking, and is found to contain fewer overall errors. Results are less clearcut for some error categories, and for temporal and technical postediting effort. In addition, results are reported based on sentence length, showing advantages and disadvantages depending on the particular language pair and MT paradigm.
Wu, Liang; Morstatter, Fred; Liu, Huan
2 Citations
Sentiment information about social media posts is increasingly considered an important resource for customer segmentation, market understanding, and tackling other socioeconomic issues.
However, sentiment in social media is difficult to measure since usergenerated content is usually short and informal. Although many traditional sentiment analysis methods have been proposed, identifying slang sentiment words remains a challenging task for practitioners. Though some slang words are available in existing sentiment lexicons, with new slang being generated with emerging memes, a dedicated lexicon will be useful for researchers and practitioners. To this end, we propose to build a slang sentiment dictionary to aid sentiment analysis.
It is laborious and timeconsuming to collect a comprehensive list of slang words and label the sentiment polarity. We present an approach to leverage web resources to construct a Slang Sentiment Dictionary (SlangSD) that is easy to expand. SlangSD is publicly available for research purposes. We empirically show the advantages of using SlangSD, the newlybuilt slang sentiment word dictionary for sentiment classification, and provide examples demonstrating its ease of use with a sentiment analysis system.
Klubička, Filip; Toral, Antonio; SánchezCartagena, Víctor M.
2 Citations
This paper presents a quantitative finegrained manual evaluation approach to comparing the performance of different machine translation (MT) systems. We build upon the wellestablished multidimensional quality metrics (MQM) error taxonomy and implement a novel method that assesses whether the differences in performance for MQM error types between different MT systems are statistically significant. We conduct a case study for EnglishtoCroatian, a language direction that involves translating into a morphologically rich language, for which we compare three MT systems belonging to different paradigms: pure phrasebased, factored phrasebased and neural. First, we design an MQMcompliant error taxonomy tailored to the relevant linguistic phenomena of Slavic languages, which made the annotation process feasible and accurate. Errors in MT outputs were then annotated by two annotators following this taxonomy. Subsequently, we carried out a statistical analysis which showed that the bestperforming system (neural) reduces the errors produced by the worst system (pure phrasebased) by more than half (54%). Moreover, we conducted an additional analysis of agreement errors in which we distinguished between short (phraselevel) and long distance (sentencelevel) errors. We discovered that phrasebased MT approaches are of limited use for long distance agreement phenomena, for which neural MT was found to be especially effective.
Rago, Alejandro; Marcos, Claudia; DiazPace, J. Andres
Engineering activities often produce considerable documentation as a byproduct of the development process. Due to their complexity, technical analysts can benefit from text processing techniques able to identify concepts of interest and analyze deficiencies of the documents in an automated fashion. In practice, text sentences from the documentation are usually transformed to a vector space model, which is suitable for traditional machine learning classifiers. However, such transformations suffer from problems of synonyms and ambiguity that cause classification mistakes. For alleviating these problems, there has been a growing interest in the semantic enrichment of text. Unfortunately, using generalpurpose thesaurus and encyclopedias to enrich technical documents belonging to a given domain (e.g. requirements engineering) often introduces noise and does not improve classification. In this work, we aim at boosting text classification by exploiting information about semantic roles. We have explored this approach when building a multilabel classifier for identifying special concepts, called domain actions, in textual software requirements. After evaluating various combinations of semantic roles and text classification algorithms, we found that this kind of semanticallyenriched data leads to improvements of up to 18% in both precision and recall, when compared to nonenriched data. Our enrichment strategy based on semantic roles also allowed classifiers to reach acceptable accuracy levels with small training sets. Moreover, semantic roles outperformed Wikipedia and WordNETbased enrichments, which failed to boost requirements classification with several techniques. These results drove the development of two requirements tools, which we successfully applied in the processing of textual use cases.
Popović, Maja
1 Citations
This work presents an extensive comparison of languagerelated problems for neural machine translation (NMT) and phrasebased machine translation (PBMT) for GermantoEnglish, EnglishtoGerman and EnglishtoSerbian. The explored issues are related both to the characteristics of the languages as well as to the (machine) translation process and, although related, go beyond typical translation error classes. It is shown that the main advantage of the NMT approach consists of better generating verb forms, avoiding verb omissions, as well as better handling of English noun collocations and negation. It is also shown that the main obstacles for the NMT system are prepositions, translation of English (source) ambiguous words and generating English (target) continuous and perfect tenses. In addition, preliminary experiments show that a number of issues are complementary, i.e., not occurring in the same segments and/or in the same form. This means that a combination or hybridisation of the NMT and PBMT approaches is a promising direction for improving both types of systems.
Lapponi, Emanuele; Søyland, Martin G.; Velldal, Erik; Oepen, Stephan
In this work we present the Talk of Norway (ToN) data set, a collection of Norwegian Parliament speeches from 1998 to 2016.
Every speech is richly annotated with metadata harvested from different sources, and augmented with language type, sentence, token, lemma, partofspeech, and morphological feature annotations. We also present a pilot study on party classification in the Norwegian Parliament, carried out in the context of a crossfaculty collaboration involving researchers from both Political Science and Computer Science. Our initial experiments demonstrate how the linguistic and institutional annotations in ToN can be used to gather insights on how different aspects of the political process affect classification.
Steffens, Marie
The study of the discourse functions of antonymy was developed mainly by Steven Jones (Antonymy: a corpusbased perspective. Routledge, London, 2002; Antonyms in english. Construals, constructions and canonicity. Cambridge University Press, Cambridge, 2012), who classifies antonymic cooccurrences in English into ten categories, based on the different discourse functions they can fulfil. On the basis of a similar study of antonymic discourse functions in French, this paper explores how two opposites used in the same sentence exploit our thought processes to influence the way we conceptualise the world. It focuses on sentences extracted from the newspaper Le Monde (1987–2006 and 2009–2011) in which two antonyms are used in copresence. Through the analysis of these utterances, this paper describes the discourse functions of antonymy in French and shows how the semantic and syntactic roles of copresent antonyms determine the semanticoreferential functions they perform. I then analyse how the two major (groups of) functions, the ancillary function and the coordination functions, identified in English journalistic texts by Steven Jones, produce meaning effects in French texts, and how the mechanisms underlying these functions allow opposites to manipulate us.
MilàGarcia, Alba
Within the recentlycoined subfield of corpus pragmatics, one of the areas of interest is the study of speech acts and, specifically, how it can profit from the adoption of this methodological approach. However, the acknowledged lack of correspondence between speech acts and linguistic forms makes basic formbased corpus searches unreliable in retrieving speech acts from a corpus. In fact, functiontoform corpus research can prove much more fruitful in carrying out this kind of study, but it usually requires timeconsuming manual annotation, which in turn means that there have been few attempts to employ this methodology. As a contribution in this new direction, this study will showcase a functiontoform approach to investigating speech acts of agreement and disagreement in spoken Catalan. Through this example, this paper aims to show the benefits of designing, compiling, transcribing and, especially, annotating one’s own corpus for the study of speech acts. In order to annotate data for the study of speech acts, a complex and multilayered annotation system was designed and manually applied, so that all the different aspects that play a relevant role in the expression of agreement and disagreement could be covered. In addition to discussing the findings from this study, it is argued that the possibilities of exploitation provided by the resulting annotated corpus far outweigh the time cost and open the door to indepth analyses of speech acts and politeness in naturally occurring spoken data.
Badawy, Adam; Ferrara, Emilio
2 Citations
Using a dataset of over 1.9 million messages posted on Twitter by about 25,000 ISIS sympathizers, we explore how ISIS makes use of social media to spread its propaganda and recruit militants from the Arab world and across the globe. By distinguishing between violencedriven, theological, and sectarian content, we trace the connection between online rhetoric and key events on the ground. To the best of our knowledge, ours is one of the first studies to focus on Arabic content, while most literature focuses on English content. Our findings yield new important insights about how social media is used by radical militant groups to target the Arabspeaking world, and reveal important patterns in their propaganda efforts.
Fix, Blair
1 Citations
What explains the powerlaw distribution of top incomes? This paper tests the hypothesis that it is firm hierarchy that creates the powerlaw income distribution tail. Using the available casestudy evidence on firm hierarchy, I create the first largescale simulation of the hierarchical structure of the US private sector. Although not tuned to do so, this model reproduces the powerlaw scaling of top US incomes. I show that this is purely an effect of firm hierarchy. This raises the possibility that the ubiquity of powerlaw income distribution tails is due to the ubiquity of hierarchical organization in human societies.
Asatani, Kimitaka; Toriumi, Fujio; Mori, Junichiro; Ochi, Masanao; Sakata, Ichiro
With increases in the amount of human trajectory data, interest in explaining or predicting human mobility is growing. Owing to the difficulty of associating mobility data with interpersonal relationship data, previous studies on the link between interpersonal relationships and mobility are limited to the specific activities of particular users. In this paper, we propose a method for detecting interpersonal relationships from mobility data, while distinguishing these relationships from those of familiar strangers such as commuters. In the method, persons who take diverse variations within the same activities are recognized as a pair. From IC card data covering the daily mobility of six million people over three years, we detected millions of frequently colocated pairs. Under certain conditions, most of the detected pairs are confirmed as not being familiar strangers, but rather to have an interpersonal relationship. Next, we analyzed the detected pairs and found that the density of the relationships between groups was divided by gender and age and was found to be asymmetric by gender. For example, an elderly male person is not likely to take trips as a pair with a samegender elderly person, and this result is databased evidence for the isolation of retired men. In addition, group trips are confirmed to have an extraordinal character and sometimes converge spatiotemporally. These findings indicate that interpersonal relationship is a strong factor to determine their mobility and group observation is potentially useful for event detection.
Hodas, Nathan O.; Hunter, Jacob; Young, Stephen J.; Lerman, Kristina
In the modern knowledge economy, success demands sustained focus and high cognitive performance. Research suggests that human cognition is linked to a finite resource, and upon its depletion, cognitive functions such as selfcontrol and decisionmaking may decline. While fatigue, among other factors, affects human activity, how cognitive performance evolves during extended periods of focus remains poorly understood. By analyzing performance of a large cohort answering practice standardized test questions online, we show that accuracy and learning decline as the test session progresses and recover following prolonged breaks. To explain these findings, we hypothesize that answering questions consumes some finite cognitive resources on which performance depends, but these resources recover during breaks between test questions. We propose a dynamic mechanism of the consumption and recovery of these resources and show that it explains empirical findings and predicts performance better than alternative hypotheses. While further controlled experiments are needed to identify the physiological origin of these phenomena, our work highlights the potential of empirical analysis of largescale human behavior data to explore cognitive behavior.
Timmis, Ivor
1 Citations
This paper discusses the way early nineteenth century English paupers used language for the pragmatic purpose of securing charitable relief. The paper is based on two historical sources: (1) The Essex Pauper Letters (Sokoll in Essex pauper letters, 1731–1837, Oxford University Press, Oxford, 2001), which consists of letters written by paupers applying for charitable relief, and (2) the Mayhew Corpus, a corpus of interviews with the destitute of London carried out by Sir Henry Mayhew in the 1850s. The paper focuses on certain grammatical differences between the language of the pauper letters and the language in the Mayhew Corpus. From this analysis, it emerges that the pauper writers made markedly less use of certain vernacular features than speakers in the Mayhew Corpus. The features not used to any great extent in the pauper letters but present in the Mayhew Corpus are: vernacular relative pronouns (as and what); vernacular preterites and past participles; aprefixing; and nonstandard verbal ‘s’ ending. It is argued that the infrequency of these features in the pauper letters indicates that the pauper writers were orienting towards the emergent notion of Standard English. However, in contrast to this argument, we find that multiple negation, a low prestige vernacular feature, occurs with similar frequency in both The Essex Pauper Letters and the Mayhew Corpus. The main argument of the paper, in the light of this apparent contradiction, is that, in some cases, the pauper writers’ attempts to orient towards prestige forms faltered as they were dealing with the emotive issues of health, welfare and money.
Ní Mhurchú, Aoife
This paper examines progressive forms in an Irish English context. Through corpus based analysis, it identifies a number of nonstandard progressive structures which are then isolated for more qualitative discourse analysis, drawing upon past studies of aspect in Irish English, and applying a pragmatic framework, where appropriate, to discuss issues surrounding these structures. The primary data are accessed from the Limerick Corpus of Irish English, a 1millionword corpus of spoken Irish English, and then crossreferenced using three other corpora including from British and American English, in order to allow for crossvarietal comparisons. The study finds that the progressive acts as a softener in imperative structures or structures with a similar illocutionary force and as an intensifier in the habitual do be Ving. Of particular note is be going + Ving, a muchneglected structure in studies of Irish English to date, but which this study found to have a unique syntax and pragmatic function.
Haim, Mario; Weimann, Gabriel; Brosius, HansBernd
1 Citations
Since the early introduction of the notion of agendasetting, researchers have attempted to determine the factors that shape media agendas. One of the key sources of media agenda has been identified as intermedia flow, which various studies revealed in the offlinetoonlinetoSNS media context. While most of them focused on the offlineonline flow, the present study examines agendasetting within the new online platforms in various countries, thus allowing for crosscountry and crossmedia comparisons. We applied timeseries analysis to new online media and to traditional online media samples in the context of Edward Snowden’s NSA revelations. Our findings of intermedia agendasetting effects show a moderate but consistent flow from new online media to traditional online media. This highlights the importance of studying these new directions of agenda flow. Apart from that, no profound agendasetting patterns can be found elsewhere. Possible reasons and implications are discussed.
Boeva, Veselka; Lundberg, Lars; Kota, Sai M. Harsha; Sköld, Lars
In this work, we apply cluster validation measures for analyzing email communications at an organizational level of a company. This analysis can be used to evaluate the company structure and to produce further recommendations for structural improvements. Our evaluations, based on data in the forms of email logs and organizational structure for a large European telecommunication company, show that cluster validation techniques can be useful tools for assessing the organizational structure using objective analysis of internal email communications, and for simulating and studying different reorganization scenarios.
Huang, Qianjia; Singh, Vivek K.; Atrey, Pradeep K.
Cyberbullying is an important social challenge that takes place over a technical substrate. Thus, it has attracted research interest across both computational and social science research communities. While the social science studies conducted via careful participant selection have shown the effect of personality, social relationships, and psychological factors on cyberbullying, they are often limited in scale due to manual survey or ethnographic study components. Computational approaches on the other hand have defined multiple automated approaches for detecting cyberbullying at scale, and have largely focused only on the textual content of the messages exchanged. There are no existing efforts aimed at testing, validating, and potentially refining the findings from traditional bullying literature as obtained via surveys and ethnographic studies at scale over online environments. By analyzing the social relationship graph between users in an online social network and deriving features such as outdegree centrality and the number of common friends, we find that multiple social characteristics are statistically different between the cyberbullying and nonbullying groups, thus supporting many, but not all, of the results found in previous surveybased bullying studies. The results pave way for better understanding of the cyberbullying phenomena at scale.
Frey, Seth; Goldstone, Robert L.
1 Citations
Lowlevel “adaptive” and higherlevel “sophisticated” human reasoning processes have been proposed to play opposing roles in the emergence of unpredictable collective behaviors such as crowd panics, traffic jams, and market bubbles. While adaptive processes are widely recognized drivers of emergent social complexity, complementary theories of sophistication predict that incentives, education, and other inducements to rationality will suppress it. We show in a series of multiplayer laboratory experiments that, rather than suppressing complex social dynamics, sophisticated reasoning processes can drive them. Our experiments elicit an endogenous collective behavior and show that it is driven by the human ability to recursively anticipate the reasoning of others. We identify this behavior, “sophisticated flocking”, across three games, the Beauty Contest and the “Mod Game” and “Runway Game”. In supporting our argument, we also present evidence for mental models and social norms constraining how players express their higherlevel reasoning abilities. By implicating sophisticated recursive reasoning in the kind of complex dynamic that it has been predicted to suppress, we support interdisciplinary perspectives that emergent complexity is typical of even the most intelligent populations and carefully designed social systems.
Tominaga, Tomu; Hijikata, Yoshinori; Konstan, Joseph A.
Social media—particularly services such as Twitter where most content is public—present an interesting balance between social benefits and privacy risks. Twitter users have various usage objectives to gain social benefits. As to privacy risks, we introduce the concept of “anonymity consciousness” as users’ intention to avoid being identified and reached by strangers when engaging in public space. In this study, we present a crosscultural study to investigate selfdisclosure in Twitter profiles, usage objectives on Twitter, and anonymity consciousness and examine how selfdisclosure is influenced by usage objectives and anonymity consciousness. Specifically, this study targets Twitter users in the United States, India, and Japan. We find: (a) Indian users are more likely to disclose their personal information and have weaker anonymity consciousness than US and Japanese users, (b) users in every country are less likely to disclose their real name if they have stronger anonymity consciousness, and (c) US users tend to disclose their webpage link and Japanese users tend to disclose their affiliation when advertising themselves on Twitter.
Tambuscio, Marcella; Oliveira, Diego F. M.; Ciampaglia, Giovanni Luca; Ruffo, Giancarlo
2 Citations
Misinformation under the form of rumor, hoaxes, and conspiracy theories spreads on social media at alarming rates. One hypothesis is that, since social media are shaped by homophily, belief in misinformation may be more likely to thrive on those social circles that are segregated from the rest of the network. One possible antidote to misinformation is fact checking which, however, does not always stop rumors from spreading further, owing to selective exposure and our limited attention. What are the conditions under which factual verification are effective at containing the spreading of misinformation? Here we take into account the combination of selective exposure due to network segregation, forgetting (i.e., finite memory), and factchecking. We consider a compartmental model of two interacting epidemic processes over a network that is segregated between gullible and skeptic users. Extensive simulation and meanfield analysis show that a more segregated network facilitates the spread of a hoax only at low forgetting rates, but has no effect when agents forget at faster rates. This finding may inform the development of mitigation techniques and raise awareness on the risks of uncontrolled misinformation online.
Mantzaris, Alexander V.; Rein, Samuel R.; Hopkins, Alexander D.
1 Citations
The Eurovision Song Contest (ESC) has been a growing source of entertainment for millions of viewers. Countries are
represented by a single song during a live performance and in an award ceremony scores are exchanged according to their preference. It has been speculated that socioeconomic ties influence the awards. The work presented here aims at investigating a different explanation for the voting patterns which deviate significantly from a uniform distribution. A perspective which is not covered is whether an audience member sees bias as a route towards increasing a country’s score rank. Given that much of the biased voting is apparent to the audience, the question whether these biased connections present themselves as a path to increasing score rank is explored. The results show that countries which attracted more biased preferential edges (preference in degree) and produced bias towards other countries (preference out degree) had a significant rank correlation with their total accumulated score. This adds to the theory explaining the biased voting patterns, in that they assist towards the simple goal of an audience member seeking to win by utilizing exchange partnerships with those countries where socioeconomic ties already exist.
Kupilik, Matthew; Witmer, Frank
1 Citations
Large volume, datadriven violent conflict research is now possible using publicly available data sets. This work analyzes the predictive ability of dataderived Gaussian process models compared to a generalized linear model. Societal violence is a highly nonlinear process and the available data sets have high dimensionality that yield observation totals in the hundreds of thousands to millions. These challenges make machine learning modeling difficult without significant dimensionality reduction. We develop a computationally intensive Gaussian process modeling approach that exploits the size and complexity of the violent conflict dataset to identify appropriate basis vectors for the model. We develop our models using gridded monthly violent event counts for subSaharan Africa from 1980 to 2012. Our resulting Gaussian process models modestly improve the accuracy and predictive ability of existing generalized linear models. Despite this improvement, the accurate prediction of violence in subSaharan Africa at a relatively fine resolution spatial grid of 1
$$^\circ$$
latitude/longitude remains a challenging problem.
Umemoto, Daigo; Ito, Nobuyasu
1 Citations
We found powerlaw behavior in the distribution of traffic on road segments in urban traffic simulations using digitized map of Kobe city in Japan as an example of an actual road network. As a comparison, we performed simulations using an artificial random road network and Manhattantype road network. Similar powerlaw behavior was confirmed in the former, but not the latter. The behavior appeared robustly with or without traffic congestion, which suggests that its origin is not the interaction between vehicles. The powerlaw exponent was fitted using least squares method and obtained as
$$1.1$$
for Kobe city and the random road network, with optimization to avoid traffic congestion. The result did not change with the use of a different origin and destination distribution. From these results, one of the reasons that caused the powerlaw behavior was considered to be the randomness of the road network connection and edge lengths, whose fluctuations are obvious both in Kobe city and the random road network, unlike the grid network.
Felice, Rachele; Garretson, Gregory
This article introduces the Clinton Email Corpus, comprising 33,000 recently released email messages sent to and from Hillary Clinton during her tenure as United States Secretary of State, and presents the results of a first investigation into the effect of status and gender on politenessrelated linguistic choices within the corpus, based on a sample of 500 emails. We describe the composition of the corpus and mention the technical challenges inherent in its creation, and then present the 500email subset, in which all messages are categorized according to sender and recipient gender, position in the workplace hierarchy, and personal closeness to Clinton. The analysis looks at the most frequent bigrams in each of these subsets as a starting point for the identification of linguistic differences. We find that the main differences relate to the content and function of the messages rather than their tone. Individuals lower in the hierarchy but not in Clinton’s inner circle are more often engaged in practical tasks, while members of the inner circle primarily discuss issues and use email to arrange inperson conversations. Clinton herself is generally found to engage neither in extensive politeness nor in overt displays of power. These findings present further evidence of how corpus linguistics can be used to advance our understanding of workplace pragmatics.
Tennant, Neil
It is shown how Tarski’s 1929 axiomatization of mereology secures the reflexivity of the ‘part of’ relation. This is done with a fusionabstraction principle that is constructively weaker than that of Tarski; and by means of constructive and relevant reasoning throughout. We place a premium on complete formal rigor of proof. Every step of reasoning is an application of a primitive rule; and the natural deductions themselves can be checked effectively for formal correctness.
Gruszczyński, Rafał; Pietruszczak, Andrzej
In the second installment to Gruszczyński and Pietruszczak (Stud Log, 2018.
https://doi.org/10.1007/s1122501897868
) we carry out an analysis of spaces of points of Grzegorczyk structures. At the outset we introduce notions of a concentric and
$$\omega $$
concentric topological space and we recollect some facts proven in the first part which are important for the sequel. Theorem 2.9 is a strengthening of Theorem 5.13, as we obtain stronger conclusion weakening Tychonoff separation axiom to mere regularity. This leads to a stronger version of Theorem 6.10 (in form of Corollary 2.10). Further, we show that Grzegorczyk points are maximal contracting filters in the sense of De Vries (Compact spaces and compactifications, Van Gorcum and Comp. N.V., 1962), but the converse inclusion is not necessarily true. We also compare the notions of a Grzegorczyk point and an ultrafilter, and establish several properties of topological spaces based on Grzegorczyk structures. The main results of the paper are representation and completion theorems for Gstructures. We prove both settheoretical and topological representation theorems for various classes of Gstructures. We also present topological object duality theorem for the class of complete Gstructures and the class of concentric spaces, both restricted to structures which satisfy countable chain condition. We conclude the paper with proving equivalence of the original Grzegorczyk axiom with the one accepted by us as axiom (G).
Arndt, Michael
1 Citations
Utilizing an idea that has its first appearance in Gerhard Gentzen’s unpublished manuscripts, we generate an exhaustive repertoire of all the possible inference rules that are related to the left implication inference rule of the sequent calculus from a ground sequent, that is, a logical axiom. We discuss the similarities and differences of these derived rules as well as their interaction with the implication right rule under cut and the structural axiom. We further consider the question of analyticity of cuts in calculi using one of the new rules instead of the standard left implication rule.
Shtakser, Gennady
In the previous paper with a similar title (see Shtakser in Stud Log 106(2):311–344, 2018), we presented a family of propositional epistemic logics whose languages are extended by two ingredients: (a) by quantification over modal (epistemic) operators or over agents of knowledge and (b) by predicate symbols that take modal (epistemic) operators (or agents) as arguments. We denoted this family by
$${\mathcal {P}\mathcal {E}\mathcal {L}}_{(QK)}$$
. The family
$${\mathcal {P}\mathcal {E}\mathcal {L}}_{(QK)}$$
is defined on the basis of a decidable higherorder generalization of the loosely guarded fragment (HOLGF) of firstorder logic. And since HOLGF is decidable, we obtain the decidability of logics of
$${\mathcal {P}\mathcal {E}\mathcal {L}}_{(QK)}$$
. In this paper we construct an alternative family of decidable propositional epistemic logics whose languages include ingredients (a) and (b). Denote this family by
$${\mathcal {P}\mathcal {E}\mathcal {L}}^{alt}_{(QK)}$$
. Now we will use another decidable fragment of firstorder logic: the two variable fragment of firstorder logic with two equivalence relations (FO
$$^2$$
+2E) [the decidability of FO
$$^2$$
+2E was proved in Kieroński and Otto (J Symb Log 77(3):729–765, 2012)]. The families
$${\mathcal {P}\mathcal {E}\mathcal {L}}^{alt}_{(QK)}$$
and
$${\mathcal {P}\mathcal {E}\mathcal {L}}_{(QK)}$$
differ in the expressive power. In particular, we exhibit classes of epistemic sentences considered in works on firstorder modal logic demonstrating this difference.
Benthem, Johan; Bezhanishvili, Nick; Enqvist, Sebastian
We propose a new perspective on logics of computation by combining instantial neighborhood logic
$$\mathsf {INL}$$
with bisimulation safe operations adapted from
$$\mathsf {PDL}$$
.
$$\mathsf {INL}$$
is a recent modal logic, based on an extended neighborhood semantics which permits quantification over individual neighborhoods plus their contents. This system has a natural interpretation as a logic of computation in open systems. Motivated by this interpretation, we show that a number of familiar program constructors can be adapted to instantial neighborhood semantics to preserve invariance for instantial neighborhood bisimulations, the appropriate bisimulation concept for
$$\mathsf {INL}$$
. We also prove that our extended logic
$$\mathsf {IPDL}$$
is a conservative extension of dualfree game logic, and its semantics generalizes the monotone neighborhood semantics of game logic. Finally, we provide a sound and complete system of axioms for
$$\mathsf {IPDL}$$
, and establish its finite model property and decidability.
Losada, David E.; Gamallo, Pablo
While considerable attention has been given to the analysis of texts written by depressed individuals, few studies were interested in evaluating and improving lexical resources for supporting the detection of signs of depression in text.
In this paper, we present a searchbased methodology to evaluate existing depression lexica. To meet this aim, we exploit existing resources for depression and language use and we analyze which elements of the lexicon are the most effective at revealing depression symptoms. Furthermore, we propose innovative expansion strategies able to further enhance the quality of the lexica.
Genin, Konstantin; Kelly, Kevin T.
(I) Synchronic norms of theory choice, a traditional concern in scientific methodology, restrict the theories one can choose in light of given information. (II) Diachronic norms of theory change, as studied in belief revision, restrict how one should change one’s current beliefs in light of new information. (III) Learning norms concern how best to arrive at true beliefs. In this paper, we undertake to forge some rigorous logical relations between the three topics. Concerning (III), we explicate inductive truth conduciveness in terms of optimally direct convergence to the truth, where optimal directness is explicated in terms of reversals and cycles of opinion prior to convergence. Concerning (I), we explicate Ockham’s razor and related principles of choice in terms of the information topology of the empirical problem context and show that the principles are necessary for reversal or cycle optimal convergence to the truth. Concerning (II), we weaken the standard principles of agm belief revision theory in intuitive ways that are also necessary (and in some cases, sufficient) for reversal or cycle optimal convergence. Then we show that some of our weakened principles of change entail corresponding principles of choice, completing the triangle of relations between (I), (II), and (III).
Ardeshir, Mohammad; Ruitenburg, Wim
A latarre is a lattice with an arrow. Its axiomatization looks natural. Latarres have a nontrivial theory which permits many constructions of latarres. Latarres appear as an end result of a series of generalizations of better known structures. These include Boolean algebras and Heyting algebras. Latarres need not have a distributive lattice.
Castiglioni, José Luis; San Martín, Hernán Javier
An lhemiimplicative semilattice is an algebra
$$\mathbf {A} = (A,\wedge ,\rightarrow ,1)$$
such that
$$(A,\wedge ,1)$$
is a semilattice with a greatest element 1 and satisfies: (1) for every
$$a,b,c\in A$$
,
$$a\le b\rightarrow c$$
implies
$$a\wedge b \le c$$
and (2)
$$a\rightarrow a = 1$$
. An lhemiimplicative semilattice is commutative if if it satisfies that
$$a\rightarrow b = b\rightarrow a$$
for every
$$a,b\in A$$
. It is shown that the class of lhemiimplicative semilattices is a variety. These algebras provide a general framework for the study of different algebras of interest in algebraic logic. In any lhemiimplicative semilattice it is possible to define an derived operation by
$$a \sim b := (a \rightarrow b) \wedge (b \rightarrow a)$$
. Endowing
$$(A,\wedge ,1)$$
with the binary operation
$$\sim $$
the algebra
$$(A,\wedge ,\sim ,1)$$
results an lhemiimplicative semilattice, which also satisfies the identity
$$a \sim b = b \sim a$$
. In this article, we characterize the (derived) commutative lhemiimplicative semilattices. We also provide many new examples of lhemiimplicative semilattice on any semillatice with greatest element (possibly with bottom). Finally, we characterize congruences on the classes of lhemiimplicative semilattices introduced earlier and we characterize the principal congruences of lhemiimplicative semilattices.
He, Peng; Wang, Xueping
A geometric lattice is the lattice of closed subsets of a closure operator on a set which is zeroclosure, algebraic, atomistic and which has the socalled exchange property. There are many profound results about this type of lattices, the most recent one of which, due to Czédli and Schimdt (Adv Math 225:2455–2463, 2010), says that a lattice L of finite length is semimodular if and only if L has a coverpreserving embedding into a geometric lattice G of the same length. The goal of our paper is to offer the following result: a lattice of finite length is semimodular if and only if every cell in L is a 4element Boolean lattice and the 7element nondistributive atomistic lattice having 3 atoms is not a coverpreserving sublattice of L.
Zoghifard, R.; Pourmahdian, M.
1 Citations
We generalize two wellknown modeltheoretic characterization theorems from propositional modal logic to firstorder modal logic (FML, for short). We first study FMLdefinable frames and give a version of the Goldblatt–Thomason theorem for this logic. The advantage of this result, compared with the original Goldblatt–Thomason theorem, is that it does not need the condition of ultrafilter reflection and uses only closure under bounded morphic images, generated subframes and disjoint unions. We then investigate Lindström type theorems for firstorder modal logic. We show that FML has the maximal expressive power among the logics extending FML which satisfy compactness, bisimulation invariance and the Tarski union property.
Aglianò, P.; Montagna, F.
2 Citations
In this paper we introduce a poset of subvarieties of BLalgebras, whose completion is the entire lattice of subvarietes; we exhibit also a description of this poset in terms of finite sequences of functions on the natural numbers.
By
Busaniche, Manuela; Gomez, Conrado
Different constructions of BLchains are compared. We establish when the ordinal sum and the poset product of the same family of BLchains coincide. We also compare the poset product of MVchains and product chains with saturated BLchains.
By
Gispert, Joan
In this paper we study finitary extensions of the nilpotent minimum logic (NML) or equivalently quasivarieties of NMalgebras. We first study structural completeness of NML, we prove that NML is hereditarily almost structurally complete and moreover NM
$$^{}$$
, the axiomatic extension of NML given by the axiom
$$\lnot (\lnot \varphi ^{2})^{2}\leftrightarrow (\lnot (\lnot \varphi )^{2})^{2}$$
, is hereditarily structurally complete. We use those results to obtain the full description of the lattice of all quasivarieties of NMalgebras which allow us to characterize and axiomatize all finitary extensions of NML.
Tsai, Hsingchien
Mereology is the theory of the relation “being a part of”. The first exact formulation of mereology is due to the Polish logician Stanisław Leśniewski. But Leśniewski’s mereology is not firstorder axiomatizable, for it requires every subset of the domain to have a fusion. In recent literature, a firstorder theory named General Extensional Mereology (GEM) can be thought of as a firstorder approximation of Leśniewski’s theory, in the sense that GEM guarantees that every definable subset of the domain has a fusion, and this has been achieved by positing an axiom schema which in effect defines infinitely many axioms. Intuitively, in order to range over every definable subset, such an axiom schema seems unavoidable. But this paper will show that GEM is finitely axiomatizable and that all we need are just finitely many instances of the said axiom schema.
Citkin, Alex
Propositional logic is understood as a set of theorems defined by a deductive system: a set of axioms and a set of rules. Superintuitionistic logic is a logic extending intuitionistic propositional logic
$$\mathsf {Int}$$
. A rule is admissible for a logic if any substitution that makes each premise a theorem, makes the conclusion a theorem too. A deductive system
$$\mathsf {S}$$
is structurally complete if any rule admissible for the logic defined by
$$\mathsf {S}$$
is derivable in
$$\mathsf {S}$$
. It is known that any logic can be defined by a structurally complete deductive system—its structural completion. The main goal of the paper is to study the following problem: given a superintuitionistic logic L, is the structural completion of L hereditarily structurally complete? It is shown that, on the one hand, there is continuum many of such logics, including
$$\mathsf {Int}$$
, and many of its standard extensions. On the other hand, there is continuum many superintutitionistic logics structural completion of which is not hereditarily structurally complete (the Medvedev and Kreisel–Putnam logics are notable examples). It is observed that the class of hereditarily structurally complete superintuitionistic consequence relations does not have the smallest element and it contains continuum many members lacking the finite model property. The following statement is instrumental in obtaining negative results: if a Lindenbaum algebra of formulas on one variable is finite and has more than 15 elements, then a structural completion of such a logic is not hereditarily structurally complete
more …
