Showing 1 to 100 of 458 matching Articles
Results per page:
Export (CSV)
By
Coello, Carlos A. Coello
Post to Citeulike
This chapter provides a short overview of multiobjective optimization using metaheuristics. The chapter includes a description of some of the main metaheuristics that have been used for multiobjective optimization. Although special emphasis is made on evolutionary algorithms, other metaheuristics, such as particle swarm optimization, artificial immune systems, and ant colony optimization, are also briefly discussed. Other topics such as applications and recent algorithmic trends are also included. Finally, some of the main research trends that are worth exploring in this area are briefly discussed.
more …
By
Coello, Carlos A. Coello
Post to Citeulike
This chapter provides a short overview of multiobjective optimization using metaheuristics. The chapter includes a description of some of the main metaheuristics that have been used for multiobjective optimization. Although special emphasis is made on evolutionary algorithms, other metaheuristics, such as particle swarm optimization, artificial immune systems, and ant colony optimization, are also briefly discussed. Other topics such as applications and recent algorithmic trends are also included. Finally, some of the main research trends that are worth exploring in this area are briefly discussed.
more …
By
Bagnoli, Franco; El Yacoubi, Samira; Rechtman, Raúl
Post to Citeulike
4 Citations
An important question to be addressed regarding system control on a time interval [0, T] is whether some particular target state in the configuration space is reachable from a given initial state. When the target of interest refers only to a portion of the spatial domain, we speak about regional analysis. Cellular automata approach have been recently promoted for the study of control problems on spatially extended systems for which the classical approaches cannot be used. An interesting problem concerns the situation where the subregion of interest is not interior to the domain but a portion of its boundary . In this paper we address the problem of regional controllability of cellular automata via boundary actions, i.e., we investigate the characteristics of a cellular automaton so that it can be controlled inside a given region only acting on the value of sites at its boundaries.
more …
By
GalvánLópez, Edgar; VázquezMendoza, Lucia; Schoenauer, Marc; Trujillo, Leonardo
Show all (4)
Post to Citeulike
In Genetic Programming (GP), the fitness of individuals is normally computed by using a set of fitness cases (FCs). Research on the use of FCs in GP has primarily focused on how to reduce the size of these sets. However, often, only a small set of FCs is available and there is no need to reduce it. In this work, we are interested in using the whole FCs set, but rather than adopting the commonly used GP approach of presenting the entire set of FCs to the system from the beginning of the search, referred as static FCs, we allow the GP system to build it by aggregation over time, named as dynamic FCs, with the hope to make the search more amenable. Moreover, there is no study on the use of FCs in Dynamic Optimisation Problems (DOPs). To this end, we also use the Kendall Tau Distance (KTD) approach, which quantifies pairwise dissimilarities among two lists of fitness values. KTD aims to capture the degree of a change in DOPs and we use this to promote structural diversity. Results on eight symbolic regression functions indicate that both approaches are highly beneficial in GP.
more …
By
Adj, Gora; CanalesMartínez, Isaac; RiveraZamarripa, Luis; RodríguezHenríquez, Francisco
Show all (4)
Post to Citeulike
The problem of determining whether a polynomial defined over a finite field ring is smooth or not with respect to a given degree, is the most intensive arithmetic operation of the socalled descent phase of indexcalculus algorithms. In this paper, we present an analysis and efficient implementation of Coppersmith’s smoothness test for polynomials defined over finite fields with characteristic three. As a case study, we review the best strategies for obtaining a fast field and polynomial arithmetic for polynomials defined over the ring
$$F_q[X],$$
with
$$q=3^6,$$
and report the timings achieved by our library when computing the smoothness test applied to polynomials of several degrees defined in that ring. This software library was recently used in Adj et al. (Cryptology 2016.
http://eprint.iacr.org/2016/914
), as a building block for achieving a record computation of discrete logarithms over the 4841bit field
$${{\mathbb {F}}}_{3^{6\cdot 509}}$$
.
more …
By
Feng, Xiang; Wan, Wanggen; Xu, Richard Yi Da; Chen, Haoyu; Li, Pengfei; Sánchez, J. Alfredo
Show all (6)
Post to Citeulike
In computer graphics, various processing operations are applied to 3D triangle meshes and these processes often involve distortions, which affect the visual quality of surface geometry. In this context, perceptual quality assessment of 3D triangle meshes has become a crucial issue. In this paper, we propose a new objective quality metric for assessing the visual difference between a reference mesh and a corresponding distorted mesh. Our analysis indicates that the overall quality of a distorted mesh is sensitive to the distortion distribution. The proposed metric is based on a spatial pooling strategy and statistical descriptors of the distortion distribution. We generate a perceptual distortion map for vertices in the reference mesh while taking into account the visual masking effect of the human visual system. The proposed metric extracts statistical descriptors from the distortion map as the feature vector to represent the overall mesh quality. With the feature vector as input, we adopt a support vector regression model to predict the mesh quality score.We validate the performance of our method with three publicly available databases, and the comparison with stateoftheart metrics demonstrates the superiority of our method. Experimental results show that our proposed method achieves a high correlation between objective assessment and subjective scores.
more …
By
YáñezMárquez, Cornelio; LópezYáñez, Itzamá; AldapePérez, Mario; CamachoNieto, Oscar; ArgüellesCruz, Amadeo José; VilluendasRey, Yenny
Show all (6)
Post to Citeulike
The current paper contains the theoretical foundation for the offthemainstream model known as AlphaBeta associative memories (
$$\alpha \beta $$
model). This is an unconventional computation model designed to operate as an associative memory, whose main application is the solution of pattern recognition tasks, particularly for pattern recall and pattern classification. Although this model was devised, proposed and created in 2002, it is worth noting that its theoretical support remains unpublished to this day. This is despite the fact that more than a hundred scientific articles have been published with applications, improvements, and new models derived from the
$$\alpha \beta $$
model. The present paper includes all the required definitions, and the rigorous mathematical demonstrations of the lemmas and theorems, explaining the operation of the
$$\alpha \beta $$
model, as well as the original models it has inspired or that have been derived from it. Also, brief descriptions of 60 selected articles related to the
$$\alpha \beta $$
model are presented. These latter works illustrate the competitiveness (and sometimes superiority) of several extensions and models derived from the original
$$\alpha \beta $$
model, when compared against some models and paradigms present in the mainstream current scientific literature.
more …
By
DelporteGallet, Carole; Fauconnier, Hugues; Rajsbaum, Sergio; Yanagisawa, Nayuta
Show all (4)
Post to Citeulike
One of the central questions in distributed computability is characterizing the tasks that are solvable in a given system model. In the anonymous case, where processes have no identifiers and communicate through multiwriter/multireader registers, there is a recent topological characterization (Yanagisawa 2017) of the colorless tasks that are solvable when any number of asynchronous processes may crash. In this paper, we consider the case where at most t asynchronous processes may crash, where
$$1\le t<n$$
. We prove that a colorless task is tresilient solvable anonymously if and only if it is tresilient solvable nonanonymously. We obtain our results through various reductions and simulations that explore how to extend techniques for nonanonymous computation to anonymous one.
more …
By
Amaya, Ivan; OrtizBayliss, José Carlos; ConantPablos, Santiago Enrique; TerashimaMarín, Hugo; Coello Coello, Carlos A.
Show all (5)
Post to Citeulike
Solvers for different combinatorial optimization problems have evolved throughout the years. These can range from simple strategies such as basic heuristics, to advanced models such as metaheuristics and hyperheuristics. Even so, the set of benchmark instances has remained almost unaltered. Thus, any analysis of solvers has been limited to assessing their performance under those scenarios. Even if this has been fruitful, we deem necessary to provide a tool that allows for a better study of each available solver. Because of that, in this paper we present a tool for assessing the strengths and weaknesses of different solvers, by tailoring a set of instances for each of them. We propose an evolutionarybased model and test our idea on four different basic heuristics for the 1D bin packing problem. This, however, does not limit the scope of our proposal, since it can be used in other domains and for other solvers with few changes. By pursuing an indepth study of such tailored instances, more relevant knowledge about each solver can be derived.
more …
By
Merelo Guervós, Juan J.; GarcíaValdez, J. Mario
Post to Citeulike
Cloudnative applications add a layer of abstraction to the underlying distributed computing system, defining a highlevel, selfscaling and selfmanaged architecture of different microservices linked by a messaging bus. Creating new algorithms that tap these architectural patterns and at the same time employ distributed resources efficiently is a challenge we will be taking up in this paper. We introduce KafkEO, a cloudnative evolutionary algorithms framework that is prepared to work with different implementations of evolutionary algorithms and other populationbased metaheuristics by using micropopulations and stateless services as the main building blocks; KafkEO is an attempt to map the traditional evolutionary algorithm to this new cloudnative format. As far as we know, this is the first architecture of this kind that has been published and tested, and is free software and vendorindependent, based on OpenWhisk and Kafka. This paper presents a proof of concept, examines its cost, and tests the impact on the algorithm of the design around cloudnative and asynchronous system by comparing it on the well known BBOB benchmarks with other poolbased architectures, with which it has a remarkable functional resemblance. KafkEO results are quite competitive with similar architectures.
more …
By
Nebro, Antonio J.; Durillo, Juan J.; GarcíaNieto, José; BarbaGonzález, Cristóbal; Ser, Javier; Coello Coello, Carlos A.; BenítezHidalgo, Antonio; AldanaMontes, José F.
Show all (8)
Post to Citeulike
The Speedconstrained Multiobjective PSO (SMPSO) is an approach featuring an external bounded archive to store nondominated solutions found during the search and out of which leaders that guide the particles are chosen. Here, we introduce SMPSO/RP, an extension of SMPSO based on the idea of reference point archives. These are external archives with an associated reference point so that only solutions that are dominated by the reference point or that dominate it are considered for their possible addition. SMPSO/RP can manage several reference point archives, so it can effectively be used to focus the search on one or more regions of interest. Furthermore, the algorithm allows interactively changing the reference points during its execution. Additionally, the particles of the swarm can be evaluated in parallel. We compare SMPSO/RP with respect to three other reference point based algorithms. Our results indicate that our proposed approach outperforms the other techniques with respect to which it was compared when solving a variety of problems by selecting both achievable and unachievable reference points. A realworld application related to civil engineering is also included to show up the real applicability of SMPSO/RP.
more …
By
FernándezZepeda, José Alberto; BrubeckSalcedo, Daniel; FajardoDelgado, Daniel; ZatarainAceves, Héctor
Show all (4)
Post to Citeulike
We address the bodyguard allocation problem (BAP), an optimization problem that illustrates the conflict of interest between two classes of processes with contradictory preferences within a distributed system. While a class of processes prefers to minimize its distance to a particular process called the root, the other class prefers to maximize it; at the same time, all the processes seek to build a communication spanning tree with the maximum social welfare. The two stateoftheart algorithms for this problem always guarantee the generation of a spanning tree that satisfies a condition of Nash equilibrium in the system; however, such a tree does not necessarily produce the maximum social welfare. In this paper, we propose a twoplayer coalition cooperative scheme for BAP, which allows some processes to perturb or break a Nash equilibrium to find another one with a better social welfare. By using this cooperative scheme, we propose a new algorithm called FFCBAP_{S} for BAP. We present both theoretical and empirical analyses which show that this algorithm produces better quality approximate solutions than former algorithms for BAP.
more …
By
Oliveira, Thomaz; López, Julio; Hışıl, Hüseyin; FazHernández, Armando; RodríguezHenríquez, Francisco
Show all (5)
Post to Citeulike
In the RFC 7748 memorandum, the Internet Research Task Force specified a Montgomeryladder scalar multiplication function based on two recently adopted elliptic curves, “curve25519” and “curve448”. The purpose of this function is to support the DiffieHellman key exchange algorithm that will be included in the forthcoming version of the Transport Layer Security cryptographic protocol. In this paper, we describe a ladder variant that permits to accelerate the fixedpoint multiplication function inherent to the DiffieHellman key pair generation phase. Our proposal combines a righttoleft version of the Montgomery ladder along with the precomputation of constant values directly derived from the basepoint and its multiples. To our knowledge, this is the first proposal of a Montgomery ladder procedure for prime elliptic curves that admits the extensive use of precomputation. In exchange of very modest memory resources and a small extra programming effort, the proposed ladder obtains significant speedups for software implementations. Moreover, our proposal fully complies with the RFC 7748 specification. A software implementation of the X25519 and X448 functions using our precomputable ladder yields an acceleration factor of roughly 1.20, and 1.25 when implemented on the Haswell and the Skylake microarchitectures, respectively.
more …
By
Chakraborty, Debrup; López, Cuauhtemoc Mancillas; Sarkar, Palash
Post to Citeulike
1 Citations
In the last one and a half decade there has been a lot of activity toward development of cryptographic techniques for disk encryption. It has been almost canonized that an encryption scheme suitable for the application of disk encryption must be length preserving, i.e., it rules out the use of schemes such as authenticated encryption where an authentication tag is also produced as a part of the ciphertext resulting in ciphertexts being longer than the corresponding plaintexts. The notion of a tweakable enciphering scheme (TES) has been formalized as the appropriate primitive for disk encryption, and it has been argued that they provide the maximum security possible for a tagless scheme. On the other hand, TESs are less efficient than some existing authenticated encryption schemes. Also TES cannot provide true authentication as they do not have authentication tags. In this paper, we analyze the possibility of the use of encryption schemes where length expansion is produced for the purpose of disk encryption. On the negative side, we argue that noncebased authenticated encryption schemes are not appropriate for this application. On the positive side, we demonstrate that deterministic authenticated encryption (DAE) schemes may have more advantages than disadvantages compared to a TES when used for disk encryption. Finally, we propose a new deterministic authenticated encryption scheme called BCTR which is suitable for this purpose. We provide the full specification of BCTR, prove its security and also report an efficient implementation in reconfigurable hardware. Our experiments suggests that BCTR performs significantly better than existing TESs and existing DAE schemes.
more …
By
Manoatl Lopez, Edgar; Coello Coello, Carlos A.
Post to Citeulike
In recent years, decompositionbased multiobjective evolutionary algorithms (MOEAs) have gained increasing popularity. However, these MOEAs depend on the consistency between the Pareto front shape and the distribution of the reference weight vectors. In this paper, we propose a decompositionbased MOEA, which uses the modified Euclidean distance (
$$d^+$$
) as a scalar aggregation function. The proposed approach adopts a novel method for approximating the reference set, based on an hypercubebased method, in order to adapt the reference set for leading the evolutionary process. Our preliminary results indicate that our proposed approach is able to obtain solutions of a similar quality to those obtained by stateoftheart MOEAs such as MOMBIII, NSGAIII, RVEA and MOEA/DD in several MOPs, and is able to outperform them in problems with complicated Pareto fronts.
more …
By
FalcónCardona, Jesús Guillermo; Coello Coello, Carlos A.
Post to Citeulike
Recently, it has been shown that the current ManyObjective Evolutionary Algorithms (MaOEAs) are overspecialized in solving certain benchmark problems. This overspecialization is due to a high correlation between the Pareto fronts of the test problems with the convex weight vectors commonly used by MaOEAs. The main consequence of such overspecialization is the inability of these MaOEAs to solve the minus versions of wellknown benchmarks (e.g., the DTLZ
$$^{1}$$
test suite). In furtherance of avoiding this issue, we propose a novel steadystate MaOEA that does not require weight vectors and uses a density estimator based on the IGD
$$^+$$
indicator. Moreover, a fast method to calculate the IGD
$$^+$$
contributions is integrated in order to reduce the computational cost of the proposed approach, which is called IGD
$$^+$$
MaOEA. Our proposed approach is compared with NSGAIII, MOEA/D, IGD
$$^+$$
EMOA (the previous ones employ convex weight vectors) and SMSEMOA on the test suites DTLZ and DTLZ
$$^{1}$$
, using the hypervolume indicator. Our experimental results show that IGD
$$^+$$
MaOEA is a more general optimizer than MaOEAs that need a set of convex weight vectors and it is competitive and less computational expensive than SMSEMOA.
more …
By
Bagnoli, Franco; Rechtman, Raúl
Post to Citeulike
1 Citations
We study the regional masterslave synchronization of a one dimensional probabilistic cellular automaton with two absorbing states. The master acts on the boundary of an interval, the region, of a fixed size. For some values of the parameters, this is enough to achieve synchronization in the region. For other values, we extend the regional synchronization to include a fraction of sites inside the region of interest. We present four different ways of doing this and show which is the most effective one, in terms of the fraction of sites inside the region and the time needed for synchronization.
more …
By
Dashtipour, Kia; Hussain, Amir; Gelbukh, Alexander
Post to Citeulike
In the recent years, people all around the world share their opinions about different fields with each other over Internet. Sentiment analysis techniques have been introduced to classify these rich data based on the polarity of the opinion. Sentiment analysis research has been growing rapidly; however, most of the research papers are focused on English. In this paper, we review Englishbased sentiment analysis approaches and discuss what adaption these approaches require to become applicable to the Persian language. The results show that approaches initially suggested for English language are competitive with those developed specifically for Persian sentiment analysis.
more …
By
Jiménez, Samantha; JuárezRamírez, Reyes; Castillo, Víctor H.; RamírezNoriega, Alan
Show all (4)
Post to Citeulike
Affectivity has influence in learning facetoface environments and improves some aspects in students, such as motivation. For that reason, it is important to integrate affectivity elements into virtual environments. We propose a conceptual model that suggests which elements of tutor, student and dialogue should be integrated and implemented into learning systems. We design an ontology guided by methontology, and apply a mathematical evaluation (OntoQA) to determine the richness of the proposed model. The mathematical evaluation states that the proposed model has relationship richness and horizontal nature. We developed a software application implementing the conceptual model in order to prove its effectivity to generate students’ motivation. The findings suggest that the implemented affective learning ontology impacts positively the motivation in students with low academic performance, in female students and in engineering students.
more …
By
RodríguezDiez, Vladímir; MartínezTrinidad, José Fco.; CarrascoOchoa, Jesús Ariel; LazoCortés, Manuel S.
Show all (4)
Post to Citeulike
Within Testor Theory, typical testors are irreducible subsets of attributes preserving the object discernibility ability of the original set of attributes. Computing all typical testors from a dataset has exponential complexity regarding its number of attributes, however there are other properties of a dataset that have some influence on the performance of different algorithms. Previous studies have determined that a significant runtime reduction can be obtained from selecting the appropriate algorithm for a given dataset. In this work, we present an experimental study evaluating the effect of basic matrix dimensionality on the performance of the algorithms for typical testor computation. Our experiments are carried out over synthetic and real–world datasets. Finally, some guidelines obtained from the experiments, for helping to select the best algorithm for a given dataset, are summarised.
more …
By
RodriguezTello, Eduardo; NarvaezTeran, Valentina; Lardeux, Fréderic
Post to Citeulike
The Cyclic Bandwidth Sum Problem (CBSP) is an NPHard Graph Embedding Problem which aims to embed a simple, finite graph (the guest) into a cycle graph of the same order (the host) while minimizing the sum of cyclic distances in the host between guest’s adjacent nodes. This paper presents preliminary results of our research on the design of a Memetic Algorithm (MA) able to solve the CBSP. A total of 24 MA versions, induced by all possible combinations of four selection schemes, two operators for recombination and three for mutation, were tested over a set of 25 representative graphs. Results compared with respect to the stateoftheart top algorithm showed that all the tested MA versions were able to consistently improve its results and give us some insights on the suitability of the tested operators.
more …
By
López, Marco A.; MarcialRomero, J. Raymundo; Ita, Guillermo; Moyao, Yolanda
Show all (4)
Post to Citeulike
Although the satisfiability problem for two Conjunctive Normal Form formulas (2SAT) is polynomial time solvable, it is well known that #2SAT, the counting version of 2SAT is #PComplete. However, it has been shown that for certain classes of formulas, #2SAT can be computed in polynomial time. In this paper we show another class of formulas for which #2SAT can also be computed in lineal time, the so called outerplanar formulas, e.g. formulas whose signed primal graph is outerplanar. Our algorithm’s time complexity is given by
$$O(n+m)$$
where n is the number of variables and m the number of clauses of the formula.
more …
By
López, Uriel; Trujillo, Leonardo; Legrand, Pierrick
Post to Citeulike
Outliers are one of the most difficult issues when dealing with realworld modeling tasks. Even a small percentage of outliers can impede a learning algorithm’s ability to fit a dataset. While robust regression algorithms exist, they fail when a dataset is corrupted by more than 50% of outliers (breakdown point). In the case of Genetic Programming, robust regression has not been properly studied. In this paper we present a method that works as a filter, removing outliers from the target variable (vertical outliers). The algorithm is simple, it uses a randomly generated population of GP trees to determine which target values should be labeled as outliers. The method is highly efficient. Results show that it can return a clean dataset when contamination reaches as high as 90%, and may be able to handle higher levels of contamination. In this study only synthetic univariate benchmarks are used to evaluate the approach, but it must be stressed that no other approaches can deal with such high levels of outlier contamination while requiring such small computational effort.
more …
By
Figueroa, Karina; Paredes, Rodrigo; Reyes, Nora
Post to Citeulike
Proximity searching consists in retrieving the most similar objects to a given query from a database. To do so, the usual approach consists in using an index in order to improve the response time of online queries. Recently, the permutation based algorithms (PBA) were presented, and from then on, this technique has been very successful. In its core, the PBA uses a metric between permutations, typically Spearman Footrule or Spearman Rho. Until now, several proposals based on the PBA have been developed and all of them uses one of those metrics. In this paper, we present a new family of dissimilarity measures between permutations. According to our experimental evaluation, we can reduce up to 30% the original technique costs, while preserving its exceptional answer quality. Since our dissimilarity measures can be applied in any stateoftheart PBA variant, the impact of our proposal is significant for the similarity search community.
more …
By
Guerrero Huerta, Ana Georgina; Hernández Rubio, Erika; Meneses Viveros, Amilcar
Post to Citeulike
Specialists in the area of mental health require tools that allow them to apply their tests to elderly patients more efficiently and do not lose effectiveness. One of these tests is the Yerkes test. On the one hand, implementing this test requires that it be like an augmented reality application. On the other hand, it is known that tablets are ideal are suitable devices to develop applications aimed at older adults. This paper presents the design of a prototype of the Yerkes test in augmented reality tablets, aimed to older adults.
more …
By
Sanchez, Andres Jesus; Romero, Luis Felipe; Tabik, Siham; MedinaPérez, Miguel Angel; Herrera, Francisco
Show all (5)
Post to Citeulike
Fingerprint recognition is one of the most used biometric methods for authentication. The identification of a query fingerprint requires matching its minutiae against every minutiae of all the fingerprints of the database. The stateoftheart matching algorithms are costly, from a computational point of view, and inefficient on large datasets. In this work, we include faster methods to accelerating DMC (the most accurate fingerprint matching algorithm based only on minutiae). In particular, we translate into C++ the functions of the algorithm which represent the most costly tasks of the code; we create a library with the new code and we link the library to the original C# code using a CLR Class Library project by means of a C++/CLI Wrapper. Our solution reimplements critical functions, e.g., the bit population count including a fast C++ PopCount library and the use of the squared Euclidean distance for calculating the minutiae neighborhood. The experimental results show a significant reduction of the execution time in the optimized functions of the matching algorithm. Finally, a novel approach to improve the matching algorithm, considering cache memory blocking and parallel data processing, is presented as future work.
more …
By
Trujillo, Leonardo; ZFlores, Emigdio; JuárezSmith, Perla S.; Legrand, Pierrick; Silva, Sara; Castelli, Mauro; Vanneschi, Leonardo; Schütze, Oliver; Muñoz, Luis
Show all (9)
Post to Citeulike
There are two important limitations of standard treebased genetic programming (GP). First, GP tends to evolve unnecessarily large programs, what is referred to as bloat. Second, GP uses inefficient search operators that focus on modifying program syntax. The first problem has been studied extensively, with many works proposing bloat control methods. Regarding the second problem, one approach is to use alternative search operators, for instance geometric semantic operators, to improve convergence. In this work, our goal is to experimentally show that both problems can be effectively addressed by incorporating a local search optimizer as an additional search operator. Using realworld problems, we show that this rather simple strategy can improve the convergence and performance of treebased GP, while also reducing program size. Given these results, a question arises: Why are local search strategies so uncommon in GP? A small survey of popular GP libraries suggests to us that local search is underused in GP systems. We conclude by outlining plausible answers for this question and highlighting future work.
more …
By
PozosParra, Pilar; Perrussel, Laurent; Thévenin, Jean Marc
Post to Citeulike
We extend the classic propositional tableau method in order to compute the models given by the semantics of the Priest’s paraconsistent logic of paradox. Without loss of generality, we assume that the knowledge base is represented through propositional statements in NNF, which leads to use only two rules from the classical propositional tableau calculus for computing the paraconsistent models. We consider multisets to represent branches of the tableau tree and we extend the classical closed branches in order to compute the paradoxical models of formulas of the knowledge base. A sound and complete algorithm is provided.
more …
By
Loyola, Juan Martín; Errecalde, Marcelo Luis; Escalante, Hugo Jair; Montes y Gomez, Manuel
Show all (4)
Post to Citeulike
The problem of classification is a widely studied one in supervised learning. Nonetheless, there are scenarios that received little attention despite its applicability. One of such scenarios is early text classification, where one needs to know the category of a document as soon as possible. The importance of this variant of the classification problem is evident in tasks like sexual predator detection, where one wants to identify an offender as early as possible. This paper presents a framework for early text classification which highlights the two main pieces involved in this problem: classification with partial information and deciding the moment of classification. In this context, a novel approach that learns the second component (when to classify) and an adaptation of a temporal measurement for multiclass problems are introduced. Results with a classical text classification corpus in comparison against a model that reads the entire documents confirm the feasibility of our approach.
more …
By
Trujillo, Alejandra Guadalupe Silva; Orozco, Ana Lucila Sandoval; Villalba, Luis Javier García; Kim, TaiHoon
Show all (4)
Post to Citeulike
The development of digital media, the increasing use of social networks, the easier access to modern technological devices, is perturbing thousands of people in their public and private lives. People love posting their personal news without consider the risks involved. Privacy has never been more important. Privacy enhancing technologies research have attracted considerable international attention after the recent news against users personal data protection in social media websites like Facebook. It has been demonstrated that even when using an anonymous communication system, it is possible to reveal user’s identities through intersection attacks or traffic analysis attacks. Combining a traffic analysis attack with Analysis Social Networks (SNA) techniques, an adversary can be able to obtain important data from the whole network, topological network structure, subset of social data, revealing communities and its interactions. The aim of this work is to demonstrate how intersection attacks can disclose structural properties and significant details from an anonymous social network composed of a university community.
more …
By
LópezMonroy, A. Pastor; MontesyGómez, Manuel; Escalante, Hugo Jair; González, Fabio A.
Show all (4)
Post to Citeulike
The BagofVisualWords (BoVW) representation is a well known strategy to approach many computer vision problems. The idea behind BoVW is similar to the BagofWords (BoW) used in text mining tasks: to build word histograms to represent documents. Regarding computer vision, most of the research has been devoted to obtain better visual words, rather than in improving the final representation. This is somewhat surprising, as there are many alternative ways of improving the BoW representation within the text mining community that can be applied in computer vision as well. This paper aims at evaluating the usefulness of Distributional Term Representations (DTRs) for image classification. DTRs represent instances by exploiting statistics of feature occurrences and cooccurrences along the dataset. We focus in the suitability and effectiveness of using wellknown DTRs in different image collections. Furthermore, we devise two novel distributional strategies that learn appropriated groups of images to compute better suited distributional features. We report experimental results in several image datasets showing the effectiveness of the proposed DTRs over BoVW and other methods in the literature including deep learning based strategies. In particular we show the effectiveness of the proposed representations on image collections from narrow domains, where target categories are subclasses of a more general class (e.g., subclasses of birds, aircrafts, or dogs).
more …
By
RojasMorales, Nicolás; Riff, MaríaCristina; Coello Coello, Carlos A.; Montero, Elizabeth
Show all (4)
Post to Citeulike
In recent years, there has been an increasing interest in Opposite Learning strategies. In this work, we propose COISA, a Cooperative OppositeInspired Strategy for Ants. Inspired on the concept of antipheromone, in this approach, subcolonies of ants perform different search processes to construct an initial pheromone matrix. We aim to produce a repel effect to (temporarily) avoid components that were related to an undesirable characteristic. To assess the effectiveness of COISA, we selected Ant Knapsack, a wellknown antbased algorithm that efficiently solves the Multidimensional Knapsack Problem. Results in benchmark instances show that the performance of Ant Knapsack is improved considering the opposite information, so that it can reach better solutions than before.
more …
By
KuriMorales, Angel
Post to Citeulike
Structured Data Bases which include both numerical and categorical attributes (Mixed Databases or MD) ought to be adequately preprocessed so that machine learning algorithms may be applied to their analysis and further processing. Of primordial importance is that the instances of all the categorical attributes be encoded so that the patterns embedded in the MD be preserved. We discuss CESAMO, an algorithm that achieves this by statistically sampling the space of possible codes. CESAMO’s implementation requires the determination of the moment when the codes distribute normally. It also requires the approximation of an encoded attribute as a function of other attributes such that the best code assignment may be identified. The MD’s categorical attributes are thusly mapped into purely numerical ones. The resulting numerical database (ND) is then accessible to supervised and nonsupervised learning algorithms. We discuss CESAMO, normality assessment and functional approximation. A case study of the US census database is described. Data is made strictly numerical using CESAMO. Neural Networks and SelfOrganized Maps are then applied. Our results are compared to classical analysis. We show that CESAMO’s application yields better results.
more …
By
HernándezHernández, Saiveth; OrantesMolina, Antonio; CruzBarbosa, Raúl
Post to Citeulike
Breast cancer is a global health problem principally affecting the female population. Digital mammograms are an effective way to detect this disease. One of the main indicators of malignancy in a mammogram is the presence of masses. However, their detection and diagnosis remains a difficult task. In this study, the impact of the combination of image descriptors and clinical data on the performance of conventional and kernel methods is presented. These models are trained with a dataset extracted from the public database BCDRD01. The experimental results have shown that the incorporation of clinical data to image descriptors improves the performance of classifiers better than using the descriptors alone. Likewise, this combination, but using a nonlinear kernel function, improves the performance similar to those reported in the literature for this dataset.
more …
By
Ma, Xiaobin; Du, Zhihui; Sun, Yankui; Bai, Yuan; Wu, Suping; Tchernykh, Andrei; Xu, Yang; Wu, Chao; Wei, Jianyan
Show all (9)
Post to Citeulike
Latest astronomy projects observe the spacial objects with astronomical cameras generating images continuously. To identify transient objects, the position of these objects on the images need to be compared against a reference table on the same portion of the sky, which is a complex search task called cross match. We designed EuclideanZone (EZone), a method for faster neighbor point queries which allows efficient cross match between spatial catalogs. In this paper, we implemented EZone algorithm utilizing euclidean distance between celestial objects with pixel coordinates to avoid the complex mathematical functions in equatorial coordinate system. Meanwhile, we surveyed on the parameters of our model and other system factors to find optimal configures of this algorithm. In addition to the sequential algorithm, we modified the serial program and implemented an OpenMP parallelized version. For serial version, the results of our algorithm achieved a speedup of 2.07 times over using equatorial coordinate system. Also, we achieved 19 ms for sequencial queries and 5 ms for parallel queries for 200,000 objects on a single CPU processor over a 230,520 synthetic reference database.
more …
By
Amato, Giuseppe; Chávez, Edgar; Connor, Richard; Falchi, Fabrizio; Gennaro, Claudio; Vadicamo, Lucia
Show all (6)
Post to Citeulike
In the realm of metric search, the permutationbased approaches have shown very good performance in indexing and supporting approximate search on large databases. These methods embed the metric objects into a permutation space where candidate results to a given query can be efficiently identified. Typically, to achieve high effectiveness, the permutationbased result set is refined by directly comparing each candidate object to the query one. Therefore, one drawback of these approaches is that the original dataset needs to be stored and then accessed during the refining step. We propose a refining approach based on a metric embedding, called nSimplex projection, that can be used on metric spaces meeting the npoint property. The nSimplex projection provides upper and lowerbounds of the actual distance, derived using the distances between the data objects and a finite set of pivots. We propose to reuse the distances computed for building the data permutations to derive these bounds and we show how to use them to improve the permutationbased results. Our approach is particularly advantageous for all the cases in which the traditional refining step is too costly, e.g. very large dataset or very expensive metric function.
more …
By
TelloRodríguez, Heberi; TorresTreviño, Luis
Post to Citeulike
Using the collective perception of a smart swarm robotics with a natureinspired behavior (Iain Couzin’s model) and with help of landmarks, we obtained information about a specific area, as an approach of the position of each robot, an idea of the perimeter’s shape and a representation of the influence of the landmark (a light source). We used only information of local sensors, and although we didn’t use a centralized control we had to develop a communication mechanism to centralized the information of each individual.
more …
By
SánchezAdame, Luis Martín; Mendoza, Sonia; GonzálezBeltrán, Beatriz A.; Meneses Viveros, Amilcar; Rodríguez, José
Show all (5)
Post to Citeulike
A virtual community is a social group of any size that shares common interests and communicates through the Internet. A user joins a virtual community not only because of its popularity or the quality of its contents, but also owing to the user experience that the platform offers. Anticipated User eXperience (AUX) allows knowing the idealisations, hopes, and desires of the users in a very early stage of any development. Participation is a crucial component in the growth and survival of any virtual community. An essential element for people to participate in a virtual community is that the platform should provide suitable user tools, which are widgets that allow users to interact with their peers. We propose an AUX evaluation framework for user tools, whose intention is to improve their design, and through it, the participation of users.
more …
By
Merelo, Juan J.; GarcíaValdez, JoséMario
Post to Citeulike
Concurrent languages such as Perl 6 fully leverage the power of current multicore and hyperthreaded computer architectures, and they include easy ways of automatically parallelizing code. However, to achieve more computational capability by using all threads and cores, algorithms need to be redesigned to be run in a concurrent environment; in particular, the use of a reactive, fully functional patterns need to turn the algorithm into a series of stateless steps, with simple functions that receive all the context and map it to the next stage. In this paper, we are going to analyze different versions of these stateless, reactive architectures applied to evolutionary algorithms, assessing how they interact with the characteristics of the evolutionary algorithm itself and show how they improve the scaling behavior and performance. We will use the Perl 6 language, which is a modern, concurrent language that was released recently and is still under very active development.
more …
By
SánchezGutiérrez, Máximo; Albornoz, Enrique Marcelo
Post to Citeulike
In the last years, the effort devoted by the scientific community to develop better emotion recognition systems has been increased, mainly impulsed by the potential applications. The Boltzmann restricted machines (RBM) and the deep machines of Boltzmann (DBM) are models that, in recent years, have received much attention due to their good performance for different issues. However, it is usually difficult to measure their predictive capacity and, specifically, the individual importance of hidden units. In this work, some measures are computed in the hidden units in order to rank their discriminative ability among multiple classes. Then, this information is used to prune those units that seem less relevant. The results show a significant decrease in the number of units used in the classification at the same time that the error rate is improved.
more …
By
Alexandrov, Vassil; Davila, Diego; EsquivelFlores, Oscar; Karaivanova, Aneta; Gurov, Todor; Atanassov, Emanouil
Show all (6)
Post to Citeulike
This paper focuses on minimizing further the communications in Monte Carlo methods for Linear Algebra and thus improving the overall performance. The focus is on producing set of small number of covering Markov chains which are much longer that the usually produced ones. This approach allows a very efficient communication pattern that enables to transmit the sampled portion of the matrix in parallel case. The approach is further applied to quasiMonte Carlo. A comparison of the efficiency of the new approach in case of Sparse Approximate Matrix Inversion and hybrid Monte Carlo and quasiMonte Carlo methods for solving Systems of Linear Algebraic Equations is carried out. Experimental results showing the efficiency of our approach on a set of test matrices are presented. The numerical experiments have been executed on the MareNostrum III supercomputer at the Barcelona Supercomputing Center (BSC) and on the Avitohol supercomputer at the Institute of Information and Communication Technologies (IICT).
more …
By
López, Marco A.; MarcialRomero, J. Raymundo; Ita, Guillermo; Valdovinos, Rosa M.
Show all (4)
Post to Citeulike
In this paper we present an implementation (markSAT) for computing #2SAT via graph transformations. For that, we transform the input formula into a graph and test whether it is which we call a cactus graph. If it is not the case, the formula is decomposed until cactus subformulas are obtained. We compare the efficiency of markSAT against sharpSAT which is the leading sequential algorithm in the literature for computing #SAT obtaining better results with our proposal.
more …
By
Calvo, Hiram; Juárez Gambino, Omar
Post to Citeulike
Many different attempts have been made to determine sentiment polarity in tweets, using emotion lexicons and different NLP techniques with machine learning. In this paper we focus on using emotion lexicons and machine learning only, avoiding the use of additional NLP techniques. We present a scheme that is able to outperform other systems that use both natural language processing and distributional semantics. Our proposal consists on using a cascading classifier on lexicon features to improve accuracy. We evaluate our results with the TASS 2015 corpus, reaching an accuracy only 0.07 below the topranked system for task 1, 3 levels, whole test corpus. The cascading method we implemented consisted on using the results of a first stage classification with Multinomial Naïve Bayes as additional columns for a second stage classification using a Naïve Bayes Tree classifier with feature selection. We tested with at least 30 different classifiers and this combination yielded the best results.
more …
By
Aguilar, José Alfonso; ZaldívarColado, Aníbal; TrippBarba, Carolina; Espinosa, Roberto; Misra, Sanjay; Zurita, Carlos Eduardo
Show all (6)
Post to Citeulike
Scientific literature over time highlighted the relevance of requirements engineering for software development process for desktop, web or mobile applications. Nevertheless, not much contemporary information with regard to current practices in smallsized software factories is available. This is specially true in the region of Sinaloa, México, for that reason this work presents an exploratory study which provides insight into industrial practices in Sinaloa. A combination of both qualitative and quantitative data is collected, using semistructured interviews and a detailed questionnaire from sixteen software factories. A Pearson (r) correlation analysis was performed independently between the variables Company location (EU), Scope of coverage (AC), Number of workers (NT), Time to live in the market (TV), Projects completed (PY), Time dedicated to activities related to the project (TA), Outdated projects completed (PC) in order to determine the degree of relationship between each of the variables mentioned, with all. A correlation analysis and an analysis of variance (ANOVA) were performed. The quantitative results offers opportunities for further interpretation and comparison.
more …
By
Calvo, Hiram; HernándezCastañeda, Ángel; GarcíaFlores, Jorge
Post to Citeulike
We tackle the task of author identification at PAN 2015 through a Latent Dirichlet Allocation (LDA) model. By using this method, we take into account the vocabulary and context of words at the same time, and after a statistical process find to what extent the relations between words are given in each document; processing a set of documents by LDA returns a set of distributions of topics. Each distribution can be seen as a vector of features and a fingerprint of each document within the collection. We used then a Naïve Bayes classifier on the obtained patterns with different performances. We obtained stateoftheart performance for English, overtaking the best FS score reported in PAN 2015, while obtaining mixed results for other languages.
more …
By
RojasSimón, Jonathan; Ledeneva, Yulia; GarcíaHernández, René Arnulfo
Post to Citeulike
Over the last years, Automatic Text Summarization (ATS) has been considered as one of the main tasks in Natural Language Processing (NLP) that generates summaries in several languages (e.g., English, Portuguese, Spanish, etc.). One of the most significant advances in ATS is developed for Portuguese reflected with the proposals of various stateofart methods. It is essential to know the performance of different stateoftheart methods with respect to the upper bounds (Topline), lower bounds (Baselinerandom), and other heuristics (Baselinefirst). In recent works, the significance and upper bounds for SingleDocument Summarization (SDS) and MultiDocument Summarization (MDS) using corpora from Document Understanding Conferences (DUC) were calculated. In this paper, a calculus of upper bounds for SDS in Portuguese using Genetic Algorithms (GA) is performed. Moreover, we present a comparison of some stateoftheart methods with respect to the upper bounds, lower bounds, and heuristics to determinate their level of significance.
more …
By
AcostaMendoza, Niusvel; CarrascoOchoa, Jesús Ariel; MartínezTrinidad, José Fco.; GagoAlonso, Andrés; MedinaPagola, José E.
Show all (5)
Post to Citeulike
1 Citations
Frequent approximate subgraph (FAS) mining and graph clustering are important techniques in Data Mining with great practical relevance. In FAS mining, some approximations in data are allowed for identifying graph patterns, which could be used for solving other pattern recognition tasks like supervised classification and clustering. In this paper, we explore the use of the patterns identified by a FAS mining algorithm on a graph collection for image clustering. Some experiments are performed on image databases for showing that by using the FASs mined from a graph collection under the bag of features image approach, it is possible to improve the clustering results reported by other stateoftheart methods.
more …
By
Figueroa, Karina; Reyes, Nora; CamarenaIbarrola, Antonio; ValeroElizondo, L.
Show all (4)
Post to Citeulike
The similarity search is a central problem to many applications, such as multimedia databases and repositories containing complex nonstructured objects. The metric space model is very useful in these scenarios, because metric indexes support efficient similarity search but most of them are designed for main memory. In this article we introduce an improved version of the List of Clustered Permutations (iLCP), a competitive index for approximate similarity search. Our proposal is specially adapted for secondary memory and performs well in several scenarios, especially on spaces of medium and high dimensionality. We assessed this new structure with several reallife metric spaces from SISAP, the results show that this new version keeps the rewarding characteristics of LCP, while obtaining a very good performance in terms of number of pages read per search.
more …
By
GonzálezLópez, Samuel; LópezLópez, Aurelio
Post to Citeulike
Often, academic programs require students to write a thesis or research proposal. The review of such texts is a heavy load, especially at initial stages. Natural Language Processing techniques are employed to mine existing corpora of research proposals and theses to further assess drafts of college students in information technologies and computer science. In this chapter, we focus on examining specific sections of student writings, first seeking for the connection of ideas identifying the pattern of entities. Subsequently, we analyze the justification and conclusions sections, studying features such as the presence of importance in justification and the level of speculative words in a conclusion section. Experiments and results for the different analyses are explained in detail. Each analysis is independent and could allow the student to analyze their text with a set of tools with the aim of improving their writing.
more …
By
DiazEscobar, Julia; Kober, Vitaly
Post to Citeulike
The objective of text segmentation algorithms is a pixellevel separation of characters from the image background. This task is difficult due to several factors such as environmental aspects, image acquisition problems, and complex textual content. Up to now, the MSER technique has been widely used to solve the problem due to its invariance to geometric distortions, robustness to noise and illumination variations. However, when pixels intensities are too low, the MSER method often fails. In this paper, a new text segmentation method based on local phase information is proposed. Phasebased stable regions are obtained while the phase congruency values are used to select candidate regions. The computer simulation results show the robustness of the proposed method to different image degradations. Moreover, the method outperforms the MSER technique in most of the cases.
more …
By
Olague, Gustavo; Hernández, Daniel E.; Llamas, Paul; Clemente, Eddie; Briseño, José L.
Show all (5)
Post to Citeulike
This work describes the use of brain programming for automating the video tracking design process. The challenge is that of creating visual programs that learn to detect a toy dinosaur from a database while tested in a visualtracking scenario. When planning an object tracking system, two subtasks need to be approached: detection of moving objects in each frame and correct association of detection to the same object over time. Visual attention is a skill performed by the brain whose functionality is to perceive salient visual features. The automatic design of visual attention programs through an optimization paradigm is applied to the detectionbased tracking of objects in a video from a moving camera. A system based on the acquisition and integration steps of the natural dorsal stream was engineered to emulate its selectivity and goaldriven behavior useful to the task of tracking objects. This is considered a challenging problem since many difficulties can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid structures, objecttoobject and objecttoscene occlusions, as well as camera motion, models, and parameters. Tracking relies on the quality of the detection process and automatically designing such stage could significantly improve tracking methods. Experimental results confirm the validity of our approach using three different kinds of robotic systems. Moreover, a comparison with the method of regions with convolutional neural networks is provided to illustrate the benefit of the approach.
more …
By
Arce, Fernando; Zamora, Erik; Hernández, Gerardo; Antelis, Javier M.; Sossa, Humberto
Show all (5)
Post to Citeulike
A braincomputer interface provides individuals with a way to control a computer. However, most of these interfaces remain mostly utilized in research laboratories due to the absence of certainty and accuracy in the proposed systems. In this work, we acquired our own dataset from seven ablebodied subjects and used Deep MultiLayer Perceptrons to classify motor imagery encephalography signals into binary (Rest vs Imagined and Left vs Right) and ternary classes (Rest vs Left vs Right). These Deep MultiLayer Perceptrons were fed with power spectral features computed with the Welch’s averaged modified periodogram method. The proposed architectures outperformed the accuracy achieved by the stateoftheart for classifying motor imagery bioelectrical brain signals obtaining 88.03%, 85.92% and 79.82%, respectively, and an enhancement of 11.68% on average over the commonly used Support Vector Machines.
more …
By
GómezAdorno, Helena; MartíndelCampoRodríguez, Carolina; Sidorov, Grigori; Alemán, Yuridiana; Vilariño, Darnes; Pinto, David
Show all (6)
Post to Citeulike
The author clustering problem consists in grouping documents written by the same author so that each group corresponds to a different author. We described our approach to the author clustering task at PAN 2017, which resulted in the bestperforming system at the aforementioned task. Our method performs a hierarchical clustering analysis using document features such as typed and untyped character ngrams, word ngrams, and stylometric features. We experimented with two feature representation methods, logentropy model, and TFIDF, while tuning minimum frequency threshold values to reduce the feature dimensionality. We identified the optimal number of different clusters (authors) dynamically for each collection using the Caliński Harabasz score. The implementation of our system is available open source (
https://github.com/helenpy/clusterPAN2017
).
more …
By
LópezRamírez, Cristina; Ita, Guillermo; Neri, Alfredo
Post to Citeulike
A novel method to model the 3coloring on polygonal tree graphs is presented. This proposal is based on the logical specification of the constraints generated for a valid 3coloring on polygonal graphs. In order to maintain a polynomial time procedure, the logical constraints are formed in a dinaymic way. At the same time, the graph is traversing in postorder, resulting in a polynomial time instance of the incremental satisfiability problem. This proposal can be extended for considering other polynomial time instances of the 3coloring problem.
more …
By
Berčič, Katja; Vidali, Janoš
Post to Citeulike
There have been various efforts to collect certain mathematical results into searchable databases. In this paper, we present DiscreteZOO: a repository and a fingerprint database for discrete mathematical objects. At the moment, it hosts collections of vertextransitive graphs and maniplexes, which are a common generalisation of maps and abstract polytopes. The project encompasses a tool for handling and maintaining collections of objects, as well as a website and SageMath package for interacting with the database. The project aims to become a general platform to make collections of mathematical objects easier to publish and access.
more …
By
Fraga, Luis Gerardo; GarcíaMorales, Nataly A.; JaramilloOlivares, Daybelis; RamírezDíaz, Adrián J.
Show all (4)
Post to Citeulike
To build an Augmented Reality (AR) application it is necessary to recognize a fiducial marker, then to calibrate the camera that is viewing the 3D scene on the marker, and finally to draw a virtual object over the image taken by the camera but in the virtual coordinate system supposed also on the fiducial marker. The camera calibration step give us the transformation matrix from 3D world to 2D on the screen, and the pose of the marker with respect to the virtual coordinate system. An AR application must run interactively with the user, and also in real time. Performing all these calculations in a embedded device such as a Single Board Computer (SBC), a tablet, or a smartphone, is a challenge because a normal numerical analysis library is huge, and it is not designed for such devices. In this article we present a lightweight numerical library, it has been developed thinking in such computing restricted devices. We show results on two AR applications developed for the Raspberry Pi 3 SBC.
more …
By
ValdezRodríguez, José E.; Calvo, Hiram; FelipeRiverón, Edgardo M.
Post to Citeulike
Depth reconstruction from single images has been a challenging task due to the complexity and the quantity of depth cues that images have. Convolutional Neural Networks (CNN) have been successfully used to reconstruct depth of general object scenes; however, these works have not been tailored for the particular problem of road perspective depth reconstruction. As we aim to build a computational efficient model, we focus on singlestage CNNs. In this paper we propose two different models for solving this task. A particularity is that our models perform refinement in the same singlestage training; thus, we call them ReduceRefineUpsample (RRU) models because of the order of the CNN operations. We compare our models with the current state of the art in depth reconstruction, obtaining improvements in both global and local views for images of road perspectives.
more …
By
RodriguezCoayahuitl, Lino; MoralesReyes, Alicia; Escalante, Hugo Jair
Post to Citeulike
We introduce a novel method for representation learning based on genetic programming (GP). Inspired into the way that deep neural networks learn descriptive/discriminative representations from raw data, we propose a structurally layered representation that allows GP to learn a feature space from large scale and high dimensional data sets. Previous efforts from the GP community for feature learning have focused on small data sets with a few input variables, also, most approaches rely on domain expert knowledge to produce useful representations. In this paper, we introduce the structurally layered GP formulation, together with an efficient scheme to explore the search space and show that this framework can be used to learn representations from large data sets of high dimensional raw data. As case of study we describe the implementation and experimental evaluation of an autoencoder developed under the proposed framework. Results evidence the benefits of the proposed framework and pave the way for the development of deep geneticprogramming.
more …
By
Behrisch, Mike; VargasGarcía, Edith; Zhuk, Dmitriy
Post to Citeulike
We consider finitary relations (also known as crosses) that are definable via finite disjunctions of unary relations, i.e. subsets, taken from a fixed finite parameter set Γ. We prove that whenever Γ contains at least one nonempty relation distinct from the full carrier set, there is a countably infinite number of polymorphism clones determined by relations that are disjunctively definable from Γ. Finally, we extend our result to finitely related polymorphism clones and countably infinite sets Γ. These results address an open problem raised in Creignou, N., et al. Theory Comput. Syst. 42(2), 239–255 (2008), which is connected to the complexity analysis of the satisfiability problem of certain multiplevalued logics studied in Hähnle, R. Proc. 31st ISMVL 2001, 137–146 (2001).
more …
By
Majumder, Goutam; Pakray, Partha; Khiangte, Zoramdinthara; Gelbukh, Alexander
Show all (4)
Post to Citeulike
We examine the formation of multiword expressions (MWE) and reduplicated words in the Mizo language, basing on a news corpus (reduplication is a repetition of a linguistic unit, such as morpheme, affix, word, or clause). To study the structure of reduplication, we follow lexical and morphological approaches, which have been used for the study of other Indian languages, such as Manipuri, Bengali, Odia, Marathi etc. We also show the effect of these phenomena on natural language processing tasks for the Mizo language. To develop an algorithm for identification of reduplicated words in the Mizo language, we manually identified MWEs and reduplicated words and then studied their structural and semantic properties. The results were verified by linguists, experts in the Mizo language.
more …
By
HervertEscobar, Laura; HernandezGress, Neil; Matis, Timothy I.
Post to Citeulike
In the current world, sports produce considerable data such as players skills, game results, season matches, leagues management, etc. The big challenge in sports science is to analyze this data to gain a competitive advantage. The analysis can be done using several techniques and statistical methods in order to produce valuable information. The problem of modeling soccer data has become increasingly popular in the last few years, with the prediction of results being the most popular topic. In this paper, we propose a Bayesian Model based on rank position and shared history that predicts the outcome of future soccer matches. The model was tested using a data set containing the results of over 200,000 soccer matches from different soccer leagues around the world.
more …
By
HervertEscobar, Laura; Alexandrov, Vassil
Post to Citeulike
A well designed territory enhances customer coverage, increases sales, fosters fair performance and rewards systems and lower travel cost. This paper considers a real life case study to design a sales territory for a business sales plan. The business plan consists in assigning the optimal quantity of sellers to a territory including the scheduling and routing plans for each seller. The problem is formulated as a combination of assignment, scheduling and routing optimization problems. The solution approach considers a metaheuristic using stochastic iterative projection method for large systems. Several real life instances of different sizes were tested with stochastic data to represent raise/fall in the customers demand as well as the appearance/loss of customers.
more …
By
SánchezJunquera, Javier; VillaseñorPineda, Luis; MontesyGómez, Manuel; Rosso, Paolo
Show all (4)
Post to Citeulike
Controversial topics are present in the everyday life, and opinions about them can be either truthful or deceptive. Deceptive opinions are emitted to mislead other people in order to gain some advantage. In the most of the cases humans cannot detect whether the opinion is deceptive or truthful, however, computational approaches have been used successfully for this purpose. In this work, we evaluate a representation based on character ngrams features for detecting deceptive opinions. We consider opinions on the following: abortion, death penalty and personal feelings about the best friend; three domains studied in the state of the art. We found character ngrams effective for detecting deception in these controversial domains, even more than using psycholinguistic features. Our results indicate that this representation is able to capture relevant information about style and content useful for this task. This fact allows us to conclude that the proposed one is a competitive text representation with a good tradeoff between simplicity and performance.
more …
By
Retchkiman Konigsberg, Zvi
Post to Citeulike
Consider the interaction of populations, in which there are exactly two species, one of which the predators eat the preys thereby affecting each other. In the study of this interaction LotkaVolterra models have been used. Other nonclassical methodologies as Petri nets and first order logic have been employed too. This paper proposes a formal modeling and verification analysis methodology, which consists in representing the interaction behavior by means of a modal logic formula. Then, using the concept of logic implication, and transforming this logical implication relation into a set of clauses, a modal resolution qualitative method for verification (satisfiability) as well as performance issues, for some queries is applied.
more …
By
Gąsior, Jakub; Seredyński, Franciszek; Tchernykh, Andrei
Post to Citeulike
The paper presents a general framework to study issues of multiobjective online scheduling in the Infrastructure as a Service model of Cloud Computing (CC) systems taking into account the aspects of the total workflow execution cost while meeting the deadline and risk rate constraints. Our goal is providing fairness between concurrent job submissions by minimizing tardiness of individual applications and dynamically rescheduling them to the best suited resources. The system, via the scheduling algorithms, is responsible to guarantee the corresponding Quality of Service (QoS) and Service Level Agreement (SLA) for all accepted jobs.
more …
By
LaraNino, Carlos Andres; DiazPerez, Arturo; MoralesSandoval, Miguel
Post to Citeulike
Lightweight block ciphers are today of paramount importance to provide security services in constrained environments. Recent studies have questioned the security properties of Present, which makes it evident the need to study alternative ciphers. In this work we provide hardware architectures for Midori and Gift, and compare them against implementations for Present and Gimli under fair conditions. The hardware description for our designs is made publicly available.
more …
By
Navarro, Ingrid; Herrera, Alberto; Hernández, Itzel; Garrido, Leonardo
Show all (4)
Post to Citeulike
Deep learningbased frameworks have been widely used in object recognition, perception and autonomous navigation tasks, showing outstanding feature extraction capabilities. Nevertheless, the effectiveness of such detectors usually depends on large amounts of training data. For specific objectrecognition tasks, it is often difficult and timeconsuming to gather enough valuable data [10]. Data Augmentation has been broadly adopted to overcome these difficulties, as it allows to increase the training data and introduce variation in qualitative elements like color, illumination, distortion and orientation. In this paper, we leverage on the object detection framework YOLOv2 [12] to evaluate the behavior of an obstacle detection system for an autonomous boat designed for the International RoboBoat Competition. We are focused on how the overall performance of a model changes with different augmentation techniques. Thus, we analyze the features that the network learns by using geometric and pixelwise transformations to augment our data. Our instances of interest are buoys and sea markers, thus to generate training data comprising these classes, we simulated the aquatic surface of the boat and collected data from the COCO dataset [8]. Finally, we discuss that significant generalization is achieved in the learning process of our experiments using different augmentation techniques.
more …
By
Toledo, Leonel; Rivalcoba, Ivan; Rudomin, Isaac
Post to Citeulike
1 Citations
In this work we present a system able to simulate crowds in complex urban environments; the system is built in two stages, urban environment generation and pedestrian simulation, for the first stage we integrate the WRLD3D plugin with real data collected from GPS traces, then we use a hybrid approach done by incorporating steering pedestrian behaviors with the goal of simulating the subtle variations present in real scenarios without needing large amounts of data for those lowlevel behaviors, such as pedestrian motion affected by other agents and static obstacles nearby. Nevertheless, realistic human behavior cannot be modeled using deterministic approaches, therefore our simulations are both datadriven and sometimes are handled by using a combination of finite state machines (FSM) and fuzzy logic in order to handle the uncertainty of people motion.
more …
By
Dash, Sandeep Kumar; Pakray, Partha; Porzel, Robert; Smeddinck, Jan; Malaka, Rainer; Gelbukh, Alexander
Show all (6)
Post to Citeulike
Instructions for physical exercises leave many details underspecified that are taken for granted and inferred by the intended reader. For certain applications, such as generating virtual action visualizations from such textual instructions, advanced text processing is needed, requiring interpretation of both implicit and explicit information. This work presents an ontology that can support the semantic analysis of such instructions in order to support the identification of matching action constructs. The proposed ontology lays down a hierarchical structure following the human body structure along with various type of movement restrictions. This facilitates flexible yet adequate representations.
more …
By
PinillaBuitrago, Laura Alejandra; CarrascoOchoa, Jesús A.; MartinezTrinidad, José Fco.
Post to Citeulike
In the literature, all methods that represent Maya hieroglyphs compute local descriptors from the hieroglyph foreground. However, the background of a hieroglyph also contains information of its shape. Therefore, in this paper, we propose a new Maya hieroglyph representation that includes information from both, the foreground and the background. Our experimental results show that our proposal for representing Maya hieroglyphs allows obtaining better retrieval results than those previously reported in the state of the art.
more …
By
PatinoSaucedo, Alberto; RostroGonzalez, Horacio; Conradt, Jorg
Post to Citeulike
AlexNet is a Convolutional Neural Network (CNN) and reference in the field of Machine Learning for Deep Learning. It has been successfully applied to image classification, especially in large sets such as ImageNet. Here, we have successfully applied a smaller version of the AlexNet CNN to classify tropical fruits from the Supermarket Produce dataset. This database contains 2633 images of fruits divided into 15 categories with high variability and complexity, i.e. shadows, pose, occlusion, reflection (fruits inside a bag), etc. Since few training samples are required for fruit classification and to prevent overfitting, the modified AlexNet CNN has fewer feature maps and fully connected neurons than the original one, and data augmentation of the training set is used. Numerical results show a top1 classification accuracy of 99.56 %, and a top2 accuracy of 100 % for the 15 classes, which outperforms previous works on the same dataset.
more …
By
LoyolaGonzález, Octavio; Monroy, Raúl; MedinaPérez, Miguel Angel; Cervantes, Bárbara; GrimaldoTijerina, José Ernesto
Show all (5)
Post to Citeulike
Nowadays, companies invest resources in detecting nonhuman accesses on their web traffics. Usually, nonhuman accesses are a few compared with the human accesses, which is considered as a class imbalance problem, and as a consequence, classifiers bias their classification results toward the human accesses obviating, in this way, the nonhuman accesses. In some classification problems, such as the nonhuman traffic detection, high accuracy is not only the desired quality, the model provided by the classifier should be understood by experts. For that, in this paper, we study the use of contrast patternbased classifiers for building an understandable and accurate model for detecting nonhuman traffic on web log files. Our experiments over five databases show that the contrast patternbased approach obtains significantly better AUC results than other stateoftheart classifiers.
more …
By
Zúñiga, Angel; Sierra, Gerardo; BelEnguix, Gemma; GaliciaHaro, Sofía N.
Show all (4)
Post to Citeulike
Being able to create a natural language compiler has been one of the most soughtafter goals to reach since the very beginning of artificial intelligence. Since then; however, it has been an elusive and difficult task to achieve to the extent of being considered almost impossible to perform. In this article, we present a promising path by using a grammar formalism which attempts to model natural language; in principle, by using minimalist grammars as one of the last proposed instances of formalism of this type. The main idea consists in creating a parser based on this type of grammars which could recognize and analyze the text (or input program) written in natural language and use this parser as a frontend of a compiler. Then, for the rest of the compilation process, utilize the usual phases of a classic compiler of a programming language. Moreover, we present a prototype of a natural language compiler whose specific language is that of arithmetic expressions, in order to show with evidence that it is indeed possible to implement it, that is to say, to put the proposed compiler design into practice, showing in this manner that it is actually possible to create a natural language compiler following this promising path.
more …
By
Ullah, Muhammad Rizwan; Aslam, Muhammad; Ullah, Muhammad Imran; Maria, MartinezEnriquez Ana
Show all (4)
Post to Citeulike
Drowsiness and sleepiness of driver is an important cause of road accident on expressways, highways, and motorways. These accidents not only results in economic loss but may also in physical injuries, which could result permanent disability or even death. The aim of this research is to minimize this cause of road accidents. Safe driving requirement is unavoidable and to attain this, driver’s drowsiness detection system is to be incorporate in vehicles. Drowsiness detection using vehiclebased, physiological, and behavioral change measurement system is possible with embedded pros and cons. Advancements in the field of image processing and development of faster and cheaper processors direct researches to focus on behavioral change measurement system for drowsiness detection. Computer vision based drowsiness detection is possible by closely monitoring the drowsiness symptoms like eye blinking intervals, yawning, eye closing duration, head position etc. The presented paper deals with merits and demerits of the drowsiness symptoms measurement mechanism and computer vision based drowsiness detection systems. The conclusion of the research is that by designing a hybrid computer vision based drowsy driver detection system dependability achieved. The proposed system is nonintrusive in nature and helpful in attaining safer roads by limiting potential accidental threat due to driver drowsiness.
more …
By
ReyesNava, A.; Sánchez, J. S.; Alejo, R.; FloresFuentes, A. A.; RendónLara, E.
Show all (5)
Post to Citeulike
In recent years, researchers have increased their interest in deep learning for data mining and pattern recognition applications. This is mainly due to its high processing capability and good performance in feature selection, prediction and classification tasks. In general, deep learning algorithms have demonstrated their great potential in handling large scale data sets in image recognition and natural language processing applications, which are characterized by a very large number of samples coupled with a high dimensionality. In this work, we aim at analyzing the performance of deep neural networks for classification of geneexpression microarrays, in which the number of genes is of the order of thousands while the number of samples is typically less than a hundred. The experimental results show that in some of these challenging situations, the use of deep neural networks and traditional machine learning algorithms does not always lead to high performance results. This finding suggests that deep learning needs a very large number of both samples and features to achieve high performance.
more …
By
Duhart, Bronson; Camarena, Fernando; OrtizBayliss, José Carlos; Amaya, Ivan; TerashimaMarín, Hugo
Show all (5)
Post to Citeulike
The knapsack problem is a fundamental problem that has been extensively studied in combinatorial optimization. The reason is that such a problem has many practical applications. Several solution techniques have been proposed in the past, but their performance is usually limited by the complexity of the problem. Hence, this paper studies a novel hyperheuristic approach based on the ant colony optimization algorithm to solve the knapsack problem. The hyperheuristic is used to produce rules that decide which heuristic to apply given the current problem state of the instance being solved. We test the hyperheuristic model on sets with a variety of knapsack problem instances. Our resulting data seems promising.
more …
By
Limón, Xavier; GuerraHernández, Alejandro; Ricci, Alessandro
Post to Citeulike
This paper deals with distribution aspects of endogenous environments, in this case, distribution refers to the deployment in several machines across a network. A recognized challenge is the achievement of distributed transparency, a mechanism that allows the agent working in a distributed environment to maintain the same level of abstraction as in local contexts. In this way, agents do not have to deal with details about network connections, which hinders their abstraction level, and the way they work in comparison with locally focused environments, reducing flexibility. This work proposes a model based on hierarchical workspaces, creating a distinctive layer for environment distribution, which the agents do not manage directly but can exploit as part of infrastructure services. The proposal is in the context of JaCaMo, the MultiAgent Programming framework that combines the Jason, CArtAgO, and MOISE technologies, specially focusing on CArtAgO, which provides the means to program and organize the environment in terms of workspaces.
more …
By
Oves García, Reinier; Valentin, Luis; MartínezCarranza, José; Sucar, L. Enrique
Show all (4)
Post to Citeulike
This paper presents a fast algorithm for camera selection in a robotic multicamera localization system. The scenario we study is that where a robot is navigating in an indoor environment using a fourcamera vision system to localize itself inside the world. In this context, when something occludes the current camera used for localization, the system has to switch to one of the other three available cameras to remain localized. In this context, the question that arises is that of “what camera should be selected?”. We address this by proposing an approach that aims at selecting the next best view to carry on the localization. For that, the number of static features at each direction is estimated using the optical flow. In order to validate our approach, experiments in a real scenario with a mobile robot system are presented.
more …
By
FabilaMonroy, Ruy; HidalgoToscano, Carlos; Huemer, Clemens; Lara, Dolores; Mitsche, Dieter
Show all (5)
Post to Citeulike
How to draw the vertices of a complete multipartite graph G on different points of a bounded ddimensional integer grid, such that the sum of squared distances between vertices of G is (i) minimized or (ii) maximized? For both problems we provide a characterization of the solutions. For the particular case
$$d=1$$
, our solution for (i) also settles the minimum2sum problem for complete bipartite graphs; the minimum2sum problem was defined by Juvan and Mohar in 1992. Weighted centroidal Voronoi tessellations are the solution for (ii). Such drawings are related with Laplacian eigenvalues of graphs. This motivates us to study which properties of the algebraic connectivity of graphs carry over to the restricted setting of drawings of graphs with integer coordinates.
more …
By
PortilloPortillo, Jose; Leyva, Roberto; Sanchez, Victor; SanchezPerez, Gabriel; PerezMeana, Hector; OlivaresMercado, Jesus; ToscanoMedina, Karina; NakanoMiyatake, Mariko
Show all (8)
Post to Citeulike
This paper proposes a viewinvariant gait recognition algorithm, which builds a unique view invariant model taking advantage of the dimensionality reduction provided by the Direct Linear Discriminant Analysis (DLDA). Proposed scheme is able to reduce the undersampling problem (USP) that appears usually when the number of training samples is much smaller than the dimension of the feature space. Proposed approach uses the Gait Energy Images (GEIs) and DLDA to create a view invariant model that is able to determine with high accuracy the identity of the person under analysis independently of incoming angles. Evaluation results show that the proposed scheme provides a recognition performance quite independent of the view angles and higher accuracy compared with other previously proposed gait recognition methods, in terms of computational complexity and recognition accuracy.
more …
By
Pérez, Julio; Tang, Yu; Grave, Ileana
Post to Citeulike
The objective of this paper is to design observers for a class of neuronal oscillators on the one hand, and to give a comparative study of the observer performance as the number of synchronized observer increases, on the other hand. More specifically, we apply the methodology of observer design in [4] for a class of neural oscillators. Contraction tool [7] is applied to obtain an exponentially convergent reducedorder observer, which serves as a buildingblock to construct a completeorder observer when the output is corrupted by moderate level of noise. In presence of strong measurement noise, several identical completeorder observers are coupled to synchronize.
more …
By
Tovar, Mireya; Flores, Gerardo; ReyesOrtiz, José A.; Contreras, Meliza
Show all (4)
Post to Citeulike
Synonymy is a relation of equivalence between the meanings of one or more words which allows the use of any word in an equivalent way depending on the context. Given the difficulty of defining the concordance between the meanings, the Natural Language Processing has focused on researching computational techniques that allow defining pairs of synonyms automatically. In this paper, a method based on lexicosyntactic patterns is proposed for the validation of semantic relations of synonymy between ontological concepts. An acronym will be considered a type of synonym within our paper. The results obtained by our proposed method were compared with the criterion of three experts, resulting above 80% of accuracy in the concordances of opinion between what is marked by the experts and the results of our proposed method.
more …
By
DíazPacheco, Angel; GonzalezBernal, Jesús A.; ReyesGarcía, Carlos Alberto; EscalanteBalderas, Hugo Jair
Show all (4)
Post to Citeulike
The increasingly larger quantities of information generated in the world over the last few years, has led to the emergence of the paradigm known as Big Data. The analysis of those vast quantities of data has become an important task in science and business in order to turn that information into a valuable asset. Many data analysis tasks involves the use of machine learning techniques during the model creation step and the goal of these predictive models consists on achieving the highest possible accuracy to predict new samples, and for this reason there is high interest in selecting the most suitable algorithm for a specific dataset. This trend is known as model selection and it has been widely studied in datasets of common size, but poorly explored in the Big Data context. As an effort to explore in this direction this work propose an algorithm for model selection in Big Data.
more …
By
AcostaMendoza, Niusvel; CarrascoOchoa, Jesús Ariel; GagoAlonso, Andrés; MartínezTrinidad, José Francisco; MedinaPagola, José Eladio
Show all (5)
Post to Citeulike
In data mining, frequent approximate subgraph (FAS) mining techniques has taken the full attention of several applications, where some approximations are allowed between graphs for identifying important patterns. In the last four years, the application of FAS mining algorithms over multigraphs has reported relevant results in different pattern recognition tasks like supervised classification and object identification. However, to the best of our knowledge, there is no reported work where the patterns identified by a FAS mining algorithm over multigraph collections are used for image clustering. Thus, in this paper, we explore the use of multigraph FASs for image clustering. Some experiments are performed over image collections for showing that by using multigraph FASs under the bag of features image approach, the image clustering results reported by using simplegraph FAS can be improved.
more …
By
Jafari, Raheleh; Razvarz, Sina; Gegov, Alexander
Post to Citeulike
Predicting the solution of complex systems is a significant challenge. Complexity is caused mainly by uncertainty and nonlinearity. The nonlinear nature of many complex systems leaves uncertainty irreducible in many cases.
In this work, a novel iterative strategy based on the feedback neural network is recommended to obtain the approximated solutions of the fully fuzzy nonlinear system (FFNS). In order to obtain the estimated solutions, a gradient descent algorithm is suggested for training the feedback neural network. An example is laid down in order to demonstrate the high accuracy of this suggested technique.
more …
By
SantamaríaBonfil, Guillermo; Hernández, Yasmín; PérezRamírez, Miguel; ArroyoFigueroa, G.
Show all (4)
Post to Citeulike
An indispensable element of any Intelligent Tutoring Systems is the student model since it enables the system to cope with student’s particular needs. Furthermore, data accumulated by educational systems in bug libraries can be exploited to build a student model by data mining methods. In this work, we built a student model for a virtual reality system used by a Mexican utility to train electricians in operations with medium tension energized lines using its bug libraries. First, errors are mapped to features using a BagofErrors scheme. Additional information about the courses, and the students is also incorporated. Then, a Decision Tree is employed to build the student model. Finally, several student models are built, and compared in terms of Accuracy, Sensitivity, and Specificity. Results show that the proposed model is able to identify trained/untrained students with high accuracy. Moreover, these models shed light on critical task knowledge components which may be used to improve the learning experience of technical operators.
more …
By
Ghanem, Bilal; Arafeh, Labib; Rosso, Paolo; SánchezVega, Fernando
Show all (4)
Post to Citeulike
Plagiarism is specifically defined as literary theft of paragraphs or sentences from unreferenced source. This unauthorized behavior is a real problem that targets scientific research scope. This paper proposes a Hybrid Arabic Plagiarism Detection System (HYPLAG). The HYPLAG approach combines corpusbased and knowledgebased approaches by utilizing an Arabic semantic resource (Arabic WordNet). A preliminary study on texts from undergraduate students was conducted to understand their behavior and the patterns used in plagiarism. The results of the study show that students apply different techniques to plagiarized sentences, also it shows changes in sentence’s components (verbs, nouns, and adjectives). HYPLAG was evaluated on the ExAraPlagDet2015 dataset against several other approaches that participated in the AraPlagDet PAN@FIRE shared task on Extrinsic Arabic plagiarism detection obtaining a higher performance (Fscore 89% vs. 84% obtained by the best performing system at AraPlagDet) with less computational time.
more …
By
KuriMorales, Angel; CartasAyala, Alejandro
Post to Citeulike
One of the most interesting goals in engineering and the sciences is the mathematical representation of physical, social and other kind of complex phenomena. This goal has been attempted and, lately, achieved with different machine learning (ML) tools. ML owes much of its present appeal to the fact that it allows to model complex phenomena without the explicit definition of the form of the model. Neural networks and support vector machines exemplify such methods. However, in most of the cases, these methods yield “black box” models, i.e. input and output correspond to the phenomena under scrutiny but it is very difficult (or outright impossible) to discern the interrelation of the input variables involved. In this paper we address this problem with the explicit aim of targeting on models which are closed in nature, i.e. the aforementioned relation between variables is explicit. In order to do this, in general, the only assumption regarding the data is that they be approximately continuous. In such cases it is possible to represent the system with polynomial expressions. To be able to do so one must define the number of monomials, the degree of every variable in every monomial and the coefficients associated. We model sparse data systems with an algorithm minimizing the minmax norm. From mathematical and experimental evidence we are able to set a bound on the number of terms and degrees of the approximating polynomials. Thereafter, a genetic algorithm (GA) identifies the coefficients which correspond to the terms and degrees defined as above.
more …
By
Canales, Diana; HernandezGress, Neil; Akella, Ram; Perez, Ivan
Show all (4)
Post to Citeulike
The prevalence of type 2 Diabetes Mellitus (T2DM) has reached critical proportions globally over the past few years. Diabetes can cause devastating personal suffering and its treatment represents a major economic burden for every country around the world. To property guide effective actions and measures, the present study aims to examine the profile of the diabetic population in Mexico. We used the KarhunenLoève transform which is a form of principal component analysis, to identify the factors that contribute to T2DM. The results revealed a unique profile of patients who cannot control this disease. Results also demonstrated that compared to young patients, old patients tend to have better glycemic control. Statistical analysis reveals patient profiles and their health results and identify the variables that measure overlapping health issues as reported in the database (i.e. collinearity).
more …
By
DíazPacheco, Angel; ReyesGarcía, Carlos Alberto
Post to Citeulike
Full Model Selection is a technique for improving the accuracy of machine learning algorithms through the search of the most adequate combination on each dataset of feature selection, data preparation, a machine learning algorithm and its hyperparameters tuning. With the increasingly larger quantities of information generated in the world, the emergence of the paradigm known as Big Data has made possible the analysis of gigantic datasets in order to obtain useful information for science and business. Though Full Model Selection is a powerful tool, it has been poorly explored in the Big Data context, due to the vast search space and the elevated number of fitness evaluations of candidate models. In order to overcome this obstacle, we propose the use of proxy models in order to reduce the number of expensive fitness functions evaluations and also the use of the Full Model Selection paradigm in the construction of such proxy models.
more …
By
Markov, Ilia; Stamatatos, Efstathios; Sidorov, Grigori
Post to Citeulike
The effectiveness of character ngram features for representing the stylistic properties of a text has been demonstrated in various independent Authorship Attribution (AA) studies. Moreover, it has been shown that some categories of character ngrams perform better than others both under single and crosstopic AA conditions. In this work, we present an improved algorithm for crosstopic AA. We demonstrate that the effectiveness of character ngrams representation can be significantly enhanced by performing simple preprocessing steps and appropriately tuning the number of features, especially in crosstopic conditions.
more …
By
KuriMorales, Angel
Post to Citeulike
Structured data bases may include both numerical and nonnumerical attributes (categorical or CA). Databases which include CAs are called “mixed” databases (MD). Metric clustering algorithms are ineffectual when presented with MDs because, in such algorithms, the similarity between the objects is determined by measuring the differences between them, in accordance with some predefined metric. Nevertheless, the information contained in the CAs of MDs is fundamental to understand and identify the patterns therein. A practical alternative is to encode the instances of the CAs numerically. To do this we must consider the fact that there is a limited subset of codes which will preserve the patterns in the MD. To identify such patternpreserving codes (PPC) we appeal to a statistical methodology. It is possible to statistically identify a set of PPCs by selectively sampling a bounded number of codes (corresponding to the different instances of the CAs) and demanding the method to set the size of the sample dynamically. Two issues have to be considered for this method to be defined in practice: (a) How to set the size of the sample and (b) How to define the adequateness of the codes. In this paper we discuss the method and present a case of study wherein the appropriateness of the method is illustrated.
more …
By
LazoCortés, Manuel S.; MartínezTrinidad, José Fco.; CarrascoOchoa, Jesús Ariel
Post to Citeulike
In Rough Set Theory, reducts are minimal subsets of attributes that retain the ability of the whole set of attributes to discern objects belonging to different classes. On the other hand, classspecific reducts allow discerning objects belonging to a specific class from all other classes. This latest type of reduct has been little studied. Here we show, through a case study, some advantages of using classspecific reducts instead of classic ones in a rulebased classifier. Our results show that it is worthwhile to deepen in the study of this issue.
more …
By
NeiraTovar, Leticia; Escobar Cavazos, Jesús Antonio; Lozano Carrasco, Javier Antonio; BarreraAldana, Salvador
Show all (4)
Post to Citeulike
Diabetes is a disease that affects a large part of the global population and it is one of the leading death causes in the world. In Mexico, last years has been seen the biggest number of children with this sickness, conscience is needed to prevent it, and are required as well ideal tools to fight it. Actually, some efforts had been made with information technologies to help people cope with their daily life, including virtual reality (VR) as support in some physical coordination treatments. The objective of this work in progress, is to propose a development of a product through VR techniques, basically a videogame supported by a data structure looking for smart integration specially designed to control this disease, motivate physical activity and improve body coordination, using immersion techniques, attractive and fun way. Tracking will done by the use of a proposal intelligent database structure to collect the produced data that will be use to improve the treatment trace.
more …
By
HernándezVega, JoséIsidro; Varela, Elda Reyes; Romero, Natividad Hernández; HernándezSantos, Carlos; Cuevas, Jonam Leonel Sánchez; Gorham, Dolores Gabriela Palomares
Show all (6)
Post to Citeulike
Incorporating sensors for data acquisition to upload them afterwards to the internet is one of the applications of the internet of things (IoT). This research approaches IoT in an environmental way, concerning the air pollution and its measures as one of the biggest problems nowadays. Air pollution affects not only natural environment, but also human health. There are several ways for measuring air pollution through sensors integrated in fixed monitoring units, which are grounded in strategic city areas. These monitoring units have limited range of scope and height limitations as well. The paper presents a system for data acquisition, designed and incorporated into a UAV, which allows monitoring the pollutants criterion in air. Collected data is then sent to a ground station by radiofrequency, where the station processes the information and sends it to the Internet. The results are shown through a web page that can be displayed on any computer or mobile device. The actual proposal seeks to incorporate the technology into a smart city.
more …
By
OrdazRivas, Erick; RodríguezLiñán, Angel; TorresTreviño, Luis
Post to Citeulike
Swarm robotics is an approach for the coordination of large numbers of simple robots inspired in the biological societies behaviors. Swarm robotics have benefits such as parallelism, redundancy, and solutions distributed in space and time, which are obtained through the use of homogeneous robots. However, certain complex tasks require the collaboration of heterogeneous robots. In nature, there is a relationship between heterogeneous species called preypredator, where the concept of this relationship is also used in swarm robotics to solve tasks. In this research, it is desired to study the collaboration that emerges in a group of predators trying to catch a prey, where heterogeneity is implicit in the role, rules and detection range of the prey compared to predators. The proposed system has a minimum change of information based on the diffusion of short range signals perceived by robots. An algorithm is implemented in simulated robots, based on rules inspired by the behavior of social animals, including the parameters of repulsion, orientation, attraction (ROA) and influence factors that allow to change the behavior of the swarm in a decentralized way.
more …
