Showing 1 to 33 of 33 matching Articles
Results per page:
Export (CSV)
By
GalvánLópez, Edgar; VázquezMendoza, Lucia; Schoenauer, Marc; Trujillo, Leonardo
Show all (4)
Post to Citeulike
In Genetic Programming (GP), the fitness of individuals is normally computed by using a set of fitness cases (FCs). Research on the use of FCs in GP has primarily focused on how to reduce the size of these sets. However, often, only a small set of FCs is available and there is no need to reduce it. In this work, we are interested in using the whole FCs set, but rather than adopting the commonly used GP approach of presenting the entire set of FCs to the system from the beginning of the search, referred as static FCs, we allow the GP system to build it by aggregation over time, named as dynamic FCs, with the hope to make the search more amenable. Moreover, there is no study on the use of FCs in Dynamic Optimisation Problems (DOPs). To this end, we also use the Kendall Tau Distance (KTD) approach, which quantifies pairwise dissimilarities among two lists of fitness values. KTD aims to capture the degree of a change in DOPs and we use this to promote structural diversity. Results on eight symbolic regression functions indicate that both approaches are highly beneficial in GP.
more …
By
Olague, Gustavo; Romero, Eva; Trujillo, Leonardo; Bhanu, Bir
Show all (4)
Post to Citeulike
3 Citations
This paper presents a linear genetic programming approach, that solves simultaneously the region selection and feature extraction tasks, that are applicable to common image recognition problems. The method searches for optimal regions of interest, using texture information as its feature space and classification accuracy as the fitness function. Texture is analyzed based on the gray level cooccurrence matrix and classification is carried out with a SVM committee. Results show effective performance compared with previous results using a standard image database.
more …
By
GarcíaValdez, Mario; Trujillo, Leonardo; MereloGuérvos, Juan Julián; FernándezdeVega, Francisco
Show all (4)
Post to Citeulike
4 Citations
Recently, several Poolbased Evolutionary Algorithms (PEAs) have been proposed, that asynchronously distribute an evolutionary search among heterogeneous devices, using controlled nodes and nodes outside the local network, through web browsers or cloud services. In PEAs, the population is stored in a shared pool, while distributed processes called workers execute the actual evolutionary search. This approach allows researchers to use low cost computational power that might not be available otherwise. On the other hand, it introduces the challenge of leveraging the computing power of heterogeneous and unreliable resources. The heterogeneity of the system suggests that using a heterogeneous parametrization might be a better option, so the goal of this work is to test such a scheme. In particular, this paper evaluates the strategy proposed by Gong and Fukunaga for the IslandModel, which assigns random control parameter values to each worker. Experiments were conducted to assess the viability of this strategy on poolbased EAs using benchmark problems and the EvoSpace framework. The results suggest that the approach can yield results which are competitive with other parametrization approaches, without requiring any form of experimental tuning.
more …
By
Galvan, Edgar; Trujillo, Leonardo; McDermott, James; Kattan, Ahmed
Show all (4)
Post to Citeulike
1 Citations
It is commonly accepted that a mapping is local if it preserves neighbourhood. In Evolutionary Computation, locality is generally described as the property that neighbouring genotypes correspond to neighbouring phenotypes. Locality has been classified in one of two categories: high and low locality. It is said that a representation has high locality if most genotypic neighbours correspond to phenotypic neighbours. The opposite is true for a representation that has low locality. It is argued that a representation with high locality performs better in evolutionary search compared to a representation that has low locality. In this work, we explore, for the first time, a study on Genetic Programming (GP) locality in continuous fitnessvalued cases. For this, we extended the original definition of locality (first defined and used in Genetic Algorithms using bitstrings) from genotypephenotype mapping to the genotypefitness mapping. Then, we defined three possible variants of locality in GP regarding neighbourhood. The experimental tests presented here use a set of symbolic regression problems, two different encoding and two different mutation operators. We show how locality can be studied in this type of scenarios (continuous fitnessvalued cases) and that locality can successfully been used as a performance prediction tool.
more …
By
Muñoz, Luis; Silva, Sara; Trujillo, Leonardo
Post to Citeulike
4 Citations
Data classification is one of the most ubiquitous machine learning tasks in science and engineering. However, Genetic Programming is still not a popular classification methodology, partially due to its poor performance in multiclass problems. The recently proposed M2GP  Multidimensional Multiclass Genetic Programming algorithm achieved promising results in this area, by evolving mappings of the
$$p$$
dimensional data into a
$$d$$
dimensional space, and applying a minimum Mahalanobis distance classifier. Despite good performance, M2GP employs a greedy strategy to set the number of dimensions
$$d$$
for the transformed data, and fixes it at the start of the search, an approach that is prone to locally optimal solutions. This work presents the M3GP algorithm, that stands for M2GP with multidimensional populations. M3GP extends M2GP by allowing the search process to progressively search for the optimal number of new dimensions
$$d$$
that maximize the classification accuracy. Experimental results show that M3GP can automatically determine a good value for
$$d$$
depending on the problem, and achieves excellent performance when compared to stateoftheartmethods like Random Forests, Random Subspaces and Multilayer Perceptron on several benchmark and realworld problems.
more …
By
GarcíaValdez, Mario; Trujillo, Leonardo; Vega, Francisco Fernández; Merelo Guervós, Juan Julián; Olague, Gustavo
Show all (5)
Post to Citeulike
7 Citations
Currently, a large number of computing systems and user applications are focused on distributed and collaborative models for heterogeneous devices, exploiting cloudbased approaches and social networking. However, such systems have not been fully exploited by the evolutionary computation community. This work is an attempt to bridge this gap, and integrate interactive evolutionary computation with a distributed cloudbased approach that integrates with social networking for collaborative design of artistic artifacts. Such an approach to evolutionary art could fully leverage the concept of memes as an idea that spreads from person to person, within a computational system. In particular, this work presents EvoSpaceInteractive, an open source framework for the development of collaborativeinteractive evolutionary algorithms, a computational tool that facilitates the development of interactive algorithms for artistic design. A proof of concept application is developed on EvoSpaceInteractive called Shapes that incorporates the popular social network Facebook for the collaborative evolution of artistic images generated using the Processing programming language. Initial results are encouraging, Shapes illustrates that it is possible to use EvoSpaceInteractive to effectively develop and deploy a collaborative system.
more …
By
López, Uriel; Trujillo, Leonardo; Legrand, Pierrick
Post to Citeulike
Outliers are one of the most difficult issues when dealing with realworld modeling tasks. Even a small percentage of outliers can impede a learning algorithm’s ability to fit a dataset. While robust regression algorithms exist, they fail when a dataset is corrupted by more than 50% of outliers (breakdown point). In the case of Genetic Programming, robust regression has not been properly studied. In this paper we present a method that works as a filter, removing outliers from the target variable (vertical outliers). The algorithm is simple, it uses a randomly generated population of GP trees to determine which target values should be labeled as outliers. The method is highly efficient. Results show that it can return a clean dataset when contamination reaches as high as 90%, and may be able to handle higher levels of contamination. In this study only synthetic univariate benchmarks are used to evaluate the approach, but it must be stressed that no other approaches can deal with such high levels of outlier contamination while requiring such small computational effort.
more …
By
Trujillo, Leonardo; ZFlores, Emigdio; JuárezSmith, Perla S.; Legrand, Pierrick; Silva, Sara; Castelli, Mauro; Vanneschi, Leonardo; Schütze, Oliver; Muñoz, Luis
Show all (9)
Post to Citeulike
There are two important limitations of standard treebased genetic programming (GP). First, GP tends to evolve unnecessarily large programs, what is referred to as bloat. Second, GP uses inefficient search operators that focus on modifying program syntax. The first problem has been studied extensively, with many works proposing bloat control methods. Regarding the second problem, one approach is to use alternative search operators, for instance geometric semantic operators, to improve convergence. In this work, our goal is to experimentally show that both problems can be effectively addressed by incorporating a local search optimizer as an additional search operator. Using realworld problems, we show that this rather simple strategy can improve the convergence and performance of treebased GP, while also reducing program size. Given these results, a question arises: Why are local search strategies so uncommon in GP? A small survey of popular GP libraries suggests to us that local search is underused in GP systems. We conclude by outlining plausible answers for this question and highlighting future work.
more …
By
GarcíaValdez, Mario; Trujillo, Leonardo; Fernández de Vega, Francisco; Merelo Guervós, Juan J.; Olague, Gustavo
Show all (5)
Post to Citeulike
4 Citations
This paper presents EvoSpace, a Cloud service for the development of distributed evolutionary algorithms. EvoSpace is based on the tuple space model, an associatively addressed memory space shared by several processes. Remote clients, called EvoWorkers, connect to EvoSpace and periodically take a subset of individuals from the global population, perform evolutionary operations on them, and return a set of new individuals. Several EvoWorkers carry out the evolutionary search in parallel and asynchronously, interacting with each other through the central repository. EvoSpace is designed to be domain independent and flexible, in the sense that in can be used with different types of evolutionary algorithms and applications. In this paper, a genetic algorithm is tested on the EvoSpace platform using a wellknown benchmark problem, achieving promising results compared to a standard evolutionary system.
more …
By
Trujillo, Leonardo; Muñoz, Luis; Naredo, Enrique; Martínez, Yuliana
Show all (4)
Post to Citeulike
2 Citations
The Operator Equalization (OE) family of bloat control methods have achieved promising results in many domains. In particular, the FlatOE method, that promotes a flat distribution of program sizes, is one of the simplest OE methods and achieves some of the best results. However, FlatOE, like all OE variants, can be computationally expensive. This work proposes a simplified strategy for bloat control based on FlatOE. In particular, bloat is studied in the NeuroEvolution of Augmenting Topologies (NEAT) algorithm. NEAT includes a very simple diversity preservation technique based on speciation and fitness sharing, and it is hypothesized that with some minor tuning, speciation in NEAT can promote a flat distribution of program size. Results indicate that this is the case in two benchmark problems, in accordance with results for FlatOE. In conclusion, NEAT provides a worthwhile strategy that could be extrapolated to other GP systems, for effective and simple bloat control.
more …
By
Trujillo, Leonardo; Olague, Gustavo
Post to Citeulike
3 Citations
This work presents scale invariant region detectors that apply evolved operators to extract an interest measure. We evaluate operators using their repeatability rate, and have experimentally identified a plateau of local optima within a space of possible interest operators Ω. The space Ω contains operators constructed with Gaussian derivatives and standard arithmetic operations. From this set of local extrema, we have chosen two operators, obtained by searching within Ω using Genetic Programming, that are optimized for high repeatability and global separability when imaging conditions are modified by a known transformation. Then, by embedding the operators into the linear scale space generated with a Gaussian kernel we can characterize scale invariant features by detecting extrema within the scale space response of each operator. Our scale invariant region detectors exhibit a high performance when compared with stateoftheart techniques on standard tests.
more …
By
Silva, Sara; Muñoz, Luis; Trujillo, Leonardo; Ingalalli, Vijay; Castelli, Mauro; Vanneschi, Leonardo
Show all (6)
Post to Citeulike
1 Citations
Classification is one of the most important machine learning tasks in science and engineering. However, it can be a difficult task, in particular when a high number of classes is involved. Genetic Programming, despite its recognized successfulness in so many different domains, is one of the machine learning methods that typically struggles, and often fails, to provide accurate solutions for multiclass classification problems. We present a novel algorithm for tree based GP that incorporates some ideas on the representation of the solution space in higher dimensions, and can be generalized to other types of GP. We test three variants of this new approach on a large set of benchmark problems from several different sources, and observe their competitiveness against the most successful stateoftheart classifiers like Random Forests, Random Subspaces and Multilayer Perceptron.
more …
By
Trujillo, Leonardo; Muñoz, Luis; López, Uriel; Hernández, Daniel E.
Show all (4)
Post to Citeulike
In the era of Deep Learning and Big Data, the place of Genetic Programming (GP) within the Machine Learning area seems difficult to define. Whether it is due to technical constraints or conceptual barriers, GP is currently not a paradigm of choice for the development of stateoftheart machine learning systems. Nonetheless, there are important features of the GP approach that make it unique and should continue to be actively explored and studied. In this work we focus on two aspects of GP that have previously received little or no attention, particularly in treebased GP for symbolic regression. First, on the potential of GP to perform transfer learning, where solutions evolved for one problem are transferred to another. Second, on the potential of GP individuals to detect the true underlying structure of an input dataset and detect anomalies in the input data, what are known as outliers. This work presents initial results on both issues, with the goal of fostering discussion and showing that there is still untapped potential in the GP paradigm.
more …
By
Trujillo, Leonardo; Olague, Gustavo; Fernández de Vega, Francisco; Lutton, Evelyne
Show all (4)
Post to Citeulike
The basic problem for a mobile vision system is determining where it is located within the world. In this paper, a recognition system is presented that is capable of identifying known places such as rooms and corridors. The system relies on a bag of features approach using locally prominent image regions. Realworld locations are modeled using a mixture of Gaussians representation, thus allowing for a multimodal scene characterization. Local regions are represented by a set of 108 statistical descriptors computed from different modes of information. From this set the system needs to determine which subset of descriptors captures regularities between image regions of the same location, and also discriminates between regions of different places. A genetic algorithm is used to solve this selection task, using a fitness measure that promotes: 1) a high classification accuracy; 2) the selection of a minimal subset of descriptors; and 3) a high separation among place models. The approach is tested on two real world examples: a) using a sequence of still images with 4 different locations; and b) a sequence that contains 8 different locations. Results confirm the ability of the system to identify previously seen places in a realworld setting.
more …
By
Naredo, Enrique; Dunn, Enrique; Trujillo, Leonardo
Post to Citeulike
Stereo vision is one of the most active research areas in modern computer vision. The objective is to recover 3D depth information from a pair of 2D images that capture the same scene. This paper addresses the problem of dense stereo correspondence, where the goal is to determine which image pixels in both images are projections of the same 3D point from the observed scene. The proposal in this work is to build a nonlinear operator that combines three well known methods to derive a correspondence measure that allows us to retrieve a better approximation of the ground truth disparity of stereo image pair. To achieve this, the problem is posed as a search and optimization task and solved with genetic programming (GP), an evolutionary paradigm for automatic program induction. Experimental results on well known benchmark problems show that the combined correspondence measure produced by GP outperforms each standard method, based on the mean error and the percentage of bad pixels. In conclusion, this paper shows that GP can be used to build composite correspondence algorithms that exhibit a strong performance on standard tests.
more …
By
Urbano, Paulo; Naredo, Enrique; Trujillo, Leonardo
Post to Citeulike
2 Citations
Recent research on evolutionary algorithms has begun to focus on the issue of generalization. While most works emphasize the evolution of high quality solutions for particular problem instances, others are addressing the issue of evolving solutions that can generalize in different scenarios, which is also the focus of the present paper. In particular, this paper compares fitnessbased search, Novelty Search (NS), and random search in a set of generalization oriented experiments in a maze navigation problem using Grammatical Evolution (GE), a variant of Genetic Programming. Experimental results suggest that NS outperforms the other search methods in terms of evolving general navigation behaviors that are able to cope with different initial conditions within a static deceptive maze.
more …
By
Trujillo, Leonardo; Martínez, Yuliana; Melin, Patricia
Post to Citeulike
1 Citations
A fundamental task that must be addressed before classifying a set of data, is that of choosing the proper classification method. In other words, a researcher must infer which classifier will achieve the best performance on the classification problem in order to make a reasoned choice. This task is not trivial, and it is mostly resolved based on personal experience and individual preferences. This paper presents a methodological approach to produce estimators of classifier performance, based on descriptive measures of the problem data. The proposal is to use Genetic Programming (GP) to evolve mathematical operators that take as input descriptors of the problem data, and output the expected error that a particular classifier might achieve if it is used to classify the data. Experimental tests show that GP can produce accurate estimators of classifier performance, by evaluating our approach on a large set of 500 twoclass problems of multimodal data, using a neural network for classification. The results suggest that the GP approach could provide a tool that helps researchers make a reasoned decision regarding the applicability of a classifier to a particular problem.
more …
By
Chávez, Francisco; Fernández, Francisco; Benavides, César; Lanza, Daniel; Villegas, Juan; Trujillo, Leonardo; Olague, Gustavo; Román, Graciela
Show all (8)
Post to Citeulike
2 Citations
This paper describes initial steps towards allowing Evolutionary Algorithms (EAs) researchers to easily deploy computing intensive runs of EAs on Big Data infrastructures. Although many proposals have already been described in the literature, and a number of new software tools have been implemented embodying parallel versions of EAs, we present here a different approach. Given traditional resistance to change when adopting new software, we try instead to endow the well known ECJ tool with the MapReduce model. By using the Hadoop framework, we introduce changes in ECJ that allow researchers to launch any EA problem on a big data infrastructure similarly as when a single computer is used to run the algorithm. By means of a new parameter, researchers can choose where the run will be launched, whether in a Hadoop based infrastructure or in a desktop computer. This paper shows the tests performed, how the whole system has been tuned to optimize the running time for ECJ experiments, and finally a realworld problem is shown to describe how the MapReduce model can automatically deploy the tasks generated by ECJ without additional intervention.
more …
By
López, Uriel; Trujillo, Leonardo; Martinez, Yuliana; Legrand, Pierrick; Naredo, Enrique; Silva, Sara
Show all (6)
Post to Citeulike
2 Citations
Genetic programming (GP) has been shown to be a powerful tool for automatic modeling and program induction. It is often used to solve difficult symbolic regression tasks, with many examples in realworld domains. However, the robustness of GPbased approaches has not been substantially studied. In particular, the present work deals with the issue of outliers, data in the training set that represent severe errors in the measuring process. In general, a datum is considered an outlier when it sharply deviates from the true behavior of the system of interest. GP practitioners know that such data points usually bias the search and produce inaccurate models. Therefore, this work presents a hybrid methodology based on the RAndom SAmpling Consensus (RANSAC) algorithm and GP, which we call RANSACGP. RANSAC is an approach to deal with outliers in parameter estimation problems, widely used in computer vision and related fields. On the other hand, this work presents the first application of RANSAC to symbolic regression with GP, with impressive results. The proposed algorithm is able to deal with extreme amounts of contamination in the training set, evolving highly accurate models even when the amount of outliers reaches 90%.
more …
By
Naredo, Enrique; Trujillo, Leonardo; Martínez, Yuliana
Post to Citeulike
4 Citations
Natural evolution is an openended search process without an a priori fitness function that needs to be optimized. On the other hand, evolutionary algorithms (EAs) rely on a clear and quantitative objective. The Novelty Search algorithm (NS) substitutes fitnessbased selection with a novelty criteria; i.e., individuals are chosen based on their uniqueness. To do so, individuals are described by the behaviors they exhibit, instead of their phenotype or genetic content. NS has mostly been used in evolutionary robotics, where the concept of behavioral space can be clearly defined. Instead, this work applies NS to a more general problem domain, classification. To this end, two behavioral descriptors are proposed, each describing a classifier’s performance from two different perspectives. Experimental results show that NSbased search can be used to derive effective classifiers. In particular, NS is best suited to solve difficult problems, where exploration needs to be encouraged and maintained.
more …
By
Sotelo, Arturo; Guijarro, Enrique; Trujillo, Leonardo; Coria, Luis; Martínez, Yuliana
Show all (5)
Post to Citeulike
Epilepsy is a widespread disorder that affects many individuals worldwide. For this reason much work has been done to develop computational systems that can facilitate the analysis and interpretation of the signals generated by a patients brain during the onset of an epileptic seizure. Currently, this is done by human experts since computational methods cannot achieve a similar level of performance. This paper presents a Genetic Programming (GP) based approach to analyze brain activity captured with Electrocorticogram (ECoG). The goal is to evolve classifiers that can detect the three main stages of an epileptic seizure. Experimental results show good performance by the GPclassifiers, evaluated based on sensitivity, specificity, prevalence and likelihood ratio. The results are unique within this domain, and could become a useful tool in the development of future treatment methods.
more …
By
GalvánLópez, Edgar; VázquezMendoza, Lucia; Trujillo, Leonardo
Post to Citeulike
1 Citations
Data sets with imbalanced class distribution pose serious challenges to wellestablished classifiers. In this work, we propose a stochastic multiobjective genetic programming based on semantics. We tested this approach on imbalanced binary classification data sets, where the proposed approach is able to achieve, in some cases, higher recall, precision and Fmeasure values on the minority class compared to C4.5, Naive Bayes and Support Vector Machine, without significantly decreasing these values on the majority class.
more …
By
Dibene, Juan Carlos; Picos, Kenia; DíazRamírez, Victor H.; Trujillo, Leonardo
Show all (4)
Post to Citeulike
Object recognition is a widely studied problem in computer vision. Template matching with correlation filters is one of the most accurate strategies for target recognition. However, it is computationally expensive, particularly when there is no restriction in the pose of the object of interest and an exhaustive search is implemented. This work proposes the use of a Covariance Matrix Adaptation Evolution Strategy (CMAES) for postprocessing template matched filters. The proposed strategy searches for the best template matching guided by the discrimination capability of a correlationbased filter, considering a vast set of filters. CMAES is used to find the best match and determine the correct pose or orientation parameters of a target object. The proposed method demonstrates that CMAES is effective for multidimensional problems in a huge search space, which makes it a suitable candidate for target recognition in unconstrained applications. Experimental results show high efficiency in terms of the number of function evaluations and locating the correct pose parameters based on the DC measure.
more …
By
Trujillo, Leonardo; Silva, Sara; Legrand, Pierrick; Vanneschi, Leonardo
Show all (4)
Post to Citeulike
2 Citations
Recently, it has been stated that the complexity of a solution is a good indicator of the amount of overfitting it incurs. However, measuring the complexity of a program, in Genetic Programming, is not a trivial task. In this paper, we study the functional complexity and how it relates with overfitting on symbolic regression problems. We consider two measures of complexity, Slopebased Functional Complexity, inspired by the concept of curvature, and Regularitybased Functional Complexity based on the concept of Hölderian regularity. In general, both complexity measures appear to be poor indicators of program overfitting. However, results suggest that Regularitybased Functional Complexity could provide a good indication of overfitting in extreme cases.
more …
By
Trujillo, Leonardo; Olague, Gustavo; Lutton, Evelyne; Fernández de Vega, Francisco
Show all (4)
Post to Citeulike
7 Citations
This contribution studies speciation from the standpoint of evolutionary robotics (ER). A common approach to ER is to design a robot’s control system using neuroevolution during training. An extension to this methodology is presented here, where speciation is incorporated to the evolution process in order to obtain a varied set of solutions for a robotics problem using a single algorithmic run. Although speciation is common in evolutionary computation, it has been less explored in behaviorbased robotics. When employed, speciation usually relies on a distance measure that allows different individuals to be compared. The distance measure is normally computed in objective or phenotypic space. However, the speciation process presented here is intended to produce several distinct robot behaviors; hence, speciation is sought in behavioral space. Thence, individual neurocontrollers are described using behavior signatures, which represent the traversed path of the robot within the training environment and are encoded using a character string. With this representation, behavior signatures are compared using the normalized Levenshtein distance metric (NGLD). Results indicate that speciation in behavioral space does indeed allow the ER system to obtain several navigation strategies for a common experimental setup. This is illustrated by comparing the best individual from each species with those obtained using the NeuroEvolution of Augmenting Topologies (NEAT) method which speciates neural networks in topological space.
more …
By
Fernández de Vega, Francisco; Olague, Gustavo; Trujillo, Leonardo; Lombraña González, Daniel
Show all (4)
Post to Citeulike
4 Citations
Evolutionary algorithms (EAs) consume large amounts of computational resources, particularly when they are used to solve realworld problems that require complex fitness evaluations. Beside the lack of resources, scientists face another problem: the absence of the required expertise to adapt applications for parallel and distributed computing models. Moreover, the computing power of PCs is frequently underused at institutions, as desktops are usually devoted to administrative tasks. Therefore, the proposal in this work consists of providing a framework that allows researchers to massively deploy EA experiments by exploiting the computing power of their instituions’ PCs by setting up a Desktop Grid System based on the BOINC middleware. This paper presents a new model for running unmodified applications within BOINC with a webbased centralized management system for available resources. Thanks to this proposal, researchers can run scientific applications without modifying the application’s source code, and at the same time manage thousands of computers from a single web page. Summarizing, this model allows the creation of ondemand customized execution environments within BOINC that can be used to harness unused computational resources for complex computational experiments, such as EAs. To show the performance of this model, a realworld application of Genetic Programming was used and tested through a centrallymanaged desktop grid infrastructure. Results show the feasibility of the approach that has allowed researchers to generate new solutions by means of an easy to use and manage distributed system.
more …
By
Martínez, Yuliana; Trujillo, Leonardo; Legrand, Pierrick; GalvánLópez, Edgar
Show all (4)
Post to Citeulike
2 Citations
The estimation of problem difficulty is an open issue in genetic programming (GP). The goal of this work is to generate models that predict the expected performance of a GPbased classifier when it is applied to an unseen task. Classification problems are described using domainspecific features, some of which are proposed in this work, and these features are given as input to the predictive models.
These models are referred to as predictors of expected performance. We extend this approach by using an ensemble of specialized predictors (SPEP), dividing classification problems into groups and choosing the corresponding SPEP. The proposed predictors are trained using 2D synthetic classification problems with balanced datasets. The models are then used to predict the performance of the GP classifier on unseen realworld datasets that are multidimensional and imbalanced. This work is the first to provide a performance prediction of a GP system on test data,
while previous works focused on predicting training performance. Accurate predictive models are generated by posing a symbolic regression task and solving it with GP. These results are achieved by using highly descriptive features and including a dimensionality reduction stage that simplifies the learning and testing process. The proposed approach could be extended to other classification algorithms and used as the basis of an expert system for algorithm selection.
more …
By
Beltrán, Mónica; Melin, Patricia; Trujillo, Leonardo
Post to Citeulike
1 Citations
This chapter describes a modular neural network (MNN) for the problem of signature recognition. Currently, biometric identification has gained a great deal of research interest within the pattern recognition community. For instance, many attempts have been made in order to automate the process of identifying a person’s handwritten signature, however this problem has proven to be a very difficult task. In this work, we propose a MNN that has three separate modules, each using different image features as input, these are: edges, wavelet coefficients, and the Hough transform matrix. Then, the outputs from each of these modules are combined using a Sugeno fuzzy integral. The experimental results obtained using a database of 30 individual’s shows that the modular architecture can achieve a very high 98% recognition accuracy with a test set of 150 images. Therefore, we conclude that the proposed architecture provides a suitable platform to build a signature recognition system.
more …
By
Olague, Gustavo; Trujillo, Leonardo
Post to Citeulike
5 Citations
Recently, the detection of local image feature has become an indispensable process for many image analysis or computer vision systems. In this chapter, we discuss how Genetic Programming (GP), a form of evolutionary search, can be used to automatically synthesize image operators that detect such features on digital images. The experimental results we review, confirm that artificial evolution can produce solutions that outperform many manmade designs. Moreover, we argue that GP is able to discover, and reuse, small code fragments, or building blocks, that facilitate the synthesis of image operators for point detection. Another noteworthy result is that the GP did not produce operators that rely on the autocorrelation matrix, a mathematical concept that some have considered to be the most appropriate to solve the point detection task. Hence, the GP generates operators that are conceptually simple and can still achieve a high performance on standard tests.
more …
By
Legrand, Pierrick; Vézard, Laurent; Chavent, Marie; FaïtaAïnseba, Frédérique; Trujillo, Leonardo
Show all (5)
Post to Citeulike
This chapter presents a method to automatically determine the alertness state of humans. Such a task is relevant in diverse domains, where a person is expected or required to be in a particular state of alertness. For instance, pilots, security personnel, or medical personnel are expected to be in a highly alert state, and this method could help to confirm this or detect possible problems. In this work, electroencephalographic (EEG) data from 58 subjects in two distinct vigilance states (state of high and low alertness) was collected via a cap with 58 electrodes. Thus, a binary classification problem is considered. To apply the proposed approach in a realworld scenario, it is necessary to build a prediction method that requires only a small number of sensors (electrodes), minimizing the total cost and maintenance of the system while also reducing the time required to properly setup the EEG cap. The approach presented in this chapter applies a preprocessing method for EEG signals based on the use of discrete wavelet decomposition (DWT) to extract the energy of each frequency in the signal. Then, a linear regression is performed on the energies of some of these frequencies and the slope of this regression is retained. A genetic algorithm (GA) is used to optimize the selection of frequencies on which the regression is performed and to select the best recording electrode. Results show that the proposed strategy derives accurate predictive models of alertness.
more …
By
JuárezSmith, Perla; Trujillo, Leonardo; GarcíaValdez, Mario; Fernández de Vega, Francisco; Chávez, Francisco
Show all (5)
Post to Citeulike
This work presents a unique genetic programming (GP) approach that integrates a numerical local search method and a bloatcontrol mechanism to address some of the main issues with traditional GP. The former provides a directed search operator to work in conjunction with standard syntax operators that perform more exploration in design space, while the latter controls code growth by maintaining program diversity through speciation. The system can produce highly parsimonious solutions, thus reducing the cost of performing the local optimization process. The proposal is extensively evaluated using realworld problems from diverse domains, and the behavior of the search is analyzed from several different perspectives, including how species evolve, the effect of the local search process and the interpretability of the results. Results show that the proposed approach compares favorably with a standard approach, and that the hybrid algorithm can be used as a viable alternative for solving realworld symbolic regression problems.
more …
By
GarcíaValdez, Mario; Trujillo, Leonardo; Merelo, JuanJ; Fernández de Vega, Francisco; Olague, Gustavo
Show all (5)
Post to Citeulike
14 Citations
This work presents the EvoSpace model for the development of poolbased evolutionary algorithms (PoolEA). Conceptually, the EvoSpace model is built around a central repository or population store, incorporating some of the principles of the tuplespace model and adding additional features to tackle some of the issues associated with PoolEAs; such as, work redundancy, starvation of the population pool, unreliability of connected clients or workers, and a large parameter space. The model is intended as a platform to develop search algorithms that take an opportunistic approach to computing, allowing the exploitation of freely available services over the Internet or volunteer computing resources within a local network. A comprehensive analysis of the model at both the conceptual and implementation levels is provided, evaluating performance based on efficiency, optima found and speedup, while providing a comparison with a standard EA and an islandbased model. The issues of lost connections and system parametrization are studied and validated experimentally with encouraging results, that suggest how EvoSpace can be used to develop and implement different PoolEAs for search and optimization.
more …
