Showing 1 to 33 of 33 matching Articles
Results per page:
Export (CSV)
By
Paredes, Rodrigo; Chávez, Edgar; Figueroa, Karina; Navarro, Gonzalo
Show all (4)
15 Citations
Let
$\mathbb{U}$
be a set of elements and d a distance function defined among them. Let NN_{k}(u) be the k elements in
$\mathbb{U}\{u\}$
having the smallest distance to u. The knearest neighbor graph (knng) is a weighted directed graph
$G(\mathbb{U},E)$
such that E = {(u,v), v ∈ NN_{k}(u)}. Several knng construction algorithms are known, but they are not suitable to general metric spaces. We present a general methodology to construct knngs that exploits several features of metric spaces. Experiments suggest that it yields costs of the form c_{1}n^{1.27} distance computations for low and medium dimensional spaces, and c_{2}n^{1.90} for high dimensional ones.
more …
By
Paredes, Rodrigo; Chávez, Edgar
11 Citations
Proximity searching consists in retrieving from a database, objects that are close to a query. For this type of searching problem, the most general model is the metric space, where proximity is defined in terms of a distance function. A solution for this problem consists in building an offline index to quickly satisfy online queries. The ultimate goal is to use as few distance computations as possible to satisfy queries, since the distance is considered expensive to compute. Proximity searching is central to several applications, ranging from multimedia indexing and querying to data compression and clustering.
In this paper we present a new approach to solve the proximity searching problem. Our solution is based on indexing the database with the knearest neighbor graph (knng), which is a directed graph connecting each element to its k closest neighbors.
We present two search algorithms for both range and nearest neighbor queries which use navigational and metrical features of the knng graph. We show that our approach is competitive against current ones. For instance, in the document metric space our nearest neighbor search algorithms perform 30% more distance evaluations than AESA using only a 0.25% of its space requirement. In the same space, the pivotbased technique is completely useless.
more …
By
Chávez, Edgar; Dobrev, Stefan; Kranakis, Evangelos; Opatrny, Jaroslav; Stacho, Ladislav; Urrutia, Jorge
Show all (6)
7 Citations
We give an algorithm for constructing a connected spanning subgraphs(panner) of a wireless network modelled as a unit disk graph with nodes of irregular transmission ranges, whereby for some parameter 0 < r ≤ 1 the transmission range of a node includes the entire disk around the node of radius at least r and it does not include any node at distance more than one. The construction of a spanner is distributed and local in the sense that nodes use only information at their vicinity, moreover for a given integer k ≥ 2 each node needs only consider all the nodes at distance at most k hops from it. The resulting spanner has maximum degree at most 3 +
$\frac{6}{\pi r}$
+
$\frac{r+1}{r^{2}}$
, when 0 < r <1 (and at most five, when r = 1). Furthermore it is shown that the spanner is planar provided that the distance between any two nodes is at least
$\sqrt{1r^{2}}$
. If the spanner is planar then for k ≥ 2 the sum of the Euclidean lengths of the edges of the spanner is at most
$\frac{kr+1}{kr1}$
times the sum of the Euclidean lengths of the edges of a minimum weight Euclidean spanning tree.
more …
By
Tejeda, Héctor; Chávez, Edgar; Sanchez, Juan A.; Ruiz, Pedro M.
Show all (4)
6 Citations
Geographic routing protocols are one of the most common routing schemes for sensor networks. These protocols consist of two different modes of operation: greedy routing to forward data to the destination using neighbors which are closer to the destination than current node and face routing to avoid voids in the network. Face routing requires the graph to be planar, which usually means that some crossing links of the original network cannot be considered when routing in face mode. In this paper we introduce a new localized scheme to build a virtual spanner which is planar by construction and is guaranteed to be connected if the underlying network is connected as well. Unlike previous works, by performing face routing over this spanner we can reduce energy consumption in face mode because the elimination of any of the original links in the network is not required. Thus, the most energyefficient paths can be selected when the protocol enters face mode. The virtual spanner is easytobuild and uses only local information, making it scalable to largescale networks. Routing is always performed in real nodes; virtual nodes are used only as routing anchors when the agent is in face mode. In addition, our simulation results show that the proposed scheme outperforms the best energyefficient geographic routing protocol for different network densities and energy models.
more …
By
Chávez, Edgar; Navarro, Gonzalo
20 Citations
Range searches in metric spaces can be very difficult if the space is “high dimensional”, i.e. when the histogram of distances has a large mean and/or a small variance. This socalled “curse of dimensionality”, well known in vector spaces, is also observed in metric spaces. There are at least two reasons behind the curse of dimensionality: a large search radius and/or a high intrinsic dimension of the metric space. We present a general probabilistic framework based on stretching the triangle inequality, whose direct effect is a reduction of the effective search radius. The technique gets more effective as the dimension grows, and the basic principle can be applied to any search algorithm. In this paper we apply it to a particular class of indexing algorithms. We present an analysis which helps understand the process, as well as empirical evidence showing dramatic improvements in the search time at the cost of a very small error probability.
more …
By
BeltránMárquez, Jessica; Chávez, Edgar; Favela, Jesús
2 Citations
Automatic identification of activities can be used to provide information to caregivers of persons with dementia for identifying assistance needs. Environmental audio provides significant and representative information of the context, making microphones a choice to identify activities automatically. However, in real situations, the audio captured by microphones comes from overlapping sound sources, making its identification a challenge for audio analysis and retrieval. In this paper we propose a succinct representation of the signal by measuring the multiband spectral entropy of the signal frame by frame, followed by a cosine transform and binary codification, we call this the Cosine MultiBand Spectral Entropy Signature (CMBSES). To test our proposal, we created a database of a mixup of triples from a collection of nine environmental sounds in four different signaltonoise ratios (SNR). We codified both the original sounds and the triples and then searched all the original sounds in the mixup collection. To establish a ground truth we also tested the same database with 48 people of assorted ages. Our feature extraction outperforms the stateoftheart Mel Frequency Cepstral Coefficients (MFCC) and it also surpass humans in the experiment.
more …
By
Figueroa, Karina; Chávez, Edgar; Navarro, Gonzalo; Paredes, Rodrigo
Show all (4)
10 Citations
Proximity searching consists in retrieving from a database those elements that are similar to a query. As the distance is usually expensive to compute, the goal is to use as few distance computations as possible to satisfy queries. Indexes use precomputed distances among database elements to speed up queries. As such, a baseline is AESA, which stores all the distances among database objects, but has been unbeaten in query performance for 20 years. In this paper we show that it is possible to improve upon AESA by using a radically different method to select promising database elements to compare against the query. Our experiments show improvements of up to 75% in document databases. We also explore the usage of our method as a probabilistic algorithm that may lose relevant answers. On a database of faces where any exact algorithm must examine virtually all elements, our probabilistic version obtains 85% of the correct answers by scanning only 10% of the database.
more …
By
Santoyo, Francisco; Chávez, Edgar; Téllez, Eric S.
Some instances of multimedia data can be represented as high dimensional binary vectors under the hamming distance. The standard index used to handle queries is Locality Sensitive Hashing (LSH), reducing approximate queries to a set of exact searches. When the queries are not selective and multiple families of hashing functions are employed, or when the collection is large, LSH indexes should be stored in secondary memory, slowing down the query time.
In this paper we present a compressed LSH index, queryable without decompression and with negligible impact in query speed. This compressed representation enables larger collections to be handled in main memory with the corresponding speedup with respect to fetching data from secondary memory.
We tested the index with a real world example, indexing songs to detect near duplicates. Songs are represented using an entropy based audiofingerprint (AFP), of independent interest.
The combination of compressed LSH and the AFP enables the retrieval of lossy compressed audio with near perfect recall at bitrates as low as 32 kbps, packing the representation of 30+ million music tracks of standard length (which is about the total number of unique tracks of music available worldwide) in half a gigabyte of space. A sequential search for matches would take about 15 minutes; while using our compressed index, of size roughly one gigabyte, searching for a song would take a fraction of a second.
more …
By
Chávez, Edgar; Fraser, Maia; Tejeda, Héctor
1 Citations
For an adhoc network with n nodes, we propose a proactive routing protocol without routing tables which uses O(logn) bits per node for the location service tables. The algorithm is based on 1dimensional virtual coordinates, which we call labels. The decision of where to forward a packet is oblivious and purely local, depending only on the labels of the immediate neighbours and the label of the destination: the packet is forwarded to the neighbour whose label is closest to that of the destination. The algorithm is based on mapping the network to an ordered list where each node has one or more integer labels. This labeling can be produced by any arbitrary traversal of the network visiting all the nodes, in particular by a depthfirst search of a flood tree which gives a 2n length traversal. We show experimentally that, in terms of hop number, our routing algorithm is far superior to geographic protocols in randomly generated networks and for sparse networks produces routes of length very close to those of the shortest path.
more …
By
Chávez, Edgar; Ludueña, Verónica; Reyes, Nora; Roggero, Patricia
Show all (4)
4 Citations
In this paper we present the Distal Spatial Approximation Tree (DiSAT), an algorithmic improvement of SAT. Our improvement increases the discarding power of the SAT by selecting distal nodes instead of the proximal nodes proposed in the original paper. Our approach is parameter free and it was the most competitive in an extensive benchmarking, from two to forty times faster than the SAT, and faster than the List of Clusters (LC) which is considered the state of the art for main memory, linear sized indexes in the model of distance computations.
In summary, we obtained an index more resistant to the curse of dimensionality, establishing a new benchmark in performance, faster to build than the LC and with a small memory footprint. Our strategies can be used in any version of the SAT, either in main or secondary memory.
more …
By
Chávez, Edgar; Figueroa, Karina
4 Citations
A number of problems in computer science can be solved efficiently with the so called memory based or kernel methods. Among this problems (relevant to the AI community) are multimedia indexing, clustering, non supervised learning and recommendation systems. The common ground to this problems is satisfying proximity queries with an abstract metric database.
In this paper we introduce a new technique for making practical indexes for metric range queries. This technique improves existing algorithm based on pivots and signatures, and introduce a new data structure, the Fixed Queries Trie to speedup metric range queries. The result is an O(n) construction time index, with query complexity O(n^{α}), α≤ 1. The indexing algorithm uses only a few bits of storage for each database element.
more …
By
Amato, Giuseppe; Chávez, Edgar; Connor, Richard; Falchi, Fabrizio; Gennaro, Claudio; Vadicamo, Lucia
Show all (6)
In the realm of metric search, the permutationbased approaches have shown very good performance in indexing and supporting approximate search on large databases. These methods embed the metric objects into a permutation space where candidate results to a given query can be efficiently identified. Typically, to achieve high effectiveness, the permutationbased result set is refined by directly comparing each candidate object to the query one. Therefore, one drawback of these approaches is that the original dataset needs to be stored and then accessed during the refining step. We propose a refining approach based on a metric embedding, called nSimplex projection, that can be used on metric spaces meeting the npoint property. The nSimplex projection provides upper and lowerbounds of the actual distance, derived using the distances between the data objects and a finite set of pivots. We propose to reuse the distances computed for building the data permutations to derive these bounds and we show how to use them to improve the permutationbased results. Our approach is particularly advantageous for all the cases in which the traditional refining step is too costly, e.g. very large dataset or very expensive metric function.
more …
By
Tejeda, Héctor; Chávez, Edgar; Sanchez, Juan A.; Ruiz, Pedro M.
Show all (4)
2 Citations
Geographic routing for ad hoc and sensor networks has gained a lot of momentum during the last few years. In this scheme routes are created locally by each individual node, just based on the position of the destination and its local neighbors. To do that, a node selects its best neighbor (according to some metric) out of those being closer than itself to the destination. This operation is called greedy mode. When a node has no such neighbors, it enters into face routing mode. However, for face routing to work properly, the underlying graph needs to be planarized by removing crossing edges, which may eventually be very good from the routing metric pointofview. In this paper, we introduce a new localized scheme to build a planar virtual spanner in a simple and efficient way, with low control overhead. The produced virtual spanner allows face routing to be executed, without the need to remove any of the original links in the network. Thus, the best links according to the routing metric can still be used, Our simulation results show that by performing face routing over the virtual spanner, we manage to enhance the routing performance both for greedyfacegreedy routing and face routing between a 40 to 60% compared to existing planarity tests.
more …
By
Chávez, Edgar; Figueroa, Karina
A number of problems in computer science can be solved efficiently with the so called memory based or kernel methods. Among this problems (relevant to the AI community) are multimedia indexing, clustering, non supervised learning and recommendation systems. The common ground to this problems is satisfying proximity queries with an abstract metric database.
In this paper we introduce a new technique for making practical indexes for metric range queries. This technique improves existing algorithm based on pivots and signatures, and introduce a new data structure, the Fixed Queries Trie to speedup metric range queries. The result is an O(n) construction time index, with query complexity O(n^{α}), α≤ 1. The indexing algorithm uses only a few bits of storage for each database element.
more …
By
Ruiz, Guillermo; Chávez, Edgar; Graff, Mario; Téllez, Eric S.
Show all (4)
1 Citations
Proximity searching can be formulated as an optimization problem, being the goal function to find the object minimizing the distance to a given query by traversing a graph with a greedy algorithm. This formulation can be traced back to early formulations defined for vector spaces, and other recent approaches defined for the more general setup of metric spaces.
In this paper we introduce three searching algorithms generalizing to local search other than greedy, and experimentally prove that our approach improves significantly the state of the art. In particular, our contributions have excellent tradeoffs among speed, recall and memory usage; making our algorithms suitable for real world applications. As a byproduct, we present an open source implementation of most of the near neighbor search algorithms in the literature.
more …
By
Chávez, Edgar; Sadit Tellez, Eric
2 Citations
Nearest neighbor queries can be satisfied, in principle, with a greedy algorithm under a proximity graph. Each object in the database is represented by a node, and proximal nodes in this graph will share an edge. To find the nearest neighbor the idea is quite simple, we start in a random node and get iteratively closer to the nearest neighbor following only adjacent edges in the proximity graph. Every reachable node from current vertex is reviewed, and only the closertothequery node is expanded in the next round. The algorithm stops when none of the neighbors of the current node is closer to the query. The number of revised objects will be proportional to the diameter of the graph times the average degree of the nodes. Unfortunately the degree of a proximity graph is unbounded for a general metric space [1], and hence the number of inspected objects can be linear on the size of the database, which is the same as no indexing at all.
In this paper we introduce a quasiproximity graph induced by the allknearest neighbor graph. The degree of the above graph is bounded but we will face local minima when running the above greedy algorithm, which boils down to have false positives in the queries.
We show experimental results for high dimensional spaces. We report a recall greater than 90% for most configurations, which is very good for many proximity searching applications, reviewing just a tiny portion of the database.
The space requirement for the index is linear on the database size, and the construction time is quadratic in worst case. Relaxations of our method are sketched to obtain practical subquadratic implementations.
more …
By
Chávez, Edgar; Mitton, Nathalie; Tejeda, Héctor
9 Citations
Sensor networks are wireless adhoc networks where all the nodes cooperate for routing messages in the absence of a fixed infrastructure. Nonflooding, guaranteed delivery routing protocols are preferred because sensor networks have limited battery life. Location aware routing protocols are good candidates for sensor network applications, nevertheless they need either an external location service like GPS or Galileo (which are bulky, energy consuming devices) or internal location services providing nonunique virtual coordinates leading to low delivery rates. In this paper we introduce Position Trees a collision free, distributed labeling algorithm based on hop counting, which embed a spanning tree of the underlying network . The Routing with Position Trees (RTP) is a guaranteed delivery, nonflooding, efficient implicit routing protocol based on Position Trees. We study experimentally the statistical properties of memory requirements and the routing efficiency of the RPT.
more …
By
SotoMendoza, Valeria; Beltrán, Jessica; Chávez, Edgar; Hernández, Jehú; GarcíaMacías, J. Antonio
Show all (5)
1 Citations
The automatic detection of behavioral changes in older adults living in geriatric centers is relevant for physicians and caregivers. These changes could indicate an incipient symptom of a disease or a steep decline in the health of the person. Abnormal pattern discovery has been studied in the context of an array of (wearable) sensors (i.e. accelerometers, infrared, cameras, etc.) dedicated to monitor the older adult. In this work we explore the use of manually annotated records, the type of records maintained by caregivers in a daily log. These annotations have low semantic value, and consist in a sequence of keywords about the activity being carried out by the older adult. This information is often overseen because it could be noisy, incomplete and redundant. We tested a datadriven approach to identify patterns from daily activity records, which were collected over six months from a group of older adults in a geriatric center. The results show that through simple data processing techniques it is posible to identify abnormal patterns in daily activities associated with behavioral changes over time.
more …
By
Navarro, Gonzalo; Paredes, Rodrigo; Chávez, Edgar
10 Citations
A tspanner, a subgraph that approximates graph distances within a precision factor t, is a well known concept in graph theory. In this paper we use it in a novel way, namely as a data structure for searching metric spaces. The key idea is to consider the tspanner as an approximation of the complete graph of distances among the objects, and use it as a compact device to simulate the large matrix of distances required by successful search algorithms like AESA [Vidal 1986]. The tspanner provides a timespace tradeoff where full AESA is just one extreme. We show that the resulting algorithm is competitive against current approaches, e.g., 1.5 times the time cost of AESA using only 3.21% of its space requirement, in a metric space of strings; and 1.09 times the time cost of AESA using only 3.83 % of its space requirement, in a metric space of documents. We also show that tspanners provide better spacetime tradeoffs than classical alternatives such as pivotbased indexes. Furthermore, we show that the concept of tspanners has potential for large improvements.
more …
By
Miranda, Natalia; Chávez, Edgar; Piccoli, María Fabiana; Reyes, Nora
Show all (4)
5 Citations
Proximity queries consists in retrieving objects near a given query. To avoid a brute force scan over a large database, an index can be used. However, for some problems, indexes are mostly useless (their running times are worst than sequential scan).
On the other hand, researchers have tried massively parallel hardware (as GPGPU) in the quest of faster query times. The results have been modest because current algorithms are cumbersome, while GPGPU architectures favor simple kernels, have a clear memory hierarchy and need close to zero crosstalk between processing units. We have engineered very fast algorithms for proximity queries taking into account this principles, all of them are presented in this paper.
In our approach no index is built, the crosstalk between threads is eliminated, and the higher (faster) levels of memory hierarchy are consistently used. The absence of data structures allows to use all the available memory for the database, and furthermore makes possible to do stream processing on very large data collections.
more …
By
Beltrán, Jessica; Navarro, René; Chávez, Edgar; Favela, Jesús; SotoMendoza, Valeria; Ibarra, Catalina
Show all (6)
3 Citations
People suffering from dementia exhibit abnormal behaviors that can put them at risk or burden their relatives and caregivers. Many of these behaviors have acoustic manifestations, such as shouting, mumbling, cursing or making repetitive tapping. In this paper we propose an approach for detecting disruptive behavior manifested through audio. The solution proposed is an specialized speech, verbal and ambient sound detector and classifier, using speech and CASA techniques. We illustrate how the detection of these symptoms can be used to enact nonpharmacological interventions aimed at stopping such behavior or mitigating its negative impact.
more …
By
Chávez, Edgar; Figueroa, Karina; Navarro, Gonzalo
11 Citations
Kernel based methods (such as knearest neighbors classifiers) for AI tasks translate the classification problem into a proximity search problem, in a space that is usually very high dimensional. Unfortunately, no proximity search algorithm does well in high dimensions. An alternative to overcome this problem is the use of approximate and probabilistic algorithms, which trade time for accuracy.
In this paper we present a new probabilistic proximity search algorithm. Its main idea is to order a set of samples based on their distance to each element. It turns out that the closeness between the order produced by an element and that produced by the query is an excellent predictor of the relevance of the element to answer the query.
The performance of our method is unparalleled. For example, for a full 128dimensional dataset, it is enough to review 10% of the database to obtain 90% of the answers, and to review less than 1% to get 80% of the correct answers. The result is more impressive if we realize that a full 128dimensional dataset may span thousands of dimensions of clustered data. Furthermore, the concept of proximity preserving order opens a totally new approach for both exact and approximated proximity searching.
more …
By
BaezaYates, Ricardo; Bustos, Benjamín; Chávez, Edgar; Herrera, Norma; Navarro, Gonzalo
Show all (5)
The concept of cluster is elusive. The direct implication of this is a large diversity of clustering techniques, each serving a particular definition. Most techniques focus on a global optimization function. The general procedure is to propose a clustering (using a suitable algorithm), then to measure the quality and amount of clusters, and repeat the procedure (proposing a new clustering structure, using for example new parameters) until satisfied.
more …
By
CamarenaIbarrola, Antonio; Chávez, Edgar
5 Citations
In this paper we address the problem of matching musical renditions of the same piece of music also known as performances. We use an entropy based AudioFingerprint delivering a framed, small footprint AFP which reduces the problem to a string matching problem. The Entropy AFP has very low resolution (750 ms per symbol), making it suitable for flexible string matching.
We show experimental results using dynamic time warping (DTW), Levenshtein or edit distance and the Longest Common Subsequence (LCS) distance. We are able to correctly (100%) identify different renditions of masterpieces as well as pop music in less than a second per comparison.
The three approaches are 100% effective, but LCS and Levenshtein can be computed online, making them suitable for monitoring applications (unlike DTW), and since they are distances a metric index could be use to speed up the recognition process.
more …
By
Sadit Tellez, Eric; Chávez, Edgar
2 Citations
One of the most efficient index for similarity search, to fix ideas think in speeding up knn searches in a very large database, is the so called list of clusters. This data structure is a counterintuitive construction which can be seen as extremely unbalanced, as opposed to balanced data structures for exact searching. In practical terms there is no better alternative for exact indexing, when every search return all the incumbent results; as opposed to approximate similarity search. The major drawback of the list of clusters is its quadratic time construction.
In this paper we revisit the list of clusters aiming at speeding up the construction time without sacrificing its efficiency. We obtain similar search times while gaining a significant amount of time in the construction phase.
more …
By
Téllez, Eric Sadit; Chávez, Edgar; CamarenaIbarrola, Antonio
7 Citations
Many pattern recognition tasks can be modeled as proximity searching. Here the common task is to quickly find all the elements close to a given query without sequentially scanning a very large database.
A recent shift in the searching paradigm has been established by using permutations instead of distances to predict proximity. Every object in the database record how the set of reference objects (the permutants) is seen, i.e. only the relative positions are used. When a query arrives the relative displacements in the permutants between the query and a particular object is measured. This approach turned out to be the most efficient and scalable, at the expense of loosing recall in the answers. The permutation of every object is represented with κ short integers in practice, producing bulky indexes of 16 κn bits.
In this paper we show how to represent the permutation as a binary vector, using just one bit for each permutant (instead of logκ in the plain representation). The Hamming distance in the binary signature is used then to predict proximity between objects in the database. We tested this approach with many real life metric databases obtaining faster queries with a recall close to the Spearman ρ using 16 times less space.
more …
By
Ruiz, Guillermo; Santoyo, Francisco; Chávez, Edgar; Figueroa, Karina; Tellez, Eric Sadit
Show all (5)
7 Citations
Pivot tables are popular for exact metric indexing. It is well known that a large pivot table produces faster indexes. The rule of thumb is to use as many pivots as the available memory allows for a given application. To further speedup searches, redundant pivots can be eliminated or the scope of the pivots (the number of database objects covered by a pivot) can be reduced.
In this paper, we apply a different technique to speedup searches. We assign objects to pivots while, at the same time, enforcing proper coverage of the database objects. This increases the discarding power of pivots and in turn leads to faster searches. The central idea is to select a set of essential pivots (without redundancy) covering the entire database. We call our technique extreme pivoting (EP).
A nice additional property of EP is that it balances performance and memory usage. For example; using the same amount of memory, EP is faster than the List of Clusters and the Spatial Approximation Tree. Moreover, EP is faster than LAESA even when it uses less memory.
The EP technique was formally modeled allowing performance prediction without an actual implementation. Performance and memory usage depend on two parameters of EP, which are characterized with a wide range of experiments. Also, we provide automatic selection of one parameter fixing the other. The formal model was empirically tested with real world and synthetic datasets finding high consistency between the predicted and the actual performance.
more …
By
Chávez, Edgar; Navarro, Gonzalo
13 Citations
We present a radically new indexing approach for approximate string matching. The scheme uses the metric properties of the edit distance and can be applied to any other metric between strings. We build a metric space where the sites are the nodes of the suffix tree of the text, and the approximate query is seen as a proximity query on that metric space. This permits us finding the R occurrences of a pattern of length m in a text of length n in average time O(mlog^{2}n+m^{2}+R), using O(n log n) space and O(n log^{2}n) index construction time. This complexity improves by far over all other previous methods. We also show a simpler scheme needing O(n) space.
more …
By
CamarenaIbarrola, Antonio; Chávez, Edgar; Tellez, Eric Sadit
4 Citations
Monitoring media broadcast content has deserved a lot of attention lately from both academy and industry due to the technical challenge involved and its economic importance (e.g. in advertising). The problem pose a unique challenge from the pattern recognition point of view because a very high recognition rate is needed under non ideal conditions. The problem consist in comparing a small audio sequence (the commercial ad) with a large audio stream (the broadcast) searching for matches.
In this paper we present a solution with the MultiBand Spectral Entropy Signature (MBSES) which is very robust to degradations commonly found on amplitude modulated (AM) radio. Using the MBSES we obtained perfect recall (all audio ads occurrences were accurately found with no false positives) in 95 hours of audio from five different am radio broadcasts. Our system is able to scan one hour of audio in 40 seconds if the audio is already fingerprinted (e.g. with a separated slave computer), and it totaled five minutes per hour including the fingerprint extraction using a single core off the shelf desktop computer with no parallelization.
more …
By
CamarenaIbarrola, Antonio; Chávez, Edgar
1 Citations
Real time tracking of musical performances allows for implementation of virtual teachers of musical instruments, automatic accompanying of musicians or singers, and automatic adding of special effects in live presentations.
State of the art approaches make a local alignment of the score (the target audio) and a musical performance, such procedure induce cumulative error since it assumes the rendition to be well tracked up to the current time. We propose searching for the knearest neighbors of the current audio segment among all audio segments of the score then use some heuristics to decide the current tracked position of the performance inside the score.
We tested the method with 62 songs, some pop music but mostly classical. For each song we have two performances, we use one of them as the score and the other one as the music to be tracked with excellent results.
more …
By
Chávez, Edgar; Marroquín, José L.; Navarro, Gonzalo
36 Citations
Pivotbased algorithms are effective tools for proximity searching in metric spaces. They allow trading space overhead for number of distance evaluations performed at query time. With additional search structures (that pose extra space overhead) they can also reduce the amount of side computations. We introduce a new data structure, the Fixed Queries Array (FQA), whose novelties are (1) it permits sublinear extra CPU time without any extra data structure; (2) it permits trading number of pivots for their precision so as to make better use of the available memory. We show experimentally that the FQA is an efficient tool to search in metric spaces and that it compares favorably against other state of the art approaches. Its simplicity converts it into a simple yet effective tool for practitioners seeking for a blackbox method to plug in their applications.
more …
By
Beltrán, Jessica; Navarro, René; Chávez, Edgar; Favela, Jesús; SotoMendoza, Valeria; Ibarra, Catalina
Show all (6)
1 Citations
Frequently, people with dementia exhibit abnormal behaviors that may cause selfinjury or burden their caregivers. Some audible manifestations of these problematic behaviors are of vocal nature (e.g., shouting, mumbling, or cursing), others are environmental sounds (e.g., tapping or slamming). The timely detection of these behaviors could enact nonpharmacological interventions which in turn can assist caregivers or prevent escalation of the disruption with other fellow residents in nursing homes. We conducted a field study in a geriatric residence to gather naturalistic data. With the participation of five residents for 203 h of observation and of the 242 incidents of problematic behaviors were registered, 85% of them had a distinctive auditory manifestation. We used a combination of standard speech detection techniques, along with a novel environmental sound recognition methodology based on the entropy of the signal. We conducted experiments using realistic data, i.e., audio immersed in natural background noise. Based on classification results with F1 score above 87%, we conclude that audible cues can be used to enact nonpharmacological interventions aimed at reducing problematic behaviors, or mitigating their negative impact.
more …
By
LuqueSuárez, Fernando; CamarenaIbarrola, Antonio; Chávez, Edgar
In voice recognition, the two main problems are speech recognition (what was said), and speaker recognition (who was speaking). The usual method for speaker recognition is to postulate a model where the speaker identity corresponds to the parameters of the model, which estimation could be timeconsuming when the number of candidate speakers is large. In this paper, we model the speaker as a high dimensional point cloud of entropybased features, extracted from the speech signal. The method allows indexing, and hence it can manage large databases. We experimentally assessed the quality of the identification with a publicly available database formed by extracting audio from a collection of YouTube videos of 1,000 different speakers. With 20 second audio excerpts, we were able to identify a speaker with 97% accuracy when the recording environment is not controlled, and with 99% accuracy for controlled recording environments.
more …
