Posts Tagged ‘IT consulting services

Computer Network Routing with a Fuzzy Neural Network

As more individuals transmit data through a computer network, the quality of service received by the users begins to degrade. A major aspect of computer networks that is vital to quality of service is data routing. A more effective method for routing data through a computer network can assist with the new problems being encountered with today’s growing networks. Effective routing algorithms use various techniques to determine the most appropriate route for transmitting data. Determining the best route through a wide area network (WAN), requires the routing algorithm to obtain information concerning all of the nodes, links, and devices present on the network. The most relevant routing information involves various measures that are often obtained in an imprecise or inaccurate manner, thus suggesting that fuzzy reasoning is a natural method to employ in an improved routing scheme. The neural network is deemed as a suitable accompaniment because it maintains the ability to learn in dynamic situations.

Once the neural network is initially designed, any alterations in the computer routing environment can easily be learned by this adaptive artificial intelligence method. The capability to learn and adapt is essential in today’s rapidly growing and changing computer networks. These techniques, fuzzy reasoning and neural networks, when combined together provide a very effective routing algorithm for computer networks. Computer simulation is employed to prove the new fuzzy routing algorithm outperforms the Shortest Path First (SPF) algorithm in most computer network situations. The benefits increase as the computer network migrates from a stable network to a more variable one. The advantages of applying this fuzzy routing algorithm are apparent when considering the dynamic nature of modern computer networks.

Applying artificial intelligence to specific areas of network management allows the network engineer to dedicate additional time and effort to the more specialized and intricate details of the system. Many forms of artificial intelligence have previously been introduced to network management; however, it appears that one of the more applicable areas, fuzzy reasoning, has been somewhat overlooked. Computer network managers are often challenged with decision-making based on vague or partial information. Similarly, computer networks frequently perform operational adjustments based on this same vague or partial information. The imprecise nature of this information can lead to difficulties and inaccuracies when automating network management using currently applied artificial intelligence techniques. Fuzzy reasoning will allow this type of imprecise information to be dealt with in a precise and well-defined manner, providing a more flawless method of automating the network management decision making process.

The objective of this research is to explore the use of fuzzy reasoning in one area of network management, namely the routing aspect of configuration management. A more effective method for routing data through a computer network needs to be discovered to assist with the new problems being encountered on today’s networks. Although traffic management is only one aspect of  configuration management, at this time it is one of the most visible networking issues. This becomes apparent as consideration is given to the increasing number of network users and the tremendous growth driven by Internet-based multimedia applications. Because of the number of users and the distances between WAN users, efficient routing is more critical in wide area networks than in LANs (also, many LAN architectures such as token ring do not allow any flexibility in the nature of message passing). In order to determine the best route over the WAN, it is necessary to obtain information concerning all of the nodes, links, and LANs present
in the wide area network. The most relevant routing information involves various measures regarding each link. These measures include the distance a message will travel, bandwidth available for transmitting that message (maximum signal frequency), packet size used to segment the message (size of the data group being sent), and the likelihood of a link failure. These are often measured in an imprecise or inaccurate manner, thus suggesting that fuzzy reasoning is a natural method to employ in an improved routing scheme.

Utilizing fuzzy reasoning should assist in expressing these imprecise network measures; however, there still remains the massive growth issue concerning traffic levels. Most routing algorithms currently being implemented as a means of transmitting data from a source node to a destination node cannot effectively handle this large traffic growth. Most network routing methods are designed to be efficient for a current network situation; therefore, when the network deviates from the original situation, the methods begin to lose efficiency. This suggests that an effective routing method should also be capable of learning how to successfully adapt to network growth. Neural networks are extremely capable of adapting to system changes, and thus will be applied as a second artificial intelligence technique to the proposed routing method in this research. The proposed routing approach incorporates fuzzy reasoning in order to prepare a more accurate assessment of the network’s traffic conditions, and hence provide a faster, more reliable, or more efficient route for data exchange. Neural networks will be incorporated into the routing method as a means for the routing method to adapt and learn how to successfully handle network traffic growth. The combination of these two tools is expected to produce a more effective routing method than is currently available.

In order to achieve the primary objective of more efficient routing, several minor objectives also need to be accomplished. A method of data collection is needed throughout the different phases of the study. Data collection will be accomplished through the use of simulation methods; therefore, a simulation model must be accurately designed before proceeding with experimenting or analysis. Additional requirements include building and training the neural network and defining the fuzzy system. The objective of this research is to demonstrate the effective applicability of fuzzy reasoning to only one area of network management, traffic routing.

Tags : , , , , , , , , , , , ,

Shortest Path First with Emergency Exits

Under heavy and dynamic traffic, the SPF routing algorithm often suffers from wild oscillation and severe congestion, and results in degradation of the network performance. Here we presenta new routing algorithm (SPF-EE) which attempts to eliminate the problems associated with the SPF algorithm by providing alternate paths as emergency exits.With the SPF-EE algorithm, traffic is routed along the shortest-paths under normal condition. However, in the presence of congestion and resource failures, the traffic can be dispersed temporarily to alternate paths without route re-computation. Simulation experiments show that the SPF-EE algorithm achieves grater throughput, higher responsiveness, better congestion control and fault tolerance, and substantially improves the performance of routing in a dynamic environment.

A distributed routing algorithm can be decomposed into four procedures: distance measurement, information updating, route computation and packet forwarding. The distance measurement procedure monitors and collects certain network parameters according to the particular routing metric used. The collected information is distributed over the entire network by the information distribution procedure. In each node, the route computation procedure then constructs the routing table based on the received information and the packet forwarding procedure actually routes the traffic to the next hop. The new routing algorithm is an improvement over the SPF algorithm. It uses the same distance measurement procedure and information updating procedure as the SPF algorithm. Each node measures the actual delay to its neighbors and periodically broadcasts the changes to all other nodes in the network. However, the new routing algorithm is equipped with more sophisticated route computation procedure and packet forwarding procedure to deal with heavy and dynamic traffic.

The problem with the SPF algorithm is that there are no mechanisms to alter the routing other than updating the routing tables while route updating is too slow and costly for responding to traffic fluctuations. Under heavy traffic load, frequent route updating may also lead to instability. To solve this dilemma, the new routing algorithm, maintains a stable routing table and meanwhile provides alternate paths to disperse traffic when it is accumulating in some points of the network. When congestion and network failures do occur, instead of initiating route updating, the node forwards the traffic along the alternate paths temporarily and pass around the congested or failed areas. If the changes are persistent, the routing tables will be updated eventually when the next route updating time is due.

In the new routing algorithm, the alternate paths are only used as emergency exits when the shortest paths are experiencing problems. In normal conditions, the new routing algorithm performs exactly the same as the SPF algorithm does. Because of this feature, we call the new routing algorithm Shortest Path First with Emergency Exits (SPF-EE). It should be emphasized that the alternate path as anemergency exit does not have to be the shorest path to theparticular destination, nor does it have to be disjoint fromthe current shortest path. When congestion or networkfailures occur, the primary goal is to !ind an alternativepath for the traflic and avoid the packets accumulating inthe network.

 

Tags : , , , , , , , , , ,

Factors examined during query optimization

Query plans consist of retrieval tactics and an ordered set of execution steps to retrieve the data needed by the query. In developing query plans, the optimizer examines:

• The size of each table in the query, both in rows and data pages, and the number of OAM and allocation pages that need to be read.

• The indexes that exist on the tables and columns used in the query, the type of index, and the height, number of leaf pages, and cluster ratios for each index.

• Whether the index covers the query, that is, whether the query can besatisfied by retrieving data from the index leaf pages without having to access the data pages. Adaptive Server can use indexes that cover queries, even if no where clauses are included in the query.

• The density and distribution of keys in the indexes.

• The size of the available data cache or caches, the size of I/O supported bythe caches, and the cache strategy to be used.

• The cost of physical and logical reads.

• Join clauses and the best join order and join type, considering the costs and number of scans required for each join and the usefulness of indexes in limiting the I/O.

• Whether building a worktable (an internal, temporary table) with an index on the join columns would be faster than repeated table scans if there are no useful indexes for the inner table in a join.

• Whether the query contains a max or min aggregate that can use an index to find the value without scanning the table.

• Whether the data or index pages will be needed repeatedly to satisfy a query such as a join or whether a fetch-and-discard strategy can be employed because the pages need to be scanned only once.

For each plan, the optimizer determines the total cost by computing the logical and physical I/Os. Adaptive Server then uses the cheapest plan. Stored procedures and triggers are optimized when the object is first executed,and the query plan is stored in the procedure cache. If other users execute the same procedure while an unused copy of the plan resides in cache, the compiled query plan is copied in cache, rather than being recompiled.

Tags : , , , , , , , , ,

Parallel Dataflow Approach to SQL Software

Terabyte online databases, consisting of billions of records, are becoming common as the price of online storage decreases. These databases are often represented and manipulated using the SQL relational model. A relational database consists of relations (files in COBOL terminology) that in turn contain tuples (records in COBOL terminology). All the tuples in a relation have the same set of attributes (fields in COBOL terminology).

Relations are created, updated, and queried by writing SQL statements. These statements are syntactic sugar for a simple set of operators chosen from the relational algebra. Select project, here called scan, is the simplest and most common operator – it produces a row-and column subset of a relational table. A scan of relation R using predicate P and attribute list L produces a relational data stream as output. The scan reads each tuple, t, of R and applies thepredicate P to it. If P(t) is true, the scan discards any attributes of t not in L and inserts the resulting tuple in the scan output stream. Expressed in SQL, a scan of a telephone book relation to find the phone numbers of all people named Smith would be written:

SELECT     telephone_number      /* the output attribute(s) */

FROM        telephone_book           /* the input relation */

WHERE     last_name = ‘Smith’;   /* the predicate */

A scan’s output stream can be sent to another relational operator, returned to an application, displayed on a terminal, or printed in a report. There in lies the beauty and utility of the relational model. The uniformity of the data and operators allow them to be arbitrarily composed into data flow graphs. The output of a scan may be sent to a sort operator that will reorder the tuples based onan attribute sort criteria, optionally eliminating duplicates. SQL defines several aggregate operators to summarize attributes into a single value, for example, taking the sum, min, or maxof an attribute, or counting the number of distinct values of the attribute. The insert operator adds tuples from a stream to an existing relation. The update and delete operators alter and delete tuples in a relation matching a scan stream.

The relational model defines several operators to combine and compare two or more relations. It provides the usual set operators union, inter section, difference, and some more exoticones like join and division. Discussion here will focus on the equi-join operator (here called join). The join operator composes two relations, A and B, on some attribute to produce a third relation. For each tuple, ta, in A, the join finds all tuples, tb, in B with attribute value equal to that of ta. For each matching pair of tuples, the join operator inserts into the output steam a tuple built by concatenating the pair. Codd, in a classic paper, showed that the relational data model can represent any form of data, and that these operators are complete [CODD70]. Today, SQL applications are typically a combination of conventional programs and SQL statements. The programs interact with clients, perform data display, and provide high-level direction of the SQL dataflow.

The SQL data model was originally proposed to improve programmer productivity by offering a non-procedural database language. Data independence was and additional benefit; since the programs do not specify how the query is to be executed, SQL programs continue to operate as the logical and physical database schema evolves. Parallelism is an unanticipated benefit of the relational model. Since relational queries are really just relational operators applied to very large collections of data, they offer many opportunities for parallelism. Since the queries are presented in a non-procedural language, they offer considerable latitude in executing the queries. Relational queries can be executed as a dataflow graph. As mentioned in the introduction, these graphs can use both pipelined parallelism and partitioned parallelism. If one operator sends its output to another, the two operators can execute in parallel giving potential speedup of two.

The benefits of pipeline parallelism are limited because of three factors: (1) Relational pipelines are rarely very long – a chain of length ten is unusual. (2) Some relational operators donot emit their first output until they have consumed all their inputs. Aggregate and sort operators have this property. One cannot pipeline these operators. (3) Often, the execution cost of one operator is much greater than the others (this is an example of skew). In such cases, the speedup obtained by pipelining will be very limited. Partitioned execution offers much better opportunities for speedup and scaleup. Bytaking the large relational operators and partitioning their inputs and outputs, it is possible to use divide-and-conquer to turn one big job into many independent little ones. This is an ideal situation for speedup and scaleup. Partitioned data is the key to partitioned execution.

Tags : , , , , , , , , , ,

Semantic optimization of OQL queries

This explores all the phases of developing a query processor for OQL, the Object Query Language proposed by the Object Data Management Group (ODMG 3.0). There has been a lot of research on the execution of relational queries and their optimization using syntactic or semantic transformations. However, there is no context that has integrated and tested all the phases of processing an object query language, including the use of semantic optimization heuristics. This research is motivated by the need for query execution tools that combine two valuable properties: i) the expressive power to encompass all the features of the object-oriented paradigm and ii) the flexibility to benefit from the experience gained with relational systems, such as the use of semantic knowledge to speedup query execution. The contribution of this work is two fold. First, it establishes a rigorous basis for OQL by defining a type inference model for OQL queries and proposing a complete framework for their translation into calculus and algebraic representations. Second, in order to enhance query execution it provides algorithms for applying two semantic optimization heuristics: constraint introduction and constraint elimination techniques. By taking into consideration a set of association rules with exceptions, it is possible to add or remove predicates from an OQL query, thus transforming it to a more efficient form.

We have implemented this framework, which enables us to measure the benefits and the cost of exploiting semantic knowledge during query execution. The experiments showed significant benefits, especially in the application of the constraint introduction technique. In contexts where queries are optimized once and are then executed repeatedly, we can ignore the cost of optimization, and it is always worth carrying out the proposed transformation. In the context of adhoc queries the cost ofthe optimization becomes an important consideration. We have developed heuristics to estimate the cost as well as the benefits of optimization. The optimizer will carry out a semantic transformation only when the overhead is less than the expected benefit. Thus transformations are performed safely even with adhoc queries. The framework can often speed up the execution of an OQL query to a considerable extent.

Semantic optimization is a third important aspect of our query processing framework. The underlying idea is to optimize a query based on semantic knowledge in the form of association rules. This first requires identifying the rules using standard data mining methods; these rules must then be combined into forms suitable for the optimization purposes. Finally, assessing the performance benefits of semantic optimization requires estimating the costand the scalability of the methods developed.

The query language used in our framework is OQL, the Object Query Language proposed as part ofthe Object Data Standard (version 3.0) published by the ODMG [CBB+00]. This standard provides specifications for storing and retrieving objects in databases. Its objective is to encompass all the important features of the OO paradigm and to ensure portability across database systems that complywith it. It consists of specifications of the following components:

1.  the Object Model defines the type system of the object paradigm,

2. the Object Definition Language (ODL) is a specification language used to define the classes of a particular schema,

3. the Object Interchange Format (OIF) is a specification language used to dump and load the state of the database to and from a file,

4. the Object Query Language (OQL) is a functional language that allows the user to query objects,

5. C++, Java, Smalltalk bindings enable developers to write code in the respective language to manipulate persistent objects.

It is a functional language, in which queries can be composed to an arbitrary depth, as long as their operands and results respect the type system defined in the Object Model. Many of the constructs of OQL are syntactically similar to those of SQL-92; however, OQL has a number of object-oriented features. First, it deals with complex objects, i.e. objects containing or being related to other objects,or containing collections (or structures) of objects. Secondly, it enables users to navigate over the objects using path expressions of arbitrary length. Thirdly, OQL allows the comparison of objects based on their object identity instead of their content values. Fourthly, it enables method invocations within a query. Fifthly, it manipulates a series of collections (sets, bags, lists, arrays, dictionaries) with the aid of additional language constructs. Sixthly, it handles polymorphic collections, i.e. collections with objects belonging to different classes. The late binding mechanism enables users to perform generic operations on the elements of these collections. The casting mechanism allows them to declare explicitly the class of an object, going down the class hierarchy.

The expressive power of OQL makes it an interesting query language to develop. In addition to its independent use as a query language, it can potentially be embedded in any object-oriented programming language. The object-oriented nature of OQL makes the gap between the query and the programming language much narrower than in the relational context. In certain cases, it allows the programmer to perform exactly the same operation by either using the commands available in the binding of the programming language, or by embedding an OQL query in it.

Tags : , , , , , , , , , , , , ,

Determining InnoDB Resource Requirements

InnoDB clearly requires far more memory for these reasons,  but it gets slightly difficult to pin down exactly how much more memory. This is true for several reasons:

a. How did you load your database?

InnoDB table size is not a constant. If you took a straight SQL dump from a MyISAM table and inserted it into an InnoDB table, it is likely larger than it really needs to be. This is because the data was loaded out of primary key order and the index isn’t tightly packed because of that. If you took the dump with the ­­order ­by ­primary argument to mysql dump, you likely have a much smaller table and will need less memory to buffer it.

b. What exactly is your table size?

This is an easy question to answer with MyISAM: that information is directly in the output of “SHOW TABLE STATUS”. However, the numbers from that same source for InnoDB are known to be estimates only. The sizes shown are the physical sizes reserved for the tables and have nothing to do with the actual data size at that point. Even the row count is a best guess.

c. How large is your primary key?

It was mentioned above that InnoDB clusters the data for a table around the primary key.This means that any secondary index leaves must contain the primary key of the data they “point to.” Thus, if you have tables with a large primary key, you will need more memory to buffer a secondary index and more disk space to hold them. This is one of the reasons some people argue for short “artificial” primary keys for InnoDB tables when there isn’t one “natural” primary key.

In summary, there is no set method that will work for everyone to predict the needed resources. Worse than that, your needed resources will change with time as more inserts to your table increase its size and fragment the packing of the BTree. The best advice to be offered is use the mysql report tool available here (http://hackmysql.com/mysqlreport) to monitor your available innodb_buffer_pool and adjust accordingly, the most important InnoDB tunable. It is important to not run at 100% usage of the innodb buffer, as this likely means that you’re not buffering as much as you could for reads, and that you are starving your write buffer which also lives in the same global innodb_buffer.

Tags : , , , , , , , , , , ,

Constrained Shortest Path Algorithm for Multicast Routing in Multimedia Applications

A  new  heuristic  algorithm  is  proposed  for  constructing multicast  tree  for multimedia  and  real-time  applications. The  tree  is  used  to  concurrently  transmit  packets  from source to multiple destinations such that exactly one copy of  any  packet  traverses  the  links  of  the  multicast  tree. Since  multimedia  applications  require  some  Quality  of Service,  QoS,  a  multicast  tree  is  needed  to  satisfy  two main  goals,  the minimum  path  cost  from  source  to  each destination (Shortest Path Tree) and a certain end-to-end delay  constraint  from  source  to  each  destination.  This problem  is  known  to  be  NP-Complete.  The  proposed heuristic algorithm solves this problem in polynomial time and gives near optimal tree. We first mention some related work  in  this  area  then  we  formalize  the  problem  and introduce the new algorithm with its pseudo code and the proof of its complexity and its correctness by showing that it always finds a feasible tree if one exists. Other heuristic algorithms are examined and compared with the proposed algorithm via simulation.

Shortest  path  tree  or source-based trees tends to minimize the cost of each path from  source  to  any  destination,  this  can  be  achieved  in polynomial  time  by  using  one  of  the  two  famous algorithms of  Bellman  and Dijkstra  and pruning the undesired  links.  Recently,  with  the  rapid  evolution  of multimedia  and  real-time  applications  like  audio / video conferencing,  interactive  distributed  games  and  real-time remote control system, certain QoS need to be guaranteed in the resulted tree. One such QoS, and the most important one,  is  the  end-to-end  delay  between  source  and  each destination, where  the  information must  be  sent within  a certain delay constraint D. By adding this constraint to the original  problem  of  multicast  routing,  the  problem  is reformulated and the multicast tree should be either delayconstrained Steiner tree, or delay-constrained shortest path tree.  Delay  constrained  Steiner  tree  is  an  NP-Complete problem,  several  heuristics  are  introduced  for  this problem each trying to get near optimal tree cost, without regarding to the cost of each individual path for  each  destination. Delay-constrained  shortest  path  tree is  also  an  NP-Complete  problem.  An  optimal algorithm  for  this  problem  is  presented  at,  but  its execution  time  is  exponential  and  used  only  for comparison  with  other  algorithms.  Heuristic  for  this problem  is  presented  in, which  tries  to  get  a  near optimal  tree  from  the  point  of  view  of  each  destination without regarding the total cost of the tree. An exhaustive comparison  between  the  previous  heuristics  for  the  two problems  can  be  found  in.  We investigate  the problem of delay constrained  shortest path tree since it is appropriate in some applications like Video on  Demand  (VoD),  where  the  multicast  group  has  a frequent  change,  and  every  user  wants  to  get  his information  in  the  lowest  possible  cost  for  him  without regarding  the  total  cost  of  the  routing  tree. Also  shortest path  tree  always  gives  average  cost  per  destination  less than  Steiner  tree.

 

Tags : , , , , , , , , , , , , , , ,

Distributed Query Optimization

The distributed query optimization is one of the hardest problems in the database area. The great commercial success of database systems is partly due to the development of sophisticated query optimization technology where users pose queries in a declarative way using SQL or OQL and the optimizer of the database system finds a good way (i. e. plan) to execute these queries. The optimizer, for example, determines which indices should be used to execute a query and in which order the operations of a query (e. g. joins, selects, and projects) should be executed. To this end, the optimizer enumerates alternative plans, estimates the cost of every plan using a cost model, and chooses the plan with lowest cost. There has been much research into this field. Here we study the problem of distributed query optimization; we focus on the basic components of the distributed query optimizer, i. e. search space, search strategy, and cost model. A survey of the available work into this field is given. Finally, some future work is highlighted based on some recent work that uses mobile agent technologies.

A database is physically distributed across different sites by fragmenting and replicating the data. Fragmentation subdivides each relation into horizontal fragments using select operation or vertical fragments using project operation. Fragmentation is desirable because it enables the placement of data in close proximity to its place of use. Each fragment may also be replicated to a number of sites. This is preferable when the same data is accessed from applications that run at a number of sites. The great commercial success of database systemsis partly due to the development of sophisticated query optimization technology, where users pose queries in a declarative way using SQL or OQL and the optimizer of the database system finds a good way (i. e. plan) to execute these queries. The optimizer, for example, determines which indices should be used to execute a query and in which order the operations of a query (e.g. joins, selects, and projects) should be executed. To this end, the optimizer enumerates alternative plans, estimates the cost of every plan using a cost model, and chooses the plan with lowest cost. Selecting the optimal execution strategy for a query is NP-hard in the number of relations. For complex queries with many relations, this incurs a prohibitive optimization cost. Therefore, the actual objective of the optimizer is to find a strategy close to optimal and to avoid bad strategies. The selection of the optimal strategy generally requires the prediction of execution cost of the alternative candidate ordering prior to actually executing the query. The execution cost is expressed as a weighted combination of I/ O, CPU, and communication costs.

In distributed query optimization two more steps are involved between query decomposition and query optimization: Data localization and global query optimization. The input to data localization is the initial algebraic query generated by the query decomposition step. The initial algebraic query is specified on global relations irrespective of their fragmentation or   distribution. The main role of data localization is to localize thequery using data distributed information. In this step, the fragments that are involved in the query are determined and the query is transformed into one that operates on fragments rather than global relations. Thus during the data localization step, each global relation is first replaced by its localization program,which is union of the fragment of a horizontally or vertically fragment query, and then the resulting fragment query is simplified and restructured to produce another good query. Simplification and restructuring may be done according to the same rules used in the decomposition step. The final fragment query is generally far from optimal; this process only eliminates bad queries.

The input to the third step is a fragment query, that is an algebraic query on fragments. By permuting the ordering of operations within one fragment query, many equivalent query execution plans may be found. The goal of query optimization is to find an execution strategy for the query that is close optimal. An execution strategy for a distributed query can be described with relational algebra operations and communication primitives (send/ receive operations) for transferring data between sites. The query optimizer that follows this approach is seen as three components: A search space, a search strategy and a cost model. The search space is the set of alternative execution to represent the input query. These strategies are equivalent, in the sense that they yield the same result but they differ on the execution order of operations and the way these operations are implemented. The search strategy explores the search space and selects the best plan. It defines which plans are examined and in which order. The cost model predicts the cost of a given execution plan which may consist of the following components.

· Secondary storage cost: This is the cost of searching for reading and writing data blocks on secondary storage.

· Memory storage cost: This is the cost pertaining to the number of memory buffers needed during query execution.

· Computation cost: This the cost of performing in memory operations on the data buffers during query optimization.

· Communication cost: This is the cost of shipping the query and its results from the database site to the site or terminal where the query originated.

Tags : , , , , , , , , , , , , , , , , ,

AM-FM Signals and Demodulation

The de nition of SM is well connected with the de nition of an AM-FM signal. Actually, the mathematical de nition of  both models are mainly the same, although, there are diff erences between them. For instance, the components of an AM-FM signal may cross-over and usually the carrier frequency is of orders greater than the modulation frequency which is not typical for SM. Moreover, the number of components in AM-FM signals is usually smaller than the number of components in SM. In voiced speech, for instance, SM may be applied for modeling the harmonics, while AM-FM representation models the formants of speech (usually one format per kHz). Since we develop an algorithm which is able to decompose both time-varying sinusoids and AM-FM signals, we will review most of the AM-FM demodulation algorithms presented in the literature.

The demodulation of an AM-FM signal depends on the number of components it contains. For the mono-component case, analytic signal through Hilbert transform provides an estimate of the instantaneous amplitude and instantaneous phase. Instantaneous frequency is then computed by diff erentiate the unwrapped instantaneous phase. Another well-known algorithmfor mono-component AF-FM demodulation is the Discrete Energy Separation Algorithm (DESA) developed by Maragos, Quatieri and Kaiser. DESA utilizes the nonlinear Teager-Kaiser operator which has ne time-resolution.

However, the generalization to multi-component AM-FM signals is not a trivial task. Eventhe well-posiness of the de nition of a multi-component AM-FM signal received great attention. The most common solution for demodulation of a multicomponent AM-FM signals to pass the signal from a fi lter bank and then apply the preferred mono-component AM-FM demodulation algorithm to the output of each fi lter. This approach is similar tophase vocoder algorithm used in speech processing. However, the interference between the adjacent lters as well the crossing of a component between di fferent filters add limitations to this approach. Another AM-FM component-separation approach has been proposed relatively recently by Santhanam and Maragos which separates the AM-FM components algebraically based on the periodicity characteristics of the components. This algorithm is very attractive since the separation is accurate even when the AM-FM components cross each other. The weak point of this demodulation method is that the period of each AM-FM component should be correctly computed. Finally, a novel multi-component AM-FM decomposition algorithm was proposed in which is highly accurate for signal representation, however, the extracted AM-FM components usually lacks physical meaning especially when the components have approximately equal strength.

Tags : , , , , , , , , , , , ,

Automated Sound Analysis Using a Mobile Telemedicine Platform

Being able to ensure that patients are properly diagnosed is a difficult task. This is because training of healthcare workers and the supply of standard diagnostic medical devices (such as stethoscopes, ultrasound, or electrocardiographic machines) for health clinics are costly and logistically difficult. As mentioned before, resource-poor regions lack a supply chain infrastructure to support reliable healthcare equipment delivery. Even if a medical device is successfully delivered to a region, local adoption is still dependent on whether advanced training and expertise are required to operate the device, regional availability of replacement parts, and availability of technicians to service and calibrate the device.

The rapid evolution of the wireless telecommunications industry can be leveraged in order to improve the capacity of CHWs to care for their local communities. In fact, mobile phones are experiencing rapid adoption across the globe with estimates that more than 80% of the world’s population live within transmission and reception range of a cellular (mobile phone) communications tower (Fortner 2009). Furthermore, 64% of mobile phone subscribers are located in developing nations. Mobile phones are an obvious platform choice for several reasons other than their ubiquity. They can be used for telemedicine by sending patient data from remote CHWs over the cellular network to an expert for diagnosis. Mobile phones are also easy to keep charged (even if it is through the car battery), easy to conceal and protect from theft (because of their size), intuitive and familiar to use, and come replete with sensors such as cameras and microphones. The computing power of the mobile phone can also be leveraged so that the CHW can use the signal and image processing capabilities of the phone locally as a diagnostic tool (without any data transmission cost). In addition, mobile phones facilitate easy data back-up, security, authentication, tracking, and integration with medical records for auditing, longitudinal follow-ups, and appropriate allocation of resources.

The aim of this thesis is to develop a low-cost automated heart and lung sound diagnostic framework which uses a hands-free kit attached to a mobile phone, and aback-end centralized review system to allow expert annotation and rapid diagnostics. In particular, calibrated methods (for automated or semi-automated signal processing) need to be developed to determine if such sound captured on mobile phones are of sufficient quality to accurately indicate infection or disease. The resulting low-cost system, will hopefully address a growing and critical need in resource-poor settings for detection of heart and respiratory problems  particularly among fetuses (in the case of heart rate) and young children (inthe case of respiratory sounds).

Ad hoc in-situ health diagnostics are likely to be appealing to those living in resource-poor locations because the modality of sourcing health care is already user driven and ad hoc. In order to develop a system that is specific to a given medical problem and geographical region, a large database of expert-annotated data samples is required. Therefore, a centralized remote data transfer, storage and annotation system is also necessary. To do this, an open source telemedicine client/server system known as Sana, was augmented to provide the appropriate infrastructure. Sana provides an Android-based Java application which allows authentication of users via barcode scanning, the entry or capture of categorical medical data together with any binary object (such as text, images, audio recordings, videos or files from peripheral systems connected by Bluetooth or USB), and an asynchronous upload method to a central database.

An upload instigates a prompt to a medical professional to review the data and apply a medical code drawn from a relevant standard medical ontology, to categorize both the data and the follow-up course of action. Since multiple experts can review the data, accurate comparisons of expert diagnoses can be performed if a degenerate subset of alpha numeric codes are used. The Android-based system for uploading and downloading data in the field can run on many different types of hardware, from laptops to mobile phones, and the data transfer can take place through any connection mobility supported by the hardware (from 802.11x wireless to GPRS to Bluetooth). The backend electronic medical record (EMR) system is based upon OpenMRS an open source EMR developed specifically for medical data collection in resource poor areas, which allows for full transparency of the data, and provides an open application protocol interface (API) so that others can add additional functionality, such as data quality auditing or database integration.

The audio signal processing capability of the chipsets on mobile phones is designed to compress, filter, and transmit complex vocal signals. Audio recorded from the hands free kit can therefore be automatically processed on the mobile phone to extract health metrics such as heart and respiratory rate. Automated or semi automated algorithms may be able to provide the healthcare worker with decision support and training to more accurately detect the presence of infection or disease, and allow rapid referral to an appropriate treatment center. The medium term goal of this project is to deliver and evaluate a prototype system with community partners in Central America and the Philippines in the future.

Tags : , , , , , , , , , ,