Computer Network Routing with a Fuzzy Neural Network

As more individuals transmit data through a computer network, the quality of service received by the users begins to degrade. A major aspect of computer networks that is vital to quality of service is data routing. A more effective method for routing data through a computer network can assist with the new problems being encountered with today’s growing networks. Effective routing algorithms use various techniques to determine the most appropriate route for transmitting data. Determining the best route through a wide area network (WAN), requires the routing algorithm to obtain information concerning all of the nodes, links, and devices present on the network. The most relevant routing information involves various measures that are often obtained in an imprecise or inaccurate manner, thus suggesting that fuzzy reasoning is a natural method to employ in an improved routing scheme. The neural network is deemed as a suitable accompaniment because it maintains the ability to learn in dynamic situations.

Once the neural network is initially designed, any alterations in the computer routing environment can easily be learned by this adaptive artificial intelligence method. The capability to learn and adapt is essential in today’s rapidly growing and changing computer networks. These techniques, fuzzy reasoning and neural networks, when combined together provide a very effective routing algorithm for computer networks. Computer simulation is employed to prove the new fuzzy routing algorithm outperforms the Shortest Path First (SPF) algorithm in most computer network situations. The benefits increase as the computer network migrates from a stable network to a more variable one. The advantages of applying this fuzzy routing algorithm are apparent when considering the dynamic nature of modern computer networks.

Applying artificial intelligence to specific areas of network management allows the network engineer to dedicate additional time and effort to the more specialized and intricate details of the system. Many forms of artificial intelligence have previously been introduced to network management; however, it appears that one of the more applicable areas, fuzzy reasoning, has been somewhat overlooked. Computer network managers are often challenged with decision-making based on vague or partial information. Similarly, computer networks frequently perform operational adjustments based on this same vague or partial information. The imprecise nature of this information can lead to difficulties and inaccuracies when automating network management using currently applied artificial intelligence techniques. Fuzzy reasoning will allow this type of imprecise information to be dealt with in a precise and well-defined manner, providing a more flawless method of automating the network management decision making process.

The objective of this research is to explore the use of fuzzy reasoning in one area of network management, namely the routing aspect of configuration management. A more effective method for routing data through a computer network needs to be discovered to assist with the new problems being encountered on today’s networks. Although traffic management is only one aspect of  configuration management, at this time it is one of the most visible networking issues. This becomes apparent as consideration is given to the increasing number of network users and the tremendous growth driven by Internet-based multimedia applications. Because of the number of users and the distances between WAN users, efficient routing is more critical in wide area networks than in LANs (also, many LAN architectures such as token ring do not allow any flexibility in the nature of message passing). In order to determine the best route over the WAN, it is necessary to obtain information concerning all of the nodes, links, and LANs present
in the wide area network. The most relevant routing information involves various measures regarding each link. These measures include the distance a message will travel, bandwidth available for transmitting that message (maximum signal frequency), packet size used to segment the message (size of the data group being sent), and the likelihood of a link failure. These are often measured in an imprecise or inaccurate manner, thus suggesting that fuzzy reasoning is a natural method to employ in an improved routing scheme.

Utilizing fuzzy reasoning should assist in expressing these imprecise network measures; however, there still remains the massive growth issue concerning traffic levels. Most routing algorithms currently being implemented as a means of transmitting data from a source node to a destination node cannot effectively handle this large traffic growth. Most network routing methods are designed to be efficient for a current network situation; therefore, when the network deviates from the original situation, the methods begin to lose efficiency. This suggests that an effective routing method should also be capable of learning how to successfully adapt to network growth. Neural networks are extremely capable of adapting to system changes, and thus will be applied as a second artificial intelligence technique to the proposed routing method in this research. The proposed routing approach incorporates fuzzy reasoning in order to prepare a more accurate assessment of the network’s traffic conditions, and hence provide a faster, more reliable, or more efficient route for data exchange. Neural networks will be incorporated into the routing method as a means for the routing method to adapt and learn how to successfully handle network traffic growth. The combination of these two tools is expected to produce a more effective routing method than is currently available.

In order to achieve the primary objective of more efficient routing, several minor objectives also need to be accomplished. A method of data collection is needed throughout the different phases of the study. Data collection will be accomplished through the use of simulation methods; therefore, a simulation model must be accurately designed before proceeding with experimenting or analysis. Additional requirements include building and training the neural network and defining the fuzzy system. The objective of this research is to demonstrate the effective applicability of fuzzy reasoning to only one area of network management, traffic routing.

Tags : , , , , , , , , , , , ,

Professional certification in computer security

Certification programs in computer security have been provided by government agencies, professional organizations, and private corporations. By examining the certification requirements set by these certification bodies, I hope to identify common themes, which will provide useful insights into the design of computer security curriculum. The identified certification programs include the Certified Information Systems Auditor (CISA) program, the Certified Information Systems Security Professional (CISSP) program, the SNAP program,and the SAGE program.

The Certified Information Systems Auditor (CISA(r)) program was established in 1978 by the Information Systems Audit and Control Association (ISACA). The CISA certification focuses on five domain areas: Information Systems Audit Standards and Practices and Information Systems Security and Control Practices (8%); Information Systems Organization and Management (15%); Information Systems Process (22%); Information Systems Integrity,Confidentiality, and Availability (29%); and Information Systems Development, Acquisition, and Maintenance (26%).

The Certified Information Systems Security Professional (CISSP) program was created by the International Information Systems Security Certification Consortium (ISC), which is supported by Computer Security Institute (CSI), Information Systems Security Association(ISSA), Canadian Information Processing Society (CIPS), and other industry presences(Power 1997). CISSP certification requires the participants to pass the CISSP exam, which consists of questions covering 10 test domains: Access Control Systems & Methodology; Computer Operations Security; Cryptography; Application & Systems Development; Business Continuity & Disaster Recovery Planning; Telecommunications & Network Security; Security Architecture &Models; Physical Security; Security Management Practices; Law, Investigations& Ethics.

The SNAP9 program administered by GIAC of the SANS Institute is designed to serve the people who are or will be responsible for managing and protecting important information systems and networks. The GIAC program consists of a Level One Module covering the basics of information security followed by advanced and targeted Level Two Subject Area Modules. The Level One module consists of 18 elements: Information Assurance Foundations; IP Concepts; IP Behavior; Internet Threat; Computer Security Policies.

Tags : , , , , , , , , ,

5 Reasons Why People Spam Your Blog

No aspect of the World Wide Web is immune to spam – not even the blogosphere. No matter how strong your anti spam server is you may get hit every once and a while. Of course, the type of spam seen on personal blogs is different from the normal spam that you might be used to in the fact that instead of receiving these messages in your private inbox, they are being displayed on your blog for the entire world to see. Furthermore, the professional spammers who distribute unsolicited commercial e-mail for a living have different reasons for spamming a personal online blog versus sending unwanted junk mail into somebody’s inbox. So a bloggers need a good anti spam solutions in order to protect their blog.

1:  To advertise a website, product, or service. Perhaps the most generic reason for spamming a blog is for advertisement purposes. Through a blog it is easy to reach thousands of people every single day; this holds true for the owner of the blog as much as the ones who are spamming it.

2:  Get back links to their site. Many spammers simply leave a comment with nothing more than their website address, hoping to get as many clicks as possible.

3:  It is cheap when compared to other methods of spam. Even in the world of spam marketing, it takes money to make money – unless you’re spamming blogs, of course.

4:  The process can easily be automated to save time. Unlike some of the other spamming techniques, the entire process of spamming a blog can be automated.

5:  To collect e-mail addresses. Many times a user’s e-mail-address will be listed in their online profile, or even right alongside their post. Spammers collect these addresses in order to send them unsolicited commercial e-mail at a later time.

Tags : , , , , , , , , ,

Parallel Query Support for Multidimensional Data

Intra-query parallelism is a well-established mechanism for achieving high performance in (object) relational database systems. However, the methods have yet not been applied to the upcoming field of multidimensional array databases. Specific properties of multidimensional array data require new parallel algorithms. A number of new techniques for parallelizing queries in multidimensional array database management systems. It discusses their implementation in the RasDaMan DBMS, the first DBMS for generic multidimensional array data. The efficiency of the techniques presented is demonstrated using typical queries on large   multidimensional data volumes.

Recently, integration of an application domain-independent and of a generic type constructor for such Multidimensional Discrete Data (MDD) into Database Management Systems (DBMS) has received growing attention. Current scientific contributions in this area mainly focus on MDD algebra and specialized storage architectures MDD objects may have a magnitude of several MB and much more and, compared to scalar values, operations on these values can be very complex, their efficient evaluation becomes a critical factor for the overall query response time. Beyond query optimization, parallel query processing is the most promising technique to speed up complex operations on large data volumes.

One of the outcomes of the predecessor project of ESTEDI (European Spatio-Temporal Data Infrastructure), called RasDaMan in which the Array DBMS RasDaMan has been developed, was the awareness that most queries on multidimensional array data are in fact CPU-bound. Therefore, one major research issue of the succeeding project ESTEDI is the parallel processing. Furthermore, ESTEDI, an initiative of European software vendors and supercomputing centers, will establish an European standard for the storage and retrieval of multidimensional high-performance computing (HPC) data. It addresses a main technical obstacle, the delivery bottleneck of large HPC results to the users, by augmenting high volume data generators with a flexible data management and extraction tool for multidimensional array data. Special properties of array data, e.g. the size of one single data object combined with expensive cell operations require adapted algorithms for parallel processing. Suitable concepts found in relational DBMS were implemented and evaluated in the RasDaMan Array DBMS.

 

Tags : , , , , , , , ,

Shortest Path Algorithm for Multicast Routing in Multimedia Applications

A new heuristic algorithm is proposed for constructing multicast tree for multimedia and real-time applications. The tree is used to concurrently transmit packets from source to multiple destinations such that exactly one copy of any packet traverses the links of the multicast tree. Since multimedia applications require some Quality of Service, QoS, a multicast tree is needed to satisfy two main goals, the minimum path cost from source to each destination (Shortest Path Tree) and a certain end-to-end delay constraint from source to each destination. This problem is known to be NP-Complete. The proposed heuristic algorithm solves this problem in polynomial time and gives near optimal tree. We first mention some related work in this area then we formalize the problem and introduce the new algorithm with its pseudo code and the proof of its complexity and its correctness by showing that it always finds a feasible tree if one exists. Other heuristic algorithms are examined and compared with the proposed algorithm via simulation.

Handling group communication is a key requirement for numerous applications that have one source sends the same information concurrently to multiple destinations. Multicast routing refers to the construction of a tree rooted at the source and spanning all destinations. Generally, there are two types of such a tree, the Steiner tree and the shortest path tree. Steiner tree or group-shared tree tends to minimize the total cost of the resulting tree, this is an NP-Complete problem. Shortest path tree or source-based trees tends to minimize the cost of each path from source to any destination, this can be achieved in polynomial time by using one of the two famous algorithms of Bellman and Dijkstra and pruning the undesired links. Recently, with the rapid evolution of multimedia and real-time applications like audio/video conferencing, interactive distributed games and real-time remote control system, certain QoS need to be guaranteed in the resulted tree. One such QoS, and the most important one is the end-to-end delay between source and each destination, where the information must be sent within a certain delay constraint D. By adding this constraint to the
original problem of multicast routing, the problem is reformulated and the multicast tree should be either delay constrained Steiner tree, or delay-constrained shortest path tree. Delay constrained Steiner tree is an NP-Complete problem, several heuristics are introduced for this problem each trying to get near optimal tree cost, without regarding to the cost of each individual path for each destination. Delay constrained shortest path tree is also an NP-Complete problem. An optimal algorithm for this problem is presented, but its execution time is exponential and used only for comparison with other algorithms. Heuristic for this problem is presented, which tries to get a near optimal tree from the point of view of each destination without regarding the total cost of the tree. An exhaustive comparison between the previous heuristics for the two problems can be found. We investigate the problem of delay constrained shortest path tree since it is appropriate in some applications like Video on Demand (VoD), where the multicast group has a frequent change, and every user wants to get his information in the lowest possible cost for him without regarding the total cost of the routing tree. Also shortest path tree always gives average cost per destination less than Steiner tree. We present a new heuristic algorithm that finds the required tree in polynomial time.

Tags : , , , , , , , ,

Specialized Parallel Relational Operators

Some algorithms for relational operators are especially appropriate for parallel execution, either because they minimize data flow, or because they better tolerate data and execution skew. Improved algorithms have been found for most of the relational operators. The evolution of join operator algorithms is sketched here as an example of these improved algorithms.

Recall that the join operator combines two relations, A and B, to produce a third relation containing all tuple pairs from A and B with matching attribute values. The conventional way of computing the join is to sort both A and B into new relations ordered by the join attribute. These two intermediate relations are then compared in sorted order, and matching tuples are inserted in the output stream. This algorithm is called sort-merge join.

Many optimizations of sort-merger join are possible, but since sort has execution cost nlog(n), sort-merge join has an nlog(n) execution cost. Sort-merge join works well in a parallel dataflow environment unless there is data skew. In case of data skew, some sort partitions may be much larger than others. This in turn creates execution skew and limits speedup and scaleup. These skew problems do not appear in centralized sort-merge joins.

Hash-join is an alternative to sort-merge join. It has linear execution cost rather than nlog(n) execution cost, and it is more resistant to data skew. It is superior to sort-merge join unless the input streams are already in sorted order. Hash join works as follows. Each of the relations A and B are first hash partitioned on the join attribute. A hash partition of relation A is hashed into memory. The corresponding partition of table relation B is scanned, and each tuple is compared against the main-memory hash table for the A partition. If there is a match, the pair of tuples are sent to the output stream. Each pair of hash partitions is compared in this way.

The hash join algorithm breaks a big join into many little joins. If the hash function is good and if the data skew is not too bad, then there will be little variance in the hash bucket size. In these cases hash-join is a linear-time join algorithm with linear speedup and scaleup. Many optimizations of the parallel hash-join algorithm have been discovered over the last decade. In pathological skew cases, when many or all tuples have the same attribute value, one bucket may contain all the tuples. In these cases no algorithm is known to speedup or scaleup.

Tags : , , , , , , , , , , , , , , ,

Public Cloud Outsourcing

Although cloud computing is a new computing paradigm, outsourcing information technology services is not. The steps that organizations take remain basically the same for public clouds as with other, more traditional, information technology services, and existing guidelines for outsourcing generally apply as well. What does change with public cloud computing, however,is the potential for increased complexity and difficulty in providing adequate oversight to maintain accountability and control over deployed applications and systems throughout their life cycle. This can be especially daunting when non-negotiable SLAs are involved, since responsibilities normally held by the organization are given over to the cloud provider with little recourse for the organization to address problems and resolve issues, which may arise, to its satisfaction.

Reaching agreement on the terms of service of a negotiated SLA for public cloud services can be a complicated process fraught with technical and legal issues. Migrating organizational data and functions into the cloud is accompanied by a host of security and privacy issues to be addressed, many of which concern the adequacy of the cloud provider’s technical controls for an organization’s needs. Service arrangements defined in the terms of service must also meet existing privacy policies for information protection, dissemination and disclosure. Each cloud provider and service arrangement has distinct costs and risks associated with it. A decision based on any one issue can have major implications for the organization in other areas.

Considering the growing number of cloud providers and range of services offered, organizations must exercise due diligence when moving functions to the cloud. Decision making about new services and service arrangements entails striking a balance between benefits in cost and productivity versus drawbacks in risk and liability.

Tags : , , , , , , ,

Reasons for not using assembly code

There are so many disadvantages and problems involved in assembly programming that it is advisable to consider the alternatives before deciding to use assembly code for a particular task. The most important reasons for not using assembly programming are:

1. Development time : Writing code in assembly language takes much longer time than in a high level language.

2. Reliability and security : It is easy to make errors in assembly code. The assembler is not checking if the calling conventions and register save conventions are obeyed. Nobody is checking for you if the number of PUSH and POP instructions is the same in
all possible branches and paths. There are so many possibilities for hidden errors in assembly code that it affects the reliability and security of the project unless you have a very systematic approach to testing and verifying.

3. Debugging and verifying : Assembly code is more difficult to debug and verify because there are more possibilities for errors than in high level code.

4. Maintainability : Assembly code is more difficult to modify and maintain because the language allows unstructured spaghetti code and all kinds of dirty tricks that are difficult for others to understand. Thorough documentation and a consistent programming style is needed.

5. System code can use intrinsic functions instead of assembly : The best modern C++ compilers have intrinsic functions for accessing system control registers and other system instructions. Assembly code is no longer needed for device drivers and other system code when intrinsic functions are available.

6. Application code can use intrinsic functions or vector classes instead of assembly: The best modern C++ compilers have intrinsic functions for vector operations and other special instructions that previously required assembly programming. It is no longer necessary to use old fashioned assembly code to take advantage of the Single-Instruction-Multiple-Data (SIMD)  instructions.

7. Portability: Assembly code is very platform-specific. Porting to a different platform is difficult. Code that uses intrinsic functions instead of assembly are portable to all x86 and x86-64 platforms.

8. Compilers have been improved a lot in recent years : The best compilers are now better than the average assembly programmer in many situations.

9. Compiled code may be faster than assembly code because compilers can make inter-procedural optimization and whole-program optimization : The assembly programmer usually has to make well-defined functions with a well-defined call interface that obeys all calling conventions in order to make the code testable and verifiable. This prevents many of the optimization methods that compilers use, such as function inlining, register allocation, constant propagation, common subexpression elimination across functions, scheduling across functions, etc. These advantages can be obtained by using C++ code with intrinsic functions instead of
assembly code.

 

Tags : , , , , , , , , ,

The Multiplexing Transport Protocol Suite

The two transport protocols most commonly used in the Internet are TCP, which offers a reliable stream, and UDP, which offers a connectionless datagram service. We do not offer a connectionless protocol, because the mechanisms of a rate-based protocol need a longer-lived connection to work, as they use feedback from the receiver. The interarrival time of packets is measured at the receiver and is crucial for estimating the available bandwidth and for discriminating congestion and transmission losses. On the other hand, a multiplexing unreliable protocol that offers congestion control can be used as a basis of other protocols. The regularity of a rate-based protocol lends itself naturally to multimedia applications. Sound and video need bounds on arrival time so that the playback can be done smoothly. A multimedia protocol is the natural offshoot. Most multimedia applications need timely data. Data received after the playback time is useless. Moreover, for a system with bandwidth constraints, late data is adverse to the quality of playback, as it robs bandwidth from the flow. There are many strategies to deal with losses, from forgiving applications to forward error correction (FEC) schemes. Retransmissions are rarely used, because they take the place of new data, and the time to send a request and receive the retransmission may exceed the timing constraints.

When multiple channels are available, and the aggregated bandwidth is greater than the bandwidth necessary to transmit the multimedia stream, retransmissions can be done successfully without harming the quality of playback. The simultaneous use of multiple link layers generates extra bandwidth. The best-case scenario is the coupling of a low bandwidth, low delay interface with a
high bandwidth, high delay interface. The high bandwidth interface allows for a good quality stream, while the low delay interface makes retransmissions possible by creating a good feedback channel to request (and transmit) lost frames.

When the aggregated bandwidth is not enough to transmit packets at the rate required by the application, packets have to be dropped or the application has to change the characteristics of its stream. Adapting applications can change the quality of the stream on the fly to deal with bandwidth variations, but for non-adapting applications, the best policy is to drop packets at the sender. Sending packets that will arrive late will cause further problems by making other packets late, which can have a snowball effect.

In contrast to a multimedia protocol, a reliable protocol has to deliver intact every packet that the application sent. In this case, time is not the most important factor. Lost or damaged frames will have to be retransmitted until they are successfully received. If the application expects the data to be received in the same order it was sent, the protocol will have to buffer packets received after a loss until the lost packet retransmission is received. Using the channel abstraction to multiplex the data increases the occurrence of out-of-order deliver, increasing the burden in the receiving end.

 

 

Tags : , , , , , , , , , , , ,

Visualizing numerical simulations of the Earth’s environment

Numerical models of the Earth’s atmosphere and oceans form one important class of scientific algorithms. The history files produced by these models are traces of their computations,and our VIS-5D (VISualization for 5-Dimensional data sets) system, freely available by anonymous ftp and running on graphics workstations, is widely used by scientists for interactively visualizing these history files. This system takes it name from the fact that model history files are 5-D rectangles of data, organized as 2-D arrays of 3-D spatial grids. The 2-D arrays are indexed by time and by model field (e.g., temperature, pressure, salinity, three components of wind or current velocity, etc). The data grids are transformed into graphical primitives that consist of 3-D vectors and polygons. On large workstations, we also use an efficient interactive volume rendering technique. The rendering of graphical primitives creates a virtual Earth environment behind the workstation screen. Users can reach into this virtual environment with a mouse to move slices through the data grids, to place seed points for wind trajectories, and to rotate and zoom their view. The array of icons on the left lets users select combinations of fields and rendering techniques, and control animation, iso levels, trajectories, color maps, etc.

Modern workstations can respond to these controls within the time of an animation step(usually between 1/30 and 1/5 second), giving users the sense of interacting with a small virtual atmosphere or ocean. In order for users to explore the 3-D geometry of their fields, and to explore cause and effect relations between different fields, they should be able to rotate images and change the combinations of fields displayed without interrupting the smooth animation of model dynamics. Thus we do not synchronize animation with the computation of graphical primitives, rather storing primitives in intermediate tables indexed by time and by field.

The size of a model history file is the product of five numbers and can be quite large. For example, a data set spanning 100 latitudes by 100 longitudes by 20 vertical levels by 100 time steps by 10 model fields contains 200 million grid points. In order to maximize data set size we compress grid data and derived graphics by scaling them linearly to one or two byte integers. In order to preserve fidelity, different scaling factors are used for each horizontal slice of each 3-D grid. With compression we can store one grid point, plus derived graphics, in 2.5 bytes of virtual memory. For history files that are too large for workstations the system splits into a graphics client on a workstation connected via network to a data server on a supercomputer.

Sometimes users need to see derived quantities, such as the vorticity or divergence of airflow, in order to understand the physics of a simulation. Users can write C and FORTRAN functions for deriving new diagnostic fields, and invoke them during a visualization session (they are dynamically linked with VIS-5D via sockets). In order to maximize data fidelity, floating point grid values in disk files, rather than compressed values, are used in these calculations.

Tags : , , , , , , , , , , , ,