Application Development

Existing Graph Based Stream Authentication Methods Using Crypto Signatures

For graph-based authentication, the main challenge is how to design a Directed Acyclic Graph (DAG) with lowest overhead, highest verification probability and lowest sender and receiver delay. However, there are trade-offs between these performance criteria, which are summarized below.

Tags : , , , , ,

Local Administrator Password Selection

In most enterprises there are two types of passwords: local and domain. Domain passwords are centralized passwords that are authenticated at an authentication server (e.g., a Lightweight Directory Access Protocol server, an Active Directory server). Local passwords are passwords that are stored and authenticated on the local system (e.g., a workstation or server). Although most local passwords can be managed using centralized password management mechanisms, some can only be managed through third-party tools, scripts, or manual means. A common example is built-in administrator and root accounts. Having a common password shared among all local administrator or root accounts on all machines within a network simplifies system maintenance, but it is a widespread weakness. If a single machine is compromised, an attacker may be able to recover the password and use it to gain access to all other machines that use the shared password. Organizations should avoid using the same local administrator or root account password across many systems. Also, built-in accounts are often not affected by password policies and filters, so it may be easier to just disable the built-in accounts and use other administrator-level accounts instead.

A solution to this local password management problem is the use of randomly generated passwords, unique to each machine, and a central password database that is used to keep track of local passwords on client machines. Such a database should be strongly secured and access to it limited to only the minimum needed. Specific security controls to implement include only permitting authorized administrators from authorized hosts to access the data, requiring strong authentication to access the database (for example, multi-factor authentication), storing the passwords in the database in an encrypted form (e.g., cryptographic hash), and requiring administrators to verify the identity of the database server before providing authentication credentials to it.

Another solution to management of local account passwords is to generate passwords based on system characteristics such as machine name or media access control (MAC) address. For example, the local password could be based on a cryptographic hash of the MAC address and a standard password. A machine’s MAC address, “00:16:59:7F:2C:4D”, could be combined with the password “N1stSPsRul308” to form the string “00:16:59:7F:2C:4D N1stSPsRul308”. This string could be hashed using SHA and the first 20 characters of the hash used as the password for the machine. This would create a pseudo-salt that would prevent many attackers from discovering that there is a shared password. However, if an attacker recovers one local password, the attacker would be able to determine other local passwords relatively easily.

Tags : , , , , , , , , , , ,

SSL/TLS Cipher Suites

A cipher suite combines four kinds of security features, and is given a name in the SSL protocol specification. Before data flows over a SSL connection, both ends attempt to negotiate a cipher suite. This lets them establish an appropriate quality of protection for their communications, within the constraints of the particular mechanism combinations which are available. The features associated with a cipher suite are:

A given implementation of SSL will support a particular set of cipher suites, and some subset of those will be enabled by default. Applications have a limited degree of control over the cipher suites that are used on their connections; they can enable or disable any of the supported cipher suites, but cannot change the cipher suites that are available.

Tags : , , , , , , , , , ,

Transport Layer Security Protocol

TLS (Transport Layer Security) was released in response to the Internet community’s demands for a standardized protocol. The IETF provided a venue for the new protocol to be openly discussed, and encouraged developers to provide their input to the protocol.

The TLS protocol was released in January 1999 to create a standard for private communications. The protocol “allows client/server applications to communicate in a way that is designed to prevent eavesdropping,tampering or message forgery.” According to the protocol’s creators, the goals of the TLS protocol are cryptographic security, interoperability, extensibility, and relative efficiency.These goals are achieved through implementation of the TLS protocol on two levels: the TLS Record protocol and the TLS Handshake protocol.

TLS Record Protocol

The TLS Record protocol negotiates a private, reliable connection between the client and the server. Though the Record protocol can be used without encryption, it uses symmetric cryptography keys, to ensure a private connection. This connection is secured through the use of hash functions generated by using a Message Authentication Code.

TLS Handshake Protocol

The TLS Handshake protocol allows authenticated communication to commence between the server and client. This protocol allows the client and server to speak the same language, allowing them to agree upon an encryption algorithm and encryption keys before the selected application protocol begins to send data. Using the same handshake protocol procedure as SSL, TLS provides for authentication of the server, and optionally, the client.

Tags : , , , , , , , , , , , ,

User login protocol

Initialization: Once the user has successfully logged into an account, the server places in the user’s computer a cookie that contains an authenticated record of the username, and possibly an expiration date. (“Authenticated”means that no party except for the server is able to change the cookie data without being detected by the server. This can be ensured, for example, by adding a MAC that is computed using a key known only to the server. Cookies of this type can be stored in several computers, as long as each of them was used by the user.


1. The user enters a username and a password. If his computer contains a cookie stored by the login server then the cookie is retrieved by the server.

2. The server checks whether the username is valid and whether the password is correct for this username.

3. If the username/password pair is correct, then

4. If the username/password pair is incorrect, then

Tags : , , , , , , ,

Locking Files in PHP

PHP supports a portable way of locking complete files in an advisory way (which means all accessing programs have to use the same way of  locking or it will not work). If there is a possibility that more than one process could write to a file at the same time then the file should be locked.

flock() operates on fp which must be an open file pointer. operation is one of the following:

  1. To acquire a shared lock (reader), set operation to LOCK_SH
  2. To acquire an exclusive lock (writer), set operation to LOCK_EX
  3. To release a lock (shared or exclusive), set operation to LOCK_UN
  4. If you don’t want flock() to block while locking, add LOCK_NB to LOCK_SH or LOCK_EX

When obtaining a lock, the process may block. That is, if the file is already locked, it will wait until it gets the lock to continue execution. flock() allows you to perform a simple reader/writer model which can be used on virtually every platform (including most Unix derivatives and even Windows). flock() returns TRUE on success and FALSE on error (e.g. when a lock could not be acquired).

Here is a script that writes to a log file with the fputs function and then displays the log file’s contents:


$fp = fopen(“/tmp/log.txt”, “a”);

flock($fp, LOCK_EX); // get lock

fputs($fp, date(“h:i A l F dS, Y\n”)); // add a single line to the log file

flock($fp, LOCK_UN); // release lock


echo “<pre>”; // dump log


echo “</pre>\n”;



Tags : , , , , , , , ,

Analysis of the index data model from a databases perspective

The logical representation of indexes is an abstraction for their actual physical implementation(e.g. inverted indexes, suffix trees, suffix arrays or signature files). This abstraction resembles the data independence principle exploited by databases and, by further investigation, it appears clear how databases and search engine indexes have some similarities in the nature of their data structures: in the relational model we refer to a table as a collection of rows having a uniform structure and intended meaning; a table is composed by a set of columns, called attributes, having values taken from a set of domains (like integers, string or boolean values). Likewise, in the index data model, we refer to an index as a collection of documents of a given (possibly generic) type having uniform structure and intended meaning where a document is composed of a (possibly unitary) set of fields having values also belonging to different domains (string, date, integer etc).

Differently from the databases, though, search engine indexes do not have functional dependencies nor inclusion dependencies defined for their fields, except for an implied key dependency used to uniquely identify documents into an index. Moreover, it is not possible to define join dependencies between fields belonging to different indexes. Another difference enlarging the gap between the database data model and the index data model is the lack of standard data definition and data manipulation languages. For example both in literature and in industry there is no standard query language convention (such as SQL for databases) for search engines; this heterogeneity is mainly due to a high dependency of the adopted query convention to the structure and to the nature of the items in the indexed collection.

In its simplest form, for a collection of items with textual representation, a query is composed of keywords and the items retrieved contain these keywords. An extension of this simple querying mechanism is the case of a collection of structured text documents, where the use of index fields allows users to search not only in the whole document but also in its specific attributes. From a database model perspective, though, just selection and projection operators are available: users can specify keyword-based queries over fields belonging to the document structure and, possibly, only a subset of all the fields in the document can be shown as result.

Tags : , , , , , , , , , , , , ,

Overcoming Barriers to Early Detection with Pervasive Computing

Embedded assessment leverages the capabilities of pervasive computing to advance early detection of health conditions. In this approach, technologies embedded in the home setting are used to establish personalized baselines against which later indices of health status can be compared. Our ethnographic and concept feedback studies suggest that adoption of such health technologies among end users will be increased if monitoring is woven into preventive and compensatory health applications, such that the integrated system provides value beyond assessment. We review health technology advances in the three areas of monitoring, compensation, and prevention. We then define embedded assessment in terms of these three components. The validation of pervasive computing systems for early detection involves unique challenges due to conflicts between the exploratory nature of these systems and the validation criteria of medical research audiences. We discuss an approach for demonstrating value that incorporates ethnographic observation and new ubiquitous computing tools for behavioral observation in naturalistic settings such as the home.

Leveraging synergies in these three areas holds promise for advancing detection of disease states. We believe this highly integrated approach will greatly increase adoption of home health technologies among end users and ease the transition of embedded health assessment prototypes from computing laboratories into medical research and practice. We derive our observations from a series of exploratory and qualitative studies on ubiquitous computing for health and well being. These studies, highlighted barriers to early detection in the clinical setting, concerns about home assessment technologies among end users, and values of target user groups related to prevention and detection. Observations from the studies are used to identify challenges that must be overcome  by  pervasive computing developers if ubiquitous computing systems are to gain wide acceptance for early detection of health  conditions.

The motivation driving research on pervasive home monitoring is that clinical diagnostic practices frequently fail to detect health problems in their early stages. Often, clinical testing is first conducted after the onset of a health problem when there is no data about an individual’s previous level of functioning. Subsequent clinical assessments are conducted periodically, often with no data other than self-report about functioning in between clinical visits. Self-report data on mundane or repetitive health-related behaviors has been repeatedly demonstrated as unreliable. Clinical diagnostics are also limited in ecological validity, not accounting for  functioning in the home and other daily environments. Another barrier to early detection is that age based norms used to detect  impairment may fail to capture significant decline among people whose premorbid functioning was far above average. Cultural differences have also been repeatedly shown to influence performance on standardized tests. Although early detection can cut costs in the long term, most practitioners are more accustomed to dealing with severe, late stage health issues than subclinical patterns that may or may not be markers for more serious problems. In our participatory design interviews, clinicians voiced concerns about false positives causing unwarranted patient concerns and additional demands on their time. Compounding the clinical barriers to early detection listed above are psychological and behavioral patterns among individuals contending with the possibility of illness. Our interviews highlighted denial, perceptual biases regarding variability of health states, over-confidence in recall and insight, preference for preventive and compensatory directives over pure assessment results, and a disinclination towards time consuming self-monitoring as barriers to early detection. Our ethnographic studies of households coping with cognitive decline revealed a  tension between a desire for forecasting of what illness might lie ahead and a counter current of denial. Almost all caregivers and patients wished that they had received an earlier diagnosis to guide treatment and lifestyle choices, but they also acknowledged that they had overlooked blatant warning signs until the occurrence of a catastrophic incident (e.g. a car accident). This lag between  awareness and actual decline caused them to miss out on the critical window for initiation of treatments and planning that could have had a major impact on independence and quality of life. Ethnography and concept feedback participants attributed this denial in part to a fear of being diagnosed with a disease for which there is no cure. They also worried about the effect of this data on  insurers and other outside parties. Participants in the three cohorts included in our studies (boomers, healthy older adults, and older adults coping with illness themselves or in a spouse) were much more interested in, and less conflicted about, preventive and compensatory directives than pure assessment.

Perceptual biases also appear to impede traditional assessment and self monitoring. Ethnography participants reported consistently overestimating functioning before a catastrophic event and appeared, during the interview, to consistently underestimate functioning following detection of cognitive impairment Additionally, we observed probable over-confidence among healthy adults in their ability to recall behaviors and analyze their relationship to both environmental factors and well being. This confidence in recall and insight seemed exaggerated given findings that recall of frequent events is generally poor. As a result of these health perceptions, many of those interviewed felt that the time and discipline required for journaling (e.g. of eating, sleeping, mood, etc.) outweighed the benefits. Additionally, they expressed wariness of confronting or being reprimanded about what is already obvious
to them. They would prefer to lead investigations and develop strategies for improving their lives. Pervasive computing systems may enable this type of integrated, contextualized inquiry if they can also overcome the clinical and individual barriers that might otherwise impede adoption of the new technologies.

Tags : , , , ,

Why use MyISAM?


So far, this has read somewhat like a paid advertisement for InnoDB. However, MyISAM has some very real advantages. One of these is the simplicity of the engine, it is very well understood and it’s easy to write third party tools to interact with it. There are very high quality free tools, such as mysql hot copy available for MyISAM. It is much more difficult to write tools for an engine as complicated as InnoDB and this can be easily seen in the number of them available. Also, this simplicity allows for an ease of administration that is not there with InnoDB.


MyISAM’s other main advantage is how long it has been around. There are many systems, Drupal for example, that are very much optimized for that particular engine. This is not to say that they perform poorly on InnoDB, but they are not optimized for it. For example,while many of Drupal’s core queries are well indexed and use the primary key (thus benefiting from InnoDB’s primary key clustering), some could be improved. The node table has a primary key on (nid, vid). Having this index is a good idea, but it is a two integer index and there are eleven secondary indexes based on it. This doesn’t mean much when you use MyISAM, but under InnoDB it means each of those secondary indexes has two integer sized leaves identifying the primary key.

Another fact, is that there are some workloads MyISAM is better suited for. For example,Drupal’s built in search functionality performs horribly on InnoDB for very large datasets, for example 100k+ rows. These tables are best left MyISAM. Fortunately, MySQL allows for mixing engines like this.

Resource Usage

It is readily accepted in computer science that there is often a trade off  between speed and memory footprint. We have seen through the above benchmarks that InnoDB has some fast algorithms, however, this comes at a price. Not only does InnoDB use more memory than MyISAM, but the actual data files are often quite a bit larger. Add to this the fact that InnoDB has at least one quite large log file and you have a significant increase in resource use. This makes MyISAM a good fit for a resource limited server. However, if you’re concerned at all with high levels of concurrency, it is likely you have the funds to buy a server that can handle these increased resource demands.

Tags : , , , , , , , , ,

Write Optimized Tree Indexing

Due to the poor random write performance of flash SSDs (Solid State Drives), write optimized tree indexes have been proposed to improve the update performance. BFTL was proposed to balance the inferior random write performance and fast random read performance for flash memory based sensor nodes and embedded systems. It allows the index entries in one logical B-tree node to span over multiple physical pages, and maintains an in-memory table to map each B-tree node to multiple physical pages. Newly inserted entries are packed and then written together to some new blocks. The table entries of corresponding B-tree nodes are updated, thus reducing the number of random writes. However, BFTL entails a high search cost since it accesses multiple disk pages to search a single tree node. Furthermore, even though the in-memory mapping table is compact, the memory consumption is still high. FlashDB was proposed to implement a self-tuning scheme between standard B+-tree and BFTL, depending on the workloads and the types of flash devices. Since our proposed index mostly outperforms both B+-tree and BFTL under various workloads on different flash SSDs, we do not compare our index with this self-tuning index in this paper. More recently, LA-tree was proposed for flash memory devices by adding adaptive buffers between tree nodes. LA-tree focuses on raw, small-capacity and byte addressable flash memory devices, such as sensor nodes, whereas our work is targeted for off theshelf large flash SSDs, which provide only a block-based access interface. Different target devices of these two indexes result in their differences in design.

On the hard disk, many disk-based indexes optimized for write operations have also been proposed. Graefe proposed a write-optimized B-tree by applying the idea of the log file system to the B-tree index. Y-tree supports high volume insertions for data warehouses following the idea of buffer tree. The logarithmic structures have been widely applied to optimize the write performance. O’Neil et al. proposed LSM-tree and its variant LHAM for multi-version databases. Jagadish et al. used a similar idea to design a stepped tree index and the hash index for data warehouses. Our FD-tree follows the idea of logarithmic method. The major difference is that we propose a novel method based on the fractional cascading technique to improve the search performance on the logarithmic structure.

Tags : , , , , , , ,