Chapter #5 – Recursion

An activation record usually contains:

  • Values for all parameters to the function, location of the first cell in an array is passed or a variable is passed by reference, and copes of all other data.
  • Local (automatic) variables that can be stored elsewhere, in which case, the activation record contains only their descriptors and pointers to the locations where they are stored.
  • The return address to resume control by the caller, the address of the caller’s instruction immediately following the call.
  • The returned value for a function not declared as a void.

Advantages of using recursion:

  1. Similarity to the original definition.
  2. Increases the program readability.
  3. Improves self-documentation.
  4. Simplifies coding.
  5. Shorter than iterative implementations.

Tail recursion is characterized by the use of only one recursive call at the very end statement left to be executed by the function; the recursive call is not only the last statement but there are no earlier recursive calls, direct or indirect.

Tail recursion is more suitable for language like Prolog that has no explicit loop construct.

Consider the following functions:

void
function1()

{

char
stack[80];

register
int
top = 0;

cin.get(stack[top]);

while(stack[top] != ‘\n’)

cin.get(stack[++top]);

for (top -=2; top >= 0; cout.put(stack[top–]));

}

void
function2()

{

stack<int> st;

int
ch;

cin.get(ch);

while(ch != ‘\n’)

{

st.push(ch);

cin.get(ch);

}

while(!st.empty())

cout.put(st.pop());

}

After comparing function1 to function2, we can conclude that the first version is better because it is faster, no function calls are made, and the function is self-sufficient, whereas, function2 calls at least one function during each loop iteration, slowing down execution.

Indirect recursion is defined as, when a function call itself indirectly. As example:

  • Function f() will call function g()
  • Function g() is going to call function f() again.

A more complicated case of recursion is found in definition in which a function is not only defined in terms of itself, but also as one of the parameters. This is called Nested Recursion.

If you traced a usual funtion that computes Fibonacci Numbers, you’ll observe that it repeats computation of some numbers. As indicated using recursion Fib(n), number of computations equals to 2 . Fib(n + 1) – 1.

A better solution is done by using the equation: Fib(n) = dn/root(5) where d = 1/2 +(1 + root(5)). This is coded as:

unsigned int
deMoivreFib(unsigned int n)

{

return ceil(exp(n*log(1.6180339897) – log(2.2360697995)) – .5);

}

Download solutions for this chapter from this link

Chapter #4 – Stacks and Queues

Stacks are used to solve the following problems:

  1. Delimiters.
  2. Reversing a linked list.
  3. Adding huge numbers.

Queues are used to solve the following problems:

  1. Detection of acrostic.

Piece of codes:

Stack<int, vector,int> >: The stack will use vector as its container

Stack<int, list<int> >: The stack will use list as its container

Priority_queue<MyClass, vector<MyClass>, greater<MyClass> >: will use operator > not < to prioritize elements.

Priority_queue<MyClass, vector<MyClass>, MyOperator>: will use the overloaded operator in MyOperator class to prioritize elements.

Download Chapter Case Study from this link.

Download solved chapter problems from this link.

Chapter #3 – Linked Lists

Limitations of arrays:

  1. Its size has to be known at compilation time.
  2. Insertion operation requires heavy computations.

Objects that include a reference to another object of the same type are called self-referential objects.

Skip Lists

Liked lists have one drawback: they require scanning to locate a searched-for element.

Skip lists are ordered liked lists that skips certain nodes to avoid sequential processing.

In a skip list of n nodes, for each k and i such that and , the node in position points to the node in position . This means that every second node points to the node two positions ahead, every fourth node points to the node four positions ahead, and so on.

The number of pointers indicates the level of each node, and the number of levels is

Description of the searching algorithm:

Searching for an element e1 consists of following the pointers on the highest level until an element found that finishes searches successfully.

In the case of reaching end of the list or encountering an element that’s greater than e1, the search is restarted from the node preceding the one containing key, but this time starting from a pointer on a lower level than before.

The search continues until e1 is found, or the first-level pointers are followed to reach the end of the list or to find an element greater than e1.

Below is pseudocode for the algorithm

find(element e1)

p = ;

while e1

if p->key < e1

p = a sublist that begins in the predecessor of p on level –i;

else if p->key > e1

if p is the last node on level

p = a nonnull sublist that begins in p on the highest level ;

;

else p = p->next;

Unfortunately, design of skip lists lead to insufficient insertion and deletion procedures.

For keeping the search efficiency as it’s, we keep the requirement on the number of nodes of different levels.

Inserting node doesn’t require list restructuring, and nodes are generated so that the distribution of the nodes on different levels is kept adequate. This is accomplished using random variables that determine in which level the new node will be inserted.

The order of search in random skip list takes . This turns out that skip lists fare extremely well in comparison with more sophisticated data structures, such as self-adjusting trees and VAL trees, and therefore they are a viable alternative to these data structures.

Self-Organizing Lists

There are many different ways to organize lists, here are four of them:

  1. Move-to-from method. After the desired element is located, put it at the beginning of the list.
  2. Transpose method. After the desired element is located, swap it with the predecessor unless it at the head of the list.
  3. Count method. Order the list by the number of times elements are being accessed.
  4. Ordering method. Order the list using certain criteria natural for the information under scrutiny.

With the first three methods, we try to locate the elements most likely to be looked for near the beginning of the list, most explicitly the move-to-front method and most cautiously transpose.

Optimal static ordering is the optimal ordering to a list and further is used as standard of number of comparisons with other lists performance.

Download solved chapter programming assignment problems from this link.

Introduction to Data Mining and Machine Learning

Data mining is the extraction of implicit, previously unknown, and potentially useful information from data.

Machine learning provides the technical basis of data mining. It is used to extract information from the raw data in databases.

Aim of the book:

    “The objective of this book is to introduce the tools and techniques for machine learning that are used in data mining. After reading it, you will understand what these techniques are and appreciate their strengths and applicability. If you wish to experiment with your own data, you will be able to do this easily with the Weka software

 

Data Mining and Machine Learning

Data mining is about solving problems by analyzing data already present in databases.

Data mining is defined as the process of discovering patterns in data.

How are the patterns expressed? Useful patterns allow us to make nontrivial predictions on new data. There are two extremes for the expression of a pattern:

  1. As a black box whose innards are effectively incomprehensible and,
  2. As a transparent box whose construction reveals the structure of the pattern. Both, we are assuming, make good predictions.

Patterns that affect decision making process explicitly are called structural patterns.

The rules do not really generalize from the data; they merely summarize it.

Problems with datasets:

  1. The set of examples given as input is far from complete.
  2. Values are not specified for all the features in all the examples.
  3. Rules appear to classify the examples correctly, whereas often, because of errors or noise in the data, misclassifications occur even on the data that is used to train the classifier.

One of proposed definition to learning:

Things learn when they change their behavior in a way that makes them perform better in the future

A set of rules that are intended to be interpreted in sequence is called a decision list (Classification Rules).

Numeric attribute problem is problem that appears in numeric values. Otherwise it’s called, mixed-attribute problem.

Rules that strongly associate different attribute values are called association rules.

People frequently use machine learning techniques to gain insight into the structure of their data rather than to make predictions for new cases.

One of challenges in machine learning is, the question of what is the most natural and easily understood format for the output from a machine learning scheme.

The process of determining regression equation weights is called regression.

The fact that the decision structure is comprehensible is a key feature in the successful adoption of the application.

 

Fielded Applications

Decisions involving judgment:

When you apply for a loan, you have to fill out a questionnaire that asks for relevant financial and personal information. This information is used by the loan company as the basis for its decision as to whether to lend you money. Such decisions are typically made in two stages. First, statistical methods are used to determine clear “accept” and “reject” cases. The remaining borderline cases are more difficult and call for human judgment. For example, one loan company uses a statistical decision procedure to calculate a numeric parameter based on the information supplied in the questionnaire. Applicants are accepted if this parameter exceeds a preset threshold and rejected if it falls below a second threshold. This accounts for 90% of cases, and the remaining 10% are referred to loan officers for a decision. On examining historical data on whether applicants did indeed repay their loans, however, it turned out that half of the borderline applicants who were granted loans actually defaulted. Although it would be tempting simply to deny credit to borderline customers, credit industry professionals pointed out that if only their repayment future could be reliably determined it is precisely these customers whose business should be wooed; they tend to be active customers of a credit institution because their finances remain in a chronically volatile condition. A suitable compromise must be reached between the viewpoint of a company accountant, who dislikes bad debt, and that of a sales executive, who dislikes turning business away.

    Enter machine learning. The input was 1000 training examples of borderline cases for which a loan had been made that specified whether the borrower had finally paid off or defaulted. For each training example, about 20 attributes were extracted from the questionnaire, such as age, years with current employer, years at current address, years with the bank, and other credit cards possessed. A machine learning procedure was used to produce a small set of classification rules that made correct predictions on two-thirds of the borderline cases in an independently chosen test set. Not only did these rules improve the success rate of the loan decisions, but the company also found them attractive because they could be used to explain to applicants the reasons behind the decision. Although the project was an exploratory one that took only a small development effort, the loan company was apparently so pleased with the result that the rules were put into use immediately.

 

Screening Images:

Since the early days of satellite technology, environmental scientists have been trying to detect oil slicks from satellite images to give early warning of ecological disasters and deter illegal dumping. Radar satellites provide an opportunity for monitoring coastal waters day and night, regardless of weather conditions. Oil slicks appear as dark regions in the image whose size and shape evolve depending on weather and sea conditions. However, other look-alike dark regions can be caused by local weather conditions such as high wind. Detecting oil slicks is an expensive manual process requiring highly trained personnel who assess each region in the image.

    A hazard detection system has been developed to screen images for subsequent manual processing. Intended to be marketed worldwide to a wide variety of users—government agencies and companies—with different objectives, applications, and geographic areas, it needs to be highly customizable to individual circumstances. Machine learning allows the system to be trained on examples of spills and nonspills supplied by the user and lets the user control the tradeoff between undetected spills and false alarms. Unlike other machine learning applications, which generate a classifier that is then deployed in the field, here it is the learning method itself that will be deployed.

The input is a set of raw pixel images from a radar satellite, and the output is a much smaller set of images with putative oil slicks marked by a colored border. First, standard image processing operations are applied to normalize the image. Then, suspicious dark regions are identified. Several dozen attributes are extracted from each region, characterizing its size, shape, area, intensity, sharpness and jaggedness of the boundaries, proximity to other regions, and information about the background in the vicinity of the region. Finally, standard learning techniques are applied to the resulting attribute vectors.

Several interesting problems were encountered. One is the scarcity of training data. Oil slicks are (fortunately) very rare, and manual classification is extremely costly. Another is the unbalanced nature of the problem: of the many dark regions in the training data, only a very small fraction are actual oil slicks. A third is that the examples group naturally into batches, with regions drawn from each image forming a single batch, and background characteristics vary from one batch to another. Finally, the performance task is to serve as a filter, and the user must be provided with a convenient means of varying the false alarm rate.

 

Market basket analysis is the use of association techniques to find groups of items that tend to occur together in transactions, typically supermarket checkout data.

 

Other Applications:

Sophisticated manufacturing processes often involve tweaking control parameters. Separating crude oil from natural gas is an essential prerequisite to oil refinement, and controlling the separation process is a tricky job. British Petroleum used machine learning to create rules for setting the parameters. This now takes just 10 minutes, whereas previously human experts took more than a day. Westinghouse faced problems in their process for manufacturing nuclear fuel pellets and used machine learning to create rules to control the process. This was reported to save them more than $10 million per year (in 1984). The Tennessee printing company R.R. Donnelly applied the same idea to control rotogravure printing presses to reduce artifacts caused by inappropriate parameter settings, reducing the number of artifacts from more than 500 each year to less than 30.

 

Machine Learning and Statistics

If forced to point to a single difference of emphasis, it might be that statistics has been more concerned with testing hypotheses, whereas machine learning has been more concerned with formulating the process of generalization as a search through possible hypotheses. But this is a gross oversimplification: statistics is far more than hypothesis testing, and many machine learning techniques do not involve any searching at all.

 

Some of the most important decisions in machine learning system are:

  1. The concept description language.
  2. The order in which the space is searched.
  3. The way that overfitting to the particular training data is avoided.

 

Keywords:

  • Data Mining.
  • Machine Learning.
  • Pattern.
  • Structural Pattern.
  • Decision List.
  • Classification Rules.
  • Numeric Attribute Problem.
  • Mixed Attribute Problem.
  • Association Rules.
  • Regression.

Market Basket Analysis.

Using WinAPIs to apply Message Queue Inter-Process Communication Method

Today, I’m about writing an interesting article. It’s all about WinAPIs and Message Queues.

The first question is: What’s Inter-Process Communication?

Answer: Inter-process communication (IPC) is a set of techniques for the exchange of data among multiple threads in one or more processes.

There are a lot of methods to provide this service:

  1. Socket.
  2. Pipe.
  3. Named Pipe.
  4. Message Queue.
  5. Shared Memory.
  6. Message Passing.
  7. Others.

There are a number of APIs which may be used for IPC. A number of platform independent APIs include found in this link.

I’ve read a little about message queue and found that it’s very interesting. You can explore more about it from the links below:

I’ve simulated the bully algorithm using this technique. You can find source code in this link.

Introduction to the Introduction

Download this Article

What Is Machine Learning?

We do not know exactly which people are likely to buy a particular product, or which author to suggest to people who enjoy reading Hemingway. If we knew, we would not need any analysis of the data; we would just go ahead and write down the code. But because we do not, we can only collect data and hope to extract the answers to these and similar questions from data.

We do believe that there is a process that explains the data we observe. Though we do not know the details of the process underlying the generation of data-for example, consumer behavior-we know that it is not completely random. People do not go to supermarkets and buy things at random. When they buy beer, they buy chips; they buy ice cream in summer and spices for Ghihwein in winter. There are certain patterns in the data.

We may not be able to identify the process completely, but we believe we can construct a good and useful approximation. That approximation may not explain everything, but may still be able to account for some part of the data. We believe that though identifying the complete process may not be possible, we can still detect certain patterns or regularities. This is the niche of machine learning.

Application of machine learning methods to large databases is called data mining.

But machine learning is not just a database problem; it is also a part of artificial intelligence. To be intelligent, a system that is in a changing environment should have the ability to learn. If the system can learn and adapt to such changes, the system designer need not foresee and provide solutions for all possible situations.

The model may be predictive to make predictions in the future, or descriptive to gain knowledge from data, or both.

Examples of Machine Learning Applications

Learning Associations:

In the case of retail-for example, a supermarket chain-one application of machine learning is basket analysis, which is finding associations between products bought by customers: If people who buy X typically also buy Y, and if there is a customer who buys X and does not buy Y, he or she is a potential Y customer. Once we find such customers, we can target them for cross-selling.

In finding an association rule, we are interested in learning a conditional probability of the form where is the product we would like to condition on , which is the product or the set of products which we know that the customer has already purchased.

Let us say, going over our data, we calculate that = 0.7. Then, we can define the rule:

70 percent of customers who buy beer also buy chips

We may want to make a distinction among customers and toward this, estimate where is the set of customer attributes, for example, gender, age, marital status, and so on, assuming that we have access to this information.

Classification:

In credit scoring (Hand 1998), the bank calculates the risk given the amount of credit and the information about the customer. The information about the customer includes data we have access to and is relevant in calculating his or her financial capacity-namely, income, savings, collaterals, profeSSion, age, past financial history, and so forth. The bank has a record of past loans containing such customer data and whether the loan was paid back or not. From this data of particular applications, the aim is to infer a general rule coding the association between a customer’s attributes and his risk. That is, the machine learning system fits a model to the past data to be able to calculate the risk for a new application and then decides to accept or refuse it accordingly.

This is an example of a classification problem where there are two classes: low-risk and high-risk customers. The information about a customer makes up the input to the classifier whose task is to assign the input to one of the two classes.

A discriminant function, is a function that separates the examples of different classes. This type of applications is used for prediction.

There are many applications of machine learning in pattern recognition. One is optical character recognition, which is recognizing character codes from their images.

In medical diagnosis, the inputs are the relevant information we have about the patient and the classes are the illnesses. The inputs contain the patient’s age, gender, past medical history, and current symptoms. Some tests may not have been applied to the patient, and thus these inputs would be missing. Tests take time, may be costly, and may inconvience the patient so we do not want to apply them unless we believe that they will give us valuable information. In the case of a medical diagnosis, a wrong decision may lead to a wrong or no treatment, and in cases of doubt it is preferable that the classifier reject and defer decision to a human expert.

Learning also performs compression in that by fitting a rule to the data, we get an explanation that is simpler than the data, requiring less memory to store and less computation to process. Once you have the rules of addition, you do not need to remember the sum of every possible pair of numbers.

Another use of machine learning is outlier detection, which is finding the instances that do not obey the rule and are exceptions. In this case, after learning the rule, we are not interested in the rule but the exceptions not covered by the rule, which may imply anomalies requiring attention for example, fraud.

Regression:

Let us say we want to have a system that can predict the price of a used car. Inputs are the car attributes-brand, year, engine capacity, milage, and other information-that we believe affect a car’s worth. The output is the price of the car. Such problems where the output is a number are regression problems.

Both regression and classification are supervised learning problems where there is an input, X, an output, Y, and the task is to learn the mapping from the input to the output. The approach in machine learning is that we assume a model defined up to a set of parameters: .

where is the model and are its parameters. is a number in regression and is a class code (e.g., 0/1) in the case of classification. is the regression function or in classification, it is the discriminant function separating the instances of different classes. The machine learning program optimizes the parameters, , such that the approximation error is minimized, that is, our estimates are as close as possible to the correct values given in the training set.

One can envisage other applications of regression where one is trying to optimize a function. Let us say we want to build a machine that roasts coffee. The machine has many inputs that affect the quality: various settings of temperatures, times, coffee bean type, and so forth. We make a number of experiments and for different settings of these inputs, we measure the quality of the coffee, for example, as consumer satisfaction. To find the optimal setting, we fit a regression model linking these inputs to coffee quality and choose new points to sample near the optimum of the current model to look for a better configuration. We sample these points, check quality, and add these to the data and fit a new model. This is generally called response surface design.

Unsupervised Learning:

In supervised learning, the aim is to learn a mapping from the input to an output whose correct values are provided by a supervisor. In unsupervised learning, there is no such supervisor and we only have input data. The aim is to find the regularities in the input. There is a structure to the input space such that certain patterns occur more often than others, and we want to see what generally happens and what does not. In statistics, DENSITY ESTIMATION this is called density estimation.

One method for density estimation is clustering where the aim is to find clusters or groupings of input. In the case of a company with a data of past customers, the customer data contains the demographic information as well as the past transactions with the company, and the company may want to see the distribution of the profile of its customers, to see what type of customers frequently occur. In such a case, a clustering model allocates customers similar in their attributes to the same group, providing the company with natural groupings of its customers. Once such groups are found, the company may decide strategies, for example, services and products, specific to different groups. Such a grouping also allows identifying those who are outliers, namely, those who are different from other customers, which may imply a niche in the market that can be further exploited by the company.

An interesting application of clustering is in image compression. In this case, the input instances are image pixels represented as RGB values. A clustering program groups pixels with similar colors in the same group, and such groups correspond to the colors occurring frequently in the image. If in an image, there are only shades of a small number of colors and if we code those belonging to the same group with one color, for example, their average, then the image is quantized. Let us say the pixels are 24 bits to represent 16 million colors, but if there are shades of only 64 main colors, for each pixel, we need 6 bits instead of 24. For example, if the scene has various shades of blue in different parts of the image, and if we use the same average blue for all of them, we lose the details in the image but gain space in storage and transmission. Ideally, one would like to identify higher-level regularities by analyzing repeated image patterns, for example, texture, objects, and so forth. This allows a higher-level, simpler, and more useful description of the scene, and for example, achieves better compression than compressing at the pixel level.

One application area of computer science in molecular biology is alignment, which is matching one sequence to another. This is a difficult string matching problem because strings may be quite long, there are many template strings to match against, and there may be deletions, insertions, and substitutions. Clustering is used in learning motifs, which are sequences of amino acids that occur repeatedly in proteins. Motifs are of interest because they may correspond to structural or functional elements within the sequences they characterize. The analogy is that if the amino acids are letters and proteins are sentences, motifs are like words, namely, a string of letters with a particular meaning occurring frequently in different sentences.

Reinforcement Learning:

In some applications, the output of the system is a sequence of actions. In such a case, a single action is not important; what is important is the policy that is the sequence of correct actions to reach the goal. an action is good if it is part of a good policy. In such a case, the machine learning program should be able to assess the goodness of policies and learn from past good action sequences to be able to generate a policy. Such learning methods are called reinforcement learning algorithms.

A good example is game playing where a single move by itself is not that important; it is the sequence of right moves that is good. A move is good if it is part of a good game playing policy.

A robot navigating in an environment in search of a goal location is another application area of reinforcement learning. At any time, the robot can move in one of a number of directions. After a number of trial runs, it should learn the correct sequence of actions to reach to the goal state from an initial state, doing this as quickly as possible and without hitting any of the obstacles.

One factor that makes reinforcement learning harder is when the system has unreliable and partial sensory information. For example, a robot equipped with a video camera has incomplete information and thus at any time is in a partially observable state and should decide taking into account this uncertainty. A task may also require a concurrent operation of multiple agents that should interact and cooperate to accomplish a common goal. An example is a team of robots playing soccer.

Notes

Induction is the process of extracting general rules from a set of particular cases.

In statistics, going from particular observations to general descriptions is called inference and learning is called estimation. Classification is called discriminant analysis in statistics.

In engineering, classification is called pattern recognition and the approach is nonparametric and much more empirical.

Machine learning is related to artificial intelligence (Russell and Norvig 1995) because an intelligent system should be able to adapt to changes in its environment.

Data mining is the name coined in the business world for the application of machine learning algorithms to large amounts of data (Weiss and Indurkhya 1998). In computer science, it is also called knowledge discovery in databases (KDD).

Chapter’s Important Keywords:

  1. Machine Learning.
  2. Data Mining.
  3. Descriptive Model.
  4. Predictive Model.
  5. Association Rule.
  6. Classification.
  7. Discriminant Function.
  8. Knowledge Extraction.
  9. Compression.
  10. Outlier Detection.
  11. Supervised Learning.
  12. Response Surface Design.
  13. Density Estimation.
  14. Clustering.
  15. Reinforcement Learning.
  16. Partially Observable State.
  17. Induction.
  18. Inference.
  19. Estimation.
  20. Discriminant Analysis.
  21. Knowledge Discovery in Databases.

Preface

Machine learning is about programming computers to optimize a performance
criterion using example data or past experience.

Examples of Machine Learning

  1. Recognition of spoken speech. That is, converting the acoustic speech signal to an ASCrr text.
  2. Consider routing packets over a computer network. The path maximizing the quality of service from a source to destination changes continuously as the network traffic changes. A learning routing program is able to adapt to the best path by monitoring the network traffic.
  3. Intelligent user interface that can adapt to the biometrics of its user, namely, his or her accent, handwriting, working habits, and so forth.
  4. Retail companies analyze their past sales data to learn their customers’ behavior to improve customer relationship management.
  5. Financial institutions analyze past transactions to predict customers’ credit risks.
  6. Robots learn to optimize their behavior to complete a task using minimum resources.
  7. In bioinformatics, the huge amount of data can only be analyzed and knowledge be extracted using computers.
  8. Cars that can drive themselves under different road and weather conditions.
  9. Phones that can translate in real time to and from a foreign language.
  10. Autonomous robots that can navigate in a new environment, for example, on the surface of another planet.

 

The book I’m reading, discusses many methods that have their bases in different fields; statistics, pattern recognition, neural networks, artificial intelligence, signal processing, control, and data mining.