finding boundaries

I have an matrix ,i have found minimum and maximum valure for it ,please tell how to do the following
ho to Initialize all possible interval boundaries B with the minimum and maximum
[Merged from duplicate]
i have dataset
age attribute=[3;56;15;17;21;35;45;46;51;56;57;66;70;71]
please tell how to process the following ------CACC USED BELOW refers to discretization algorithm
CACC finds the minimum (d0 = 3) and maximum (dn = 71) of the age attribute, and then sorts all values in ascending order. The Globalcacc is set to 0 as default. In the first loop, CACC gets the cutting point for which the maximum cacc (=0.5045) is age = 10.50. Since 0.5045 > Globalcacc (=0), CACC updates the Globalcacc = 0.5045 and runs the second loop. At this point, the attribute age is discretized into two intervals: [3.00, 10.50] and (10.50, 71]. Similarly, CACC generates the second cutting point at 61.50 and its corresponding cacc (=0.6473) > Globalcacc (=0.5045), so that Globalcacc is updated to 0.6473 and the third loop is processed. CACC continues to follow the same process for the third cutting point (age = 28.00) with the corresponding Globalcacc = 0.6612, and for the fourth cutting point (age = 48.50) with the corresponding

10 Comments

What are "interval boundaries" in this context?
Have you considered
max(B(:))
min(B(:))
FIR
FIR on 4 Jan 2012
%1 Input: Dataset with i continuous attribute, M examples and S target classes;
%2 Begin
%3 For each continuous attribute Ai
%4 Find the maximum dn and the minimum d0 values of Ai;
%5 Form a set of all distinct values of A in ascending order;
%6 Initialize all possible interval boundaries B with the minimum and maximum
%7 Calculate the midpoints of all the adjacent pairs in the set;
%8 Set the initial discretization scheme as D: {[d0,dn]}and Globalcacc = 0;
these are steps i have completed till 4 steps,plz tell how to process the remaining
For %5, a set is by definition unordered, so finding the _set_ of distinct values of A in ascending order is contradictory. Finding the _list_ of distinct values of A in ascending order can be done with unique()
For %6, I do not know what is meant by the sentence. It is also not stated which minimum and maximum are being discussed.
I have a suspicion about what is being asked for, but if I am right, then the algorithm is poorly worded and would not be done in that order, so if the algorithm is written correctly then my interpretation would likely be incorrect.
FIR
FIR on 4 Jan 2012
Given a dataset with i continuous attributes, M
examples, and S target classes, for each attribute Ai, CACC first finds the maximum dn and minimum d0 of
Ai in Line 4 and then forms a set of all distinct values of Ai in the ascending order in Line 5. As a result,
all possible interval boundaries B with the minimum and the maximum, and all the midpoints of all the adjacent
boundaries in the set are obtained in Lines 6 and 7. Then
The set of all distinct values of Ai is an infinite set, as you defined the attributes to be continuous.
Are you still working on finding those "cutoff points" ?
FIR
FIR on 4 Jan 2012
yes walter these are the steps i have to follow which are pseudo codes, please guide me
FIR
FIR on 4 Jan 2012
age attribute=[3;56;15;17;21;35;45;46;51;56;57;66;70;71]
please tell how to process the following ------CACC USED BELOW refers to discretization algorithm
CACC finds the minimum (d0 = 3) and maximum (dn = 71) of the age attribute, and then sorts all values in ascending order. The Globalcacc is set to 0 as default. In the first loop, CACC gets the cutting point for which the maximum cacc (=0.5045) is age = 10.50. Since 0.5045 > Globalcacc (=0), CACC updates the Globalcacc = 0.5045 and runs the second loop. At this point, the attribute age is discretized into two intervals: [3.00, 10.50] and (10.50, 71]. Similarly, CACC generates the second cutting point at 61.50 and its corresponding cacc (=0.6473) > Globalcacc (=0.5045), so that Globalcacc is updated to 0.6473 and the third loop is processed. CACC continues to follow the same process for the third cutting point (age = 28.00) with the corresponding Globalcacc = 0.6612, and for the fourth cutting point (age = 48.50) with the corresponding
That description has far too many "magic numbers" that have no visible justification in the information you have presented. Where did 0.5045 come from? Where did 10.50 come from?
FIR
FIR on 5 Jan 2012
Walter please look at this file,
http://www.sendspace.com/file/v63szh
and give suggestion
We assume on this forum that you have already studied the theory behind what you want to do, and that you can explain the key points in an understandable manner. This forum is for assistance with the MATLAB language, not for assistance in understanding theory papers.
And it is time for me to estivate.

Sign in to comment.

Answers (1)

Walter Roberson
Walter Roberson on 6 Jan 2012

0 votes

Your duplicate question has been deleted and the content moved to here.
Your duplicate still had far far too many "magic numbers" for anyone to make any sense of.

4 Comments

FIR
FIR on 7 Jan 2012
WAlter the value of cacc =(o.5045)is calculated using an formula,
first flobal cacc is set to 0
if cacc>globalcacc ,the maximum value is updated for first loop
The Globalcacc is set to 0 as default. In the first loop, CACC gets the cutting point
for which the maximum cacc (=0.5045) is age = 10.50. Since 0.5045 > Globalcacc (=0), CACC updates the
Globalcacc = 0.5045 and runs the second loop. At this point, the attribute age is discretized into two intervals:
[3.00, 10.50] and (10.50, 71]. Similarly, CACC generates the second cutting point at 61.50 and its corresponding
cacc (=0.6473) > Globalcacc (=0.5045), so that Globalcacc is updated to 0.6473 and the third loop is processed.
CACC continues to follow the same process for the third cutting point (age = 28.00) with the corresponding
Globalcacc = 0.6612, and for the fourth cutting point (age = 48.50) with the corresponding.Globalcacc = 0.7263. However, in the fifth loop, the maximum cacc generated is less than Globalcacc = 0.7263
and thus, CACC terminates
What you show us has just
age attribute=[3;56;15;17;21;35;45;46;51;56;57;66;70;71]
Where in that does 0.5045 or 10.50 occur, or what function can be applied to that vector in order to calculate those values? For example, the mean() is clearly going to be much larger than either 0.5045 and 10.50 . The standard deviation is 21.93960240857053 which is more than twice 10.50 and has no obvious relationship to either 0.5045 or 10.50 .
You need to give us the formula to run on the age attribute in order to calculate the 0.5045 or 10.50 so that we have a chance of figuring out why the cutting point should be there instead of somewhere else.
FIR
FIR on 10 Jan 2012
walter value of 10.5 i calculate by
age=[3;5;6;15;17;21;35;45;46;51;56;57;66;70;71]
ignoring min and max value we get
age=[5;6;15;17;21;35;45;46;51;56;57;66;70]
by taking mid pointd of ever 2 variables
we get
10.5
16
;
;
;
;
;
68
value of cacc is calculted by formula
FIR
FIR on 10 Jan 2012
http://www.sendspace.com/file/scymbl

Sign in to comment.

Categories

Find more on Function Creation in Help Center and File Exchange

Tags

Asked:

FIR
on 4 Jan 2012

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!