./run:3: warning: parenthesize argument(s) for future version
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
MulticlassClassification
===== MAIN: learn based on training data =====
=== START program1: ./run learn ../dataset2/train
./run:3: warning: parenthesize argument(s) for future version
MulticlassClassification
=== END program1: ./run learn ../dataset2/train --- OK [1s]
===== MAIN: predict/evaluate on train data =====
=== START program3: ./run stripLabels ../dataset2/train ../program0/evalTrain.in
=== END program3: ./run stripLabels ../dataset2/train ../program0/evalTrain.in --- OK [0s]
=== START program1: ./run predict ../program0/evalTrain.in ../program0/evalTrain.out
./run:3: warning: parenthesize argument(s) for future version
MulticlassClassification
=== END program1: ./run predict ../program0/evalTrain.in ../program0/evalTrain.out --- OK [0s]
=== START program4: ./run evaluate ../dataset2/train ../program0/evalTrain.out
=== END program4: ./run evaluate ../dataset2/train ../program0/evalTrain.out --- OK [0s]
===== MAIN: predict/evaluate on test data =====
=== START program3: ./run stripLabels ../dataset2/test ../program0/evalTest.in
=== END program3: ./run stripLabels ../dataset2/test ../program0/evalTest.in --- OK [0s]
=== START program1: ./run predict ../program0/evalTest.in ../program0/evalTest.out
./run:3: warning: parenthesize argument(s) for future version
MulticlassClassification
=== END program1: ./run predict ../program0/evalTest.in ../program0/evalTest.out --- OK [0s]
=== START program4: ./run evaluate ../dataset2/test ../program0/evalTest.out
=== END program4: ./run evaluate ../dataset2/test ../program0/evalTest.out --- OK [0s]
real 0m2.167s
user 0m1.524s
sys 0m0.220s
Run specification
supervised-learning: Main entry for supervised learning for training and testing a program on a dataset.
(learner:Program) JRip_weka_nominal: This programs is part of the WEKA classifier library. The code used to generate this program is from the java class 'weka/classifiers/rules/JRip.java' from WEKA's libraries.
The following description was taken from this classes JavaDoc information:
---------------------
This class implements a propositional rule learner, Repeated Incremental Pruning to Produce Error Reduction (RIPPER), which was proposed by William W. Cohen as an optimized version of IREP.
The algorithm is briefly described as follows:
Initialize RS = {}, and for each class from the less prevalent one to the more frequent one, DO:
1. Building stage:
Repeat 1.1 and 1.2 until the descrition length (DL) of the ruleset and examples is 64 bits greater than the smallest DL met so far, or there are no positive examples, or the error rate >= 50%.
1.1. Grow phase:
Grow one rule by greedily adding antecedents (or conditions) to the rule until the rule is perfect (i.e. 100% accurate). The procedure tries every possible value of each attribute and selects the condition with highest information gain: p(log(p/t)-log(P/T)).
1.2. Prune phase:
Incrementally prune each rule and allow the pruning of any final sequences of the antecedents;The pruning metric is (p-n)/(p+n) -- but it's actually 2p/(p+n) -1, so in this implementation we simply use p/(p+n) (actually (p+1)/(p+n+2), thus if p+n is 0, it's 0.5).
2. Optimization stage:
after generating the initial ruleset {Ri}, generate and prune two variants of each rule Ri from randomized data using procedure 1.1 and 1.2. But one variant is generated from an empty rule while the other is generated by greedily adding antecedents to the original rule. Moreover, the pruning metric used here is (TP+TN)/(P+N).Then the smallest possible DL for each variant and the original rule is computed. The variant with the minimal DL is selected as the final representative of Ri in the ruleset.After all the rules in {Ri} have been examined and if there are still residual positives, more rules are generated based on the residual positives using Building Stage again.
3. Delete the rules from the ruleset that would increase the DL of the whole ruleset if it were in it. and add resultant ruleset to RS.
ENDDO
Note that there seem to be 2 bugs in the original ripper program that would affect the ruleset size and accuracy slightly. This implementation avoids these bugs and thus is a little bit different from Cohen's original implementation. Even after fixing the bugs, since the order of classes with the same frequency is not defined in ripper, there still seems to be some trivial difference between this implementation and the original ripper, especially for audiology data in UCI repository, where there are lots of classes of few instances.
Details please see:
William W. Cohen: Fast Effective Rule Induction. In: Twelfth International Conference on Machine Learning, 115-123, 1995.
PS. We have compared this implementation with the original ripper implementation in aspects of accuracy, ruleset size and running time on both artificial data "ab+bcd+defg" and UCI datasets. In all these aspects it seems to be quite comparable to the original ripper implementation. However, we didn't consider memory consumption optimization in this implementation.
---------------------
NOTE: This algorithm has no parameter tuning, it is using the default WEKA parameters
NOTE: WEKA's Classifiers read a data in the .arff format. For Multiclass datasets, the SVMlight format converted to .arff multiclass format so they can be read by WEKA programs
When you generate a run, you can set a time limit for the run (no more than 24 hours). After that point, we will terminate the program.
Your program can use 1.5GB of memory. More information here.
Go to the page for the run and look at the log file for signs of the responsible error.
You can also download the run and run it locally on your machine (a README file should
be included in the download which provides more information).
We said that a run was simply a program/dataset pair, but that's not the full story.
A run actually includes other helper programs such as the evaluation program and
various programs for reductions (e.g., one-versus-all, hyperparameter tuning).
More formally, a run is a given by a run specification,
which can be found on the page for any run.
A run specification is a tree where each internal node represents a program
and its children represents the arguments to be passed into its constructor.
For example, the one-versus-all program takes your binary classification program
as a constructor argument and behaves like a multiclass classification program.
Must be logged in to post comments.