Skip to content
Snippets Groups Projects
Commit b503aa14 authored by Andre Maroneze's avatar Andre Maroneze
Browse files

Merge branch 'add-genann' into 'master'

[genann] add new case study

See merge request pub/open-source-case-studies!12
parents ebd7b71f b119465b
No related branches found
No related tags found
No related merge requests found
Showing
with 3054 additions and 1 deletion
......@@ -56,6 +56,7 @@ TARGETS=\
cerberus \
chrony \
debie1 \
genann \
gzip124 \
hiredis \
icpc \
......
......@@ -127,6 +127,7 @@ when available. We also summarize the license of each directory below.
- `debie1`: distribution and use authorized by Patria Aviation Oy,
Space Systems Finland Ltd. and Tidorum Ltd, see `README.txt`
and `terms_of_use-2014-05.pdf`
- `genann`: Zlib, see `LICENSE`
- `gzip124`: GPL
- `hiredis`: Redis license (BSD-style), see `COPYING`
- `icpc`: Unlicense
......
Subproject commit 086dbf6774409ce5457bbc1ddb232e1f97714b23
Subproject commit 4adb46e09b8f6bd9809fa98cf1bbc7731c8eea87
# Makefile template for Frama-C/Eva case studies.
# For details and usage information, see the Frama-C User Manual.
### Prologue. Do not modify this block. #######################################
-include path.mk
FRAMAC ?= frama-c
include $(shell $(FRAMAC)-config -print-share-path)/analysis-scripts/prologue.mk
###############################################################################
# Edit below as needed. Suggested flags are optional.
MACHDEP = gcc_x86_64
# The source code uses the __FILE__ macro; to obtain stable oracles, we use
# a specific GCC option based on the directory containing this Makefile.
mkfile_path := $(abspath $(firstword $(MAKEFILE_LIST))/../..)
## Preprocessing flags (for -cpp-extra-args)
CPPFLAGS += \
-ffile-prefix-map=$(mkfile_path)=.
## General flags
FCFLAGS += \
-add-symbolic-path=..:. \
-kernel-warn-key annot:missing-spec=abort \
-kernel-warn-key typing:implicit-function-declaration=abort \
## Eva-specific flags
EVAFLAGS += \
-eva-warn-key builtins:missing-spec=abort \
-eva-slevel 9 \
## GUI-only flags
FCGUIFLAGS += \
## Analysis targets (suffixed with .eva)
TARGETS = genann.eva
### Each target <t>.eva needs a rule <t>.parse with source files as prerequisites
genann.parse: \
../genann.c \
../test.c \
### Epilogue. Do not modify this block. #######################################
include $(shell $(FRAMAC)-config -print-share-path)/analysis-scripts/epilogue.mk
###############################################################################
# optional, for OSCS
-include ../../Makefile.common
directory file line function property kind status property
. genann.c 115 genann_init signed_overflow Unknown (int)((int)(inputs + 1) * hidden) + (int)((int)((int)(hidden_layers - 1) * (int)(hidden + 1)) * hidden) ≤ 2147483647
. genann.c 115 genann_init signed_overflow Unknown (int)(inputs + 1) * hidden ≤ 2147483647
. genann.c 115 genann_init signed_overflow Unknown inputs + 1 ≤ 2147483647
. genann.c 115 genann_init signed_overflow Unknown (int)((int)(hidden_layers - 1) * (int)(hidden + 1)) * hidden ≤ 2147483647
. genann.c 115 genann_init signed_overflow Unknown (int)(hidden_layers - 1) * (int)(hidden + 1) ≤ 2147483647
. genann.c 115 genann_init signed_overflow Unknown hidden + 1 ≤ 2147483647
. genann.c 116 genann_init signed_overflow Unknown tmp_0 * outputs ≤ 2147483647
. genann.c 116 genann_init signed_overflow Unknown inputs + 1 ≤ 2147483647
. genann.c 117 genann_init signed_overflow Unknown hidden_weights + output_weights ≤ 2147483647
. genann.c 119 genann_init signed_overflow Unknown (int)(inputs + (int)(hidden * hidden_layers)) + outputs ≤ 2147483647
. genann.c 119 genann_init signed_overflow Unknown inputs + (int)(hidden * hidden_layers) ≤ 2147483647
. genann.c 119 genann_init signed_overflow Unknown hidden * hidden_layers ≤ 2147483647
. genann.c 122 genann_init signed_overflow Unknown (int)(total_weights + total_neurons) + (int)(total_neurons - inputs) ≤ 2147483647
. genann.c 122 genann_init signed_overflow Unknown total_weights + total_neurons ≤ 2147483647
. genann.c 126 genann_init mem_access Unknown \valid(&ret->inputs)
. genann.c 127 genann_init mem_access Unknown \valid(&ret->hidden_layers)
. genann.c 128 genann_init mem_access Unknown \valid(&ret->hidden)
. genann.c 129 genann_init mem_access Unknown \valid(&ret->outputs)
. genann.c 131 genann_init mem_access Unknown \valid(&ret->total_weights)
. genann.c 132 genann_init mem_access Unknown \valid(&ret->total_neurons)
. genann.c 135 genann_init mem_access Unknown \valid(&ret->weight)
. genann.c 136 genann_init mem_access Unknown \valid(&ret->output)
. genann.c 136 genann_init mem_access Unknown \valid_read(&ret->total_weights)
. genann.c 136 genann_init mem_access Unknown \valid_read(&ret->weight)
. genann.c 137 genann_init mem_access Unknown \valid(&ret->delta)
. genann.c 137 genann_init mem_access Unknown \valid_read(&ret->output)
. genann.c 137 genann_init mem_access Unknown \valid_read(&ret->total_neurons)
. genann.c 141 genann_init mem_access Unknown \valid(&ret->activation_hidden)
. genann.c 142 genann_init mem_access Unknown \valid(&ret->activation_output)
. genann.c 164 genann_read mem_access Unknown \valid_read(&ann->total_weights)
. genann.c 166 fscanf_va_2 precondition Unknown \valid(param0)
. genann.c 166 genann_read mem_access Unknown \valid_read(&ann->weight)
. genann.c 166 genann_read precondition of fscanf_va_2 Unknown \valid(param0)
. genann.c 180 genann_copy mem_access Unknown \valid_read(&ann->inputs)
. genann.c 180 genann_copy mem_access Unknown \valid_read(&ann->total_neurons)
. genann.c 180 genann_copy mem_access Unknown \valid_read(&ann->total_weights)
. genann.c 197 genann_randomize mem_access Unknown \valid_read(&ann->total_weights)
. genann.c 200 genann_randomize mem_access Unknown \valid(ann->weight + i)
. genann.c 200 genann_randomize mem_access Unknown \valid_read(&ann->weight)
. genann.c 212 genann_run mem_access Unknown \valid_read(&ann->weight)
. genann.c 397 genann_write mem_access Unknown \valid_read(&ann->outputs)
. genann.c 401 genann_write initialization Unknown \initialized(ann->weight + i)
. test.c 37 basic mem_access Unknown \valid_read(&ann->total_weights)
. test.c 71 xor mem_access Unknown \valid(&ann->activation_hidden)
. test.c 201 persist precondition of fclose Unknown valid_stream: \valid(stream)
. test.c 206 persist precondition of fclose Unknown valid_stream: \valid(stream)
. test.c 208 persist mem_access Unknown \valid_read(&second->inputs)
. test.c 209 persist mem_access Unknown \valid_read(&second->hidden_layers)
. test.c 210 persist mem_access Unknown \valid_read(&second->hidden)
. test.c 211 persist mem_access Unknown \valid_read(&second->outputs)
. test.c 212 persist mem_access Unknown \valid_read(&second->total_weights)
. test.c 216 persist signed_overflow Unknown ltests + 1 ≤ 2147483647
. test.c 216 persist initialization Unknown \initialized(first->weight + i)
. test.c 216 persist initialization Unknown \initialized(second->weight + i)
. test.c 216 persist is_nan_or_infinite Unknown \is_finite(*(second->weight + i))
. test.c 216 persist mem_access Unknown \valid_read(&second->weight)
. test.c 216 persist mem_access Unknown \valid_read(second->weight + i)
. test.c 216 persist signed_overflow Unknown lfails + 1 ≤ 2147483647
. test.c 229 copy signed_overflow Unknown ltests + 1 ≤ 2147483647
. test.c 229 copy mem_access Unknown \valid_read(&second->inputs)
. test.c 230 copy signed_overflow Unknown ltests + 1 ≤ 2147483647
. test.c 231 copy signed_overflow Unknown ltests + 1 ≤ 2147483647
. test.c 232 copy signed_overflow Unknown ltests + 1 ≤ 2147483647
. test.c 233 copy signed_overflow Unknown ltests + 1 ≤ 2147483647
. test.c 237 copy signed_overflow Unknown ltests + 1 ≤ 2147483647
. test.c 237 copy initialization Unknown \initialized(first->weight + i)
. test.c 237 copy initialization Unknown \initialized(second->weight + i)
. test.c 237 copy signed_overflow Unknown lfails + 1 ≤ 2147483647
. test.c 251 sigmoid signed_overflow Unknown ltests + 1 ≤ 2147483647
. test.c 251 sigmoid signed_overflow Unknown lfails + 1 ≤ 2147483647
. test.c 270 main signed_overflow Unknown (int)(ltests - ts_6) - (int)(lfails - fs_6) ≤ 2147483647
. test.c 270 main signed_overflow Unknown -2147483648 ≤ (int)(ltests - ts_6) - (int)(lfails - fs_6)
. test.c 271 main signed_overflow Unknown (int)(ltests - ts_7) - (int)(lfails - fs_7) ≤ 2147483647
. test.c 271 main signed_overflow Unknown -2147483648 ≤ (int)(ltests - ts_7) - (int)(lfails - fs_7)
FRAMAC_SHARE/libc stdio.h 120 fclose precondition Unknown valid_stream: \valid(stream)
[metrics] Eva coverage statistics
=======================
Syntactically reachable functions = 25 (out of 25)
Semantically reached functions = 24
Coverage estimation = 96.0%
Unreached functions (1) =
<genann.c>: genann_act_linear;
[metrics] References to non-analyzed functions
------------------------------------
Function genann_train references genann_act_linear (at genann.c:292)
Function genann_train references genann_act_linear (at genann.c:293)
[metrics] Statements analyzed by Eva
--------------------------
1042 stmts in analyzed functions, 947 stmts analyzed (90.9%)
backprop: 18 stmts out of 18 (100.0%)
genann_act_hidden_indirect: 2 stmts out of 2 (100.0%)
genann_act_output_indirect: 2 stmts out of 2 (100.0%)
genann_act_threshold: 2 stmts out of 2 (100.0%)
genann_copy: 12 stmts out of 12 (100.0%)
genann_free: 2 stmts out of 2 (100.0%)
genann_init: 48 stmts out of 48 (100.0%)
genann_init_sigmoid_lookup: 10 stmts out of 10 (100.0%)
genann_randomize: 10 stmts out of 10 (100.0%)
genann_read: 30 stmts out of 30 (100.0%)
genann_write: 9 stmts out of 9 (100.0%)
main: 116 stmts out of 116 (100.0%)
persist: 45 stmts out of 45 (100.0%)
sigmoid: 21 stmts out of 21 (100.0%)
train_and: 71 stmts out of 71 (100.0%)
train_or: 72 stmts out of 72 (100.0%)
train_xor: 71 stmts out of 71 (100.0%)
basic: 78 stmts out of 80 (97.5%)
genann_train: 132 stmts out of 152 (86.8%)
genann_act_sigmoid_cached: 22 stmts out of 26 (84.6%)
genann_run: 91 stmts out of 118 (77.1%)
copy: 31 stmts out of 41 (75.6%)
xor: 47 stmts out of 73 (64.4%)
genann_act_sigmoid: 5 stmts out of 11 (45.5%)
This diff is collapsed.
[metrics] Defined functions (25)
======================
backprop (1 call); basic (1 call); copy (1 call);
genann_act_hidden_indirect (2 calls);
genann_act_linear (address taken) (0 call);
genann_act_output_indirect (address taken) (2 calls);
genann_act_sigmoid (3 calls);
genann_act_sigmoid_cached (address taken) (2 calls);
genann_act_threshold (address taken) (0 call); genann_copy (1 call);
genann_free (11 calls); genann_init (9 calls);
genann_init_sigmoid_lookup (1 call); genann_randomize (2 calls);
genann_read (1 call); genann_run (47 calls); genann_train (4 calls);
genann_write (1 call); main (0 call); persist (1 call); sigmoid (1 call);
train_and (1 call); train_or (1 call); train_xor (1 call); xor (1 call);
Specified-only functions (0)
============================
Undefined and unspecified functions (0)
=======================================
'Extern' global variables (0)
=============================
Potential entry points (1)
==========================
main;
Global metrics
==============
Sloc = 1043
Decision point = 101
Global variables = 6
If = 101
Loop = 32
Goto = 17
Assignment = 477
Exit point = 25
Function = 25
Function call = 224
Pointer dereferencing = 309
Cyclomatic complexity = 126
test.c:44:[kernel:parser:decimal-float] warning: Floating-point constant 0.001 is not represented exactly. Will use 0x1.0624dd2f1a9fcp-10.
(warn-once: no further messages from category 'parser:decimal-float' will be emitted)
../../path.mk
\ No newline at end of file
zlib License
Copyright (C) 2015-2018 Lewis Van Winkle
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgement in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
CFLAGS = -Wall -Wshadow -O3 -g -march=native
LDLIBS = -lm
all: check example1 example2 example3 example4
sigmoid: CFLAGS += -Dgenann_act=genann_act_sigmoid_cached
sigmoid: all
threshold: CFLAGS += -Dgenann_act=genann_act_threshold
threshold: all
linear: CFLAGS += -Dgenann_act=genann_act_linear
linear: all
test: test.o genann.o
check: test
./$^
example1: example1.o genann.o
example2: example2.o genann.o
example3: example3.o genann.o
example4: example4.o genann.o
clean:
$(RM) *.o
$(RM) test example1 example2 example3 example4 *.exe
$(RM) persist.txt
.PHONY: sigmoid threshold linear clean
[![Build Status](https://travis-ci.org/codeplea/genann.svg?branch=master)](https://travis-ci.org/codeplea/genann)
<img alt="Genann logo" src="https://codeplea.com/public/content/genann_logo.png" align="right" />
# Genann
Genann is a minimal, well-tested library for training and using feedforward
artificial neural networks (ANN) in C. Its primary focus is on being simple,
fast, reliable, and hackable. It achieves this by providing only the necessary
functions and little extra.
## Features
- **C99 with no dependencies**.
- Contained in a single source code and header file.
- Simple.
- Fast and thread-safe.
- Easily extendible.
- Implements backpropagation training.
- *Compatible with alternative training methods* (classic optimization, genetic algorithms, etc)
- Includes examples and test suite.
- Released under the zlib license - free for nearly any use.
## Building
Genann is self-contained in two files: `genann.c` and `genann.h`. To use Genann, simply add those two files to your project.
## Example Code
Four example programs are included with the source code.
- [`example1.c`](./example1.c) - Trains an ANN on the XOR function using backpropagation.
- [`example2.c`](./example2.c) - Trains an ANN on the XOR function using random search.
- [`example3.c`](./example3.c) - Loads and runs an ANN from a file.
- [`example4.c`](./example4.c) - Trains an ANN on the [IRIS data-set](https://archive.ics.uci.edu/ml/datasets/Iris) using backpropagation.
## Quick Example
We create an ANN taking 2 inputs, having 1 layer of 3 hidden neurons, and
providing 2 outputs. It has the following structure:
![NN Example Structure](./doc/e1.png)
We then train it on a set of labeled data using backpropagation and ask it to
predict on a test data point:
```C
#include "genann.h"
/* Not shown, loading your training and test data. */
double **training_data_input, **training_data_output, **test_data_input;
/* New network with 2 inputs,
* 1 hidden layer of 3 neurons each,
* and 2 outputs. */
genann *ann = genann_init(2, 1, 3, 2);
/* Learn on the training set. */
for (i = 0; i < 300; ++i) {
for (j = 0; j < 100; ++j)
genann_train(ann, training_data_input[j], training_data_output[j], 0.1);
}
/* Run the network and see what it predicts. */
double const *prediction = genann_run(ann, test_data_input[0]);
printf("Output for the first test data point is: %f, %f\n", prediction[0], prediction[1]);
genann_free(ann);
```
This example is to show API usage, it is not showing good machine learning
techniques. In a real application you would likely want to learn on the test
data in a random order. You would also want to monitor the learning to prevent
over-fitting.
## Usage
### Creating and Freeing ANNs
```C
genann *genann_init(int inputs, int hidden_layers, int hidden, int outputs);
genann *genann_copy(genann const *ann);
void genann_free(genann *ann);
```
Creating a new ANN is done with the `genann_init()` function. Its arguments
are the number of inputs, the number of hidden layers, the number of neurons in
each hidden layer, and the number of outputs. It returns a `genann` struct pointer.
Calling `genann_copy()` will create a deep-copy of an existing `genann` struct.
Call `genann_free()` when you're finished with an ANN returned by `genann_init()`.
### Training ANNs
```C
void genann_train(genann const *ann, double const *inputs,
double const *desired_outputs, double learning_rate);
```
`genann_train()` will preform one update using standard backpropogation. It
should be called by passing in an array of inputs, an array of expected outputs,
and a learning rate. See *example1.c* for an example of learning with
backpropogation.
A primary design goal of Genann was to store all the network weights in one
contigious block of memory. This makes it easy and efficient to train the
network weights using direct-search numeric optimization algorthims,
such as [Hill Climbing](https://en.wikipedia.org/wiki/Hill_climbing),
[the Genetic Algorithm](https://en.wikipedia.org/wiki/Genetic_algorithm), [Simulated
Annealing](https://en.wikipedia.org/wiki/Simulated_annealing), etc.
These methods can be used by searching on the ANN's weights directly.
Every `genann` struct contains the members `int total_weights;` and
`double *weight;`. `*weight` points to an array of `total_weights`
size which contains all weights used by the ANN. See *example2.c* for
an example of training using random hill climbing search.
### Saving and Loading ANNs
```C
genann *genann_read(FILE *in);
void genann_write(genann const *ann, FILE *out);
```
Genann provides the `genann_read()` and `genann_write()` functions for loading or saving an ANN in a text-based format.
### Evaluating
```C
double const *genann_run(genann const *ann, double const *inputs);
```
Call `genann_run()` on a trained ANN to run a feed-forward pass on a given set of inputs. `genann_run()`
will provide a pointer to the array of predicted outputs (of `ann->outputs` length).
## Hints
- All functions start with `genann_`.
- The code is simple. Dig in and change things.
## Extra Resources
The [comp.ai.neural-nets
FAQ](http://www.faqs.org/faqs/ai-faq/neural-nets/part1/) is an excellent
resource for an introduction to artificial neural networks.
If you need an even smaller neural network library, check out the excellent single-hidden-layer library [tinn](https://github.com/glouw/tinn).
If you're looking for a heavier, more opinionated neural network library in C,
I recommend the [FANN library](http://leenissen.dk/fann/wp/). Another
good library is Peter van Rossum's [Lightweight Neural
Network](http://lwneuralnet.sourceforge.net/), which despite its name, is
heavier and has more features than Genann.
digraph G {
rankdir=LR;
{i1 i2} -> {h1 h2 h3} -> {o1 o2};
i1, i2, h1, h2, h3, o1, o2 [shape=circle; label="";];
input -> hidden -> output [style=invis;];
input, hidden, output [shape=plaintext;];
}
genann/doc/e1.png

21.5 KiB

5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.4,3.7,1.5,0.2,Iris-setosa
4.8,3.4,1.6,0.2,Iris-setosa
4.8,3.0,1.4,0.1,Iris-setosa
4.3,3.0,1.1,0.1,Iris-setosa
5.8,4.0,1.2,0.2,Iris-setosa
5.7,4.4,1.5,0.4,Iris-setosa
5.4,3.9,1.3,0.4,Iris-setosa
5.1,3.5,1.4,0.3,Iris-setosa
5.7,3.8,1.7,0.3,Iris-setosa
5.1,3.8,1.5,0.3,Iris-setosa
5.4,3.4,1.7,0.2,Iris-setosa
5.1,3.7,1.5,0.4,Iris-setosa
4.6,3.6,1.0,0.2,Iris-setosa
5.1,3.3,1.7,0.5,Iris-setosa
4.8,3.4,1.9,0.2,Iris-setosa
5.0,3.0,1.6,0.2,Iris-setosa
5.0,3.4,1.6,0.4,Iris-setosa
5.2,3.5,1.5,0.2,Iris-setosa
5.2,3.4,1.4,0.2,Iris-setosa
4.7,3.2,1.6,0.2,Iris-setosa
4.8,3.1,1.6,0.2,Iris-setosa
5.4,3.4,1.5,0.4,Iris-setosa
5.2,4.1,1.5,0.1,Iris-setosa
5.5,4.2,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.0,3.2,1.2,0.2,Iris-setosa
5.5,3.5,1.3,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
4.4,3.0,1.3,0.2,Iris-setosa
5.1,3.4,1.5,0.2,Iris-setosa
5.0,3.5,1.3,0.3,Iris-setosa
4.5,2.3,1.3,0.3,Iris-setosa
4.4,3.2,1.3,0.2,Iris-setosa
5.0,3.5,1.6,0.6,Iris-setosa
5.1,3.8,1.9,0.4,Iris-setosa
4.8,3.0,1.4,0.3,Iris-setosa
5.1,3.8,1.6,0.2,Iris-setosa
4.6,3.2,1.4,0.2,Iris-setosa
5.3,3.7,1.5,0.2,Iris-setosa
5.0,3.3,1.4,0.2,Iris-setosa
7.0,3.2,4.7,1.4,Iris-versicolor
6.4,3.2,4.5,1.5,Iris-versicolor
6.9,3.1,4.9,1.5,Iris-versicolor
5.5,2.3,4.0,1.3,Iris-versicolor
6.5,2.8,4.6,1.5,Iris-versicolor
5.7,2.8,4.5,1.3,Iris-versicolor
6.3,3.3,4.7,1.6,Iris-versicolor
4.9,2.4,3.3,1.0,Iris-versicolor
6.6,2.9,4.6,1.3,Iris-versicolor
5.2,2.7,3.9,1.4,Iris-versicolor
5.0,2.0,3.5,1.0,Iris-versicolor
5.9,3.0,4.2,1.5,Iris-versicolor
6.0,2.2,4.0,1.0,Iris-versicolor
6.1,2.9,4.7,1.4,Iris-versicolor
5.6,2.9,3.6,1.3,Iris-versicolor
6.7,3.1,4.4,1.4,Iris-versicolor
5.6,3.0,4.5,1.5,Iris-versicolor
5.8,2.7,4.1,1.0,Iris-versicolor
6.2,2.2,4.5,1.5,Iris-versicolor
5.6,2.5,3.9,1.1,Iris-versicolor
5.9,3.2,4.8,1.8,Iris-versicolor
6.1,2.8,4.0,1.3,Iris-versicolor
6.3,2.5,4.9,1.5,Iris-versicolor
6.1,2.8,4.7,1.2,Iris-versicolor
6.4,2.9,4.3,1.3,Iris-versicolor
6.6,3.0,4.4,1.4,Iris-versicolor
6.8,2.8,4.8,1.4,Iris-versicolor
6.7,3.0,5.0,1.7,Iris-versicolor
6.0,2.9,4.5,1.5,Iris-versicolor
5.7,2.6,3.5,1.0,Iris-versicolor
5.5,2.4,3.8,1.1,Iris-versicolor
5.5,2.4,3.7,1.0,Iris-versicolor
5.8,2.7,3.9,1.2,Iris-versicolor
6.0,2.7,5.1,1.6,Iris-versicolor
5.4,3.0,4.5,1.5,Iris-versicolor
6.0,3.4,4.5,1.6,Iris-versicolor
6.7,3.1,4.7,1.5,Iris-versicolor
6.3,2.3,4.4,1.3,Iris-versicolor
5.6,3.0,4.1,1.3,Iris-versicolor
5.5,2.5,4.0,1.3,Iris-versicolor
5.5,2.6,4.4,1.2,Iris-versicolor
6.1,3.0,4.6,1.4,Iris-versicolor
5.8,2.6,4.0,1.2,Iris-versicolor
5.0,2.3,3.3,1.0,Iris-versicolor
5.6,2.7,4.2,1.3,Iris-versicolor
5.7,3.0,4.2,1.2,Iris-versicolor
5.7,2.9,4.2,1.3,Iris-versicolor
6.2,2.9,4.3,1.3,Iris-versicolor
5.1,2.5,3.0,1.1,Iris-versicolor
5.7,2.8,4.1,1.3,Iris-versicolor
6.3,3.3,6.0,2.5,Iris-virginica
5.8,2.7,5.1,1.9,Iris-virginica
7.1,3.0,5.9,2.1,Iris-virginica
6.3,2.9,5.6,1.8,Iris-virginica
6.5,3.0,5.8,2.2,Iris-virginica
7.6,3.0,6.6,2.1,Iris-virginica
4.9,2.5,4.5,1.7,Iris-virginica
7.3,2.9,6.3,1.8,Iris-virginica
6.7,2.5,5.8,1.8,Iris-virginica
7.2,3.6,6.1,2.5,Iris-virginica
6.5,3.2,5.1,2.0,Iris-virginica
6.4,2.7,5.3,1.9,Iris-virginica
6.8,3.0,5.5,2.1,Iris-virginica
5.7,2.5,5.0,2.0,Iris-virginica
5.8,2.8,5.1,2.4,Iris-virginica
6.4,3.2,5.3,2.3,Iris-virginica
6.5,3.0,5.5,1.8,Iris-virginica
7.7,3.8,6.7,2.2,Iris-virginica
7.7,2.6,6.9,2.3,Iris-virginica
6.0,2.2,5.0,1.5,Iris-virginica
6.9,3.2,5.7,2.3,Iris-virginica
5.6,2.8,4.9,2.0,Iris-virginica
7.7,2.8,6.7,2.0,Iris-virginica
6.3,2.7,4.9,1.8,Iris-virginica
6.7,3.3,5.7,2.1,Iris-virginica
7.2,3.2,6.0,1.8,Iris-virginica
6.2,2.8,4.8,1.8,Iris-virginica
6.1,3.0,4.9,1.8,Iris-virginica
6.4,2.8,5.6,2.1,Iris-virginica
7.2,3.0,5.8,1.6,Iris-virginica
7.4,2.8,6.1,1.9,Iris-virginica
7.9,3.8,6.4,2.0,Iris-virginica
6.4,2.8,5.6,2.2,Iris-virginica
6.3,2.8,5.1,1.5,Iris-virginica
6.1,2.6,5.6,1.4,Iris-virginica
7.7,3.0,6.1,2.3,Iris-virginica
6.3,3.4,5.6,2.4,Iris-virginica
6.4,3.1,5.5,1.8,Iris-virginica
6.0,3.0,4.8,1.8,Iris-virginica
6.9,3.1,5.4,2.1,Iris-virginica
6.7,3.1,5.6,2.4,Iris-virginica
6.9,3.1,5.1,2.3,Iris-virginica
5.8,2.7,5.1,1.9,Iris-virginica
6.8,3.2,5.9,2.3,Iris-virginica
6.7,3.3,5.7,2.5,Iris-virginica
6.7,3.0,5.2,2.3,Iris-virginica
6.3,2.5,5.0,1.9,Iris-virginica
6.5,3.0,5.2,2.0,Iris-virginica
6.2,3.4,5.4,2.3,Iris-virginica
5.9,3.0,5.1,1.8,Iris-virginica
1. Title: Iris Plants Database
Updated Sept 21 by C.Blake - Added discrepency information
2. Sources:
(a) Creator: R.A. Fisher
(b) Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
(c) Date: July, 1988
3. Past Usage:
- Publications: too many to mention!!! Here are a few.
1. Fisher,R.A. "The use of multiple measurements in taxonomic problems"
Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions
to Mathematical Statistics" (John Wiley, NY, 1950).
2. Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis.
(Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218.
3. Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System
Structure and Classification Rule for Recognition in Partially Exposed
Environments". IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. PAMI-2, No. 1, 67-71.
-- Results:
-- very low misclassification rates (0% for the setosa class)
4. Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule". IEEE
Transactions on Information Theory, May 1972, 431-433.
-- Results:
-- very low misclassification rates again
5. See also: 1988 MLC Proceedings, 54-64. Cheeseman et al's AUTOCLASS II
conceptual clustering system finds 3 classes in the data.
4. Relevant Information:
--- This is perhaps the best known database to be found in the pattern
recognition literature. Fisher's paper is a classic in the field
and is referenced frequently to this day. (See Duda & Hart, for
example.) The data set contains 3 classes of 50 instances each,
where each class refers to a type of iris plant. One class is
linearly separable from the other 2; the latter are NOT linearly
separable from each other.
--- Predicted attribute: class of iris plant.
--- This is an exceedingly simple domain.
--- This data differs from the data presented in Fishers article
(identified by Steve Chadwick, spchadwick@espeedaz.net )
The 35th sample should be: 4.9,3.1,1.5,0.2,"Iris-setosa"
where the error is in the fourth feature.
The 38th sample: 4.9,3.6,1.4,0.1,"Iris-setosa"
where the errors are in the second and third features.
5. Number of Instances: 150 (50 in each of three classes)
6. Number of Attributes: 4 numeric, predictive attributes and the class
7. Attribute Information:
1. sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm
5. class:
-- Iris Setosa
-- Iris Versicolour
-- Iris Virginica
8. Missing Attribute Values: None
Summary Statistics:
Min Max Mean SD Class Correlation
sepal length: 4.3 7.9 5.84 0.83 0.7826
sepal width: 2.0 4.4 3.05 0.43 -0.4194
petal length: 1.0 6.9 3.76 1.76 0.9490 (high!)
petal width: 0.1 2.5 1.20 0.76 0.9565 (high!)
9. Class Distribution: 33.3% for each of 3 classes.
2 1 2 1 -1.777 -5.734 -6.029 -4.460 -3.261 -3.172 2.444 -6.581 5.826
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment