SourceFiles.org - Use the Source, Luke
Home | Register | News | Forums | Guide | MyLinks | Bookmark

Sponsored Links

Latest News
  General News
  Reviews
  Press Releases
  Software
  Hardware
  Security
  Tutorials
  Off Topic


Back to files

Summary

PCP (Pattern Classification Program) is an open-source program for supervised classification of patterns.

PCP has been developed and tested on Linux/i386 platform (Fedora Core 4). The Linux binary is provided with the PCP distribution, and it should run out-of-the-box on most Linux distributions.

PCP distribution also comes with Windows binary. It requires the Cygwin library cygwin1.dll to run. Cygwin is a free UNIX-like environment for Windows which can be downloaded from http://www.cygwin.com. Note that PCP only requires Cygwin library cygwin1.dll to run, not the complete Cygwin environment.

PCP is released under MIT license (also known as X11 license). This license permits free use and distribution for any purpose, including commercial, in binary and source formats. See file LICENSING for details.

Unpacking the Distribution

GNU/Linux

% gunzip -c pcp-2.2.tar.gz | tar xvf -

This will create a subdirectory `pcp-2.2' in your current directory.

Windows/Cygwin

Double-click on the pcp-2.2.zip file. This will unpack the software. However, before you can run it you need to obtain a copy of the library file called cygwin1.dll. The Cygwin environment (which contains the library) can be downloaded from http://www.cygwin.com (perhaps you can download just the library file; I haven't tried that). Upon installation, the library can typically be found in directory c:\cygwin\bin. Then you have two options:

  • add the Cygwin directory to the PATH environment variable
  • copy the library file into the same directory where you install pcp.exe executable

Once you have the library, you can run PCP executable from DOS window or Cygwin terminal window.

Quick Start

For a quick test of PCP, try this:

% cd pcp-2.2
% Linux/pcp -b srbct_test.bat

This command runs pcp in batch mode using a command file srbct_test.bat. It will build a Support Vector Machine classifier for a well-known SRBCT child leukemia data set [3]. The problem is to predict leukemia subtype for a patient (and hence, help choose appropriate treatment) using a vector of microarray (gene expression) measurements. After completion of processing, the program returns to the command line prompt. The resulting SVM is stored in file pcp.svm.

In order to perform prediction, using the built SVM model, on an independent test dataset, type:

% Linux/pcp

You should see `Main Menu'. Press `b' to enter `Pattern Classification', then `f' to enter `Support Vector Machines' Menu. In the menu, press `c' for `Prediction', then `Enter' twice. The results should look something like this:

Enter SVM model file name [pcp.svm]:

Short (0) or long (1) output [0]:

+----------------------------------------------------------------------------+ | Class | Actual/predicted card. | Error rate | +----------------------------------------------------------------------------+

|                      |           25/25          |   12.00% (     3/    25) |
|   1/ews_test         |            7/10          |    0.00% (     0/     7) |
|   2/rms_test         |            8/6           |   25.00% (     2/     8) |
|   3/nb_test          |            7/6           |   14.29% (     1/     7) |
|   4/bl_test          |            3/3           |    0.00% (     0/     3) |
+----------------------------------------------------------------------------+
| Vector |          Actual  class          |         SVM  prediction         |
+----------------------------------------------------------------------------+
|     13 | rms_test                        | ews_test                        |
|     14 | rms_test                        | ews_test                        |
|     22 | nb_test                         | ews_test                        |

+----------------------------------------------------------------------------+

The above table shows classification results for the chosen data set. The class names are the file names without the extension, and the cumulative error rate is 12%.

See User's Guide for more information, including instructions how to prepare and run your own data sets.

Documentation

For usage, see the accompanying User's Guide pcp.pdf. For compilation for other platforms, see file COMPILING. For licensing information, see LICENSING.

Distribution contents

Linux/pcp                  PCP executable for Linux
Cygwin/pcp.exe             PCP executable for Windows/Cygwin

README                     this file
LICENSING                  plain English description of licensing terms
PCP_LICENSE                PCP license (MIT license)
HASH_LICENSE               license for Kaz Kylheku's hash code
LIBSVM_LICENSE             license for LIBSVM library by Chih-Chung Chang and Chih-Jen Lin
LAPACK_LICENSE             LAPACK license
COMPILING                  instructions for porting PCP to other platforms
ChangeLog                  history of differences between releases
pcp.pdf                    User's Guide

iris_setosa.dat            IRIS dataset

iris_versicolor.dat
iris_virginica.dat
landsat*.dat Landsat dataset [1] landtst*.dat
all_train.dat Leukemia dataset [2] aml_train.dat
all_test.dat
aml_test.dat
ews.dat SRBCT (child leukemia) dataset [3] rms.dat
nb.dat
bl.dat
ews_test.dat
rms_test.dat
nb_test.dat
bl_test.dat

iris.bat                   batch file for loading the IRIS dataset
al.bat                     batch file for loading the Leukemia dataset
landsat.bat                batch file for loading the Landsat dataset
srbct.bat                  batch file for loading the SRBCT dataset
srbct_test.bat             batch files for the SRBCT dataset SVM learning

srbct_svm.bat

src                        source code directory
lapack                     LAPACK library source code directory

configure.ac               GNU Autoconf build files

configure
Makefile.am
Makefile.in
install-sh
aclocal.m4
missing
depcomp

Author

PCP was designed and written by Ljubomir J. Buturovic of San Francisco State University. Sasha Jaksic of San Francisco State University contributed code to feature selection functionality.

Please send feedback and comments to ljubomir@sfsu.edu.

Bibliography

[1] Blake, C.L., Merz, C.J, UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science, 1998.

[2] T. Golub, D. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. Mesirov, H. Coller, M. Loh, J. Downing, M. Caligiuri, C. Bloomfield, and E. S. Lander, ``Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring,'' Science, 286(5439):531-537, October 1999.

[3] J. Khan, J. S. Wei, M. Ringner, L. H. Saal, M. Ladanyi, F. Westermann, F. Berthold, M. Schwab, C. R. Antonescu, C. Peterson, P. S. Meltzer, ``Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks,'' Nat. Med., June 2001, vol. 7, no. 6, pp. 673-679.


Sponsored Links

Discussion Groups
  Beginners
  Distributions
  Networking / Security
  Software
  PDAs

About | FAQ | Privacy | Awards | Contact
Comments to the webmaster are welcome.
Copyright 2006 Sourcefiles.org All rights reserved.