Comrite Unix Man page/Perldoc/Info page, English-Chinese Dictionary, Chinese-English Dictionary

AI::Categorizer::Collection--3pm

Command: man perldoc info search(apropos)  


 
AI::Categorizer::ColleUsernContributed Perl DoAI::Categorizer::Collection(3pm)



NAME
       AI::Categorizer::Collection - Access stored documents

SYNOPSIS
         my $c = new AI::Categorizer::Collection::Files
           (path => '/tmp/docs/training',
            category_file => '/tmp/docs/cats.txt');
         print "Total number of docs: ", $c->count_documents, "\n";
         while (my $document = $c->next) {
           ...
         }
         $c->rewind; # For further operations

DESCRIPTION
       This abstract class implements an iterator for accessing documents in
       their natively stored format.  You cannot directly create an instance
       of the Collection class, because it is abstract - see the documentation
       for the "Files", "SingleFile", or "InMemory" subclasses for a concrete
       interface.

METHODS
       new()
           Creates a new Collection object and returns it.  Accepts the fol-
           lowing parameters:

           category_hash
               Indicates a reference to a hash which maps document names to
               category names.  The keys of the hash are the document names,
               each value should be a reference to an array containing the
               names of the categories to which each document belongs.

           category_file
               Indicates a file which should be read in order to create the
               "category_hash".  Each line of the file should list a docu-
               ment's name, followed by a list of category names, all sepa-
               rated by whitespace.

           stopword_file
               Specifies a file containing a list of "stopwords", which are
               words that should automatically be disregarded when scan-
               ning/reading documents.  The file should contain one word per
               line.  The file will be parsed and then fed as the "stopwords"
               parameter to the Document "new()" method.

           verbose
               If true, some status/debugging information will be printed to
               "STDOUT" during operation.

           document_class
               The class indicating what type of Document object should be
               created.  This generally specifies the format that the docu-
               ments are stored in.  The default is "AI::Categorizer::Docu-
               ment::Text".

       next()
           Returns the next Document object in the Collection.

       rewind()
           Resets the iterator for further calls to "next()".

       count_documents()
           Returns the total number of documents in the Collection.  Note that
           this usually resets the iterator.  This is because it may not be
           possible to resume iterating where we left off.

AUTHOR
       Ken Williams, ken AT mathforum.org

COPYRIGHT
       Copyright 2002-2003 Ken Williams.  All rights reserved.

       This library is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.

SEE ALSO
       AI::Categorizer(3), Storable(3)



perl v5.8.7                       2002-11-24  AI::Categorizer::Collection(3pm)
 

©2005 Comrite