|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.sun.speech.freetts.en.TokenizerImpl
Implements the tokenizer interface. Breaks an input sequence of characters into a set of tokens.
| Field Summary | |
static java.lang.String |
DEFAULT_POSTPUNCTUATION_SYMBOLS
A string containing the default post-punctuation characters. |
static java.lang.String |
DEFAULT_PREPUNCTUATION_SYMBOLS
A string containing the default pre-punctuation characters. |
static java.lang.String |
DEFAULT_SINGLE_CHAR_SYMBOLS
A string containing the default single characters. |
static java.lang.String |
DEFAULT_WHITESPACE_SYMBOLS
A string containing the default whitespace characters. |
static int |
EOF
A constant indicating that the end of the stream has been read. |
| Constructor Summary | |
TokenizerImpl()
Constructs a Tokenizer. |
|
TokenizerImpl(java.io.Reader file)
Creates a tokenizer that will return tokens from the given file. |
|
TokenizerImpl(java.lang.String string)
Creates a tokenizer that will return tokens from the given string. |
|
| Method Summary | |
java.lang.String |
getErrorDescription()
if hasErrors returns true, this will return a
description of the error encountered, otherwise
it will return null |
Token |
getNextToken()
Returns the next token. |
boolean |
hasErrors()
Returns true if there were errors while reading tokens |
boolean |
hasMoreTokens()
Returns true if there are more tokens,
false otherwise. |
boolean |
isBreak()
Determines if the current token should start a new sentence. |
void |
setInputReader(java.io.Reader reader)
Sets the input reader |
void |
setInputText(java.lang.String inputString)
Sets the text to tokenize. |
void |
setPostpunctuationSymbols(java.lang.String symbols)
Sets the postpunctuation symbols of this Tokenizer to the given symbols. |
void |
setPrepunctuationSymbols(java.lang.String symbols)
Sets the prepunctuation symbols of this Tokenizer to the given symbols. |
void |
setSingleCharSymbols(java.lang.String symbols)
Sets the single character symbols of this Tokenizer to the given symbols. |
void |
setWhitespaceSymbols(java.lang.String symbols)
Sets the whitespace symbols of this Tokenizer to the given symbols. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
public static final int EOF
public static final java.lang.String DEFAULT_WHITESPACE_SYMBOLS
public static final java.lang.String DEFAULT_SINGLE_CHAR_SYMBOLS
public static final java.lang.String DEFAULT_PREPUNCTUATION_SYMBOLS
public static final java.lang.String DEFAULT_POSTPUNCTUATION_SYMBOLS
| Constructor Detail |
public TokenizerImpl()
public TokenizerImpl(java.lang.String string)
string - the string to tokenizepublic TokenizerImpl(java.io.Reader file)
file - where to read the input from| Method Detail |
public void setWhitespaceSymbols(java.lang.String symbols)
setWhitespaceSymbols in interface Tokenizersymbols - the whitespace symbolspublic void setSingleCharSymbols(java.lang.String symbols)
setSingleCharSymbols in interface Tokenizersymbols - the single character symbolspublic void setPrepunctuationSymbols(java.lang.String symbols)
setPrepunctuationSymbols in interface Tokenizersymbols - the prepunctuation symbolspublic void setPostpunctuationSymbols(java.lang.String symbols)
setPostpunctuationSymbols in interface Tokenizersymbols - the postpunctuation symbolspublic void setInputText(java.lang.String inputString)
setInputText in interface TokenizerinputString - the string to tokenizepublic void setInputReader(java.io.Reader reader)
setInputReader in interface Tokenizerreader - the input sourcepublic Token getNextToken()
getNextToken in interface Tokenizernull if no more tokenspublic boolean hasMoreTokens()
true if there are more tokens,
false otherwise.
hasMoreTokens in interface Tokenizertrue if there are more tokens
false otherwisepublic boolean hasErrors()
true if there were errors while reading tokens
hasErrors in interface Tokenizertrue if there were errors;
false otherwisepublic java.lang.String getErrorDescription()
true, this will return a
description of the error encountered, otherwise
it will return null
getErrorDescription in interface Tokenizerpublic boolean isBreak()
isBreak in interface Tokenizertrue if a new sentence should be started
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||