|||
(参阅
termFreq {tm} ) removePunctuation A logical value indicating whether punctuation characters should be removed from doc, a custom function which performs punctuation removal, or a list of arguments for removePunctuation. Defaults to FALSE. removeNumbersA logical value indicating whether numbers should be removed from doc or a custom function for number removal. Defaults to FALSE. stopwordsEither a Boolean value indicating stopword removal using default language specific stopword lists shipped with this package, a character vector holding custom stopwords, or a custom function for stopword removal. Defaults to FALSE. bounds A list with a tag local whose value must be an integer vector of length 2. Terms that appear less often in doc than the lower bound bounds$local[1] or more often than the upper bound bounds$local[2] are discarded. Defaults to list(local = c(1,Inf)) (i.e., every token will be used). wordLengthsAn integer vector of length 2. Words shorter than the minimum word length wordLengths[1] or longer than the maximum word length wordLengths[2] are discarded. Defaults to c(3, Inf), i.e., a minimum word length of 3 characters. |
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-12-28 03:12
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社