Last 10 changes peermore peermore peermore aboutchris augury socialtext pictures socialtext socialtext aboutchris 122 words 253 defs | uvizjournalRevision: Backlinks: | Installed R on hot (linux) and cream (windows) to experiment with the indexing tools provided by John. Have a question out to him on how to take advantage of the dist() and hclust() methods to output text files. One that contains a similarity matrix and one that contains, in some yet to be determined format, a representation of the clusters. Once the unknown format is figured out, we can make the tools to import that info and view the clusters in the database interface. Issues: - how do we get the right slice out of the cluster hierarchy? - inter/intra sim - how do we track of the documents when we make the input? This last issue is fairly important. The database represents messages, but we may be creating documents which are made up of multiple messages, for example dividing up the dataset by subject or author, or date. We need to know, when we input the clusters, what set of messages the individual documents represent. Implies some kind of flag upon input. Piles and piles and piles of metadata are building up. | [ Contact ] [ Old Blog ] [ New Blog ] [ Write ] [ AboutWarp ] [ Resume ] [ Search ] [ List Words ] [ Login ] |