Weka’s SparseInstance format
Non-zero attributes explicitly stated, 0 values not stated:
{1:”the”,3: ”small”,6:”boy”,9: “ate”,13: “the”,17: “small”,21: “pie”}
Strings mapped to integer indices using a hashtable:
the 0
small 1
boy 2
ate 3
the 4
small 5
pie 6
Use StringToWordVectorFilter to convert text SparseInstance to word vector (in Weka 3-2-2)