datato1ofm.m (Scripts) Publisher's description
from Amos Storkey
Take categorical data matrix and transform whole matrix to binary sparse 1ofM matrix
Take categorical data matrix and transform whole matrix to binary sparse 1ofM matrix, keeping track of what came from where. Ideal for any form of count-based probabilistic analysis.
Typically used in a chain following loadcell.m and celltonumeric.m
datato1ofm - recast data in 1 of M format, maintaining multinomial info.
function [newdata, attrmap] = datato1ofm( data );
DATA is the complete dataset. It is presumed that all the possible states are represented in the dataset. If not the data should be augmented with dummy data so that this is the case. Each column of DATA corresponds to a different attribute, and each row a different data item. DATA must be numeric.
NEWDATA is a sparse real-binary 1 of M dataset. All attributes are one of M encoded, including previous binary attributes. The split of these previously binary attributes can be removed trivially: see below.
ATTRMAP gives the attribute mapping information. ATTRMAP(1,k) gives the original atribute number for the kth new attribute. ATTRMAP(2,k) gives the value of the original attribute indicated by the kth new attribute. ATTRMAP(3,k) indicates how many elements the kth new attribute is one of.
To remove 1 of M encoding for previously binary attributes use
ii = find(~(attrmap(2,:)==1 & attrmap(3,:)==2));
newdata = newdata(:,ii); attrmap = attrmap(:,ii);
To compute multinomial probabilities (simply but inefficiently) use
normmatrix = sparse([1:size(attrmap,2)],attrmap(1,:),1);
normmatrix = normmatrix*normmatrix';
probs = mean(newdata)./(mean(newdata)*normmatrix);
See loadcell, celltonumeric
System Requirements:MATLAB 7.6 (R2008a)
Program Release Status: New Release
Program Install Support: Install and Uninstall