Created on Jun 6, 2012
@author: thenghia.pham
Bases: composes.semantic_space.operation.Operation
This class implements the application and the projection of dimensionality reduction transformations.
Applies a dim. reduction operation.
The transformation matrix obtained in the reduction (specific to each reduction method) is stored in the operation object. This transformation matrix is further used for projecting the dim. reduction method on a space peripheral to the space on which it has been originally applied.
Projects a dim. reduction operation.
Uses the transformation matrix stored in the operation object to project the dimensionality reduction method on a new space, peripheral to the original one.
Bases: composes.semantic_space.operation.Operation
This class implements the application and the projection of feature selection transformations.
Applies a dim. feature selection operation.
The columns selected are stored in the operation object. These are further used for projecting the feature selection method on a space peripheral to the original space on which it has been applied.
List of strings, the id2column of the space before applying the feature selection.
Projects a feature selection operation.
Uses the information on selected columns stored in the operation object to project the feature selection method on a new space, peripheral to the original one.
List of integers, indices of the columns selected.
Bases: object
This class implements both the application, and the projection of a transformation on a semantic space.
An operation object can be used to apply or to project a specific transformation on a semantic space. After a transformation is applied, for example on a core space, the operation class stores the information required to further project this same operation onto a space peripheral to the core space.
Bases: composes.semantic_space.operation.Operation
This class implements the application and the projection of scaling transformations.
Applies a scaling operation.
The column statistics computed by the scaling transformation, if any, is stored in the current operation object. For example, PPMI scaling needs column sums in order to be projected on peripheral spaces, while PLOG scaling does not require this.
Created on Sep 26, 2012
@author: georgianadinu
Bases: composes.semantic_space.space.Space
classdocs
Adds rows to a peripheral space.
Modifies the current space by appending the new rows. All operations of the core space are projected to the new rows.
Reads in data files and extracts the data to construct a semantic space.
If the data is read in dense format and no columns are provided, the column indexing structures are set to empty.
data: file containing the counts format: format on the input data file: one of sm/dm rows: file containing the row elements. Optional, if not provided,
extracted from the data file.
cols: file containing the column elements
Created on Sep 21, 2012
@author: georgianadinu
Bases: object
This class implements semantic spaces.
A semantic space describes a list of targets (words, phrases, etc.) in terms of co-occurrence with contextual features.
It contains a matrix storing (some type of) co-occurrence strength values between targets and contextual features: by convention, targets are rows and features are columns. The space also stores structures that encode the mappings between the matrix row/column indices and the associated target/context-feature strings.
Transformations which rescale the matrix elements can be applied to a semantic space. A semantic also space allows for similarity computations between row elements of the space.
Applies a transformation on the current space.
All transformations affect the data matrix. If the transformation reduces the dimensionality of the space, the column indexing structures are also updated. The operation applied is appended to the list of operations that the space holds.
Asserts that the elements of the space are one dimensional.
Reads in data files and extracts the data to construct a semantic space.
If the data is read in dense format and no columns are provided, the column indexing structures are set to empty.
data: file containing the counts format: format on the input data file: one of sm/dm rows: file containing the row elements. Optional, if not provided,
extracted from the data file.
cols: file containing the column elements
Dictionary, maps column strings to integer ids.
Co-occurrence matrix associated to the semantic space, of type Matrix.
Shape of row elements, of type tuple. By default, in standard spaces, element_shape=(no_cols,).
Used in composition models which build word representations which are matrices or higher order tensors, instead of simple vectors. If the representation of a word is a matrix of shape (2,2) for example, then element_shape=(2,2). The actual space matrix stores each element as a linearized vector, just as in standard spaces.
Exports the current space to disk. If the space has no column information, it cannot be exported in sparse format (sm).
Computes the neighbours of a word in the semantic space.
word: string, target word no_neighbours: int, the number of neighbours desired similarity: of type Similarity, the similarity measure to be used space2: Space type, Optional. If provided, the neighbours are
retrieved from this space, rather than the current space. Default, neighbours are retrieved from the current space.
Returns the row vector of a word.
Returns: Matrix type (of shape (1, no_cols)), the row of the word argument.
Returns the sub-matrix corresponding to a list of words.
Computes the similarity between two targets in the semantic space.
If one of the two targets to be compared is not found, it returns 0..
word1: string word2: string similarity: of type Similarity, the similarity measure to be used space2: Space type, Optional. If provided, word2 is interpreted in
this space, rather than the current space. Default, both words are interpreted in the current space.
Computes the similarity between two LIST of targets in the semantic space.
If one of the two targets to be compared is not found, it returns 0..
word_pair_list: list of (string, string) tuples. Words to be compared. similarity: of type Similarity, the similarity measure to be used space2: Space type, Optional. If provided, the second word of the word pairs
is interpreted in this space, rather than the current space. Default, both words are interpreted in the current space.
List of strings, the column elements.
List of strings, the row elements.
List of operations which have been applied on the semantic space. List of Operation type objects.
The operations, together with their associated side information, are stored because they may need to be projected on peripheral data.
Dictionary, maps row strings to integer ids.
Converts the matrix of the current space to DenseMatrix
Converts the matrix of the current space to SparseMatrix
Classmethod. Stacks two semantic spaces.
The rows in the two spaces are concatenated.