- class composes.semantic_space.space.Space(matrix_, id2row, id2column, row2id=None, column2id=None, **kwargs)¶
This class implements semantic spaces.
A semantic space describes a list of targets (words, phrases, etc.) in terms of co-occurrence with contextual features.
It contains a matrix storing (some type of) co-occurrence strength values between targets and contextual features: by convention, targets are rows and features are columns. The space also stores structures that encode the mappings between the matrix row/column indices and the associated target/context-feature strings.
Transformations which rescale the matrix elements can be applied to a semantic space. A semantic also space allows for similarity computations between row elements of the space.
- apply(transformation)¶
Applies a transformation on the current space.
All transformations affect the data matrix. If the transformation reduces the dimensionality of the space, the column indexing structures are also updated. The operation applied is appended to the list of operations that the space holds.
- Args:
- transformation: of type Scaling, DimensionalityReduction or
- FeatureSelection
- Returns:
- A new space on which the transformation has been applied.
- classmethod build(**kwargs)¶
Reads in data files and extracts the data to construct a semantic space.
If the data is read in dense format and no columns are provided, the column indexing structures are set to empty.
- Args:
data: file containing the counts format: format on the input data file: one of sm/dm rows: file containing the row elements. Optional, if not provided,
extracted from the data file.cols: file containing the column elements
- Returns:
- A semantic space build from the input data files.
- Raises:
- ValueError: if one of data/format arguments is missing.
- if cols is missing and format is “sm” if the input columns provided are not consistent with the shape of the matrix (for “dm” format)
- export(file_prefix, **kwargs)¶
Exports the current space to disk. If the space has no column information, it cannot be exported in sparse format (sm).
- Args:
- file_prefix: string, prefix of the files to be exported format: string, one of dm/sm
- Prints:
- matrix in file_prefix.<format>
- row elements in file_prefix.<row>
- col elements in file_prefix.<col>
- Raises:
- ValueError: if the space has no column info and “sm” exporting
- is attempted
- NotImplementedError: the space matrix is dense and “sm” exporting
- is attempted
- get_sim(word1, word2, similarity, space2=None)¶
Computes the similarity between two targets in the semantic space.
If one of the two targets to be compared is not found, it returns 0..
- Args:
word1: string word2: string similarity: of type Similarity, the similarity measure to be used space2: Space type, Optional. If provided, word2 is interpreted in
this space, rather than the current space. Default, both words are interpreted in the current space.- Returns:
- scalar, similarity score
- get_neighbours(word, no_neighbours, similarity, space2=None)¶
Computes the neighbours of a word in the semantic space.
- Args:
word: string, target word no_neighbours: int, the number of neighbours desired similarity: of type Similarity, the similarity measure to be used space2: Space type, Optional. If provided, the neighbours are
retrieved from this space, rather than the current space. Default, neighbours are retrieved from the current space.- Returns:
- list of (neighbour_string, similarity_value) tuples.
- Raises:
- KeyError: if the word is not found in the semantic space.
- classmethod vstack(space1, space2)¶
Classmethod. Stacks two semantic spaces.
The rows in the two spaces are concatenated.
- Args:
- space1, space2: spaces to be stacked, of type Space
- Returns:
- Stacked space, type Space.
- Raises:
- ValueError: if the spaces have different number of columns
- or their columns are not identical