Compositionality allows the construction of complex meanings from combinations of simple elements. In distributional semantics, the composition of two constituents can be expressed in terms of a function acting on those constituents. Different models use different functions for composition. See the list of available composition models, and the introduction for references introducing the models we implement.
You can jump to the command-line usage, but we strongly recommend that you read through the Python examples first, to get a better sense of how DISSECT handles composition.
We start by illustrating composition with an additive model where the alpha and beta parameters are both set to 1 (what in the literature is known as the “simple additive model”) instead of being estimated in a training step:
#ex10.py #------- from composes.utils import io_utils from composes.composition.weighted_additive import WeightedAdditive #load a space my_space = io_utils.load("./data/out/ex10.pkl") print my_space.id2row print my_space.cooccurrence_matrix # instantiate a weighted additive model my_comp = WeightedAdditive(alpha = 1, beta = 1) # use the model to compose words in my_space composed_space = my_comp.compose([("good", "book", "good_book"), ("good", "car", "good_car")], my_space) print composed_space.id2row print composed_space.cooccurrence_matrix #save the composed space io_utils.save(composed_space, "data/out/PHRASE_SS.ex10.pkl")
Note that the space of composed phrases is just like any other space, and can be used, for example, for similarity computations.
Saving a model object and printing its parameters:
#ex11.py #------- from composes.utils import io_utils from composes.composition.weighted_additive import WeightedAdditive # instantiate a weighted additive model my_comp = WeightedAdditive(alpha = 1, beta = 1) #save it to pickle io_utils.save(my_comp, "./data/out/model01.pkl") #print its parameters my_comp.export("./data/out/model01.params")
A composition model can be used to combine vectors that live in different (but compatible) spaces:
#ex12.py #------- from composes.utils import io_utils #load a previously saved weighted additive model my_comp = io_utils.load("./data/out/model01.pkl") #print its parameters print "alpha:", my_comp.alpha print "beta:", my_comp.beta #load two spaces my_space = io_utils.load("./data/out/ex10.pkl") my_per_space = io_utils.load("./data/out/PER_SS.ex05.pkl") #apply the composition model to them composed_space = my_comp.compose([("good", "history_book", "good_history_book")], (my_space, my_per_space)) print composed_space.id2row print composed_space.cooccurrence_matrix
All composition models that have parameters (currently, all available composition models with the exception of Multiplicative) can be trained using examples of argument words and the corresponding output phrase vectors (see the introduction and the references there for an explanation of this idea). We train all models by minimizing the Euclidean norm of the difference between the composed phrase vectors as generated by the models and the corresponding phrase vectors passed as training data.
Training a Weighted Additive model:
#ex13.py #------- from composes.utils import io_utils from composes.composition.weighted_additive import WeightedAdditive #training data train_data = [("good", "car", "good_car"), ("good", "book", "good_book") ] #load an argument space arg_space = io_utils.load("./data/out/ex10.pkl") print arg_space.id2row print arg_space.cooccurrence_matrix #load a phrase space phrase_space = io_utils.load("data/out/PHRASE_SS.ex10.pkl") print phrase_space.id2row print phrase_space.cooccurrence_matrix #train a weighted additive model on the data my_comp = WeightedAdditive() my_comp.train(train_data, arg_space, phrase_space) #print its parameters print "alpha:", my_comp.alpha print "beta:", my_comp.beta
Training a Dilation model:
#ex14.py #------- from composes.utils import io_utils from composes.composition.dilation import Dilation #training data train_data = [("good", "car", "good_car"), ("good", "book", "good_book") ] #load an argument space arg_space = io_utils.load("./data/out/ex10.pkl") #load a phrase space phrase_space = io_utils.load("data/out/PHRASE_SS.ex10.pkl") print "Training phrase space" print phrase_space.id2row print phrase_space.cooccurrence_matrix #train a Dilation model on the data my_comp = Dilation() my_comp.train(train_data, arg_space, phrase_space) #print its parameters print "\nlambda:", my_comp._lambda #use the model to compose the train data composed_space = my_comp.compose([("good", "bike", "good_bike")], arg_space) print "\nComposed space:" print composed_space.id2row print composed_space.cooccurrence_matrix
Training a Full Additive model:
#ex15.py #------- from composes.utils import io_utils from composes.composition.full_additive import FullAdditive #training data train_data = [("good", "car", "good_car"), ("good", "book", "good_book") ] #load an argument space arg_space = io_utils.load("./data/out/ex10.pkl") #load a phrase space phrase_space = io_utils.load("data/out/PHRASE_SS.ex10.pkl") print "Training phrase space" print phrase_space.id2row print phrase_space.cooccurrence_matrix #train a FullAdditive model on the data my_comp = FullAdditive() my_comp.train(train_data, arg_space, phrase_space) #print its parameters print "\nA:", my_comp._mat_a_t.transpose() print "B:", my_comp._mat_b_t.transpose() #use the model to compose the train data composed_space = my_comp.compose([("good", "bike", "good_bike")], arg_space) print "\nComposed space:" print composed_space.id2row print composed_space.cooccurrence_matrix
The Lexical Function composition model differs from other models as its parameters are weight tensors for each functor word being trained (see this paper for a detailed explanation). These weight tensors are stored as vectors in a semantic space. Thus, for DISSECT, the “parameter” of a Lexical Function model is a semantic space. For example a Lexical Function model trained on adjective-noun data will learn an adjective semantic space containing the tensors (matrices in this case) representing each adjective. This semantic space can in turn be used like a regular semantic space, for example to measure the similarity of the different tensors.
Training a Lexical Function model (see the similarity page for the similarity function called at the end of the example):
#ex16.py #------- from composes.utils import io_utils from composes.composition.lexical_function import LexicalFunction from composes.similarity.cos import CosSimilarity #training data #trying to learn a "good" function train_data = [("good_function", "car", "good_car"), ("good_function", "book", "good_book") ] #load argument and phrase space arg_space = io_utils.load("./data/out/ex10.pkl") phrase_space = io_utils.load("data/out/PHRASE_SS.ex10.pkl") #train a lexical function model on the data my_comp = LexicalFunction() my_comp.train(train_data, arg_space, phrase_space) #print its parameters print "\nLexical function space:" print my_comp.function_space.id2row cooc_mat = my_comp.function_space.cooccurrence_matrix cooc_mat.reshape(my_comp.function_space.element_shape) print cooc_mat #similarity within the learned functional space print "\nSimilarity between good and good in the function space:" print my_comp.function_space.get_sim("good_function", "good_function", CosSimilarity())
Training of Full Additive and Lexical Function models solves the problem of finding a matrix X that minimizes \(||AX - B||_2\) for some matrices A and B. The currently available training methods are Least Squares Regression and Ridge Regression. Both can be used with or without intercept. Ridge Regression requires either the use of crossvalidation for learning the lambda parameter or the specification of a lambda value (see regression methods). If not otherwise specified, the default regression method used is Ridge with lambda=1.
Training a Lexical Function model with Ridge Regression:
#ex17.py #------- from composes.utils import io_utils from composes.composition.lexical_function import LexicalFunction from composes.utils.regression_learner import RidgeRegressionLearner #training data #trying to learn a "good" function train_data = [("good_function", "car", "good_car"), ("good_function", "book", "good_book") ] #load argument and phrase space arg_space = io_utils.load("./data/out/ex10.pkl") phrase_space = io_utils.load("data/out/PHRASE_SS.ex10.pkl") print "\nDefault regression:" my_comp = LexicalFunction() print type(my_comp.regression_learner).__name__ my_comp.train(train_data, arg_space, phrase_space) #print its parameters print "Lexical function space:" print my_comp.function_space.id2row cooc_mat = my_comp.function_space.cooccurrence_matrix cooc_mat.reshape(my_comp.function_space.element_shape) print cooc_mat print "\nRidge Regression with lambda = 2" rr_learner=RidgeRegressionLearner(param = 2, intercept = False, crossvalidation=False) my_comp = LexicalFunction(learner = rr_learner) my_comp.train(train_data, arg_space, phrase_space) #print its parameters print "Lexical function space:" print my_comp.function_space.id2row cooc_mat = my_comp.function_space.cooccurrence_matrix cooc_mat.reshape(my_comp.function_space.element_shape) print cooc_mat
Applying the Lexical Function model recursively:
#ex18.py #------- from composes.utils import io_utils from composes.composition.lexical_function import LexicalFunction #training data #trying to learn a "book" function train_data = [("good_function", "car", "good_car"), ("good_function", "book", "good_book") ] #load argument and phrase space arg_space = io_utils.load("./data/out/ex10.pkl") phrase_space = io_utils.load("data/out/PHRASE_SS.ex10.pkl") #train a lexical function model on the data my_comp = LexicalFunction() my_comp.train(train_data, arg_space, phrase_space) #apply the trained model comp_sp1 = my_comp.compose([("good_function", "car", "good_car")], arg_space) #apply the trained model a second time comp_sp2 = my_comp.compose([("good_function", "good_car", "good_good_car")], comp_sp1) #print the composed spaces: print "\nComposed space 1:" print comp_sp1.id2row print comp_sp1.cooccurrence_matrix print "\nComposed space 2:" print comp_sp2.id2row print comp_sp2.cooccurrence_matrix
Training and using a 3D Lexical Function composition model (e.g., for transitive verbs, as done in Grefenstette et al.):
#ex19.py #------- from composes.semantic_space.space import Space from composes.composition.lexical_function import LexicalFunction from composes.utils.regression_learner import LstsqRegressionLearner #training data1: VO N -> SVO train_vo_data = [("hate_boy", "man", "man_hate_boy"), ("hate_man", "man", "man_hate_man"), ("hate_boy", "boy", "boy_hate_boy"), ("hate_man", "boy", "boy_hate_man") ] #training data2: V N -> VO train_v_data = [("hate", "man", "hate_man"), ("hate", "boy", "hate_boy") ] #load N and SVO spaces n_space = Space.build(data = "./data/in/ex19-n.sm", cols = "./data/in/ex19-n.cols", format = "sm") svo_space = Space.build(data = "./data/in/ex19-svo.sm", cols = "./data/in/ex19-svo.cols", format = "sm") print "\nInput SVO training space:" print svo_space.id2row print svo_space.cooccurrence_matrix #1. train a model to learn VO functions on train data: VO N -> SVO print "\nStep 1 training" vo_model = LexicalFunction(learner=LstsqRegressionLearner()) vo_model.train(train_vo_data, n_space, svo_space) #2. train a model to learn V functions on train data: V N -> VO # where VO space: function space learned in step 1 print "\nStep 2 training" vo_space = vo_model.function_space v_model = LexicalFunction(learner=LstsqRegressionLearner()) v_model.train(train_v_data, n_space, vo_space) #print the learned model print "\n3D Verb space" print v_model.function_space.id2row print v_model.function_space.cooccurrence_matrix #3. use the trained models to compose new SVO sentences #3.1 use the V model to create new VO combinations vo_composed_space = v_model.compose([("hate", "woman", "hate_woman"), ("hate", "man", "hate_man")], n_space) #3.2 the new VO combinations will be used as functions: # load the new VO combinations obtained through composition into # a new composition model expanded_vo_model = LexicalFunction(function_space=vo_composed_space, intercept=v_model._has_intercept) #3.3 use the new VO combinations by composing them with subject nouns # in order to obtain new SVO sentences svo_composed_space = expanded_vo_model.compose([("hate_woman", "woman", "woman_hates_woman"), ("hate_man", "man", "man_hates_man")], n_space) #print the composed spaces: print "\nSVO composed space:" print svo_composed_space.id2row print svo_composed_space.cooccurrence_matrix
The apply_composition.py script can be used to generate phrases from their constituents, either specifying parameter values directly from the command-line, or using a previously trained model file.
Usage:
python2.7 apply_composition.py [options] [config_file]Options:
- -i, --input input_file¶
Input file containing a list of element1 element2 composed_phrase tuples on each line. The words (or phrases) in column 1 will be composed with the words (or phrases) in column 2. A semantic space for the composed words is created using the strings in column 3 as phrase labels (note that the latter strings are arbitrary, they have no mandatory relation to word1 and word2). If the Lexical Function model is applied, element1 is interpreted as the name of the functor to be used, element2 as the argument.
- -o, --output directory¶
Output directory of the resulting composed space. The output is a pickle dump of the composed space (and possibly a sparse or dense file with the same data if requested with –output_format option). The output files are named COMPOSED_SS.model_name.input_file.format, e.g., COMPOSED_SS.Dilation.myphrases.txt.pkl.
- -m, --model model_name¶
Name of the composition model to be applied, whose parameters will be directly specified on the command line (instead of being read from model file). One of mult (Multiplicative), weighted_add (Weighted Additive) or dilation (Dilation) is expected. One (and only one) of -m or –load_model has to be provided.
- --load_model model_file¶
File containing a previously saved composition model (pickle dump). See below for how model files can be created from the command line. One (and only one) of -m or –load_model has to be provided.
- --alpha alpha_parameter_for_WeightedAdditive¶
Required when –model is weighted_add, ignored otherwise.
- --beta beta_parameter_for_WeightedAdditive¶
Required when –model is weighted_add, ignored otherwise.
- --lambda lambda_parameter_for_Dilation¶
Required when –model is dilation, ignored otherwise.
- -a, --arg_space space_file or space_file1,space_file2¶
File(s) containing the space(s) of the arguments. If a second file is provided, the second element of a pair is interpreted in the additional space. Pickle format (and .pkl extension) required.
- --output_format: additional_output_format¶
Additional output format for the resulting composed space: one of sm (sparse matrix), dm (dense matrix). This is in addition to default pickle output. Optional.
- -l, --log file¶
Logger output file. Optional, by default no logging output is produced.
- -h, --help¶
Displays help message.
Examples:
python2.7 apply_composition.py -i ../examples/data/in/data_to_comp.txt -m dilation --lambda 2 -a ../examples/data/out/ex01.pkl -o ../examples/data/out/ --output_format dm python2.7 apply_composition.py -i ../examples/data/in/data_to_comp.txt -m mult -a ../examples/data/out/ex01.pkl -o ../examples/data/out/ --output_format dm python2.7 apply_composition.py -i ../examples/data/in/data_to_comp.txt --load_model ../examples/data/out/model01.pkl -a ../examples/data/out/ex01.pkl -o ../examples/data/out/ --output_format dm python2.7 apply_composition.py -i ../examples/data/in/data_to_comp2.txt --load_model ../examples/data/out/model01.pkl -a ../examples/data/out/ex01.pkl,../examples/data/out/PER_SS.ex05.pkl -o ../examples/data/out/ --output_format dm
The train_composition.py script is used to create model files that can then be loaded by the apply_composition.py script through the –load_model option.
Usage:
python2.7 train_composition.py [options] [config_file]Options:
- -i, --input input_file¶
Input file containing a list of element1 element2 phrase tuples on each line. The words (or phrases) in columns 1 and 2 will be extracted from the argument space, the phrase in column 3 from the phrase space. When training a Lexical Function model, the first column (element1) will contain a functor name, and the element2 and phrase vectors will be used as an input-output training pair when estimating the corresponding function (a separate function will be trained for each distinct element1 in the file).
- -o, --output directory¶
Output directory of the resulting composition model. The output is a pickle dump of the composition model object, named TRAINED_COMP_MODEL.model_name.input_file.pkl, e.g., TRAINED_COMP_MODEL.weighted_add.mytrainingfile.pkl.
- -m, --model model_name¶
Name of a composition model to be trained. One of weighted_add (Weighted Additive), full_add (Full Additive), lexical_func (Lexical Function) or dilation (Dilation).
- -r, --regression regression_method¶
Used when –model is one of full_add or lexical_func. One of lstsq (Least-Squares regression) or ridge (Ridge regression) (see the section on training composition models in Python above for more details). Optional, lstsq by default.
- --crossvalidation True/False¶
Used when -r is ridge. Optional, default is True.
- --intercept True/False¶
Used when –model is one of full_add or lexical_func. Optional, default is True.
- --lambda lambda_parameter_for_Ridge_regression¶
Required when -r is ridge and –crossvalidation is False, ignored otherwise.
- --lambda_range lambda_range_for_Ridge_regression¶
Used when -r is ridge and –crossvalidation True as the search range for the lambda parameter. Format is comma-separated values, e.g., 0,1,2,3,4,5. Optional, default is linspace(0, 0.5, 10) (that is, 10 equally spaced values from 0 to 0.5).
- -a, --arg_space file¶
File containing the space of the arguments. Pickle format (and .pkl extension) required.
- -p, --phrase_space file¶
File containing the phrase space used for training. Pickle format (and .pkl extension) required.
- --export_params: True/False¶
If True, parameters of the learned model are exported to an appropriate format. Optional, False by default.
- -l, --log file¶
Logger output file. Optional, by default no logging output is produced.
- -h, --help¶
Displays help message.
Examples:
python2.7 train_composition.py -i ../examples/data/in/train_data.txt -m lexical_func -a ../examples/data/out/ex01.pkl -p ../examples/data/out/PHRASE_SS.ex10.pkl -o ../examples/data/out/ --export_params True python2.7 train_composition.py -i ../examples/data/in/train_data.txt -m lexical_func -r ridge --lambda 0.0 -a ../examples/data/out/ex01.pkl -p ../examples/data/out/PHRASE_SS.ex10.pkl -o ../examples/data/out/ --export_params True
After running one of these examples, you can use the output model to generate phrases with apply_composition.py (in this case, making sure that the functor name in the input file is book_function).