A qualia role-based entity-dependency graph (EDG) is proposed to represent and extract quantity relations for solving algebra story problems stated in Chinese. Traditional neural solvers use end-to-end models to translate problem texts into math expressions, which lack quantity relation acquisition in sophisticated scenarios. To address the problem, the proposed method leverages EDG to represent quantity relations hidden in qualia roles of math objects. Algorithms were designed for EDG generation and quantity relation extraction for solving algebra story problems. Experimental result shows that the proposed method achieved an average accuracy of 82.2% on quantity relation extraction compared to 74.5% of baseline method. Another prompt learning result shows a 5% increase obtained in problem solving by injecting the extracted quantity relations into the baseline neural solvers.

Automatically solving math word problem (MWP) is a long-standing artificial intelligence (AI) challenge problem that dates back to the 1960s [

Algebra story problem [

Traditional MWP solvers, such as rule-based methods [

To represent the relations between quantities mentioned in the problem text, Roy et al. [

Inspired by their work, we propose an EDG method to represent the quantity relations for solving algebra story problems stated in Chinese. The proposed method leverages qualia roles [

The main contributions of this paper lie in:

A qualia role-related entity dependency graph (EDG) is proposed to efficiently represent the quantity relations presented by the properties of the interconnected math objects, especially for those literally uncorrelated math objects in algebra story problems stated in Chinese.

An algorithm is designed to extract quantity relations from the built EDG to improve the algebra story problem solving.

The rest of the paper is organized as follows.

This paper aims to build a graph-based representation method to improve the efficiency of quantity relation extraction for solving algebra story problems stated in Chinese. As discussed above, the algebra story problem is a sub-category of the math word problem which is much more complex than other math word problems such as arithmetic word problems [

Other researchers adopt the deep learning approach and obtain an expression from the problem text through training on the big data set without manual intervention, so as to calculate and solve the target. A method to align the quantities in the text with the template by pre-defining the equation template and using the deep learning is proposed in [

The quantity relations and solution goals in algebra story problems are presented associated with math objects and recognizing the math object and identifying the relationship between the math objects is the fundamental work before quantity relation extraction. In the field of natural language processing (NLP), objects are usually named entities and a fair amount of work has been done on entity relation representation. This section introduces the general approaches of entity relation representation in NLP: semantic relation-based method, semantic framework-based method, and commonsense network-based method.

The semantic relation-based method classifies words according to the lexical meaning to obtain the relations between words, such as WordNet [

The method based on semantic framework takes the semantic context of natural language into constructing lexical associations, such as FrameNet database [

The commonsense network-based method represents general commonsense information, such as ConceptNet [

As described in [

Later, Yuan [

The qualia role explores subjective evaluation and objective attribute of the math objects within an individual framework, which coincides with the expectation of the

In this section, a qualia role-based entity dependency graph (EDG) is introduced to represent quantity relations for solving algebra story problems. An EDG is composed by a node set

In an algebra story problem, quantities are mostly associated with the attributes that belong to math objects and the attributes are related to each other according to the interconnections of the math objects. Quantity relations are formed from the source formula of the attribute(s) of an individual math object or the attribute relation(s) between the interconnected math objects.

For example, in the case of “

In this paper, the detected objects are divided into two categories: math entity and math attribute, from the perspective of semantic roles.

Both

To note that not all the math attributes are presented explicitly in the problem text. For example, in statement of “

Another issue is that the boundary between math attributes and math entities is not static, i.e., there is a variable semantic role. For example, in the case of “

In the qualia role description system, entity relationships are modeled by the qualia roles of entities. Specifically, the materiality roles of entities are determined by the type and syntactic structure, and we use ten qualia roles [

Recalling the example of

Entity dependency presents the relationship that exists between different entities. The relationship can be described as a graph called entity-dependency graph (EDG). We define the EDG as a tuple

As shown in

As a result, entity relationships could be organized as a network named entity dependency graph (EDG in short) which is composed by the above three relationship networks. In the field of math word problem solving, we can build a static EDG that contains domain knowledge and commonsense knowledge to store and represent the attribute relations of entities. Besides, EDG can be also used to represent the dynamic relations of entities given by the problem text. In the end, all the static relations and dynamic relations are integrated into a single EDG. Next, we give an introduction to how the quantity relations are represented and stored in an EDG.

According to the definition of the categories of entity relationships, quantity relations in algebra story problems could be divided into the following three categories:

The above types of quantity relations exist everywhere in algebra story problems. The difference between each category is that the attribute relation is used to represent the quantities of the attributes within an attribute sub-graph

This section introduces the algorithms of quantity relation extraction from a given problem, including the processes of EDG construction and sub-graph identification. Firstly, a generation algorithm is introduced to build an instance graph of entity relation network, called entity-dependency graph (EDG), from an input problem text. Then, an extraction algorithm is introduced to obtain quantity relations from the generated EDG.

To build the EDG for each input problem, we first create a node set

Take the problem shown in

As discussed in

The core task of Algorithm 2 is to search out all the sub-graphs

For example in

Though some AWP solvers can directly use the generated expressions, most of the neural solvers cannot accept these expressions directly. To address the incompatible issue, the expressions need to be transformed into natural language statements to input to the solvers.

In this section, we introduce the experimental results of the proposed method compared to the state-of-the-art on the public dataset stated in Chinese, Math23K.

Proportion: problems involve or require proportions.

Unitary: problems involve total quantities and require the quantity per day, minute, etc.

Interest rate: problems involve or require the interest and interest rate.

Summation: problems involve the partial values and require the total value.

Motion: problems involve or require the distance, time and speed.

Compared with Mayer’s work, we merged several simple categories into a more comprehensive category. For example, in Mayer’s work, the DRT (distance-rate-time) problems and Motion problems are divided into two separate categories. While in our experiment, these two categories are merged into a single group named Motion as they have similar graph structures and source formulas.

To build the new data set, we first create a set of keywords for each category, and then a classifier accomplishes the classification work. In the end, we group the selected 6028 problems into five categories, including 1377 proportion problems, 1115 unitary problems, 239 interest rate problems, 1464 summation problems, and 1833 motion problems.

Type of question | Number of problems | Percentage |
---|---|---|

Proportion | 1377 | 5.9% |

Unitary | 1115 | 4.8% |

Interest rate | 239 | 1.0% |

Summation | 1464 | 6.3% |

Motion | 1833 | 7.9% |

Total | 6028 | 26.1% |

All the above baseline methods are built on the dataset of Math23K which is stated in Chinese. Another similar work employed UDG [

As the performance of quantity relation extraction is not evaluated by all the neural solvers, we only compare the result to the

Type of question | Number of question | EDG | |||||
---|---|---|---|---|---|---|---|

Acc | R | F1 | Acc | R | F1 | ||

Proportion | 143 | 0.711 | 0.621 | 0.633 | 0.721 | 0.750 | 0.735 |

Unitary | 201 | 0.725 | 0.770 | 0.747 | 0.882 | 0.893 | 0.887 |

Interest rate | 219 | 0.712 | 0.622 | 0.664 | 0.775 | 0.747 | 0.761 |

Summation | 621 | 0.805 | 0.679 | 0.737 | 0.866 | 0.843 | 0.855 |

Motion | 237 | 0.655 | 0.532 | 0.587 | 0.761 | 0.754 | 0.759 |

Total | 1421 | 0.745 | 0.653 | 0.695 | 0.822 | 0.811 | 0.817 |

Inspired by [

Type of question | GTS | Graph2Tree | GTS + EDG | Graph2Tree + EDG |
---|---|---|---|---|

Summation | 82.7 | 80.9 | 85.0 | 82.9 |

Motion | 89.6 | 88.9 | 90.7 | 90.2 |

Interest rate | 86.2 | 82.8 | 87.9 | 84.9 |

Unitary | 27.7 | 34.7 | 32.7 | 52.0 |

Proportion | 43.9 | 44.3 | 47.2 | 49.7 |

Average | 65.8 | 66.5 | 68.4 | 71.9 |

From

For most MWP solvers, problem texts are treated as an unstructured word sequence. However, the relationships among the math objects involved in the text, such as math entities and attributes, are structured information with corresponding hidden domain knowledge. This structured information can be organized as a tree or graph [

Compared with the above three types of graphs, the proposed entity-dependency graph (EDG) can be used to represent the relationship among the quantities and math entities in sophisticated scenarios. In each constructed EDG (as shown in

It is a first attempt to integrate nodes of math entities, attributes and quantities into a single graph which provides more indicative information for solution tree construction.

The qualia structure is used to model the relationships among math entities, which allows us to avoid the heavily manual work of semantic role labeling for text parsing.

The edges in the constructed EDG can be easily formed into path(s) by integrating a lite inference engine to generate an interpretable solution.

Quantity relation extraction is essential for building MWP solvers, especially for those solvers built for intelligent tutoring systems. This paper proposed a qualia role-based entity-dependency graph (EDG) to represent quantity relations for solving algebra story problems stated in Chinese. Algorithms were designed to generate the EDG and to extract mathematical expressions from the generated EDG. Finally, the extracted mathematical expressions are used as the input of the solver to calculate the final answer. Experimental results showed that the proposed method achieved up to 7% promotion on the average accuracy on quantity relation extraction and 5% promotion on problem solving compared to baseline methods.

Despite the encouraging results that have been achieved, there is room for improvement. First, the proposed method can only be applied to solve algebra story problems stated in Chinese currently. In future work, we will continue working on developing an extended qualia role knowledge base to support solving problems stated in other languages, such as English. Second, further research is needed to improve the capability of mathematical property representation. Mathematical properties are much more complex than numerical values and some new encoding mechanisms should be discussed when constructing the EDG. Recent research results on natural language processing and deep neural symbol networks will be considered for extracting the math entities in varieties of MWPs and modeling the relationship between the extracted math entities. In the end, we will also explore the possibility of using entity-dependency graph to improve the interpretability of the generated solutions.