An Example of Sentiment Polarity Analysis and Recognizing Textual Entailment Utilizing Texts in Japanese Hotel Reviews Posted on Jalan Net

In order to create a better customer experience, we need to know what makes a service or product attractive to customers. Megagon Labs Tokyo has developed the Japanese Realistic Textual Entailment Corpus (JRTE Corpus), which is a corpus of text based reviews of hotels, and other lodging facilities, published on the travel information website Jalan Net. The JRTE Corpus is enriched with labels provided by human annotators. This article introduces the JRTE corpus and a simple example of how to use JRTE to train (fine-tune) a standard machine learning model (BERT).

Word-of-mouth communication in online services is essential for users who are considering using a service or product. Prospective customers can make better choices if they learn beforehand about an unfamiliar area, or if they understand the amenities of a lodging option they’ve never tried but were reviewed favorably by others. Unfortunately, it is still not easy to quickly obtain the information you want from the large amount of user chatter online. In order to solve this problem, we decided to create a corpus that would help in the automatic extraction and organization of knowledge.

Read the paper
Download JRTE Corpus

Contents of the JRTE Corpus

The JRTE corpus enriches the reviews’ text with three labels: presence/absence of the hotel feature, sentiment polarity, and textual entailment. The corpus is distinct because the sentences are not created artificially but are from real reviews. In general, using a corpus is a time-consuming process, but it is immediately available for academic purposes, which we think is a valuable language resource in the perspective of academic development.

The presence/absence label and the sentiment polarity
label of the hotel feature

The corpus assigns a binary label (Yes=1, No=0) depending on whether or not the features of the hotel are included. It also assigns a tri-level label in sentiment polarity (positive=1, negative=-1, neutral=0). In this article, we call the classification tasks of each label “RHR” and “N.” Example data is shown below. By using this data to build a classifier, we can extract “mentions of popular lodging features.”

An example of hotel features with Y/N label and sentiment polarity label

The following is the Japanese corpus with sentiment polarity labels available to the public:

Textual Entailment Label

It assigns a binary label (entail=1, not entail=0) depending on whether or not premise (P) always applies when hypothesis (H) is true (whether P entails H or not). In this article, we call the classification tasks of each label “RTE.” Example data is shown below. By using this data to build a classifier, we can find the same reference with different wording.

“Best view from the room” does not necessarily mean “you can see the ocean from the room,” which labels 0. On the other hand, “The room had an ocean view” means “you can see the ocean from the room,” which labels 1.


An Example of Recognizing Textual Entailment

The following is the Japanese corpus with Recognizing Textual Entailment labels available to the public:

Examples of using the JRTE corpus for model training

There are many ways to create a classifier using a corpus, but here we will try to fine-tune the publicly available BERT model of Tohoku University according to the task. (For an explanation of BERT itself, the OGIS Research Institute’s article is a helpful reference.)

The first step is to get the JRTE corpus, sample scripts, and install the library.

					$ git clone

$ cd jrte-corpus_example

$ git clone

$ pip3 install poetry

$ poetry install --no-root

The next step is to build the model by fine-tuning with a learning script, which took about 1.5 minutes for PN and RHR and about 5 minutes for RTE on a GPU (NVIDIA V100-SXM2).

					$ poetry run python3 ./ -i ./jrte-corpus/data/pn.tsv -o ./model-pn --task pn

$ poetry run python3 ./ -i ./jrte-corpus/data/rhr.tsv -o ./model-rhr --task rhr

$ poetry run python3 ./ -i './jrte-corpus/data/rte.*.tsv' -o ./model-rte --task rte

Verifying that the model works

Now, try to set up a classification server using transformer-cli‘s serve to test the model.

					$ poetry run transformers-cli serve --task sentiment-analysis --model ./model-pn --port 8900 

$ curl -X POST -H "Content-Type: application/json" "http://localhost:8900/forward" -d '{"inputs":["ご飯が美味しいです。", "3人で行きました。" , "部屋は狭かったです。"] }' 


Each of the three sentence inputs is given a (correct) label: positive, neutral, or negative.

					$ poetry run transformers-cli serve --task sentiment-analysis --model ./model-rhr --port 8901 

$ curl -X POST -H "Content-Type: application/json" "http://localhost:8901/forward" -d '{"inputs":["ご飯が美味しいです。", "3人で行きました。"] }' 


The two sentence inputs are respectively given a (correct) label: either relevant or irrelevant.

					$ poetry run transformers-cli serve --task sentiment-analysis --model ./model-rte --port 8902 

$ curl -X POST -H "Content-Type: application/json" "http://localhost:8902/forward" -d '{"inputs":[["風呂がきれいです。", "食事が美味しいです" ] , [ "暑いです。", "とても暑かった"]] }' 


The pair of the two sentences’ inputs are respectively given a (correct) label of “not entailed” or “entailed.” Of course, this does not mean that it gives the correct label to every input; it may have the wrong label.

Model Performance Evaluation

Now, we examine the accuracy using the evaluation data.

					$ poetry run python3 ./tra --evaluate -i ./jrte-corpus/data/pn.tsv --base ./model-pn --task pn -o ./model-pn/evaluate_output.txt

$ awk '{if($1==$2){ok+=1} } END{ print(ok, NR, ok/NR) }' ./model-pn/evaluate_output.txt 

464 553 0.83906 

$ poetry run python3 ./ --evaluate -i ./jrte-corpus/data/rhr.tsv --base ./model-rhr --task rhr -o ./model-rhr/evaluate_output.txt 

$ awk '{if($1==$2){ok+=1} } END{ print(ok, NR, ok/NR) } ' ./model-rhr/evaluate_output.txt 

490 553 0.886076 

$ poetry run python3 ./ --evaluate -i './jrte-corpus/data/rte.*.tsv' --base ./model-rte --task rte -o ./model-rte/evaluate_output.txt 

$ awk '{if($1==$2){ok+=1} } END{ print(ok, NR, ok/NR) } ' ./model-rte/evaluate_output.txt 

4903 5529 0.886779 

This is a brief example of how to build a classification model using the JRTE corpus. The accuracy can be further improved by devising a structure and parameters for the model, or by separately collecting and labeling data that is prone to mistakes, then adding it to the training data. In the JRTE corpus, error analysis of models is easy because it is written in Japanese, and it can be used as a tutorial for those who are not familiar with text classification.


Megagon Labs will continue to enhance the research capabilities of the broader academic community in Japanese natural language processing by continuously releasing research data that will benefit researchers and students at public research institutions and universities.

Written by: Yuta Hayashibe / Megagon Labs Tokyo

Tag: Sentiment Polarity Analysis/感情極性分析, Recognizing Textual Entailment/含意関係認, BERT

Follow us on LinkedIn and Twitter for to stay up to date with us.

(*Note: Megagon Labs Tokyo is a research team of Recruit Co.)


Share on facebook
Share on twitter
Share on linkedin
Share on reddit
Share on email

More Blog Posts: