My undergraduate honour’s dissertation was a Natural Language Processing (NLP) research project. It focused on multilingual text generation in under-represented languages. Because existing metrics performed very poorly on evaluating outputs of models trained on the dataset I was using, I needed to train a learned regression metric.
Regression would be useful for many textual tasks, such as:
- Sentiment analysis: Predict the strength of positive or negative sentiment instead of simple binary classification.
- Writing quality estimation: Predict how high the quality of a piece of writing is.
For my use case, I needed the model to score how good another model’s prediction was for a given task. My dataset’s rows consisted of the textual input and a label, 0 (bad prediction) or 1 (good prediction).
- Input: Text
- Label: 0 or 1
- The task: Predict a numerical probability between 0 and 1
But transformer-based models are usually used for generation tasks. Why would you use a pre-trained LM for…