In supervised classification tasks, a machine learning model is provided
with an input, and after the training phase, it outputs one or more labels
from a fixed set of classes. Recent developments of large pre-trained
language models (LLMs), such as BERT, T5 and GPT-3, gave rise to a novel
approach to such tasks, namely prompting.
In prompting, there is
usually no further training required (although fine-tuning is still an
option), and instead, the input to the model is extended with an additional
text specific to the task – a prompt. Prompts can contain questions about
the current sample, examples of input-output pairs or task descriptions.
Using prompts as clues, a LLM can infer from its implicit knowledge the
intended outputs in a zero-shot fashion.
Legal prompt engineering is the process of creating, evaluating, and
recommending prompts for legal NLP tasks. It would enable legal experts to
perform legal NLP tasks, such as annotation or search, by simply querying
large LLMs in natural language.
In this presentation, we investigate
prompt engineering for the task of legal judgement prediction (LJP). We use
data from the Swiss Federal Supreme Court and the European Court of Human
Rights, and we compare various prompts for LJP using multilingual LLMs
(mGPT, GPT-J-6B, etc.) in a zero-shot manner. We find that our approaches
achieve promising results, but the long documents in the legal domain are
still a challenge compared to single sentence inputs.