In supervised classification tasks, a machine learning model is provided with an input, and after the training phase, it outputs one or more labels from a fixed set of classes. Recent developments of large pre-trained language models (LLMs), such as BERT, T5 and GPT-3, gave rise to a novel approach to such tasks, namely prompting.
In prompting, there is usually no further training required (although fine-tuning is still an option), and instead, the input to the model is extended with an additional text specific to the task – a prompt. Prompts can contain questions about the current sample, examples of input-output pairs or task descriptions. Using prompts as clues, a LLM can infer from its implicit knowledge the intended outputs in a zero-shot fashion.
Legal prompt engineering is the process of creating, evaluating, and recommending prompts for legal NLP tasks. It would enable legal experts to perform legal NLP tasks, such as annotation or search, by simply querying large LLMs in natural language.
In this presentation, we investigate prompt engineering for the task of legal judgement prediction (LJP). We use data from the Swiss Federal Supreme Court and the European Court of Human Rights, and we compare various prompts for LJP using multilingual LLMs (mGPT, GPT-J-6B, etc.) in a zero-shot manner. We find that our approaches achieve promising results, but the long documents in the legal domain are still a challenge compared to single sentence inputs.