A Novel Optimization Scheme for Named Entity Recognition with Pre-trained Language Models

Shuanglong  Li; Xulong  Zhang; Jianzong  Wang

doi:10.26689/jera.v8i5.8402

Download PDF

Keywords

GPT-3
Named Entity Recognition
Sentence-BERT model
In-context example

DOI

10.26689/jera.v8i5.8402

Submitted : 2024-09-15

Accepted : 2024-09-30

Published : 2024-10-15

Abstract

Named Entity Recognition (NER) is crucial for extracting structured information from text. While traditional methods rely on rules, Conditional Random Fields (CRFs), or deep learning, the advent of large-scale Pre-trained Language Models (PLMs) offers new possibilities. PLMs excel at contextual learning, potentially simplifying many natural language processing tasks. However, their application to NER remains underexplored. This paper investigates leveraging the GPT-3 PLM for NER without fine-tuning. We propose a novel scheme that utilizes carefully crafted templates and context examples selected based on semantic similarity. Our experimental results demonstrate the feasibility of this approach, suggesting a promising direction for harnessing PLMs in NER.

References

Grishman R, Sundheim B, 1996, Message Understanding Conference: A Brief History. 16th International Conference on Computational Linguistics, 1996: 466–471.

Qu X, Gu Y, Xia Q, et al., 2024, A Survey on Arabic Named Entity Recognition: Past, Recent Advances, and Future Trends. IEEE Transactions on Knowledge and Data Engineering, 36(3): 943–959.

Hu Z, Hou W, Liu X, 2024, Deep Learning for Named Entity Recognition: A Survey. Neural Computing and Applications, 36(16): 8995–9022.

Jiang Y, Hu C, Xiao T, et al., 2019, Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 3583–3588.

Strubell E, Verga P, Belanger D, et al., 2017, Fast And Accurate Entity Recognition with Iterated Dilated Convolutions. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017: 2670–2680.

Ronran C, Lee S, 2020, Effect of Character and Word Features in Bidirectional LSTM-CRF for NER. 2020 IEEE International Conference on Big Data and Smart Computing, 2020: 613–616.

Baevski A, Edunov S, Liu Y, et al., 2019, Cloze-driven Pretraining of Self-Attention Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 5359–5368.

Brown TB, Mann B, Ryder N, et al., 2020, Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems, 2020: 1–75.

Wies N, Levine Y, Shashua A, 2023, The Learnability of In-context Learning. Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems, 2023: 1–15.

Zhang Y, Yang J, 2018, Chinese NER using Lattice LSTM. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018: 1554–1564.

Gui T, Ma R, Zhang Q, et al., 2019, CNN-based Chinese NER with Lexicon Rethinking. Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019: 4982–4988.

Gui T, Zou Y, Zhang Q, et al., 2019, A Lexicon-based Graph Neural Network for Chinese NER. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019: 1040–1050.

Li X, Yan H, Qiu X, et al., 2020, FLAT: Chinese NER using Flat-lattice Transformer. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2020: 6836–6842.

Wu S, Song X, Feng ZH, et al., 2021, NFLAT: Non-flat-lattice Transformer for Chinese Named Entity Recognition. Journal of Latex Class Files 14(8): 1–13. https://doi.org/10.48550/arXiv.2205.05832