NLP impacts: Policymaking and communication

NLP methods can deliver fascinating insights as well as helping legal scholars to accomplish their research goals with more speed and efficiency. However, the results of research can also be useful for policymakers including members of parliament, advisory bodies, advocacy organisations, think tanks, and international organisations. More broadly, the emergence of AI techniques in research raises important ethical and governance questions that scholars will need to address in some way during their research. 

For some researchers, therefore, taking policy communication a step further in the designs of their research studies can be an attractive idea. For others, communication development may not fit the goals of the research or it may already be the responsibility of another department in your institution.  

Whether you can rely on your communication department or must fit external relationships and societal impact alongside other research tasks, it can help to think in a holistic way impact. While policy impact can sometimes be viewed as an optional extra, by thinking about it early in your project you can build the network and resource skills that will ultimately help save time while opening up unexpected opportunities for your research to make a difference to scholars, governments or ordinary citizens beyond the confines of your immediate research community. 

There are five different ways you can consider sharing with your team or department: 

  1. Community of practice
  2. Sustainable research
  3. Ethical AI
  4. Open science
  5. Communicating findings

 

1. Community of practice

If you are a scholar in legal or policy fields who is interested in using NLP tools then you likely have what could be called a ‘community of practice’. Exploiting your community of practice is important for influencing it internally and shaping its external impact. However, knowing who is in your community does not follow a precise formula because calculating degrees of distance in network relationships is inherently fuzzy and conditional on definitional priors. Nevertheless, network relationships in communities should be actively chosen and built. This first step is about reflecting on your existing community of practice, finding its strength and weaknesses and looking for ways to improve it. 

 

Literature:  

Asakura, K., Occhiuto, K., Todd, S., Leithead, C., & Clapperton, R. (2020). A call to action on artificial intelligence and social work education: Lessons learned from a simulation project using natural language processing. Journal of Teaching in Social Work, 40(5), 501-518. 

Brundage, M. P., Sexton, T., Hodkiewicz, M., Dima, A., & Lukens, S. (2021). Technical language processing: Unlocking maintenance knowledge. Manufacturing Letters, 27, 42-46. 

Lhoest, Q., Del Moral, A. V., Jernite, Y., Thakur, A., Von Platen, P., Patil, S., … & Wolf, T. (2021). Datasets: A community library for natural language processing. arXiv preprint arXiv:2109.02846.

Parks, L., & Peters, W. (2023). Natural language processing in mixed-methods text analysis: A workflow approach. International Journal of Social Research Methodology, 26(4), 377-389.

 

2. Sustainable research

The computing power needed to run NLP models is often much higher than modelling with traditional, small datasets. Of course, for smaller projects these levels of energy expenditure may not by high. Further, how can NLP researchers negotiate difficult trade-offs between the social value of their research and its environmental impact? Researchers need to consider these dilemmas for themselves, but it must start at least with awareness. Researchers can (1) calculate the energy footprint of their research and (2) commit to public reporting practices that help to raise awareness and build public understanding.  

 

Literature: 

Bannour, N., Ghannay, S., Névéol, A., & Ligozat, A. L. (2021, November). Evaluating the carbon footprint of NLP methods: a survey and analysis of existing tools. In Proceedings of the second workshop on simple and efficient natural language processing (pp. 11-21). 

 

Hershcovich, D., Webersinke, N., Kraus, M., Bingler, J. A., & Leippold, M. (2022). Towards climate awareness in NLP research. arXiv preprint arXiv:2205.05071.

 

Jin, Z., Chauhan, G., Tse, B., Sachan, M., & Mihalcea, R. (2021). How good is NLP? A sober look at NLP tasks through the lens of social impact. arXiv preprint arXiv:2106.02359.

 

Rillig, M. C., Ågerstrand, M., Bi, M., Gould, K. A., & Sauerland, U. (2023). Risks and benefits of large language models for the environment. Environmental Science & Technology, 57(9), 3464-3466. 

  

3. Ethical AI

One of the first decisions that needs to be made in research methodology is to select the right tool for the job. There are so many NLP applications that the decision can be difficult. Further, all researchers face the need to negotiate between three competing needs: being technically able and meeting ethical research standards and norms. Each of these things facilitates or inhibits the ability to do the other things.  

 

The use of NLP comes with well-known ethics concerns. Some of these concerns are related to clear intent to create harm such as creating political censorship or training large language models with offensive textual content. However, some concerns are more challenging because they can occur more easily and without any intent to cause harm. An example of this sort of risk would be paying or using services of technology companies that have undisclosed unethical product development practices or making training data with personal information public. Whether of the more serious or careless kinds of ethical violations, a compounding problem is the highly overlapping and reusable nature of training data and programing code. Finding an ethical path through this difficult terrain is as difficult for researchers as it is important. 

 

 

Literature: 

Jin, Z., Chauhan, G., Tse, B., Sachan, M., & Mihalcea, R. (2021). How good is NLP? A sober look at NLP tasks through the lens of social impact. arXiv preprint arXiv:2106.02359.

 

Ray, P. P. (2023). ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3, 121-154. 

 

Vydra, S., Poama, A., Giest, S., Ingrams, A., & Klievink, B. (2021). Big data ethics: A life cycle perspective. Erasmus L. Rev., 14, 24. 

  

4. Open science

The Explainable AI (XAI) movement has resulted in several different standards of AI transparency. XAI extends many existing principles of open science to the domain of machine learning. Principles of public knowledge for publicly funded research, open data and reproducibility all apply equally to XAI in the domain of scientific research as they do to open science. However, there are new challenges that NLP researchers should consider in their efforts to become more open. The chief among these challenges are algorithmic black boxes and third party data. The former challenge comes about due to the complexity of the code used in NLP models, especially when neural network techniques are employed. Relatedly, the third party problem comes about because the companies that often run larger models are themselves not transparent about how their models work or their data sources.  

 

 

Literature: 

Belz, A., Agarwal, S., Shimorina, A., & Reiter, E. (2021). A systematic review of reproducibility research in natural language processing. arXiv preprint arXiv:2103.07929.

 

Coro, G., Panichi, G., & Pagano, P. (2019, November). An Open Science System for Text Mining. In CLiC-it.

 

Lund, B. D., Wang, T., Mannuru, N. R., Nie, B., Shimray, S., & Wang, Z. (2023). ChatGPT and a new academic reality: Artificial Intelligence‐written research papers and the ethics of the large language models in scholarly publishing. Journal of the Association for Information Science and Technology, 74(5), 570-581. 

 

Van De Schoot, R., De Bruin, J., Schram, R., Zahedi, P., De Boer, J., Weijdema, F., … & Oberski, D. L. (2021). An open source machine learning framework for efficient and transparent systematic reviews. Nature machine intelligence, 3(2), 125-133. 

 

5. Communicating findings

Communicating research findings is an integral part of scholarly work. There are many different kinds of audiences with whom researchers communicate: students, teachers, peers, practitioners, citizens in general etc. There are two particular challenges that require special attention here: Firstly, explaining the outputs of NLP analysis to non experts requires very careful and skillful communication techniques. Secondly, the practitioner audiences of NLP performed with legal and policy kinds of texts present unique type of professions such as judges, barristers, advisors, public administrators and politicians. The types of practitioners mean that communication approaches that are tailored to the type and attention scopes are important.  

 

 

Literature: 

Brewer, P. R., Bingaman, J., Paintsil, A., Wilson, D. C., & Dawson, W. (2022). Media use, interpersonal communication, and attitudes toward artificial intelligence. Science communication, 44(5), 559-592. 

Lempert, R. (1988). “Between Cup and Lip”: Social Science Influences On Law and Policy. Law & Policy, 10(2‐3), 167-200. 

Schäfer, M. S. (2023). The Notorious GPT: science communication in the age of artificial intelligence. JCOM: Journal of Science Communication, 22(02), 1-15. 

Scheufele, D. A. (2014). Science communication as political communication. Proceedings of the National Academy of Sciences, 111(supplement_4), 13585-13592. 

Last updated: 4-Dec-2024