1 00:00:00,000 --> 00:00:02,720 Hello, everyone, and welcome back 2 00:00:03,720 --> 00:00:05,880 to the second panel. 3 00:00:05,880 --> 00:00:08,800 And I'm now joined by 4 00:00:10,360 --> 00:00:14,080 Chaohai Ding from the University of Southampton. 5 00:00:14,760 --> 00:00:17,840 Lourdes Moreno from the Universidade 6 00:00:17,880 --> 00:00:20,880 Carlos III de Madrid in Spain. 7 00:00:21,120 --> 00:00:24,120 And Vikas Ashok from 8 00:00:24,320 --> 00:00:26,960 Old Dominion University in the US. 9 00:00:27,760 --> 00:00:31,120 So thank you all for your availability. 10 00:00:31,120 --> 00:00:33,840 It's great to have you here. 11 00:00:33,840 --> 00:00:37,800 And let's as I said before, 12 00:00:37,800 --> 00:00:42,000 let's bring back the topic of natural language processing. We 13 00:00:43,160 --> 00:00:44,640 we addressed it 14 00:00:44,640 --> 00:00:49,160 yesterday, but now from the perspective of 15 00:00:49,160 --> 00:00:54,240 how can it be used to enhance accessible communication on the Web. 16 00:00:54,560 --> 00:00:57,680 And so here and I guess 17 00:00:58,160 --> 00:01:01,120 once again, similar to what I've done 18 00:01:01,800 --> 00:01:04,720 an hour ago in the first panel, 19 00:01:04,720 --> 00:01:07,920 you've been working on different aspects of this 20 00:01:08,160 --> 00:01:11,400 large domain of accessible communication. 21 00:01:11,840 --> 00:01:13,440 And you’ve 22 00:01:13,720 --> 00:01:16,520 pursued advances in machine 23 00:01:16,520 --> 00:01:20,000 translation, in sign language, 24 00:01:20,400 --> 00:01:24,400 AAC, and so from your perspective 25 00:01:24,400 --> 00:01:27,120 and from your focus on the work, 26 00:01:27,640 --> 00:01:32,160 what are the current challenges that that you've been facing? 27 00:01:32,160 --> 00:01:34,200 And that's that are preventing 28 00:01:35,920 --> 00:01:38,080 the next breakthrough, I guess. 29 00:01:38,080 --> 00:01:43,560 And also I would like to ask you to, uh, for your first intervention also to do 30 00:01:44,080 --> 00:01:47,680 a brief introduction to yourself and to what you've been doing. 31 00:01:48,240 --> 00:01:54,080 Okay, So I can start with you Chaohai. 32 00:01:54,120 --> 00:01:55,920 Thank you for having me today. 33 00:01:55,920 --> 00:02:00,720 My name is Chaohai Ding and I'm a senior research fellow at the University of Southampton. 34 00:02:01,160 --> 00:02:05,560 And my research interest is AI and inclusion 35 00:02:06,200 --> 00:02:09,440 which includes using data science and AI 36 00:02:10,400 --> 00:02:12,000 techniques to enhance 37 00:02:12,000 --> 00:02:15,440 accessible learning, traveling and 38 00:02:16,520 --> 00:02:17,920 communication. 39 00:02:17,920 --> 00:02:20,600 So, yes, we used... 40 00:02:21,120 --> 00:02:22,400 NLP has been widely 41 00:02:22,400 --> 00:02:26,160 used in our research to support accessible communication. 42 00:02:27,200 --> 00:02:31,480 Currently, we are working on several projects focused on AAC. 43 00:02:31,480 --> 00:02:33,520 So for example, 44 00:02:33,600 --> 00:02:36,280 we applied the concept 45 00:02:36,280 --> 00:02:39,960 net of ... knowledge graph 46 00:02:40,080 --> 00:02:42,600 to interlinking AAC 47 00:02:43,920 --> 00:02:46,000 symbols from different symbol sets. 48 00:02:47,200 --> 00:02:49,120 This can be used 49 00:02:49,120 --> 00:02:51,800 for symbol to symbol translation. 50 00:02:52,640 --> 00:02:57,840 And we also developed an NLP model to translate 51 00:02:57,840 --> 00:03:02,760 the AAC symbol sequence into spoke text sequence. 52 00:03:03,400 --> 00:03:07,200 So so that's the two projects we're working on currently 53 00:03:07,920 --> 00:03:12,000 and we also working on accessible e-learning project 54 00:03:12,360 --> 00:03:14,880 that we applied a machine translation 55 00:03:16,040 --> 00:03:18,480 to provide transcripts 56 00:03:18,480 --> 00:03:20,600 from English to, 57 00:03:20,600 --> 00:03:23,400 other languages for our international users. 58 00:03:24,000 --> 00:03:27,680 So that's another scenario we are working with on a machine 59 00:03:27,680 --> 00:03:30,800 translation for accessible communication. 60 00:03:30,800 --> 00:03:35,360 So there are a few challenges we have identified in our kind of research. 61 00:03:36,240 --> 00:03:38,640 The first one is always the data, 62 00:03:38,640 --> 00:03:42,640 data availability and the data opti...bility 63 00:03:43,160 --> 00:03:48,360 So as we know, an NLP model is normally trained on a large amount of data. So 64 00:03:49,360 --> 00:03:52,200 especially for AAC, 65 00:03:53,720 --> 00:03:57,200 we, we are... one of the biggest challenges 66 00:03:57,200 --> 00:04:00,480 is that we are lack of a 67 00:04:00,480 --> 00:04:03,360 data like user 68 00:04:04,240 --> 00:04:07,720 user user data... AAC data, and also 69 00:04:08,360 --> 00:04:12,120 how a user interact with the AAC. 70 00:04:12,520 --> 00:04:14,600 So so 71 00:04:16,080 --> 00:04:17,600 which... 72 00:04:17,600 --> 00:04:20,400 and also we have several different 73 00:04:21,000 --> 00:04:25,800 AAC symbol sets used by the different individuals 74 00:04:26,040 --> 00:04:30,600 and which make it very difficult to develop NLP models as well 75 00:04:30,840 --> 00:04:33,760 because the AAC symbols 76 00:04:33,760 --> 00:04:38,040 are separate for each symbol set and that's the 77 00:04:38,080 --> 00:04:40,720 another challenge is the lack of data 78 00:04:41,480 --> 00:04:45,080 interoperability in AAC symbol sets. 79 00:04:46,320 --> 00:04:51,000 Yet the third challenge we are identified is the inclusion 80 00:04:51,240 --> 00:04:54,960 because we are working on AAC symbol sets 81 00:04:55,720 --> 00:04:59,000 from Arabic, English and Chinese. 82 00:04:59,280 --> 00:05:05,040 So there are cultural and social difference in AAC symbols, which is 83 00:05:06,760 --> 00:05:08,240 important to 84 00:05:08,240 --> 00:05:13,880 consider the needs of different end user groups on the cultural and the social 85 00:05:14,160 --> 00:05:18,240 factors, and to be involved in the development 86 00:05:18,360 --> 00:05:20,960 of the NLP models for AAC. 87 00:05:22,440 --> 00:05:24,160 The first one is 88 00:05:24,400 --> 00:05:28,120 data privacy and safety and this has been 89 00:05:29,760 --> 00:05:33,360 identified in our web application for 90 00:05:33,360 --> 00:05:38,760 or from AAC symbols to spoken texts. 91 00:05:38,760 --> 00:05:42,600 So how do we, if we want to, a more accurate 92 00:05:42,600 --> 00:05:44,960 or more specific, 93 00:05:45,960 --> 00:05:47,400 personalized 94 00:05:47,960 --> 00:05:48,840 application? 95 00:05:48,840 --> 00:05:51,000 We need the user's information. 96 00:05:51,000 --> 00:05:54,000 So the challenge is how 97 00:05:54,000 --> 00:05:57,160 how can we store this 98 00:05:57,520 --> 00:05:59,040 personal information 99 00:05:59,040 --> 00:06:03,000 and how how prevent the data misuse and the bridge 100 00:06:03,400 --> 00:06:06,200 and how to make the tradeoff 101 00:06:06,200 --> 00:06:08,760 between the user 102 00:06:09,240 --> 00:06:12,000 information and the the model performance. 103 00:06:13,720 --> 00:06:17,040 And the last one 104 00:06:17,040 --> 00:06:20,360 is always the accessible user interface. 105 00:06:20,360 --> 00:06:22,480 So in the how, 106 00:06:22,480 --> 00:06:26,040 how to makes this AI powered tool, NLP powered 107 00:06:26,280 --> 00:06:30,840 tools accessible for for end users and 108 00:06:32,680 --> 00:06:34,040 and also 109 00:06:34,040 --> 00:06:36,760 there are more generic issues in 110 00:06:37,760 --> 00:06:42,400 AI like accountability Explainability so yes I think 111 00:06:42,400 --> 00:06:46,200 that's the last of the challenges we have identified in our research. 112 00:06:47,760 --> 00:06:48,480 Thank you. 113 00:06:48,720 --> 00:06:49,000 Thank you, Chaohai. 114 00:06:49,000 --> 00:06:54,920 Is a great summary of definitely some of the major challenges 115 00:06:54,920 --> 00:06:59,000 that well are spread across the entire domain. 116 00:06:59,400 --> 00:07:01,200 Definitely. Thank you so much. 117 00:07:01,200 --> 00:07:04,280 Lourdes, do you want to go next? 118 00:07:06,400 --> 00:07:11,240 You’re muted. 119 00:07:11,240 --> 00:07:12,240 Thank you. 120 00:07:12,840 --> 00:07:14,520 Thanks for the invitation. 121 00:07:14,520 --> 00:07:16,920 Good afternoon everyone. I’m Lourdes 122 00:07:16,920 --> 00:07:21,400 Moreno. I work as an associate professor in the computer 123 00:07:21,400 --> 00:07:24,880 science department on the Universidad Carlos III 124 00:07:24,960 --> 00:07:27,880 de Madrid, Spain. 125 00:07:27,880 --> 00:07:29,160 I am an accessibility 126 00:07:29,160 --> 00:07:33,160 expert. I have been working in the area of technology 127 00:07:33,160 --> 00:07:38,080 for disability for 20 years. And... 128 00:07:38,600 --> 00:07:41,200 I have previously worked on 129 00:07:41,200 --> 00:07:44,120 sensory disability but currently 130 00:07:44,640 --> 00:07:47,080 I work on cognitive accessibility. 131 00:07:47,640 --> 00:07:50,560 In my research work I combinate 132 00:07:50,880 --> 00:07:53,920 method from the human computer interaction 133 00:07:54,320 --> 00:07:57,720 and natural language processing areas 134 00:07:58,280 --> 00:08:00,480 to obtain accessible solutions 135 00:08:00,960 --> 00:08:03,560 from the point of view of reability 136 00:08:03,560 --> 00:08:07,120 and the stability of the language in user interfaces. 137 00:08:08,080 --> 00:08:09,080 And so is 138 00:08:09,080 --> 00:08:15,320 the question currently in natural language research is being developed 139 00:08:15,440 --> 00:08:18,360 at our language model in recent years, 140 00:08:18,680 --> 00:08:20,960 there had been many advances 141 00:08:22,080 --> 00:08:24,320 due to the increasing resources 142 00:08:24,720 --> 00:08:28,680 such as large dataset and cloud platform 143 00:08:29,240 --> 00:08:32,320 that allow the training of large models. 144 00:08:33,120 --> 00:08:36,480 But the most crucial factor is the use 145 00:08:36,480 --> 00:08:41,200 of transforming technology and the use of transfer learning. 146 00:08:41,200 --> 00:08:47,640 These are methods based on deep learning to create language model 147 00:08:48,160 --> 00:08:50,280 base of the neural network. 148 00:08:51,440 --> 00:08:53,400 They are universal models, 149 00:08:53,400 --> 00:08:57,160 but then support is different in natural processing 150 00:08:57,160 --> 00:09:03,240 language tasks, such as question answering, translation, 151 00:09:03,240 --> 00:09:06,920 summarization, speech recognition and more. 152 00:09:07,840 --> 00:09:13,840 The most extensively used models are the GPT 153 00:09:13,840 --> 00:09:17,280 from OpenAI, and Bard from Google. 154 00:09:17,760 --> 00:09:22,680 But new and bigger models continually appear 155 00:09:23,360 --> 00:09:25,720 that outperform previous one 156 00:09:25,720 --> 00:09:31,200 because they are a performance continuous to a scale 157 00:09:31,400 --> 00:09:37,440 as more parameters are added to their models and more data are added. 158 00:09:38,640 --> 00:09:42,920 However, and despite these great advance 159 00:09:43,440 --> 00:09:46,160 there are issues 160 00:09:46,160 --> 00:09:51,240 in the accessibility scope challenges to address. 161 00:09:51,240 --> 00:09:55,280 One of them is bias. 162 00:09:55,280 --> 00:09:59,000 Language models have different types of bias 163 00:09:59,440 --> 00:10:03,240 such as gender, race and disability 164 00:10:03,840 --> 00:10:07,680 but gender and race 165 00:10:07,760 --> 00:10:11,040 biases are highly analyzed. 166 00:10:11,640 --> 00:10:14,640 However it is in the case 167 00:10:14,720 --> 00:10:16,840 with disability biases. 168 00:10:17,960 --> 00:10:22,240 It has been a relatively underexplored. 169 00:10:23,040 --> 00:10:27,600 There are studies relative this matter for for example 170 00:10:27,600 --> 00:10:32,880 in in these work in the in the sentiment and analysis text 171 00:10:33,400 --> 00:10:37,200 the terms relative to disability 172 00:10:37,200 --> 00:10:40,320 have a negative value 173 00:10:40,320 --> 00:10:45,120 or in another work using a model to moderate conversation 174 00:10:45,120 --> 00:10:48,840 classified takes mentions of disability 175 00:10:49,080 --> 00:10:52,360 as more toxics. 176 00:10:53,200 --> 00:10:57,600 That is algorithms are trained to be restful 177 00:10:57,600 --> 00:11:00,040 that can be offensive and cause 178 00:11:00,040 --> 00:11:03,000 disadvantage to individual with disabilities. 179 00:11:04,240 --> 00:11:07,320 So, a investigation is necessary 180 00:11:07,320 --> 00:11:13,000 to study in depth models to reduce biases. 181 00:11:13,000 --> 00:11:16,640 We cannot only use these language model and directly 182 00:11:16,640 --> 00:11:19,520 use the outcome. 183 00:11:20,240 --> 00:11:22,520 Another problem with these model 184 00:11:23,520 --> 00:11:26,880 is that there aren’t too many dataset 185 00:11:27,360 --> 00:11:30,000 related to accessibility area. 186 00:11:30,000 --> 00:11:34,000 For instance, there a few label corpora 187 00:11:34,400 --> 00:11:36,760 to be used in training simplification, 188 00:11:37,200 --> 00:11:41,280 algorithms, lexical or syntactic simplification 189 00:11:41,880 --> 00:11:43,600 in natural language processing. 190 00:11:43,600 --> 00:11:50,720 I work in cognitive accessibility in a in Spanish to simplify text 191 00:11:50,720 --> 00:11:57,960 to plain language and easy reading language. To carry out this case 192 00:11:57,960 --> 00:12:02,680 we had to create a corpus with an expert in easy reading, 193 00:12:03,000 --> 00:12:05,680 with the participation of older people 194 00:12:06,360 --> 00:12:10,680 and with people with disability intellectual disabilities 195 00:12:11,480 --> 00:12:14,480 because the current corpora had been created 196 00:12:14,480 --> 00:12:19,520 with non expert in disability, non expert in plain language 197 00:12:19,960 --> 00:12:24,000 and they haven't taken into account the people with disability. 198 00:12:25,560 --> 00:12:30,360 Also an effort devoted to solving this 199 00:12:30,360 --> 00:12:34,040 scarcity of resources are required 200 00:12:34,040 --> 00:12:37,200 in language with low resources. 201 00:12:37,680 --> 00:12:42,600 English is the language with more developed with many natural language processing. 202 00:12:42,600 --> 00:12:46,200 But others, such as Spanish, have hardly 203 00:12:46,200 --> 00:12:49,280 any resources. We need system 204 00:12:49,400 --> 00:12:54,720 trained for English language to work for Spanish as well. 205 00:12:54,720 --> 00:12:58,680 And finally, with the proliferation of GPT models 206 00:12:58,680 --> 00:13:03,160 with application such as ChatGPT 207 00:13:03,480 --> 00:13:06,320 another problem to address 208 00:13:06,360 --> 00:13:09,760 is the regulation on ethical aspect of 209 00:13:10,800 --> 00:13:14,920 artificial intelligence. 210 00:13:15,840 --> 00:13:17,240 Okay, thank you so much. 211 00:13:17,240 --> 00:13:22,280 Lourdes, definitely some very relevant challenges in there. 212 00:13:23,600 --> 00:13:26,160 Vikas, I’ll end this first round with you. 213 00:13:27,440 --> 00:13:30,360 Thank you Carlos. I’m Vikas Ashok 214 00:13:30,440 --> 00:13:34,600 from Old Dominion University in Virginia United States. 215 00:13:35,160 --> 00:13:37,680 So I have been working researching 216 00:13:37,680 --> 00:13:41,480 in the area of accessible computing for like ten years now. 217 00:13:42,000 --> 00:13:46,920 And my specialty focus area is of people with visual disabilities. 218 00:13:47,320 --> 00:13:50,840 So I have mostly concentrated on their accessibility 219 00:13:50,840 --> 00:13:54,000 as well as usability needs when it comes to interacting 220 00:13:54,000 --> 00:13:56,280 with computer applications. 221 00:13:57,080 --> 00:14:01,560 So with the topic at hand, which is accessible communication. 222 00:14:01,600 --> 00:14:04,560 So one of the projects that I'm currently looking at 223 00:14:04,560 --> 00:14:10,080 is understandably of social media content for people 224 00:14:10,600 --> 00:14:13,920 who listen to content such as, you know, people who are blind. 225 00:14:14,640 --> 00:14:18,000 So listening, you know, social media content, 226 00:14:18,000 --> 00:14:21,640 text is not the same as looking at it. 227 00:14:22,120 --> 00:14:25,320 So, even though the social media text 228 00:14:25,320 --> 00:14:31,160 is accessible, it's not necessarily understandable because of presence 229 00:14:31,160 --> 00:14:34,880 of a lot of nonstandard language 230 00:14:34,880 --> 00:14:37,880 content in social media such as Twitter. 231 00:14:37,920 --> 00:14:39,600 Like people create their own words. 232 00:14:39,600 --> 00:14:44,040 They're very inventive there, so they hardly follow any grammar. 233 00:14:44,880 --> 00:14:48,920 So text to speech systems such as those used in screen 234 00:14:48,920 --> 00:14:52,920 readers cannot necessarily pronounce 235 00:14:52,960 --> 00:14:56,480 these out of vocabulary words in the right way, 236 00:14:56,880 --> 00:14:59,920 because most of these words, even though they're in text form, 237 00:15:00,360 --> 00:15:05,200 they're mostly intended for visual consumption, such as some kind of exaggeration 238 00:15:05,560 --> 00:15:10,680 where the letters are duplicated just for some kind of additional effect. 239 00:15:11,000 --> 00:15:14,800 Sometimes even emotions are attached to the text itself without any, 240 00:15:15,160 --> 00:15:17,480 you know, emoticons or anything else. 241 00:15:17,960 --> 00:15:22,640 So and sometimes they try to phonetically match it, 242 00:15:22,640 --> 00:15:27,080 use a different spelling for a word just for fun purposes. 243 00:15:27,240 --> 00:15:33,320 So this kind of fun as communication has increased tremendously on social media 244 00:15:33,320 --> 00:15:39,000 and people are depending on social media to understand or get news, even, 245 00:15:39,400 --> 00:15:42,760 you know, some kind of disaster news or something happens anywhere. 246 00:15:42,760 --> 00:15:46,680 Some even, they first flock to the social media to get it. So 247 00:15:47,680 --> 00:15:48,880 people who listen 248 00:15:48,880 --> 00:15:53,760 to content also should be able to easily understand, so I’m focusing on that area 249 00:15:53,760 --> 00:15:57,480 how to use NLP to make this possible. 250 00:15:58,360 --> 00:16:02,360 Because even though this is not exactly a question 251 00:16:02,440 --> 00:16:08,400 of accessibility in a conventional sense, but it's more like accessibility 252 00:16:08,400 --> 00:16:12,960 in terms of being able to understand the already accessible content. 253 00:16:12,960 --> 00:16:14,920 So that's one of the things. 254 00:16:14,920 --> 00:16:18,760 The other thing that we're looking at, which is related to this panel 255 00:16:18,800 --> 00:16:22,600 is the... related to the bias, disability 256 00:16:22,600 --> 00:16:27,720 bias of natural language models, especially those large language models. 257 00:16:28,480 --> 00:16:33,960 So unfortunately, these models are reflective of the data it's trained on 258 00:16:34,600 --> 00:16:40,720 because most of the data associates words that are used to describe people 259 00:16:40,720 --> 00:16:45,080 with disabilities somehow end up having negative connotation. 260 00:16:45,360 --> 00:16:47,680 So they're using negative context. 261 00:16:48,040 --> 00:16:51,760 So it's nobody's telling these models to learn it that way 262 00:16:52,320 --> 00:16:56,640 except that the documents or the text corpus that these models 263 00:16:56,640 --> 00:17:01,240 are looking at inherently put these words that are, 264 00:17:01,680 --> 00:17:06,200 you know, many times not offensive into the negative category. 265 00:17:07,440 --> 00:17:10,080 So I'm looking at how we can counter 266 00:17:10,080 --> 00:17:14,160 this. One example is toxicity detection 267 00:17:14,320 --> 00:17:19,280 in discussion forums, online discussion forums are very popular. 268 00:17:19,280 --> 00:17:22,280 People go there, sometimes anonymously, 269 00:17:22,280 --> 00:17:24,720 post content, interact with each other. 270 00:17:25,320 --> 00:17:28,440 And, you know, some of the posts get flagged 271 00:17:28,560 --> 00:17:32,360 as, you know, toxic or this get filtered out. 272 00:17:32,680 --> 00:17:35,800 So even if they're not toxic 273 00:17:36,160 --> 00:17:41,520 because of the use of certain words to describe disabilities or something. 274 00:17:41,520 --> 00:17:43,760 So we want to avoid that. 275 00:17:43,760 --> 00:17:47,920 So how do we how can we use NLP to not do that. 276 00:17:48,520 --> 00:17:52,920 So these two projects are pretty much what's closely related to the panel. 277 00:17:53,400 --> 00:17:55,160 So stick to these. 278 00:17:55,160 --> 00:17:58,840 This session. 279 00:17:58,840 --> 00:17:59,240 Thank you, Vikas. 280 00:18:00,920 --> 00:18:05,720 I'll follow up with that with what you've mentioned and Lourdes 281 00:18:05,800 --> 00:18:08,960 has also previously highlighted 282 00:18:08,960 --> 00:18:14,320 the disability bias and... and I'm wondering 283 00:18:14,320 --> 00:18:18,240 if you have any ideas and suggestions on how can 284 00:18:19,560 --> 00:18:22,320 NLP tools 285 00:18:22,920 --> 00:18:24,960 address such issues. 286 00:18:24,960 --> 00:18:27,960 I'm thinking, for instance, text summarization tools, 287 00:18:27,960 --> 00:18:30,680 but also other also NLP tools. 288 00:18:30,680 --> 00:18:33,840 How can they help us address issues of disability bias, 289 00:18:33,840 --> 00:18:37,920 but also how can they explore other aspects 290 00:18:37,920 --> 00:18:41,280 like accountability or personalization 291 00:18:41,280 --> 00:18:44,320 of in the case of 292 00:18:44,320 --> 00:18:45,480 text summaries? 293 00:18:45,480 --> 00:18:49,200 And how can I personalize a summary 294 00:18:49,200 --> 00:18:53,080 for specific audiences for the needs of specific people? 295 00:18:54,200 --> 00:18:56,640 I'll start with you now Lourdes. 296 00:18:56,640 --> 00:18:57,600 OK. 297 00:18:58,680 --> 00:19:00,080 Text summarization is 298 00:19:00,080 --> 00:19:02,680 a natural language processing task. 299 00:19:02,680 --> 00:19:04,920 Is a... is a great resource 300 00:19:06,720 --> 00:19:08,360 because improve cognitive 301 00:19:08,360 --> 00:19:14,080 accessibility in order to help people people with disabilities to process alone 302 00:19:14,080 --> 00:19:16,880 and deduce text. 303 00:19:17,400 --> 00:19:20,280 Also, in the web content accessibility guidelines 304 00:19:20,760 --> 00:19:25,200 following success criteria 3.1.5 Reading Level 305 00:19:25,200 --> 00:19:29,720 the readable summary is a 306 00:19:30,160 --> 00:19:33,000 resource that is recommended. 307 00:19:34,640 --> 00:19:37,920 But these task has challenges. 308 00:19:38,320 --> 00:19:42,680 Such us, bias, disability biasis. 309 00:19:43,040 --> 00:19:46,000 And the summaries generated 310 00:19:46,000 --> 00:19:49,320 are understandable for people with disability is 311 00:19:50,000 --> 00:19:54,520 is at a is understandable for people with disability 312 00:19:55,080 --> 00:20:00,560 therefore some aspects must be taken into account: is necessary 313 00:20:00,560 --> 00:20:06,320 to approach this task which is summarize of the extractive type 314 00:20:07,520 --> 00:20:12,240 where the extracted sentences can be modified with paraphrases 315 00:20:12,320 --> 00:20:17,400 resources and help understandability and readability of the text. 316 00:20:18,520 --> 00:20:22,400 To summarize text different input are required 317 00:20:22,960 --> 00:20:28,440 not only knowledge about the sequences of words or other 318 00:20:28,800 --> 00:20:32,600 leads about sentences, 319 00:20:33,000 --> 00:20:37,320 but also about the target of audience is important. 320 00:20:37,440 --> 00:20:40,480 Different type of user 321 00:20:40,480 --> 00:20:44,760 require different type or personalization of summaries. 322 00:20:46,400 --> 00:20:50,520 It also a 323 00:20:50,760 --> 00:20:53,320 I think that this is 324 00:20:53,760 --> 00:20:56,280 it will be recommendable to include 325 00:20:56,480 --> 00:21:00,720 readability metric in the summary generation process 326 00:21:01,240 --> 00:21:06,560 to ensure that the resulting summary is minimally readable. 327 00:21:08,400 --> 00:21:09,000 For instance 328 00:21:09,000 --> 00:21:10,680 if 329 00:21:10,680 --> 00:21:15,880 we are in the context of a system that provides summaries of public 330 00:21:15,960 --> 00:21:18,560 administration information for old people, 331 00:21:19,000 --> 00:21:22,880 it's necessary to take into account that the summary 332 00:21:23,160 --> 00:21:26,600 must be in plain language, 333 00:21:26,760 --> 00:21:30,520 therefore in addition to extract the relevant sentences 334 00:21:30,520 --> 00:21:35,760 and paraphrases it will be necessary to include knowledge about of guideline 335 00:21:35,760 --> 00:21:39,720 of plain language to make the text easier to read 336 00:21:40,560 --> 00:21:45,480 and finally corpora use 337 00:21:46,240 --> 00:21:50,280 to train natural language processing system shall be tested 338 00:21:50,280 --> 00:21:54,880 with users it in order to attain useful solution. 339 00:21:55,520 --> 00:21:59,520 Only then it will be possible to obtain understandable summaries 340 00:21:59,520 --> 00:22:03,440 for the whole of society and the elderly 341 00:22:03,720 --> 00:22:07,920 and and with respect to accountability 342 00:22:09,120 --> 00:22:13,600 as as in every artificial intelligence algorithm, 343 00:22:13,680 --> 00:22:16,200 these must be explainable 344 00:22:18,000 --> 00:22:21,120 so is necessary to respond to answer 345 00:22:21,680 --> 00:22:25,160 to questions such as how processing 346 00:22:25,160 --> 00:22:29,440 actually perform, the limitation of the dataset 347 00:22:29,880 --> 00:22:34,960 used to train and test algorithms and the outcomes of the model 348 00:22:36,040 --> 00:22:37,480 a therefore good 349 00:22:37,480 --> 00:22:41,640 data manager, management and machine learning models 350 00:22:41,640 --> 00:22:42,640 trained in practice 351 00:22:42,640 --> 00:22:46,320 shall be promote to ensure quality results. 352 00:22:49,080 --> 00:22:51,000 And nothing else. 353 00:22:51,000 --> 00:22:52,880 Thank you Lourdes. 354 00:22:53,800 --> 00:22:58,920 Vikas, do you want to... even though we... from what I understood, 355 00:22:58,920 --> 00:23:03,480 you don't work directly with text summarization but still others 356 00:23:03,480 --> 00:23:07,720 aspects of disability bias, accountability, 357 00:23:07,720 --> 00:23:11,040 personalization, impact what you're doing. 358 00:23:12,080 --> 00:23:16,760 Yeah, I mean I use a lot of text summarization so I can add to it. 359 00:23:16,760 --> 00:23:21,520 So to add to what Lourdes said, 360 00:23:22,440 --> 00:23:26,560 simplification is also as important as summarization 361 00:23:26,920 --> 00:23:32,240 because sometimes it's not just summarizing or shortening the content 362 00:23:32,400 --> 00:23:35,840 to be consumed, but it's also making it understandable, like I said. 363 00:23:36,280 --> 00:23:40,040 So that means that certain complex sentence structures 364 00:23:40,040 --> 00:23:41,680 and some exotic words 365 00:23:41,680 --> 00:23:44,880 we need to replace them with equal and easier 366 00:23:44,880 --> 00:23:48,080 to understand more frequently used words. 367 00:23:48,240 --> 00:23:54,080 So there there is some work there that has been gone into text simplification. 368 00:23:54,120 --> 00:23:57,840 We created some kind of summarization in the special case. 369 00:23:58,160 --> 00:24:01,640 It's from the same language to text from between the same language. 370 00:24:01,840 --> 00:24:05,800 So the input is a text in the same language as the output text, 371 00:24:06,000 --> 00:24:09,720 except that the output text is more readable, more understandable. 372 00:24:10,320 --> 00:24:12,360 So that is extremely important. 373 00:24:12,360 --> 00:24:13,440 The other thing is 374 00:24:14,400 --> 00:24:15,240 summarization. 375 00:24:15,240 --> 00:24:19,240 Most of them tend to rely extractive summarization wherein 376 00:24:19,560 --> 00:24:24,080 they just pick certain sentences from the original piece of text 377 00:24:24,760 --> 00:24:26,840 so that they don't have to worry about the 378 00:24:27,560 --> 00:24:30,800 grammatical correctness and proper sentence structures 379 00:24:31,320 --> 00:24:34,400 so that because they rely on humans 380 00:24:34,400 --> 00:24:37,320 who have written the text in order to generate the summaries. 381 00:24:37,760 --> 00:24:40,080 So I can speak about 382 00:24:40,440 --> 00:24:44,840 how summarization needs to be personalized in certain way for certain groups, 383 00:24:44,960 --> 00:24:47,560 especially for people with visual disabilities. 384 00:24:47,920 --> 00:24:51,720 So what I have noticed in some of my studies is that 385 00:24:52,640 --> 00:24:56,480 even though they can hear it, they don't necessarily understand it 386 00:24:56,480 --> 00:24:59,840 because the writing is sort of visual in the sense 387 00:24:59,840 --> 00:25:03,040 it needs to need you to be visually imaginative. 388 00:25:03,600 --> 00:25:06,560 So what is the alt... the non-visual alternative 389 00:25:07,240 --> 00:25:09,480 for such kind of text? 390 00:25:09,840 --> 00:25:15,080 So how do you summarize the text that includes a lot of visual elements to it. 391 00:25:15,280 --> 00:25:18,080 So how do you convert it into non 392 00:25:18,840 --> 00:25:21,520 equal non-visual explanations? 393 00:25:21,640 --> 00:25:26,000 This necessarily goes beyond the extractive summarization. 394 00:25:26,000 --> 00:25:27,760 You cannot just pick and choose, 395 00:25:27,760 --> 00:25:31,760 so you need to replace the wordings in the sentence. 396 00:25:31,760 --> 00:25:36,120 By other wordings that they can understand and some of the text, 397 00:25:36,120 --> 00:25:37,960 you know, these days, especially 398 00:25:37,960 --> 00:25:42,240 the articles, news articles and all, they don't come purely as text. 399 00:25:42,240 --> 00:25:44,920 They're sort of multi-modal in the sense 400 00:25:45,240 --> 00:25:48,000 there are pictures, there are GIFs everything. 401 00:25:48,000 --> 00:25:51,200 And the text sort of refers to these pictures 402 00:25:52,120 --> 00:25:56,840 so that this is another problem because then it becomes highly visual. 403 00:25:56,840 --> 00:26:00,240 So you have to take some of the visual elements 404 00:26:00,240 --> 00:26:03,920 of the picture, probably through computer vision techniques or something, 405 00:26:03,920 --> 00:26:08,760 and then inject it into the text in order to make it more self-sufficient 406 00:26:08,760 --> 00:26:12,800 and understandable for people who cannot see the images. 407 00:26:13,760 --> 00:26:17,240 So that's my take on it. 408 00:26:17,280 --> 00:26:23,240 Yeah, that's a very good point about the multimedia information 409 00:26:23,240 --> 00:26:26,320 and how do we summarize everything into text. 410 00:26:26,320 --> 00:26:28,920 Yeah, that's a great point. 411 00:26:28,920 --> 00:26:31,560 Chaohai, your take on this? 412 00:26:31,560 --> 00:26:33,240 Oh yes. Yeah. 413 00:26:33,240 --> 00:26:36,880 But we don't know much experience in text summarization. 414 00:26:36,880 --> 00:26:42,640 Most our research is focused on the residual AAC and interlinking 415 00:26:42,680 --> 00:26:46,600 and the AAC generation, but we do how well the project 416 00:26:47,040 --> 00:26:49,200 involved part of the text summarization. 417 00:26:50,040 --> 00:26:53,240 We construct a knowledge graph 418 00:26:53,640 --> 00:26:56,520 for e-learning platform and that we need to 419 00:26:57,680 --> 00:26:58,880 extract 420 00:26:59,280 --> 00:27:04,520 the text summarization from lecture notes to, make it easier 421 00:27:04,520 --> 00:27:07,680 and accessible for people, students 422 00:27:07,680 --> 00:27:11,160 with disabilities. So, 423 00:27:11,160 --> 00:27:15,440 so based on that project, 424 00:27:15,440 --> 00:27:19,440 what we learned is that text summarization is very difficult task 425 00:27:20,880 --> 00:27:24,480 in NLP because these are highly dependent on the text 426 00:27:25,600 --> 00:27:27,240 context domain or 427 00:27:27,240 --> 00:27:31,040 target audience and even the goal summary. 428 00:27:31,040 --> 00:27:36,680 For example, in our scenario, we want to have the summary of 429 00:27:37,960 --> 00:27:39,880 each lecture notes, 430 00:27:39,880 --> 00:27:43,480 but we a very long transcripts in that lecture. 431 00:27:43,920 --> 00:27:46,280 So we use a few 432 00:27:47,480 --> 00:27:49,680 text summarization models to generate 433 00:27:52,360 --> 00:27:55,200 the summaries, but the outcome is not good 434 00:27:56,360 --> 00:27:56,920 somewhere. 435 00:27:56,920 --> 00:27:59,120 It is mainly as 436 00:28:00,280 --> 00:28:03,520 Vikas just said, some of the text 437 00:28:03,520 --> 00:28:07,880 summarization is just pick some of the text and replace 438 00:28:07,880 --> 00:28:12,960 some of the words that say so or even some that doesn't make sense. 439 00:28:13,360 --> 00:28:16,520 So that's the one. 440 00:28:16,520 --> 00:28:20,640 One problem we identified in text summarization 441 00:28:21,120 --> 00:28:24,000 and we also have some 442 00:28:26,000 --> 00:28:28,200 method to 443 00:28:28,920 --> 00:28:29,800 to read 444 00:28:31,440 --> 00:28:33,360 because we we 445 00:28:33,360 --> 00:28:36,400 we need to personalize because the project is 446 00:28:37,240 --> 00:28:41,280 related to the adaptive learning for individual students. 447 00:28:41,280 --> 00:28:44,080 We need a personalization for each student. So 448 00:28:45,200 --> 00:28:47,920 personalization could be 449 00:28:47,920 --> 00:28:51,760 customized, adapted to user need. 450 00:28:52,240 --> 00:28:55,760 But this is actually 451 00:28:55,760 --> 00:28:58,920 can can be improved the ways 452 00:29:01,000 --> 00:29:04,000 users’ personal preferences 453 00:29:04,400 --> 00:29:06,720 or feedback and 454 00:29:08,200 --> 00:29:11,040 and also allow user to set the 455 00:29:12,600 --> 00:29:14,800 summary goal and... 456 00:29:14,800 --> 00:29:20,120 and also the simplification is very important because some students 457 00:29:20,360 --> 00:29:25,320 may have cognitive disability or or other type of disability. 458 00:29:25,320 --> 00:29:28,200 They need to have simplified or blend 459 00:29:28,760 --> 00:29:33,720 plain language. Yet. 460 00:29:33,720 --> 00:29:35,760 Yeah, I think that's mainly what we 461 00:29:37,200 --> 00:29:40,840 have for text summarization. 462 00:29:40,840 --> 00:29:41,520 Thank you, Chaohai. 463 00:29:41,600 --> 00:29:43,920 Thank you. Uh. 464 00:29:43,920 --> 00:29:48,480 Okay, so let's move on to what we started with the challenges 465 00:29:48,480 --> 00:29:53,240 and now I would like to move on to the future perspectives. 466 00:29:53,240 --> 00:29:58,440 How do... what are the breakthroughs that you see happening 467 00:29:58,960 --> 00:30:01,280 promoted by the use of NLP 468 00:30:02,080 --> 00:30:04,320 for accessible communication. 469 00:30:04,680 --> 00:30:06,920 And I'll we'll start with you now, Vikas. 470 00:30:10,960 --> 00:30:12,840 So my 471 00:30:12,840 --> 00:30:16,200 perspective is that there's plenty of NLP, 472 00:30:16,200 --> 00:30:20,120 you know tools out there already that haven’t been exploited 473 00:30:20,120 --> 00:30:24,760 to the fullest extent to address accessibility and usability issues. 474 00:30:25,280 --> 00:30:27,960 The growth in NLP techniques 475 00:30:27,960 --> 00:30:33,080 and methods that has been extremely steep in the recent years 476 00:30:33,080 --> 00:30:37,080 and the rest of us in different fields are trying to catch up. 477 00:30:37,560 --> 00:30:39,840 I mean, still, there is a lot to be explored 478 00:30:40,400 --> 00:30:43,320 as to how they can be used to address 479 00:30:43,320 --> 00:30:45,880 real world accessibility problems, 480 00:30:46,360 --> 00:30:51,120 and we are in the process of doing that, I would say so 481 00:30:51,120 --> 00:30:53,320 text summarization is one thing 482 00:30:54,240 --> 00:30:57,000 that we discussed already, which can be exploited 483 00:30:57,000 --> 00:31:01,840 in a lot of scenarios to improve the efficiency 484 00:31:01,840 --> 00:31:05,880 of computer interaction for people with disabilities. 485 00:31:06,240 --> 00:31:09,840 But the main problem, as we discussed not only in this panel 486 00:31:09,840 --> 00:31:12,120 but also on other panels, is the data. 487 00:31:12,120 --> 00:31:16,560 So for some languages there is enough ... corpus 488 00:31:16,560 --> 00:31:19,960 where the translation is good, because the translation 489 00:31:19,960 --> 00:31:23,280 essentially depends on how much data you have trained on. 490 00:31:23,640 --> 00:31:27,200 But for some pair of languages it may not be that 491 00:31:28,200 --> 00:31:28,800 easy. 492 00:31:28,800 --> 00:31:31,680 Or even if it does something may not be that accurate. 493 00:31:31,680 --> 00:31:33,800 So that may be a problem. 494 00:31:33,800 --> 00:31:36,920 And then the biggest area where I see, 495 00:31:37,480 --> 00:31:41,160 which can be very useful for solving 496 00:31:41,160 --> 00:31:46,560 many accessibility problems is the improvement in dialog systems. 497 00:31:46,560 --> 00:31:49,560 So natural language dialog is more like 498 00:31:49,560 --> 00:31:52,000 a really intuitive interface for many 499 00:31:53,360 --> 00:31:56,840 users, including many people with disabilities. 500 00:31:57,360 --> 00:32:00,240 So those who have physical impairments which 501 00:32:01,280 --> 00:32:05,040 prevent them from conveniently using the keyboard or the mouse 502 00:32:05,040 --> 00:32:08,080 and those are blind who have to use screen readers, 503 00:32:08,080 --> 00:32:11,040 which is time consuming, is known to be time consuming. 504 00:32:11,040 --> 00:32:15,080 So dialog assistants are, I would say 505 00:32:16,320 --> 00:32:17,720 under-explored... they're still exploring it. 506 00:32:17,720 --> 00:32:21,840 We see that commercialization is going on like smartphones and all, 507 00:32:22,080 --> 00:32:26,240 but still it's at the level of some high level interaction 508 00:32:26,240 --> 00:32:30,840 like setting alarms or turning on lights and answering some questions. 509 00:32:31,280 --> 00:32:34,760 But what about using that to interact with applications 510 00:32:34,760 --> 00:32:36,760 in the context of an application? 511 00:32:37,360 --> 00:32:39,480 So if I see a play, 512 00:32:41,040 --> 00:32:41,760 I had a user 513 00:32:41,760 --> 00:32:44,760 comment to this particular document. 514 00:32:44,760 --> 00:32:49,280 It's in word or Google Docs. So can an assistant spoken 515 00:32:49,280 --> 00:32:52,200 dialog assistant understand that an automated 516 00:32:53,360 --> 00:32:55,440 means so this kind of automation 517 00:32:56,240 --> 00:33:01,240 will sort of address, I feel will address many of the issues 518 00:33:01,240 --> 00:33:04,640 that people face interacting with digital content. 519 00:33:04,640 --> 00:33:09,080 So that's one of the things I would say we can use NLP for. 520 00:33:09,480 --> 00:33:15,120 The other thing is the increased availability of large language 521 00:33:15,120 --> 00:33:20,280 models, Pre-trained models like one Lourdes mentioned, like GPT, 522 00:33:20,840 --> 00:33:25,680 which is essentially transformer decoder or generator based model. 523 00:33:25,680 --> 00:33:28,040 Then there's also Bert, which encoder based. 524 00:33:28,440 --> 00:33:32,040 So these help us, you know, 525 00:33:32,120 --> 00:33:36,920 in a way that we don't need large amounts of data to solve problems 526 00:33:36,920 --> 00:33:40,320 because they're already pre-trained on a large amount of data. 527 00:33:40,800 --> 00:33:44,560 So what we would need are kind of small 528 00:33:44,760 --> 00:33:49,800 data sets that are more fine tuned towards the problem we are addressing. 529 00:33:50,520 --> 00:33:53,960 So the datasets, they're accessibility datasets. 530 00:33:53,960 --> 00:33:57,600 They're I think there needs to be a little bit more investment 531 00:33:58,480 --> 00:34:02,960 doesn't have to be that big because the large language models 532 00:34:02,960 --> 00:34:06,640 already take care of most of the language complexity. 533 00:34:06,760 --> 00:34:08,200 It's more like fine tuning 534 00:34:09,160 --> 00:34:10,560 the problem at hand. 535 00:34:10,560 --> 00:34:14,400 So that's where I think some effort should go. 536 00:34:14,400 --> 00:34:18,800 And once we do that, obviously we can fine tune and solve the problems 537 00:34:18,800 --> 00:34:23,640 and then there is a tremendous enhancement or advancement 538 00:34:23,640 --> 00:34:27,920 in transport learning techniques which we can exploit that as well, 539 00:34:28,440 --> 00:34:32,280 in order to not do stuff from scratch, instead borrow 540 00:34:32,280 --> 00:34:35,640 some things that are already there for something different. 541 00:34:36,320 --> 00:34:38,280 I mean, similar problem. 542 00:34:38,280 --> 00:34:43,880 So, so there is a lot to be explored, but we haven't done that yet. 543 00:34:43,880 --> 00:34:45,920 So there's plenty of opportunity 544 00:34:45,920 --> 00:34:48,960 for research using NLP expertise for 545 00:34:49,560 --> 00:34:53,400 problems in accessible communication, especially. 546 00:34:53,400 --> 00:34:56,160 Yes, definitely some exciting avenues there. 547 00:34:57,120 --> 00:35:00,680 So, Chaohai, can we have your take on this? 548 00:35:01,120 --> 00:35:04,080 What are your breakthroughs? 549 00:35:04,200 --> 00:35:04,960 OK. 550 00:35:04,960 --> 00:35:08,560 Just listened them, Vikas, I totally agree with him. 551 00:35:08,880 --> 00:35:09,240 He's 552 00:35:11,160 --> 00:35:13,200 all opinions and 553 00:35:13,200 --> 00:35:15,320 for... for my research 554 00:35:15,720 --> 00:35:18,480 because I've mainly worked on AAC, so, currently, 555 00:35:19,160 --> 00:35:22,160 so I would take AAC for example. 556 00:35:22,920 --> 00:35:26,120 So the future perspective for AAC, 557 00:35:27,040 --> 00:35:29,200 NLP for AAC, I think the first of 558 00:35:29,240 --> 00:35:32,920 that will be the personalized adaptive 559 00:35:34,160 --> 00:35:37,480 communication for each individual, because 560 00:35:39,240 --> 00:35:42,520 each individual has their own communication, 561 00:35:43,640 --> 00:35:45,760 their own way to communicate with each other 562 00:35:46,120 --> 00:35:49,200 and NLP techniques can be used to make 563 00:35:50,320 --> 00:35:52,720 this communication more accessible, 564 00:35:52,920 --> 00:35:56,480 more personalized and adaptive based on their 565 00:35:58,000 --> 00:36:01,200 personal preferences of feedback. 566 00:36:02,440 --> 00:36:06,520 So this can can be used to 567 00:36:06,840 --> 00:36:10,320 for personalize the AAC symbols so 568 00:36:11,320 --> 00:36:14,440 currently, AAC users 569 00:36:14,440 --> 00:36:19,080 they just using some standard AAC symbol set for 570 00:36:19,160 --> 00:36:24,480 their daily communications, so how can we use NLP to 571 00:36:25,680 --> 00:36:29,520 and generic and the generic AI models 572 00:36:30,240 --> 00:36:34,720 to create a more customized personalized AAC symbols 573 00:36:36,080 --> 00:36:37,440 that's which 574 00:36:37,440 --> 00:36:40,800 which you could be have ability to adapt to the 575 00:36:42,720 --> 00:36:45,240 individual's unique cultural and social needs. 576 00:36:45,560 --> 00:36:48,240 I think that's one potentially 577 00:36:49,280 --> 00:36:52,280 contribute to the AAC users. 578 00:36:52,800 --> 00:36:56,160 The second one will be accessible multi 579 00:36:56,320 --> 00:36:59,440 modal communication 580 00:37:00,120 --> 00:37:01,720 because a 581 00:37:02,000 --> 00:37:03,920 that NLP techniques 582 00:37:03,920 --> 00:37:07,560 they have the potential to enhance this 583 00:37:08,520 --> 00:37:11,600 accessible communication by improving 584 00:37:12,960 --> 00:37:16,320 interoperability in training data 585 00:37:17,000 --> 00:37:22,160 and the between the verbal language sign language and that AAC so data 586 00:37:22,160 --> 00:37:26,280 interoperability could provide more high quality training data for this 587 00:37:28,120 --> 00:37:28,960 language with 588 00:37:28,960 --> 00:37:31,040 elastic set and 589 00:37:32,200 --> 00:37:35,400 additionally, 590 00:37:35,400 --> 00:37:39,720 it can provide the ability to translate 591 00:37:40,280 --> 00:37:43,200 different communication models 592 00:37:43,800 --> 00:37:46,280 and to make it more accessible and inclusive. 593 00:37:47,480 --> 00:37:50,280 So in AAC, so we can have 594 00:37:51,440 --> 00:37:55,080 multiple AAC symbol sets can be link 595 00:37:55,640 --> 00:37:58,200 mapped and interlinked by NLP models 596 00:37:58,680 --> 00:38:03,200 and this can be contribute to the translation between the AAC to AAC 597 00:38:03,520 --> 00:38:07,800 and AAC to text, AAC to some language and vice versa. 598 00:38:08,480 --> 00:38:13,880 Yeah, that's the, the the second aspect I think about. 599 00:38:13,880 --> 00:38:15,240 And then the third one is the 600 00:38:17,240 --> 00:38:19,720 AI assistant communication that Vikas 601 00:38:19,720 --> 00:38:23,760 just, just talk about the ChatGPT. 602 00:38:23,760 --> 00:38:26,760 So with this, 603 00:38:26,760 --> 00:38:30,080 this large language model has been trained 604 00:38:30,080 --> 00:38:32,840 by this big companies and 605 00:38:33,960 --> 00:38:35,680 and they have been widely spreading on social media. 606 00:38:35,680 --> 00:38:38,560 So how 607 00:38:38,560 --> 00:38:40,680 how to using this 608 00:38:40,680 --> 00:38:42,920 this trained large 609 00:38:43,400 --> 00:38:48,360 language models incorporated with other applications and then can use it 610 00:38:48,360 --> 00:38:53,560 for a more accessible communication to help people with disabilities. 611 00:38:54,000 --> 00:38:56,800 That's that's another 612 00:38:57,000 --> 00:38:57,560 future 613 00:38:57,560 --> 00:38:59,200 we are looking for. 614 00:38:59,200 --> 00:39:01,320 The last one that I'm going to talk about 615 00:39:01,560 --> 00:39:04,880 is more regarding the AAC because it's quite expensive. 616 00:39:05,160 --> 00:39:07,960 So affordability is very important 617 00:39:08,640 --> 00:39:13,080 and it can be achieved by the NLP 618 00:39:13,320 --> 00:39:18,120 or AI. That's what I mentioned that we are currently looking into 619 00:39:18,640 --> 00:39:23,320 how to turn images into symbols and how to generate 620 00:39:24,640 --> 00:39:28,200 AAC symbols automatically by using 621 00:39:29,520 --> 00:39:33,720 image generative AI models like stable diffusion. So 622 00:39:34,840 --> 00:39:37,960 so that's the another future we are looking forward 623 00:39:37,960 --> 00:39:41,280 how to reduce the cost for accessible communication. 624 00:39:42,000 --> 00:39:44,280 Yeah. Thank you. 625 00:39:44,280 --> 00:39:44,640 Thank you, 626 00:39:44,640 --> 00:39:47,520 Chaohai. Definitely a relevant point. 627 00:39:47,520 --> 00:39:52,040 Reducing costs of getting data and all of that. 628 00:39:52,320 --> 00:39:54,120 That's important everywhere. 629 00:39:54,120 --> 00:39:57,120 So, Lourdes, what are you looking for 630 00:39:57,360 --> 00:40:01,840 in the near future? 631 00:40:01,840 --> 00:40:05,880 And you are muted. 632 00:40:05,880 --> 00:40:11,920 So as we have mentioned before, there are two trends that are the 633 00:40:12,360 --> 00:40:16,680 the appearance of new and better language model than the previous one 634 00:40:17,120 --> 00:40:19,680 working in these these new models 635 00:40:20,120 --> 00:40:25,080 and to reduce the disability biases. 636 00:40:25,080 --> 00:40:29,040 Also I am going to list a specific natural language processing 637 00:40:29,040 --> 00:40:31,800 task and data application 638 00:40:32,080 --> 00:40:34,320 that I will work in the coming year. 639 00:40:35,480 --> 00:40:39,960 And one of them is accessibility to domain specific. 640 00:40:39,960 --> 00:40:41,680 task, such as, health 641 00:40:42,760 --> 00:40:44,720 the the 642 00:40:44,720 --> 00:40:47,960 health language is highly demanded need 643 00:40:48,960 --> 00:40:51,800 but patients have problems understanding 644 00:40:51,800 --> 00:40:57,240 information about their health condition, diagnosis, treatment 645 00:40:57,520 --> 00:41:02,560 and natural language processing method could improve their understanding 646 00:41:02,560 --> 00:41:05,440 of health related documents. 647 00:41:06,400 --> 00:41:10,800 Similar sample appear in legal and financial documents, 648 00:41:10,800 --> 00:41:14,880 the language of administration, e-government... 649 00:41:15,480 --> 00:41:18,960 Current natural language processing technology 650 00:41:18,960 --> 00:41:27,240 that simplifies and summarizes this could help in the roadmap. 651 00:41:27,240 --> 00:41:29,480 And another 652 00:41:31,000 --> 00:41:33,720 line is speech to text 653 00:41:35,040 --> 00:41:37,800 speech to text will be a relevant area 654 00:41:37,800 --> 00:41:40,800 of research in the field of virtual meetings 655 00:41:41,720 --> 00:41:46,440 in order to facilitate accessible, accessible communication by generation 656 00:41:46,440 --> 00:41:50,120 of summaries of meeting as well as minutes 657 00:41:51,280 --> 00:41:53,640 in plain language. 658 00:41:53,960 --> 00:41:56,320 Another topic is the 659 00:41:57,240 --> 00:42:00,600 integration of natural language processing 660 00:42:00,600 --> 00:42:03,960 method into the design 661 00:42:03,960 --> 00:42:07,320 and development of multimedia user interface 662 00:42:08,400 --> 00:42:12,120 is necessary to face accessible accessible communication 663 00:42:12,120 --> 00:42:17,000 from a multi-disciplinary approach between different areas 664 00:42:17,000 --> 00:42:20,520 such as, human computer interaction software engineering 665 00:42:20,840 --> 00:42:24,360 and natural language processing. 666 00:42:24,360 --> 00:42:29,280 Finally, another issue is advancing 667 00:42:29,280 --> 00:42:33,400 application in smart assistant in natural language processing 668 00:42:33,920 --> 00:42:37,440 method to support people with disabilities and the elderly. 669 00:42:38,520 --> 00:42:41,480 Assist them in their daily tasks, 670 00:42:41,760 --> 00:42:46,560 and promote active living. 671 00:42:46,560 --> 00:42:47,920 Okay thank you so much, Lourdes, 672 00:42:47,920 --> 00:42:52,120 and everyone of you for for those perspectives. 673 00:42:52,520 --> 00:42:55,480 I guess we still have 5 minutes 674 00:42:55,480 --> 00:43:00,520 more in this session, so I will risk another question 675 00:43:00,520 --> 00:43:05,040 and I will ask you to to try to to be brief on this one. 676 00:43:05,040 --> 00:43:09,720 But the the need for data 677 00:43:09,720 --> 00:43:13,720 was common across all your interventions. 678 00:43:13,720 --> 00:43:17,480 And if we go back to the previous panel, also, 679 00:43:17,480 --> 00:43:21,360 it was brought up by every every panelist. 680 00:43:21,360 --> 00:43:23,760 So yeah, we need data. 681 00:43:25,000 --> 00:43:26,880 What are your thoughts and 682 00:43:26,880 --> 00:43:29,600 how can we make it easier 683 00:43:30,120 --> 00:43:32,600 and to collect more data 684 00:43:32,880 --> 00:43:37,120 for the specific aspect of accessible communication? 685 00:43:37,120 --> 00:43:38,960 Because, we communicate a lot, right? 686 00:43:38,960 --> 00:43:41,280 Technology has 687 00:43:41,680 --> 00:43:43,720 allowed us and open up 688 00:43:44,320 --> 00:43:46,760 several channels to where we can communicate 689 00:43:46,960 --> 00:43:49,960 even when we're not co-located. 690 00:43:50,520 --> 00:43:52,600 So yeah, every one of us is 691 00:43:53,160 --> 00:43:56,200 different points of the planet and we are communicating right now. 692 00:43:56,360 --> 00:44:01,440 Technology improved those that that possibility a lot. 693 00:44:01,880 --> 00:44:04,680 And however 694 00:44:04,680 --> 00:44:07,320 we always hear this, we need more data. 695 00:44:07,320 --> 00:44:08,520 We can't get data. 696 00:44:08,520 --> 00:44:11,760 So how do you think we can get more data 697 00:44:13,160 --> 00:44:15,240 is and of course we need 698 00:44:15,240 --> 00:44:17,760 the data to train these models, but 699 00:44:18,280 --> 00:44:21,760 can't we also rely on these models to generate data? 700 00:44:22,960 --> 00:44:25,320 So let me just 701 00:44:25,320 --> 00:44:28,560 drop this on you now and 702 00:44:30,120 --> 00:44:31,320 any of you want 703 00:44:31,320 --> 00:44:37,120 to go first? 704 00:44:37,120 --> 00:44:40,160 I can go first. Okay. Yeah. Yeah. 705 00:44:40,160 --> 00:44:43,320 Because we have actually working on open 706 00:44:43,320 --> 00:44:46,240 data four years ago before, I mean 707 00:44:47,440 --> 00:44:50,680 the AI and the data science because when I started my PhD 708 00:44:50,840 --> 00:44:55,720 we working on the open data and there is a open data initiative in UK. 709 00:44:56,040 --> 00:44:59,720 So we want to open our data and government data 710 00:45:01,120 --> 00:45:05,120 and the, and the public transport data and that's 711 00:45:05,120 --> 00:45:09,200 how long working on public transportation with accessibility needs. 712 00:45:09,200 --> 00:45:11,400 So there's a lack of data. 713 00:45:11,520 --> 00:45:13,240 At the beginning of my Ph.D. 714 00:45:13,240 --> 00:45:17,320 so few years later, the still lack of the accessibility information data 715 00:45:18,000 --> 00:45:19,240 on this data. 716 00:45:19,240 --> 00:45:23,400 So the I think the so how can we how is this 717 00:45:23,760 --> 00:45:28,920 I mean, the accessibility area, how can we have such a data 718 00:45:29,320 --> 00:45:32,000 to to have to train our model? 719 00:45:32,000 --> 00:45:36,320 I mean, the first advice, what I used to do 720 00:45:36,320 --> 00:45:40,560 with public transport data is I mapped all 721 00:45:41,520 --> 00:45:42,760 available data 722 00:45:42,760 --> 00:45:46,400 into a larger dataset. That's incurred 723 00:45:46,480 --> 00:45:50,120 a lot of labor work like the cleaning data integration 724 00:45:50,120 --> 00:45:54,560 and all this method to make data available. 725 00:45:55,120 --> 00:45:57,560 That's the first first approach. 726 00:45:57,840 --> 00:46:00,960 The second is we think about how can we 727 00:46:02,200 --> 00:46:05,360 contribute like 728 00:46:05,360 --> 00:46:08,160 a data repository or something 729 00:46:08,160 --> 00:46:13,120 like an image net or word net that we can collaborate to together 730 00:46:13,160 --> 00:46:15,520 to contribute the identified 731 00:46:16,520 --> 00:46:19,320 data related to accessibility research. 732 00:46:20,160 --> 00:46:25,680 I think that that's we can as a community, we can create such a universal 733 00:46:26,000 --> 00:46:28,320 repository or, or 734 00:46:30,160 --> 00:46:32,200 yeah, some kind of data initiative 735 00:46:32,200 --> 00:46:35,480 that we can working on accessibility research. 736 00:46:36,240 --> 00:46:38,920 And the third approach is that definitely 737 00:46:39,000 --> 00:46:42,440 that we can generate the data based on the small data. 738 00:46:42,440 --> 00:46:45,400 We can be using 739 00:46:45,840 --> 00:46:48,720 generative AI model to generate more, 740 00:46:49,840 --> 00:46:53,000 but to do the question is, is that data reliable? 741 00:46:53,080 --> 00:46:58,240 The data to generate generate enough, or is that then the bias? 742 00:46:58,640 --> 00:47:01,600 So yeah, that's my my conclusion. 743 00:47:01,960 --> 00:47:02,680 Thank you. 744 00:47:03,400 --> 00:47:04,440 Yes, exactly. 745 00:47:04,440 --> 00:47:07,440 That's the... the big question mark. 746 00:47:07,440 --> 00:47:08,200 Right. 747 00:47:08,360 --> 00:47:11,520 Is that synthetic data reliable or not, so 748 00:47:12,000 --> 00:47:14,440 Vikas or Lourdes do you want to add something? 749 00:47:15,480 --> 00:47:18,000 So yeah I mean I have used synthetic data 750 00:47:18,120 --> 00:47:21,000 before based 751 00:47:21,040 --> 00:47:24,240 the little bit of real data and in some cases 752 00:47:24,240 --> 00:47:26,440 you can generate synthetic data. 753 00:47:26,680 --> 00:47:29,640 So one of the things I had to do was extract 754 00:47:30,400 --> 00:47:32,440 user comments in documents. 755 00:47:33,280 --> 00:47:37,480 Most of these word processing applications allow you to post comments 756 00:47:37,960 --> 00:47:42,800 to the right for your collaborators to look at and then, you know, address them. 757 00:47:43,240 --> 00:47:47,360 So automatically extracting that I had to generate synthetic data 758 00:47:47,360 --> 00:47:51,800 because obviously you have few documents with collaborative comments. 759 00:47:52,320 --> 00:47:54,800 So the appearance there is like, okay 760 00:47:54,840 --> 00:47:58,360 comments will appear somewhere on the right side, right corner, 761 00:47:59,080 --> 00:48:02,400 which will have some text in it with a few sentences. 762 00:48:02,680 --> 00:48:04,680 So there are some characteristics. 763 00:48:04,680 --> 00:48:07,640 So in those cases we generated synthetic data. 764 00:48:07,640 --> 00:48:09,960 We train the machine learning model. 765 00:48:09,960 --> 00:48:13,720 It was pretty accurate on this data, which was like real data. 766 00:48:14,520 --> 00:48:16,200 So... exploit... 767 00:48:16,200 --> 00:48:19,320 In some cases you can exploit the 768 00:48:19,680 --> 00:48:23,840 way data will appear and then generate the synthetic data. 769 00:48:23,920 --> 00:48:27,240 But in many cases it may not be possible. 770 00:48:27,480 --> 00:48:30,480 Like for the project I mentioned, social media, 771 00:48:30,480 --> 00:48:34,000 where text contains a lot of nonstandard words. 772 00:48:34,600 --> 00:48:39,800 Simply replacing the nonstandard words with synonyms may not do the job 773 00:48:39,880 --> 00:48:43,920 because then you take the fun aspect away from social media. 774 00:48:44,280 --> 00:48:47,320 Like, it should be as fun and entertaining 775 00:48:47,320 --> 00:48:51,120 when you listen to social media text as it is when you look at it. 776 00:48:51,640 --> 00:48:54,360 So. So you have to do some kind of clever, 777 00:48:55,360 --> 00:48:56,880 you know, replacement. 778 00:48:56,880 --> 00:49:01,040 And for that you need some kind of expert human expert going there and 779 00:49:03,000 --> 00:49:04,200 doing that. 780 00:49:04,200 --> 00:49:07,920 So crowdsourcing I think is one way to get data 781 00:49:07,920 --> 00:49:11,040 quickly and it's pretty reliable. 782 00:49:11,440 --> 00:49:14,200 And see, I've seen in the NLP community 783 00:49:14,240 --> 00:49:19,560 like NLP papers that appear in ACL and they rely heavily on the Amazon 784 00:49:20,160 --> 00:49:23,280 Mechanical Turk and other online 785 00:49:26,160 --> 00:49:29,120 incentivized data collection mechanisms. 786 00:49:29,560 --> 00:49:31,720 So that I think is one thing. 787 00:49:31,720 --> 00:49:35,640 The other thing I do know, you know, in my classes 788 00:49:35,640 --> 00:49:39,840 especially, I get the students to help each other out, collect the data 789 00:49:40,160 --> 00:49:43,560 so it doesn't have to be that intensive every day. 790 00:49:43,560 --> 00:49:47,400 If they just even one student collects like ten data points 791 00:49:47,880 --> 00:49:52,560 over the semester, it would be like enough data for a lot of things. 792 00:49:52,560 --> 00:49:56,960 So you know, in each other projects and in the end of the course 793 00:49:57,120 --> 00:50:00,320 pretty much they'll have a lot of data for research. So 794 00:50:01,320 --> 00:50:02,120 you know, 795 00:50:02,120 --> 00:50:06,040 everybody can contribute in a way and students 796 00:50:06,040 --> 00:50:10,200 especially are much more reliable because they are familiar 797 00:50:10,200 --> 00:50:14,840 with the mechanisms how to label collect data. 798 00:50:14,840 --> 00:50:18,520 And also they can understand how things work as well. 799 00:50:18,520 --> 00:50:22,760 So, it’s like a win-win. 800 00:50:22,760 --> 00:50:25,000 Okay, yeah, thanks for that contribution. 801 00:50:25,480 --> 00:50:26,880 Good suggestion. 802 00:50:26,880 --> 00:50:31,040 And, Lourdes, we are really running out of time 803 00:50:31,040 --> 00:50:34,120 but if you still want to intervene, 804 00:50:34,120 --> 00:50:36,200 I can give you a couple of minutes. 805 00:50:37,240 --> 00:50:37,760 Okay. 806 00:50:37,760 --> 00:50:40,720 Only a I think that also 807 00:50:40,720 --> 00:50:46,080 we don't find many we need a few data, but in my vision 808 00:50:46,080 --> 00:50:49,240 is also negative because obtaining 809 00:50:49,240 --> 00:50:52,640 the dataset is expensive. 810 00:50:52,680 --> 00:50:56,760 An in accessible communication, I work in simplification. 811 00:50:57,960 --> 00:50:59,880 these data must be prepared 812 00:50:59,880 --> 00:51:01,960 by the expert in accessibility 813 00:51:03,360 --> 00:51:05,720 is important as these data 814 00:51:07,240 --> 00:51:11,400 is validated by people with disability 815 00:51:11,680 --> 00:51:16,160 and use plain language resources 816 00:51:16,880 --> 00:51:20,040 and then it is a problem 817 00:51:20,520 --> 00:51:25,640 to obtain data with quality. 818 00:51:26,880 --> 00:51:28,840 Okay, thank you so much, 819 00:51:28,840 --> 00:51:29,320 Lourdes. 820 00:51:29,320 --> 00:51:34,800 And thanks a very big thank you to the three of you, Chaohai, 821 00:51:34,800 --> 00:51:36,680 Vikas and Lourdes. It 822 00:51:36,760 --> 00:51:38,760 was a really interesting panel 823 00:51:39,480 --> 00:51:42,040 thank you so much for for your availability and