1
00:00:00,000 --> 00:00:02,720
Hello, everyone, and welcome back

2
00:00:03,720 --> 00:00:05,880
to the second panel.

3
00:00:05,880 --> 00:00:08,800
And I'm now joined by

4
00:00:10,360 --> 00:00:14,080
Chaohai Ding
from the University of Southampton.

5
00:00:14,760 --> 00:00:17,840
Lourdes Moreno
from the Universidade

6
00:00:17,880 --> 00:00:20,880
Carlos III de Madrid
in Spain.

7
00:00:21,120 --> 00:00:24,120
And Vikas Ashok from

8
00:00:24,320 --> 00:00:26,960
Old Dominion University in the US.

9
00:00:27,760 --> 00:00:31,120
So thank you all for your availability.

10
00:00:31,120 --> 00:00:33,840
It's great to have you here.

11
00:00:33,840 --> 00:00:37,800
And let's as I said before,

12
00:00:37,800 --> 00:00:42,000
let's bring back the topic
of natural language processing. We

13
00:00:43,160 --> 00:00:44,640
we addressed it

14
00:00:44,640 --> 00:00:49,160
yesterday, but now from the perspective of

15
00:00:49,160 --> 00:00:54,240
how can it be used to enhance
accessible communication on the Web.

16
00:00:54,560 --> 00:00:57,680
And so here and I guess

17
00:00:58,160 --> 00:01:01,120
once again, similar to what I've done

18
00:01:01,800 --> 00:01:04,720
an hour ago in the first panel,

19
00:01:04,720 --> 00:01:07,920
you've been working on different aspects
of this

20
00:01:08,160 --> 00:01:11,400
large domain of accessible communication.

21
00:01:11,840 --> 00:01:13,440
And you’ve

22
00:01:13,720 --> 00:01:16,520
pursued advances in machine

23
00:01:16,520 --> 00:01:20,000
translation, in sign language,

24
00:01:20,400 --> 00:01:24,400
AAC, and so from your perspective

25
00:01:24,400 --> 00:01:27,120
and from your focus on the work,

26
00:01:27,640 --> 00:01:32,160
what are the current challenges
that that you've been facing?

27
00:01:32,160 --> 00:01:34,200
And that's that are preventing

28
00:01:35,920 --> 00:01:38,080
the next breakthrough, I guess.

29
00:01:38,080 --> 00:01:43,560
And also I would like to ask you to,
uh, for your first intervention also to do

30
00:01:44,080 --> 00:01:47,680
a brief introduction to yourself
and to what you've been doing.

31
00:01:48,240 --> 00:01:54,080
Okay, So I can start with you
Chaohai.

32
00:01:54,120 --> 00:01:55,920
Thank you for having me today.

33
00:01:55,920 --> 00:02:00,720
My name is Chaohai Ding and I'm a senior research
fellow at the University of Southampton.

34
00:02:01,160 --> 00:02:05,560
And my research interest is
AI and inclusion

35
00:02:06,200 --> 00:02:09,440
which includes using data science and AI

36
00:02:10,400 --> 00:02:12,000
techniques to enhance

37
00:02:12,000 --> 00:02:15,440
accessible learning, traveling and

38
00:02:16,520 --> 00:02:17,920
communication.

39
00:02:17,920 --> 00:02:20,600
So, yes, we used...

40
00:02:21,120 --> 00:02:22,400
NLP has been widely

41
00:02:22,400 --> 00:02:26,160
used in our research to
support accessible communication.

42
00:02:27,200 --> 00:02:31,480
Currently, we are working on
several projects focused on AAC.

43
00:02:31,480 --> 00:02:33,520
So for example,

44
00:02:33,600 --> 00:02:36,280
we applied the concept

45
00:02:36,280 --> 00:02:39,960
net of ...  knowledge graph

46
00:02:40,080 --> 00:02:42,600
to interlinking AAC

47
00:02:43,920 --> 00:02:46,000
symbols from different symbol sets.

48
00:02:47,200 --> 00:02:49,120
This can be used

49
00:02:49,120 --> 00:02:51,800
for symbol to symbol translation.

50
00:02:52,640 --> 00:02:57,840
And we also developed an NLP model
to translate

51
00:02:57,840 --> 00:03:02,760
the AAC symbol sequence
into spoke text sequence.

52
00:03:03,400 --> 00:03:07,200
So so that's the two projects
we're working on currently

53
00:03:07,920 --> 00:03:12,000
and we also working on 
accessible e-learning project

54
00:03:12,360 --> 00:03:14,880
that we applied a machine translation

55
00:03:16,040 --> 00:03:18,480
to provide transcripts

56
00:03:18,480 --> 00:03:20,600
from English to,

57
00:03:20,600 --> 00:03:23,400
other languages
for our international users.

58
00:03:24,000 --> 00:03:27,680
So that's another scenario
we are working with on a machine

59
00:03:27,680 --> 00:03:30,800
translation for accessible communication.

60
00:03:30,800 --> 00:03:35,360
So there are a few challenges we have
identified in our kind of research.

61
00:03:36,240 --> 00:03:38,640
The first one is always the data,

62
00:03:38,640 --> 00:03:42,640
data availability
and the data opti...bility

63
00:03:43,160 --> 00:03:48,360
So as we know, an NLP model is normally
trained on a large amount of data. So

64
00:03:49,360 --> 00:03:52,200
especially for AAC,

65
00:03:53,720 --> 00:03:57,200
we, we are... 
one of the biggest challenges

66
00:03:57,200 --> 00:04:00,480
is that we are lack of a

67
00:04:00,480 --> 00:04:03,360
data like user

68
00:04:04,240 --> 00:04:07,720
user user data... AAC data,
and also

69
00:04:08,360 --> 00:04:12,120
how a user interact with the AAC.

70
00:04:12,520 --> 00:04:14,600
So so

71
00:04:16,080 --> 00:04:17,600
which...

72
00:04:17,600 --> 00:04:20,400
and also we have several different

73
00:04:21,000 --> 00:04:25,800
AAC symbol sets used
by the different individuals

74
00:04:26,040 --> 00:04:30,600
and which make it very difficult
to develop NLP models as well

75
00:04:30,840 --> 00:04:33,760
because the AAC symbols

76
00:04:33,760 --> 00:04:38,040
are separate for each symbol set
and that's the

77
00:04:38,080 --> 00:04:40,720
another challenge is the lack of data

78
00:04:41,480 --> 00:04:45,080
interoperability in AAC symbol sets.

79
00:04:46,320 --> 00:04:51,000
Yet the third challenge
we are identified is the inclusion

80
00:04:51,240 --> 00:04:54,960
because we are working on AAC symbol sets

81
00:04:55,720 --> 00:04:59,000
from Arabic, English and Chinese.

82
00:04:59,280 --> 00:05:05,040
So there are cultural and social
difference in AAC symbols, which is

83
00:05:06,760 --> 00:05:08,240
important to

84
00:05:08,240 --> 00:05:13,880
consider the needs of different end
user groups on the cultural and the social

85
00:05:14,160 --> 00:05:18,240
factors,
and to be involved in the development

86
00:05:18,360 --> 00:05:20,960
of the NLP models for AAC.

87
00:05:22,440 --> 00:05:24,160
The first one is

88
00:05:24,400 --> 00:05:28,120
data privacy and safety 
and this has been

89
00:05:29,760 --> 00:05:33,360
identified in our web application for

90
00:05:33,360 --> 00:05:38,760
or from AAC symbols to spoken texts.

91
00:05:38,760 --> 00:05:42,600
So how do we,
if we want to, a more accurate

92
00:05:42,600 --> 00:05:44,960
or more specific,

93
00:05:45,960 --> 00:05:47,400
personalized

94
00:05:47,960 --> 00:05:48,840
application?

95
00:05:48,840 --> 00:05:51,000
We need the user's information.

96
00:05:51,000 --> 00:05:54,000
So the challenge is how

97
00:05:54,000 --> 00:05:57,160
how can we store this

98
00:05:57,520 --> 00:05:59,040
personal information

99
00:05:59,040 --> 00:06:03,000
and how how prevent the data
misuse and the bridge

100
00:06:03,400 --> 00:06:06,200
and how to make the tradeoff

101
00:06:06,200 --> 00:06:08,760
between the user

102
00:06:09,240 --> 00:06:12,000
information and the the model performance.

103
00:06:13,720 --> 00:06:17,040
And the last one

104
00:06:17,040 --> 00:06:20,360
is always the accessible user interface.

105
00:06:20,360 --> 00:06:22,480
So in the how,

106
00:06:22,480 --> 00:06:26,040
how to makes this AI powered tool, 
NLP powered

107
00:06:26,280 --> 00:06:30,840
tools accessible
for for end users and

108
00:06:32,680 --> 00:06:34,040
and also

109
00:06:34,040 --> 00:06:36,760
there are more generic issues in

110
00:06:37,760 --> 00:06:42,400
AI like accountability Explainability
so yes I think

111
00:06:42,400 --> 00:06:46,200
that's the last of the challenges
we have identified in our research.

112
00:06:47,760 --> 00:06:48,480
Thank you.

113
00:06:48,720 --> 00:06:49,000
Thank you, Chaohai.

114
00:06:49,000 --> 00:06:54,920
Is a great summary of definitely 
some of the major challenges

115
00:06:54,920 --> 00:06:59,000
that well are spread
across the entire domain.

116
00:06:59,400 --> 00:07:01,200
Definitely. Thank you so much.

117
00:07:01,200 --> 00:07:04,280
Lourdes, do you want to go next?

118
00:07:06,400 --> 00:07:11,240
You’re muted.

119
00:07:11,240 --> 00:07:12,240
Thank you.

120
00:07:12,840 --> 00:07:14,520
Thanks for the invitation.

121
00:07:14,520 --> 00:07:16,920
Good afternoon everyone. I’m Lourdes

122
00:07:16,920 --> 00:07:21,400
Moreno. I work as an associate
professor in the computer

123
00:07:21,400 --> 00:07:24,880
science department
on the Universidad Carlos III

124
00:07:24,960 --> 00:07:27,880
de Madrid, Spain.

125
00:07:27,880 --> 00:07:29,160
I am an accessibility

126
00:07:29,160 --> 00:07:33,160
expert. I have been working
in the area of technology

127
00:07:33,160 --> 00:07:38,080
for disability for 20 years. And...

128
00:07:38,600 --> 00:07:41,200
I have previously worked on

129
00:07:41,200 --> 00:07:44,120
sensory disability but currently

130
00:07:44,640 --> 00:07:47,080
I work on cognitive accessibility.

131
00:07:47,640 --> 00:07:50,560
In my research work I combinate

132
00:07:50,880 --> 00:07:53,920
method from the human computer interaction

133
00:07:54,320 --> 00:07:57,720
and natural language processing areas

134
00:07:58,280 --> 00:08:00,480
to obtain accessible solutions

135
00:08:00,960 --> 00:08:03,560
from the point of view of reability

136
00:08:03,560 --> 00:08:07,120
and the stability of the language
in user interfaces.

137
00:08:08,080 --> 00:08:09,080
And so is

138
00:08:09,080 --> 00:08:15,320
the question currently in natural
language research is being developed

139
00:08:15,440 --> 00:08:18,360
at our language model in recent years,

140
00:08:18,680 --> 00:08:20,960
there had been many advances

141
00:08:22,080 --> 00:08:24,320
due to the increasing resources

142
00:08:24,720 --> 00:08:28,680
such as large dataset and cloud platform

143
00:08:29,240 --> 00:08:32,320
that allow the training of large models.

144
00:08:33,120 --> 00:08:36,480
But the most crucial factor is the use

145
00:08:36,480 --> 00:08:41,200
of transforming technology
and the use of transfer learning.

146
00:08:41,200 --> 00:08:47,640
These are methods based on deep 
learning to create language model

147
00:08:48,160 --> 00:08:50,280
base of the neural network.

148
00:08:51,440 --> 00:08:53,400
They are universal models,

149
00:08:53,400 --> 00:08:57,160
but then support is different
in natural processing

150
00:08:57,160 --> 00:09:03,240
language tasks, such as 
question answering, translation,

151
00:09:03,240 --> 00:09:06,920
summarization,
speech recognition and more.

152
00:09:07,840 --> 00:09:13,840
The most extensively used models
are the GPT

153
00:09:13,840 --> 00:09:17,280
from OpenAI, 
and Bard from Google.

154
00:09:17,760 --> 00:09:22,680
But new and bigger models
continually appear

155
00:09:23,360 --> 00:09:25,720
that outperform previous one

156
00:09:25,720 --> 00:09:31,200
because they are a performance continuous
to a scale

157
00:09:31,400 --> 00:09:37,440
as more parameters are added to their
models and more data are added.

158
00:09:38,640 --> 00:09:42,920
However, and despite these great advance

159
00:09:43,440 --> 00:09:46,160
there are issues

160
00:09:46,160 --> 00:09:51,240
in the accessibility scope
challenges to address.

161
00:09:51,240 --> 00:09:55,280
One of them is bias.

162
00:09:55,280 --> 00:09:59,000
Language models have
different types of bias

163
00:09:59,440 --> 00:10:03,240
such as gender, race and disability

164
00:10:03,840 --> 00:10:07,680
but gender and race

165
00:10:07,760 --> 00:10:11,040
biases are highly analyzed.

166
00:10:11,640 --> 00:10:14,640
However it is in the case

167
00:10:14,720 --> 00:10:16,840
with disability biases.

168
00:10:17,960 --> 00:10:22,240
It has been a relatively underexplored.

169
00:10:23,040 --> 00:10:27,600
There are studies relative
this matter for for example

170
00:10:27,600 --> 00:10:32,880
in in these work
in the in the sentiment and analysis text

171
00:10:33,400 --> 00:10:37,200
the terms relative to disability

172
00:10:37,200 --> 00:10:40,320
have a negative value

173
00:10:40,320 --> 00:10:45,120
or in another work using a
model to moderate conversation

174
00:10:45,120 --> 00:10:48,840
classified takes mentions of disability

175
00:10:49,080 --> 00:10:52,360
as more toxics.

176
00:10:53,200 --> 00:10:57,600
That is algorithms
are trained to be restful

177
00:10:57,600 --> 00:11:00,040
that can be offensive and cause

178
00:11:00,040 --> 00:11:03,000
disadvantage to individual
with disabilities.

179
00:11:04,240 --> 00:11:07,320
So, a investigation is necessary

180
00:11:07,320 --> 00:11:13,000
to study in depth models to reduce biases.

181
00:11:13,000 --> 00:11:16,640
We cannot only use these language model
and directly

182
00:11:16,640 --> 00:11:19,520
use the outcome.

183
00:11:20,240 --> 00:11:22,520
Another problem with these model

184
00:11:23,520 --> 00:11:26,880
is that there aren’t too many dataset

185
00:11:27,360 --> 00:11:30,000
related to accessibility area.

186
00:11:30,000 --> 00:11:34,000
For instance, there a few label corpora

187
00:11:34,400 --> 00:11:36,760
to be used in training simplification,

188
00:11:37,200 --> 00:11:41,280
algorithms,
lexical or syntactic simplification

189
00:11:41,880 --> 00:11:43,600
in natural language processing.

190
00:11:43,600 --> 00:11:50,720
I work in cognitive accessibility
in a in Spanish to simplify text

191
00:11:50,720 --> 00:11:57,960
to plain language and easy
reading language. To carry out this case

192
00:11:57,960 --> 00:12:02,680
we had to create a corpus
with an expert in easy reading,

193
00:12:03,000 --> 00:12:05,680
with the participation of older people

194
00:12:06,360 --> 00:12:10,680
and with people with disability
intellectual disabilities

195
00:12:11,480 --> 00:12:14,480
because the current corpora
had been created

196
00:12:14,480 --> 00:12:19,520
with non expert in disability,
non expert in plain language

197
00:12:19,960 --> 00:12:24,000
and they haven't taken into account
the people with disability.

198
00:12:25,560 --> 00:12:30,360
Also an effort devoted to solving this

199
00:12:30,360 --> 00:12:34,040
scarcity of resources are required

200
00:12:34,040 --> 00:12:37,200
in language with low resources.

201
00:12:37,680 --> 00:12:42,600
English is the language with more developed
with many natural language processing.

202
00:12:42,600 --> 00:12:46,200
But others, such as Spanish, have hardly

203
00:12:46,200 --> 00:12:49,280
any resources. We need system

204
00:12:49,400 --> 00:12:54,720
trained for English language
to work for Spanish as well.

205
00:12:54,720 --> 00:12:58,680
And finally,
with the proliferation of GPT models

206
00:12:58,680 --> 00:13:03,160
with application
such as ChatGPT

207
00:13:03,480 --> 00:13:06,320
another problem to address

208
00:13:06,360 --> 00:13:09,760
is the regulation on ethical aspect of

209
00:13:10,800 --> 00:13:14,920
artificial intelligence.

210
00:13:15,840 --> 00:13:17,240
Okay, thank you so much.

211
00:13:17,240 --> 00:13:22,280
Lourdes, definitely
some very relevant challenges in there.

212
00:13:23,600 --> 00:13:26,160
Vikas, I’ll end this first round with you.

213
00:13:27,440 --> 00:13:30,360
Thank you Carlos. 
I’m Vikas Ashok

214
00:13:30,440 --> 00:13:34,600
from Old Dominion University
in Virginia United States.

215
00:13:35,160 --> 00:13:37,680
So I have been working researching

216
00:13:37,680 --> 00:13:41,480
in the area of accessible
computing for like ten years now.

217
00:13:42,000 --> 00:13:46,920
And my specialty focus area
is of people with visual disabilities.

218
00:13:47,320 --> 00:13:50,840
So I have mostly concentrated
on their accessibility

219
00:13:50,840 --> 00:13:54,000
as well as usability needs
when it comes to interacting

220
00:13:54,000 --> 00:13:56,280
with computer applications.

221
00:13:57,080 --> 00:14:01,560
So with the topic at hand,
which is accessible communication.

222
00:14:01,600 --> 00:14:04,560
So one of the projects
that I'm currently looking at

223
00:14:04,560 --> 00:14:10,080
is understandably of social media content
for people

224
00:14:10,600 --> 00:14:13,920
who listen to content such as,
you know, people who are blind.

225
00:14:14,640 --> 00:14:18,000
So listening,
you know, social media content,

226
00:14:18,000 --> 00:14:21,640
text is not the same as looking at it.

227
00:14:22,120 --> 00:14:25,320
So, even though the social media text

228
00:14:25,320 --> 00:14:31,160
is accessible, it's not necessarily
understandable because of presence

229
00:14:31,160 --> 00:14:34,880
of a lot of nonstandard language

230
00:14:34,880 --> 00:14:37,880
content in social media such as Twitter.

231
00:14:37,920 --> 00:14:39,600
Like people create their own words.

232
00:14:39,600 --> 00:14:44,040
They're very inventive there,
so they hardly follow any grammar.

233
00:14:44,880 --> 00:14:48,920
So text to speech systems
such as those used in screen

234
00:14:48,920 --> 00:14:52,920
readers cannot necessarily pronounce

235
00:14:52,960 --> 00:14:56,480
these out of
vocabulary words in the right way,

236
00:14:56,880 --> 00:14:59,920
because most of these words,
even though they're in text form,

237
00:15:00,360 --> 00:15:05,200
they're mostly intended for visual consumption, 
such as some kind of exaggeration

238
00:15:05,560 --> 00:15:10,680
where the letters are duplicated
just for some kind of additional effect.

239
00:15:11,000 --> 00:15:14,800
Sometimes even emotions are attached
to the text itself without any,

240
00:15:15,160 --> 00:15:17,480
you know, emoticons or anything else.

241
00:15:17,960 --> 00:15:22,640
So and sometimes they try
to phonetically match it,

242
00:15:22,640 --> 00:15:27,080
use a different spelling for a word
just for fun purposes.

243
00:15:27,240 --> 00:15:33,320
So this kind of fun as communication
has increased tremendously on social media

244
00:15:33,320 --> 00:15:39,000
and people are depending on social media
to understand or get news, even,

245
00:15:39,400 --> 00:15:42,760
you know, some kind of disaster news
or something happens anywhere.

246
00:15:42,760 --> 00:15:46,680
Some even, they first flock to the
social media to get it. So

247
00:15:47,680 --> 00:15:48,880
people who listen

248
00:15:48,880 --> 00:15:53,760
to content also should be able to easily
understand, so I’m focusing on that area

249
00:15:53,760 --> 00:15:57,480
how to use NLP to make this possible.

250
00:15:58,360 --> 00:16:02,360
Because even though this is not exactly
a question

251
00:16:02,440 --> 00:16:08,400
of accessibility in a conventional sense,
but it's more like accessibility

252
00:16:08,400 --> 00:16:12,960
in terms of being able to understand
the already accessible content.

253
00:16:12,960 --> 00:16:14,920
So that's one of the things.

254
00:16:14,920 --> 00:16:18,760
The other thing that we're looking at,
which is related to this panel

255
00:16:18,800 --> 00:16:22,600
is the... related to the bias, disability

256
00:16:22,600 --> 00:16:27,720
bias of natural language models,
especially those large language models.

257
00:16:28,480 --> 00:16:33,960
So unfortunately, these models
are reflective of the data it's trained on

258
00:16:34,600 --> 00:16:40,720
because most of the data associates words
that are used to describe people

259
00:16:40,720 --> 00:16:45,080
with disabilities somehow end up having
negative connotation.

260
00:16:45,360 --> 00:16:47,680
So they're using negative context.

261
00:16:48,040 --> 00:16:51,760
So it's nobody's telling these models
to learn it that way

262
00:16:52,320 --> 00:16:56,640
except that the documents
or the text corpus that these models

263
00:16:56,640 --> 00:17:01,240
are looking at inherently
put these words that are,

264
00:17:01,680 --> 00:17:06,200
you know, many times
not offensive into the negative category.

265
00:17:07,440 --> 00:17:10,080
So I'm looking at how we can counter

266
00:17:10,080 --> 00:17:14,160
this. One example is toxicity detection

267
00:17:14,320 --> 00:17:19,280
in discussion forums, online discussion
forums are very popular.

268
00:17:19,280 --> 00:17:22,280
People go there, sometimes anonymously,

269
00:17:22,280 --> 00:17:24,720
post content, interact with each other.

270
00:17:25,320 --> 00:17:28,440
And, you know,
some of the posts get flagged

271
00:17:28,560 --> 00:17:32,360
as, you know, toxic
or this get filtered out.

272
00:17:32,680 --> 00:17:35,800
So even if they're not toxic

273
00:17:36,160 --> 00:17:41,520
because of the use of certain words
to describe disabilities or something.

274
00:17:41,520 --> 00:17:43,760
So we want to avoid that.

275
00:17:43,760 --> 00:17:47,920
So how do we
how can we use NLP to not do that.

276
00:17:48,520 --> 00:17:52,920
So these two projects are pretty much
what's closely related to the panel.

277
00:17:53,400 --> 00:17:55,160
So stick to these.

278
00:17:55,160 --> 00:17:58,840
This session.

279
00:17:58,840 --> 00:17:59,240
Thank you, Vikas.

280
00:18:00,920 --> 00:18:05,720
I'll follow up with that
with what you've mentioned and Lourdes

281
00:18:05,800 --> 00:18:08,960
has also previously highlighted

282
00:18:08,960 --> 00:18:14,320
the disability bias
and... and I'm wondering

283
00:18:14,320 --> 00:18:18,240
if you have any ideas
and suggestions on how can

284
00:18:19,560 --> 00:18:22,320
NLP tools

285
00:18:22,920 --> 00:18:24,960
address such issues.

286
00:18:24,960 --> 00:18:27,960
I'm thinking,
for instance, text summarization tools,

287
00:18:27,960 --> 00:18:30,680
but also other also NLP tools.

288
00:18:30,680 --> 00:18:33,840
How can they help us
address issues of disability bias,

289
00:18:33,840 --> 00:18:37,920
but also how can they explore other aspects

290
00:18:37,920 --> 00:18:41,280
like accountability or personalization

291
00:18:41,280 --> 00:18:44,320
of in the case of

292
00:18:44,320 --> 00:18:45,480
text summaries?

293
00:18:45,480 --> 00:18:49,200
And how can I personalize a summary

294
00:18:49,200 --> 00:18:53,080
for specific audiences
for the needs of specific people?

295
00:18:54,200 --> 00:18:56,640
I'll start with you now Lourdes.

296
00:18:56,640 --> 00:18:57,600
OK.

297
00:18:58,680 --> 00:19:00,080
Text summarization is

298
00:19:00,080 --> 00:19:02,680
a natural language
processing task.

299
00:19:02,680 --> 00:19:04,920
Is a... is a great resource

300
00:19:06,720 --> 00:19:08,360
because improve cognitive

301
00:19:08,360 --> 00:19:14,080
accessibility in order to help people
people with disabilities to process alone

302
00:19:14,080 --> 00:19:16,880
and deduce text.

303
00:19:17,400 --> 00:19:20,280
Also, in the web
content accessibility guidelines

304
00:19:20,760 --> 00:19:25,200
following success criteria 3.1.5 
Reading Level

305
00:19:25,200 --> 00:19:29,720
the readable summary is a 

306
00:19:30,160 --> 00:19:33,000
resource that is recommended.

307
00:19:34,640 --> 00:19:37,920
But these task has challenges.

308
00:19:38,320 --> 00:19:42,680
Such us, bias, disability biasis.

309
00:19:43,040 --> 00:19:46,000
And the summaries generated

310
00:19:46,000 --> 00:19:49,320
are understandable 
for people with disability is

311
00:19:50,000 --> 00:19:54,520
is at a is understandable
for people with disability

312
00:19:55,080 --> 00:20:00,560
therefore some aspects must be taken
into account: is necessary

313
00:20:00,560 --> 00:20:06,320
to approach this task
which is summarize of the extractive type

314
00:20:07,520 --> 00:20:12,240
where the extracted sentences
can be modified with paraphrases

315
00:20:12,320 --> 00:20:17,400
resources and help understandability
and readability of the text.

316
00:20:18,520 --> 00:20:22,400
To summarize text
different input are required

317
00:20:22,960 --> 00:20:28,440
not only knowledge about the sequences
of words or other

318
00:20:28,800 --> 00:20:32,600
leads about sentences,

319
00:20:33,000 --> 00:20:37,320
but also about the target of audience
is important.

320
00:20:37,440 --> 00:20:40,480
Different type of user

321
00:20:40,480 --> 00:20:44,760
require different type
or personalization of summaries.

322
00:20:46,400 --> 00:20:50,520
It also a

323
00:20:50,760 --> 00:20:53,320
I think that this is

324
00:20:53,760 --> 00:20:56,280
it will be recommendable to include

325
00:20:56,480 --> 00:21:00,720
readability metric
in the summary generation process

326
00:21:01,240 --> 00:21:06,560
to ensure that the resulting
summary is minimally readable.

327
00:21:08,400 --> 00:21:09,000
For instance

328
00:21:09,000 --> 00:21:10,680
if

329
00:21:10,680 --> 00:21:15,880
we are in the context of a system
that provides summaries of public

330
00:21:15,960 --> 00:21:18,560
administration information 
for old people,

331
00:21:19,000 --> 00:21:22,880
it's necessary
to take into account that the summary

332
00:21:23,160 --> 00:21:26,600
must be in plain language,

333
00:21:26,760 --> 00:21:30,520
therefore in addition
to extract the relevant sentences

334
00:21:30,520 --> 00:21:35,760
and paraphrases it will be necessary
to include knowledge about of guideline

335
00:21:35,760 --> 00:21:39,720
of plain language to make
the text easier to read

336
00:21:40,560 --> 00:21:45,480
and finally corpora use

337
00:21:46,240 --> 00:21:50,280
to train natural language
processing system shall be tested

338
00:21:50,280 --> 00:21:54,880
with users it in order to attain
useful solution.

339
00:21:55,520 --> 00:21:59,520
Only then it will be possible
to obtain understandable summaries

340
00:21:59,520 --> 00:22:03,440
for the whole of society and the elderly

341
00:22:03,720 --> 00:22:07,920
and and with respect to accountability

342
00:22:09,120 --> 00:22:13,600
as as in every artificial intelligence
algorithm,

343
00:22:13,680 --> 00:22:16,200
these must be explainable

344
00:22:18,000 --> 00:22:21,120
so is necessary to respond to answer

345
00:22:21,680 --> 00:22:25,160
to questions such as how processing

346
00:22:25,160 --> 00:22:29,440
actually perform,
the limitation of the dataset

347
00:22:29,880 --> 00:22:34,960
used to train and test algorithms
and the outcomes of the model

348
00:22:36,040 --> 00:22:37,480
a therefore good

349
00:22:37,480 --> 00:22:41,640
data manager, management
and machine learning models

350
00:22:41,640 --> 00:22:42,640
trained in practice

351
00:22:42,640 --> 00:22:46,320
shall be promote 
to ensure quality results.

352
00:22:49,080 --> 00:22:51,000
And nothing else.

353
00:22:51,000 --> 00:22:52,880
Thank you Lourdes.

354
00:22:53,800 --> 00:22:58,920
Vikas, do you want to... even though we...
from what I understood,

355
00:22:58,920 --> 00:23:03,480
you don't work directly
with text summarization but still others

356
00:23:03,480 --> 00:23:07,720
aspects of disability
bias, accountability,

357
00:23:07,720 --> 00:23:11,040
personalization, impact what you're doing.

358
00:23:12,080 --> 00:23:16,760
Yeah, I mean I use a lot of text summarization
so I can add to it.

359
00:23:16,760 --> 00:23:21,520
So to add to what Lourdes said,

360
00:23:22,440 --> 00:23:26,560
simplification
is also as important as summarization

361
00:23:26,920 --> 00:23:32,240
because sometimes it's not just
summarizing or shortening the content

362
00:23:32,400 --> 00:23:35,840
to be consumed, but it's also making
it understandable, like I said.

363
00:23:36,280 --> 00:23:40,040
So that means that certain complex
sentence structures

364
00:23:40,040 --> 00:23:41,680
and some exotic words

365
00:23:41,680 --> 00:23:44,880
we need to replace them
with equal and easier

366
00:23:44,880 --> 00:23:48,080
to understand
more frequently used words.

367
00:23:48,240 --> 00:23:54,080
So there there is some work there that has
been gone into text simplification.

368
00:23:54,120 --> 00:23:57,840
We created some kind of summarization
in the special case.

369
00:23:58,160 --> 00:24:01,640
It's from the same language
to text from between the same language.

370
00:24:01,840 --> 00:24:05,800
So the input is a text
in the same language as the output text,

371
00:24:06,000 --> 00:24:09,720
except that the output text
is more readable, more understandable.

372
00:24:10,320 --> 00:24:12,360
So that is extremely important.

373
00:24:12,360 --> 00:24:13,440
The other thing is

374
00:24:14,400 --> 00:24:15,240
summarization.

375
00:24:15,240 --> 00:24:19,240
Most of them tend
to rely extractive summarization wherein

376
00:24:19,560 --> 00:24:24,080
they just pick certain sentences
from the original piece of text

377
00:24:24,760 --> 00:24:26,840
so that they don't have to worry about the

378
00:24:27,560 --> 00:24:30,800
grammatical correctness 
and proper sentence structures

379
00:24:31,320 --> 00:24:34,400
so that because they rely on humans

380
00:24:34,400 --> 00:24:37,320
who have written the text
in order to generate the summaries.

381
00:24:37,760 --> 00:24:40,080
So I can speak about

382
00:24:40,440 --> 00:24:44,840
how summarization needs to be personalized
in certain way for certain groups,

383
00:24:44,960 --> 00:24:47,560
especially for people
with visual disabilities.

384
00:24:47,920 --> 00:24:51,720
So what I have noticed in
some of my studies is that

385
00:24:52,640 --> 00:24:56,480
even though they can hear it,
they don't necessarily understand it

386
00:24:56,480 --> 00:24:59,840
because the writing
is sort of visual in the sense

387
00:24:59,840 --> 00:25:03,040
it needs to need you
to be visually imaginative.

388
00:25:03,600 --> 00:25:06,560
So what is the alt... 
the non-visual alternative

389
00:25:07,240 --> 00:25:09,480
for such kind of text?

390
00:25:09,840 --> 00:25:15,080
So how do you summarize the text that
includes a lot of visual elements to it.

391
00:25:15,280 --> 00:25:18,080
So how do you convert it into non

392
00:25:18,840 --> 00:25:21,520
equal non-visual explanations?

393
00:25:21,640 --> 00:25:26,000
This necessarily goes
beyond the extractive summarization.

394
00:25:26,000 --> 00:25:27,760
You cannot just pick and choose,

395
00:25:27,760 --> 00:25:31,760
so you need to replace the wordings
in the sentence.

396
00:25:31,760 --> 00:25:36,120
By other wordings that they can understand
and some of the text,

397
00:25:36,120 --> 00:25:37,960
you know, these days, especially

398
00:25:37,960 --> 00:25:42,240
the articles, news articles and all,
they don't come purely as text.

399
00:25:42,240 --> 00:25:44,920
They're sort of multi-modal in the sense

400
00:25:45,240 --> 00:25:48,000
there are pictures, there are GIFs everything.

401
00:25:48,000 --> 00:25:51,200
And the text sort of refers
to these pictures

402
00:25:52,120 --> 00:25:56,840
so that this is another problem
because then it becomes highly visual.

403
00:25:56,840 --> 00:26:00,240
So you have to take
some of the visual elements

404
00:26:00,240 --> 00:26:03,920
of the picture, probably through computer
vision techniques or something,

405
00:26:03,920 --> 00:26:08,760
and then inject it into the text
in order to make it more self-sufficient

406
00:26:08,760 --> 00:26:12,800
and understandable
for people who cannot see the images.

407
00:26:13,760 --> 00:26:17,240
So that's my take on it.

408
00:26:17,280 --> 00:26:23,240
Yeah, that's a very good point
about the multimedia information

409
00:26:23,240 --> 00:26:26,320
and how do we summarize everything
into text.

410
00:26:26,320 --> 00:26:28,920
Yeah, that's a great point.

411
00:26:28,920 --> 00:26:31,560
Chaohai, your take on this?

412
00:26:31,560 --> 00:26:33,240
Oh yes. Yeah.

413
00:26:33,240 --> 00:26:36,880
But we don't know much experience
in text summarization.

414
00:26:36,880 --> 00:26:42,640
Most our research is focused
on the residual AAC and interlinking

415
00:26:42,680 --> 00:26:46,600
and the AAC generation,
but we do how well the project

416
00:26:47,040 --> 00:26:49,200
involved part of the text summarization.

417
00:26:50,040 --> 00:26:53,240
We construct a knowledge graph

418
00:26:53,640 --> 00:26:56,520
for e-learning platform
and that we need to

419
00:26:57,680 --> 00:26:58,880
extract

420
00:26:59,280 --> 00:27:04,520
the text summarization from lecture notes
to, make it easier

421
00:27:04,520 --> 00:27:07,680
and accessible for people, students

422
00:27:07,680 --> 00:27:11,160
with disabilities. So,

423
00:27:11,160 --> 00:27:15,440
so based on that project,

424
00:27:15,440 --> 00:27:19,440
what we learned is that text
summarization is very difficult task

425
00:27:20,880 --> 00:27:24,480
in NLP because these are
highly dependent on the text

426
00:27:25,600 --> 00:27:27,240
context domain or

427
00:27:27,240 --> 00:27:31,040
target audience and even the goal summary.

428
00:27:31,040 --> 00:27:36,680
For example, in our scenario,
we want to have the summary of

429
00:27:37,960 --> 00:27:39,880
each lecture notes,

430
00:27:39,880 --> 00:27:43,480
but we a very long transcripts
in that lecture.

431
00:27:43,920 --> 00:27:46,280
So we use a few

432
00:27:47,480 --> 00:27:49,680
text summarization models to generate

433
00:27:52,360 --> 00:27:55,200
the summaries, but the outcome is not good

434
00:27:56,360 --> 00:27:56,920
somewhere.

435
00:27:56,920 --> 00:27:59,120
It is mainly as

436
00:28:00,280 --> 00:28:03,520
Vikas just said, some of the text

437
00:28:03,520 --> 00:28:07,880
summarization is just pick
some of the text and replace

438
00:28:07,880 --> 00:28:12,960
some of the words that say so
or even some that doesn't make sense.

439
00:28:13,360 --> 00:28:16,520
So that's the one.

440
00:28:16,520 --> 00:28:20,640
One problem
we identified in text summarization

441
00:28:21,120 --> 00:28:24,000
and we also have some

442
00:28:26,000 --> 00:28:28,200
method to

443
00:28:28,920 --> 00:28:29,800
to read

444
00:28:31,440 --> 00:28:33,360
because we we

445
00:28:33,360 --> 00:28:36,400
we need to personalize
because the project is

446
00:28:37,240 --> 00:28:41,280
related to the adaptive learning
for individual students.

447
00:28:41,280 --> 00:28:44,080
We need a personalization
for each student. So

448
00:28:45,200 --> 00:28:47,920
personalization could be

449
00:28:47,920 --> 00:28:51,760
customized, adapted to user need.

450
00:28:52,240 --> 00:28:55,760
But this is actually

451
00:28:55,760 --> 00:28:58,920
can can be improved the ways

452
00:29:01,000 --> 00:29:04,000
users’ personal preferences

453
00:29:04,400 --> 00:29:06,720
or feedback and

454
00:29:08,200 --> 00:29:11,040
and also allow user to set the

455
00:29:12,600 --> 00:29:14,800
summary goal and...

456
00:29:14,800 --> 00:29:20,120
and also the simplification
is very important because some students

457
00:29:20,360 --> 00:29:25,320
may have cognitive disability
or or other type of disability.

458
00:29:25,320 --> 00:29:28,200
They need to have simplified or blend

459
00:29:28,760 --> 00:29:33,720
plain language. Yet.

460
00:29:33,720 --> 00:29:35,760
Yeah, I think that's mainly what we

461
00:29:37,200 --> 00:29:40,840
have for text summarization.

462
00:29:40,840 --> 00:29:41,520
Thank you, Chaohai.

463
00:29:41,600 --> 00:29:43,920
Thank you. Uh.

464
00:29:43,920 --> 00:29:48,480
Okay, so let's move on to what we started
with the challenges

465
00:29:48,480 --> 00:29:53,240
and now I would like to move on
to the future perspectives.

466
00:29:53,240 --> 00:29:58,440
How do... what are the breakthroughs
that you see happening

467
00:29:58,960 --> 00:30:01,280
promoted by the use of NLP

468
00:30:02,080 --> 00:30:04,320
for accessible communication.

469
00:30:04,680 --> 00:30:06,920
And I'll we'll start with you now, Vikas.

470
00:30:10,960 --> 00:30:12,840
So my

471
00:30:12,840 --> 00:30:16,200
perspective
is that there's plenty of NLP,

472
00:30:16,200 --> 00:30:20,120
you know tools out there already
that haven’t been exploited

473
00:30:20,120 --> 00:30:24,760
to the fullest extent to address
accessibility and usability issues.

474
00:30:25,280 --> 00:30:27,960
The growth in NLP techniques

475
00:30:27,960 --> 00:30:33,080
and methods that has been extremely steep
in the recent years

476
00:30:33,080 --> 00:30:37,080
and the rest of us in different
fields are trying to catch up.

477
00:30:37,560 --> 00:30:39,840
I mean, still,
there is a lot to be explored

478
00:30:40,400 --> 00:30:43,320
as to how they can be used to address

479
00:30:43,320 --> 00:30:45,880
real world accessibility problems,

480
00:30:46,360 --> 00:30:51,120
and we are in the process of doing that,
I would say so

481
00:30:51,120 --> 00:30:53,320
text summarization is one thing

482
00:30:54,240 --> 00:30:57,000
that we discussed already,
which can be exploited

483
00:30:57,000 --> 00:31:01,840
in a lot of scenarios
to improve the efficiency

484
00:31:01,840 --> 00:31:05,880
of computer interaction
for people with disabilities.

485
00:31:06,240 --> 00:31:09,840
But the main problem, as we discussed
not only in this panel

486
00:31:09,840 --> 00:31:12,120
but also on other panels, is the data.

487
00:31:12,120 --> 00:31:16,560
So for some languages
there is enough ... corpus

488
00:31:16,560 --> 00:31:19,960
where the translation is good,
because the translation

489
00:31:19,960 --> 00:31:23,280
essentially depends on
how much data you have trained on.

490
00:31:23,640 --> 00:31:27,200
But for some pair of
languages it may not be that

491
00:31:28,200 --> 00:31:28,800
easy.

492
00:31:28,800 --> 00:31:31,680
Or even if it
does something may not be that accurate.

493
00:31:31,680 --> 00:31:33,800
So that may be a problem.

494
00:31:33,800 --> 00:31:36,920
And then the biggest area where I see,

495
00:31:37,480 --> 00:31:41,160
which can be very useful for solving

496
00:31:41,160 --> 00:31:46,560
many accessibility problems
is the improvement in dialog systems.

497
00:31:46,560 --> 00:31:49,560
So natural language dialog is more like

498
00:31:49,560 --> 00:31:52,000
a really intuitive interface for many

499
00:31:53,360 --> 00:31:56,840
users, including many people
with disabilities.

500
00:31:57,360 --> 00:32:00,240
So those who have physical impairments
which

501
00:32:01,280 --> 00:32:05,040
prevent them from conveniently
using the keyboard or the mouse

502
00:32:05,040 --> 00:32:08,080
and those are blind
who have to use screen readers,

503
00:32:08,080 --> 00:32:11,040
which is time
consuming, is known to be time consuming.

504
00:32:11,040 --> 00:32:15,080
So dialog assistants are, I would say

505
00:32:16,320 --> 00:32:17,720
under-explored...
they're still exploring it.

506
00:32:17,720 --> 00:32:21,840
We see that commercialization is going on
like smartphones and all,

507
00:32:22,080 --> 00:32:26,240
but still it's at the level
of some high level interaction

508
00:32:26,240 --> 00:32:30,840
like setting alarms or turning on lights
and answering some questions.

509
00:32:31,280 --> 00:32:34,760
But what about using that to interact
with applications

510
00:32:34,760 --> 00:32:36,760
in the context of an application?

511
00:32:37,360 --> 00:32:39,480
So if I see a play,

512
00:32:41,040 --> 00:32:41,760
I had a user

513
00:32:41,760 --> 00:32:44,760
comment to this particular document.

514
00:32:44,760 --> 00:32:49,280
It's in word or Google Docs. So can an
assistant spoken

515
00:32:49,280 --> 00:32:52,200
dialog assistant
understand that an automated

516
00:32:53,360 --> 00:32:55,440
means so this kind of automation

517
00:32:56,240 --> 00:33:01,240
will sort of address,
I feel will address many of the issues

518
00:33:01,240 --> 00:33:04,640
that people face
interacting with digital content.

519
00:33:04,640 --> 00:33:09,080
So that's one of the things I would say
we can use NLP for.

520
00:33:09,480 --> 00:33:15,120
The other thing is the increased
availability of large language

521
00:33:15,120 --> 00:33:20,280
models, Pre-trained models
like one Lourdes mentioned, like GPT,

522
00:33:20,840 --> 00:33:25,680
which is essentially transformer decoder
or generator based model.

523
00:33:25,680 --> 00:33:28,040
Then there's also Bert,
which encoder based.

524
00:33:28,440 --> 00:33:32,040
So these help us, you know,

525
00:33:32,120 --> 00:33:36,920
in a way that we don't need large
amounts of data to solve problems

526
00:33:36,920 --> 00:33:40,320
because they're already pre-trained
on a large amount of data.

527
00:33:40,800 --> 00:33:44,560
So what we would need are kind of small

528
00:33:44,760 --> 00:33:49,800
data sets that are more fine tuned
towards the problem we are addressing.

529
00:33:50,520 --> 00:33:53,960
So the datasets,
they're accessibility datasets.

530
00:33:53,960 --> 00:33:57,600
They're I think there needs to be
a little bit more investment

531
00:33:58,480 --> 00:34:02,960
doesn't have to be that big
because the large language models

532
00:34:02,960 --> 00:34:06,640
already take care of most of the language
complexity.

533
00:34:06,760 --> 00:34:08,200
It's more like fine tuning

534
00:34:09,160 --> 00:34:10,560
the problem at hand.

535
00:34:10,560 --> 00:34:14,400
So that's where I think
some effort should go.

536
00:34:14,400 --> 00:34:18,800
And once we do that, obviously
we can fine tune and solve the problems

537
00:34:18,800 --> 00:34:23,640
and then there is a tremendous enhancement
or advancement

538
00:34:23,640 --> 00:34:27,920
in transport learning techniques
which we can exploit that as well,

539
00:34:28,440 --> 00:34:32,280
in order to not do stuff from scratch,
instead borrow

540
00:34:32,280 --> 00:34:35,640
some things that are already there
for something different.

541
00:34:36,320 --> 00:34:38,280
I mean, similar problem.

542
00:34:38,280 --> 00:34:43,880
So, so there is a lot to be explored,
but we haven't done that yet.

543
00:34:43,880 --> 00:34:45,920
So there's plenty of opportunity

544
00:34:45,920 --> 00:34:48,960
for research
using NLP expertise for

545
00:34:49,560 --> 00:34:53,400
problems in accessible communication,
especially.

546
00:34:53,400 --> 00:34:56,160
Yes, definitely
some exciting avenues there.

547
00:34:57,120 --> 00:35:00,680
So, Chaohai, can we have your take on this?

548
00:35:01,120 --> 00:35:04,080
What are your breakthroughs?

549
00:35:04,200 --> 00:35:04,960
OK.

550
00:35:04,960 --> 00:35:08,560
Just listened them,
Vikas, I totally agree with him.

551
00:35:08,880 --> 00:35:09,240
He's

552
00:35:11,160 --> 00:35:13,200
all opinions and

553
00:35:13,200 --> 00:35:15,320
for... for my research

554
00:35:15,720 --> 00:35:18,480
because I've mainly worked on AAC, 
so, currently,

555
00:35:19,160 --> 00:35:22,160
so I would take AAC for example.

556
00:35:22,920 --> 00:35:26,120
So the future perspective for AAC,

557
00:35:27,040 --> 00:35:29,200
NLP for AAC, I think the first of

558
00:35:29,240 --> 00:35:32,920
that will be the personalized adaptive

559
00:35:34,160 --> 00:35:37,480
communication for each individual,
because

560
00:35:39,240 --> 00:35:42,520
each individual
has their own communication,

561
00:35:43,640 --> 00:35:45,760
their own way to communicate
with each other

562
00:35:46,120 --> 00:35:49,200
and NLP techniques can be used to make

563
00:35:50,320 --> 00:35:52,720
this communication more accessible,

564
00:35:52,920 --> 00:35:56,480
more personalized
and adaptive based on their

565
00:35:58,000 --> 00:36:01,200
personal preferences of feedback.

566
00:36:02,440 --> 00:36:06,520
So this can can be used to

567
00:36:06,840 --> 00:36:10,320
for personalize the AAC symbols so

568
00:36:11,320 --> 00:36:14,440
currently, AAC users

569
00:36:14,440 --> 00:36:19,080
they just using some standard
AAC symbol set for

570
00:36:19,160 --> 00:36:24,480
their daily communications,
so how can we use NLP to

571
00:36:25,680 --> 00:36:29,520
and generic and the generic AI models

572
00:36:30,240 --> 00:36:34,720
to create a more customized
personalized AAC symbols

573
00:36:36,080 --> 00:36:37,440
that's which

574
00:36:37,440 --> 00:36:40,800
which you could be have ability
to adapt to the

575
00:36:42,720 --> 00:36:45,240
individual's
unique cultural and social needs.

576
00:36:45,560 --> 00:36:48,240
I think that's one potentially

577
00:36:49,280 --> 00:36:52,280
contribute to the AAC users.

578
00:36:52,800 --> 00:36:56,160
The second one will be accessible multi

579
00:36:56,320 --> 00:36:59,440
modal communication

580
00:37:00,120 --> 00:37:01,720
because a

581
00:37:02,000 --> 00:37:03,920
that NLP techniques

582
00:37:03,920 --> 00:37:07,560
they have the potential to enhance this

583
00:37:08,520 --> 00:37:11,600
accessible communication by improving

584
00:37:12,960 --> 00:37:16,320
interoperability in training data

585
00:37:17,000 --> 00:37:22,160
and the between the verbal language
sign language and that AAC so data

586
00:37:22,160 --> 00:37:26,280
interoperability could provide
more high quality training data for this

587
00:37:28,120 --> 00:37:28,960
language with

588
00:37:28,960 --> 00:37:31,040
elastic set and

589
00:37:32,200 --> 00:37:35,400
additionally,

590
00:37:35,400 --> 00:37:39,720
it can provide the ability to translate

591
00:37:40,280 --> 00:37:43,200
different communication models

592
00:37:43,800 --> 00:37:46,280
and to make it more accessible
and inclusive.

593
00:37:47,480 --> 00:37:50,280
So in AAC, so we can have

594
00:37:51,440 --> 00:37:55,080
multiple AAC symbol sets can be link

595
00:37:55,640 --> 00:37:58,200
mapped and interlinked by NLP models

596
00:37:58,680 --> 00:38:03,200
and this can be contribute
to the translation between the AAC to AAC

597
00:38:03,520 --> 00:38:07,800
and AAC to text, AAC to some language
and vice versa.

598
00:38:08,480 --> 00:38:13,880
Yeah, that's the,
the the second aspect I think about.

599
00:38:13,880 --> 00:38:15,240
And then the third one is the

600
00:38:17,240 --> 00:38:19,720
AI assistant
communication that Vikas

601
00:38:19,720 --> 00:38:23,760
just, just talk about the ChatGPT.

602
00:38:23,760 --> 00:38:26,760
So with this,

603
00:38:26,760 --> 00:38:30,080
this large language model has been trained

604
00:38:30,080 --> 00:38:32,840
by this big companies and

605
00:38:33,960 --> 00:38:35,680
and they have been widely 
spreading on social media.

606
00:38:35,680 --> 00:38:38,560
So how

607
00:38:38,560 --> 00:38:40,680
how to using this

608
00:38:40,680 --> 00:38:42,920
this trained large

609
00:38:43,400 --> 00:38:48,360
language models incorporated
with other applications and then can use it

610
00:38:48,360 --> 00:38:53,560
for a more accessible communication
to help people with disabilities.

611
00:38:54,000 --> 00:38:56,800
That's that's another

612
00:38:57,000 --> 00:38:57,560
future

613
00:38:57,560 --> 00:38:59,200
we are looking for.

614
00:38:59,200 --> 00:39:01,320
The last one 
that I'm going to talk about

615
00:39:01,560 --> 00:39:04,880
is more regarding the AAC
because it's quite expensive.

616
00:39:05,160 --> 00:39:07,960
So affordability is very important

617
00:39:08,640 --> 00:39:13,080
and it can be achieved by the NLP

618
00:39:13,320 --> 00:39:18,120
or AI. That's what I mentioned
that we are currently looking into

619
00:39:18,640 --> 00:39:23,320
how to turn images into symbols
and how to generate

620
00:39:24,640 --> 00:39:28,200
AAC symbols automatically by using

621
00:39:29,520 --> 00:39:33,720
image generative AI models
like stable diffusion. So

622
00:39:34,840 --> 00:39:37,960
so that's
the another future we are looking forward

623
00:39:37,960 --> 00:39:41,280
how to reduce the cost
for accessible communication.

624
00:39:42,000 --> 00:39:44,280
Yeah. Thank you.

625
00:39:44,280 --> 00:39:44,640
Thank you,

626
00:39:44,640 --> 00:39:47,520
Chaohai. Definitely a relevant point.

627
00:39:47,520 --> 00:39:52,040
Reducing costs of getting data
and all of that.

628
00:39:52,320 --> 00:39:54,120
That's important everywhere.

629
00:39:54,120 --> 00:39:57,120
So, Lourdes, what are you looking for

630
00:39:57,360 --> 00:40:01,840
in the near future?

631
00:40:01,840 --> 00:40:05,880
And you are muted.

632
00:40:05,880 --> 00:40:11,920
So as we have mentioned before,
there are two trends that are the

633
00:40:12,360 --> 00:40:16,680
the appearance of new and better
language model than the previous one

634
00:40:17,120 --> 00:40:19,680
working in these these new models

635
00:40:20,120 --> 00:40:25,080
and to reduce the disability biases.

636
00:40:25,080 --> 00:40:29,040
Also I am going to list a specific
natural language processing

637
00:40:29,040 --> 00:40:31,800
task and data application

638
00:40:32,080 --> 00:40:34,320
that I will work in the coming year.

639
00:40:35,480 --> 00:40:39,960
And one of them is accessibility
to domain specific.

640
00:40:39,960 --> 00:40:41,680
task, such as, health

641
00:40:42,760 --> 00:40:44,720
the the

642
00:40:44,720 --> 00:40:47,960
health language is highly
demanded need

643
00:40:48,960 --> 00:40:51,800
but patients have problems understanding

644
00:40:51,800 --> 00:40:57,240
information about their health
condition, diagnosis, treatment

645
00:40:57,520 --> 00:41:02,560
and natural language processing
method could improve their understanding

646
00:41:02,560 --> 00:41:05,440
of health related documents.

647
00:41:06,400 --> 00:41:10,800
Similar sample appear in
legal and financial documents,

648
00:41:10,800 --> 00:41:14,880
the language of administration,
e-government...

649
00:41:15,480 --> 00:41:18,960
Current natural language
processing technology

650
00:41:18,960 --> 00:41:27,240
that simplifies and summarizes
this could help in the roadmap.

651
00:41:27,240 --> 00:41:29,480
And another

652
00:41:31,000 --> 00:41:33,720
line is speech to text

653
00:41:35,040 --> 00:41:37,800
speech to text will be a relevant area

654
00:41:37,800 --> 00:41:40,800
of research in the field
of virtual meetings

655
00:41:41,720 --> 00:41:46,440
in order to facilitate accessible,
accessible communication by generation

656
00:41:46,440 --> 00:41:50,120
of summaries of meeting
as well as minutes

657
00:41:51,280 --> 00:41:53,640
in plain language.

658
00:41:53,960 --> 00:41:56,320
Another topic is the

659
00:41:57,240 --> 00:42:00,600
integration of natural language
processing

660
00:42:00,600 --> 00:42:03,960
method into the design

661
00:42:03,960 --> 00:42:07,320
and development of
multimedia user interface

662
00:42:08,400 --> 00:42:12,120
is necessary to face accessible
accessible communication

663
00:42:12,120 --> 00:42:17,000
from a multi-disciplinary approach
between different areas

664
00:42:17,000 --> 00:42:20,520
such as, human computer interaction
software engineering

665
00:42:20,840 --> 00:42:24,360
and natural language processing.

666
00:42:24,360 --> 00:42:29,280
Finally, another
issue is advancing

667
00:42:29,280 --> 00:42:33,400
application in smart assistant
in natural language processing

668
00:42:33,920 --> 00:42:37,440
method to support people with disabilities
and the elderly.

669
00:42:38,520 --> 00:42:41,480
Assist them in their daily tasks,

670
00:42:41,760 --> 00:42:46,560
and promote active living.

671
00:42:46,560 --> 00:42:47,920
Okay thank you so much, Lourdes,

672
00:42:47,920 --> 00:42:52,120
and everyone of you
for for those perspectives.

673
00:42:52,520 --> 00:42:55,480
I guess we still have 5 minutes

674
00:42:55,480 --> 00:43:00,520
more in this session,
so I will risk another question

675
00:43:00,520 --> 00:43:05,040
and I will ask you to
to try to to be brief on this one.

676
00:43:05,040 --> 00:43:09,720
But the the need for data

677
00:43:09,720 --> 00:43:13,720
was common across all your interventions.

678
00:43:13,720 --> 00:43:17,480
And if we go back to the previous
panel, also,

679
00:43:17,480 --> 00:43:21,360
it was brought up by every every panelist.

680
00:43:21,360 --> 00:43:23,760
So yeah, we need data.

681
00:43:25,000 --> 00:43:26,880
What are your thoughts and

682
00:43:26,880 --> 00:43:29,600
how can we make it easier

683
00:43:30,120 --> 00:43:32,600
and to collect more data

684
00:43:32,880 --> 00:43:37,120
for the specific aspect
of accessible communication?

685
00:43:37,120 --> 00:43:38,960
Because, we communicate a lot, right?

686
00:43:38,960 --> 00:43:41,280
Technology has

687
00:43:41,680 --> 00:43:43,720
allowed us and open up

688
00:43:44,320 --> 00:43:46,760
several channels
to where we can communicate

689
00:43:46,960 --> 00:43:49,960
even when we're not co-located.

690
00:43:50,520 --> 00:43:52,600
So yeah, every one of us is

691
00:43:53,160 --> 00:43:56,200
different points of the planet
and we are communicating right now.

692
00:43:56,360 --> 00:44:01,440
Technology improved those that
that possibility a lot.

693
00:44:01,880 --> 00:44:04,680
And however

694
00:44:04,680 --> 00:44:07,320
we always hear this, 
we need more data.

695
00:44:07,320 --> 00:44:08,520
We can't get data.

696
00:44:08,520 --> 00:44:11,760
So how do you think we can get more data

697
00:44:13,160 --> 00:44:15,240
is and of course we need

698
00:44:15,240 --> 00:44:17,760
the data to train these models, but

699
00:44:18,280 --> 00:44:21,760
can't we also rely on these models
to generate data?

700
00:44:22,960 --> 00:44:25,320
So let me just

701
00:44:25,320 --> 00:44:28,560
drop this on you now and

702
00:44:30,120 --> 00:44:31,320
any of you want

703
00:44:31,320 --> 00:44:37,120
to go first?

704
00:44:37,120 --> 00:44:40,160
I can go first. Okay. Yeah. Yeah.

705
00:44:40,160 --> 00:44:43,320
Because we have actually working on open

706
00:44:43,320 --> 00:44:46,240
data four years ago before, I mean

707
00:44:47,440 --> 00:44:50,680
the AI and the data science
because when I started my PhD

708
00:44:50,840 --> 00:44:55,720
we working on the open data and
there is a open data initiative in UK.

709
00:44:56,040 --> 00:44:59,720
So we want to open our data
and government data

710
00:45:01,120 --> 00:45:05,120
and the, and the public transport data
and that's

711
00:45:05,120 --> 00:45:09,200
how long working on public transportation
with accessibility needs.

712
00:45:09,200 --> 00:45:11,400
So there's a lack of data.

713
00:45:11,520 --> 00:45:13,240
At the beginning of my Ph.D.

714
00:45:13,240 --> 00:45:17,320
so few years later, the still
lack of the accessibility information data

715
00:45:18,000 --> 00:45:19,240
on this data.

716
00:45:19,240 --> 00:45:23,400
So the I think the so how can we
how is this

717
00:45:23,760 --> 00:45:28,920
I mean, the accessibility area,
how can we have such a data

718
00:45:29,320 --> 00:45:32,000
to to have to train our model?

719
00:45:32,000 --> 00:45:36,320
I mean, the first advice,
what I used to do

720
00:45:36,320 --> 00:45:40,560
with public transport data is I mapped all

721
00:45:41,520 --> 00:45:42,760
available data

722
00:45:42,760 --> 00:45:46,400
into a larger dataset. That's incurred

723
00:45:46,480 --> 00:45:50,120
a lot of labor work
like the cleaning data integration

724
00:45:50,120 --> 00:45:54,560
and all this
method to make data available.

725
00:45:55,120 --> 00:45:57,560
That's the first first approach.

726
00:45:57,840 --> 00:46:00,960
The second is we think about how can we

727
00:46:02,200 --> 00:46:05,360
contribute like

728
00:46:05,360 --> 00:46:08,160
a data repository or something

729
00:46:08,160 --> 00:46:13,120
like an image net or word net
that we can collaborate to together

730
00:46:13,160 --> 00:46:15,520
to contribute the identified

731
00:46:16,520 --> 00:46:19,320
data related to accessibility research.

732
00:46:20,160 --> 00:46:25,680
I think that that's we can as a community,
we can create such a universal

733
00:46:26,000 --> 00:46:28,320
repository or, or

734
00:46:30,160 --> 00:46:32,200
yeah, some kind of data initiative

735
00:46:32,200 --> 00:46:35,480
that we can working on
accessibility research.

736
00:46:36,240 --> 00:46:38,920
And the third approach 
is that definitely 

737
00:46:39,000 --> 00:46:42,440
that we can generate
the data based on the small data.

738
00:46:42,440 --> 00:46:45,400
We can be using

739
00:46:45,840 --> 00:46:48,720
generative AI model to generate more,

740
00:46:49,840 --> 00:46:53,000
but to
do the question is, is that data reliable?

741
00:46:53,080 --> 00:46:58,240
The data to generate generate enough,
or is that then the bias?

742
00:46:58,640 --> 00:47:01,600
So yeah, that's my my conclusion.

743
00:47:01,960 --> 00:47:02,680
Thank you.

744
00:47:03,400 --> 00:47:04,440
Yes, exactly.

745
00:47:04,440 --> 00:47:07,440
That's the... the big question mark.

746
00:47:07,440 --> 00:47:08,200
Right.

747
00:47:08,360 --> 00:47:11,520
Is that synthetic data
reliable or not, so

748
00:47:12,000 --> 00:47:14,440
Vikas or Lourdes
do you want to add something?

749
00:47:15,480 --> 00:47:18,000
So yeah I mean I have used synthetic data

750
00:47:18,120 --> 00:47:21,000
before based

751
00:47:21,040 --> 00:47:24,240
the little bit of real data
and in some cases

752
00:47:24,240 --> 00:47:26,440
you can generate synthetic data.

753
00:47:26,680 --> 00:47:29,640
So one of the things I had to do
was extract

754
00:47:30,400 --> 00:47:32,440
user comments in documents.

755
00:47:33,280 --> 00:47:37,480
Most of these word processing applications
allow you to post comments

756
00:47:37,960 --> 00:47:42,800
to the right for your collaborators to look at
and then, you know, address them.

757
00:47:43,240 --> 00:47:47,360
So automatically extracting
that I had to generate synthetic data

758
00:47:47,360 --> 00:47:51,800
because obviously you have few documents
with collaborative comments.

759
00:47:52,320 --> 00:47:54,800
So the appearance there is like, okay

760
00:47:54,840 --> 00:47:58,360
comments will appear somewhere
on the right side, right corner,

761
00:47:59,080 --> 00:48:02,400
which will have some text in it
with a few sentences.

762
00:48:02,680 --> 00:48:04,680
So there are some characteristics.

763
00:48:04,680 --> 00:48:07,640
So in those cases
we generated synthetic data.

764
00:48:07,640 --> 00:48:09,960
We train the machine learning model.

765
00:48:09,960 --> 00:48:13,720
It was pretty accurate on this data,
which was like real data.

766
00:48:14,520 --> 00:48:16,200
So... exploit...

767
00:48:16,200 --> 00:48:19,320
In some cases you can exploit the

768
00:48:19,680 --> 00:48:23,840
way data will appear
and then generate the synthetic data.

769
00:48:23,920 --> 00:48:27,240
But in many cases it may not be possible.

770
00:48:27,480 --> 00:48:30,480
Like for the project
I mentioned, social media,

771
00:48:30,480 --> 00:48:34,000
where text contains
a lot of nonstandard words.

772
00:48:34,600 --> 00:48:39,800
Simply replacing the nonstandard words
with synonyms may not do the job

773
00:48:39,880 --> 00:48:43,920
because then you take the fun aspect
away from social media.

774
00:48:44,280 --> 00:48:47,320
Like, it should be as fun and entertaining

775
00:48:47,320 --> 00:48:51,120
when you listen to social media text 
as it is when you look at it.

776
00:48:51,640 --> 00:48:54,360
So. So you have to do some kind of clever,

777
00:48:55,360 --> 00:48:56,880
you know, replacement.

778
00:48:56,880 --> 00:49:01,040
And for that you need some kind of expert
human expert going there and

779
00:49:03,000 --> 00:49:04,200
doing that.

780
00:49:04,200 --> 00:49:07,920
So crowdsourcing
I think is one way to get data

781
00:49:07,920 --> 00:49:11,040
quickly and it's pretty reliable.

782
00:49:11,440 --> 00:49:14,200
And see, I've seen in the NLP community

783
00:49:14,240 --> 00:49:19,560
like NLP papers that appear in ACL
and they rely heavily on the Amazon

784
00:49:20,160 --> 00:49:23,280
Mechanical Turk and other online

785
00:49:26,160 --> 00:49:29,120
incentivized data collection mechanisms.

786
00:49:29,560 --> 00:49:31,720
So that I think is one thing.

787
00:49:31,720 --> 00:49:35,640
The other thing I do
know, you know, in my classes

788
00:49:35,640 --> 00:49:39,840
especially, I get the students
to help each other out, collect the data

789
00:49:40,160 --> 00:49:43,560
so it doesn't have to be that intensive
every day.

790
00:49:43,560 --> 00:49:47,400
If they just even one student
collects like ten data points

791
00:49:47,880 --> 00:49:52,560
over the semester, it would be like
enough data for a lot of things.

792
00:49:52,560 --> 00:49:56,960
So you know, in each other projects
and in the end of the course

793
00:49:57,120 --> 00:50:00,320
pretty much they'll have a
lot of data for research. So

794
00:50:01,320 --> 00:50:02,120
you know,

795
00:50:02,120 --> 00:50:06,040
everybody can contribute
in a way and students

796
00:50:06,040 --> 00:50:10,200
especially are much more reliable
because they are familiar

797
00:50:10,200 --> 00:50:14,840
with the mechanisms
how to label collect data.

798
00:50:14,840 --> 00:50:18,520
And also they can understand
how things work as well.

799
00:50:18,520 --> 00:50:22,760
So, it’s like a win-win.

800
00:50:22,760 --> 00:50:25,000
Okay, yeah, thanks for that contribution.

801
00:50:25,480 --> 00:50:26,880
Good suggestion.

802
00:50:26,880 --> 00:50:31,040
And, Lourdes, we are really running out of time

803
00:50:31,040 --> 00:50:34,120
but if you still want to intervene,

804
00:50:34,120 --> 00:50:36,200
I can give you a couple of minutes.

805
00:50:37,240 --> 00:50:37,760
Okay.

806
00:50:37,760 --> 00:50:40,720
Only a I think that also

807
00:50:40,720 --> 00:50:46,080
we don't find many
we need a few data, but in my vision

808
00:50:46,080 --> 00:50:49,240
is also negative because obtaining

809
00:50:49,240 --> 00:50:52,640
the dataset is expensive.

810
00:50:52,680 --> 00:50:56,760
An in accessible communication,
I work in simplification.

811
00:50:57,960 --> 00:50:59,880
these data must be prepared

812
00:50:59,880 --> 00:51:01,960
by the expert in accessibility

813
00:51:03,360 --> 00:51:05,720
is important as these data

814
00:51:07,240 --> 00:51:11,400
is validated
by people with disability

815
00:51:11,680 --> 00:51:16,160
and use
plain language resources

816
00:51:16,880 --> 00:51:20,040
and then it is a problem

817
00:51:20,520 --> 00:51:25,640
to obtain data with quality.

818
00:51:26,880 --> 00:51:28,840
Okay, thank you so much,

819
00:51:28,840 --> 00:51:29,320
Lourdes.

820
00:51:29,320 --> 00:51:34,800
And thanks a very big
thank you to the three of you, Chaohai,

821
00:51:34,800 --> 00:51:36,680
Vikas and Lourdes. It

822
00:51:36,760 --> 00:51:38,760
was a really interesting panel

823
00:51:39,480 --> 00:51:42,040
thank you so much for
for your availability and