1
00:00:00,000 --> 00:00:03,040
I'm going to move on to the

2
00:00:03,040 --> 00:00:05,240
to introducing Shari Trewin

3
00:00:05,240 --> 00:00:09,000
and she is an engineering manager
at Google

4
00:00:09,000 --> 00:00:13,520
and leading a team
that develops assistive technologies.

5
00:00:13,520 --> 00:00:16,760
So I'm really looking forward

6
00:00:17,240 --> 00:00:19,800
to your vision of and

7
00:00:20,960 --> 00:00:22,960
how what's next?

8
00:00:22,960 --> 00:00:26,960
What's the future holding for us
in assistive AI.

9
00:00:27,760 --> 00:00:30,080
So as we had yesterday

10
00:00:31,400 --> 00:00:34,040
at the end of the keynote Jutta

11
00:00:35,120 --> 00:00:37,200
will join us and we'll have this

12
00:00:38,600 --> 00:00:42,200
even more interesting conversation
between Shari Trewin

13
00:00:42,720 --> 00:00:46,160
making it really appetizing for the keynote.

14
00:00:46,160 --> 00:00:51,320
So Shari the floor is yours.

15
00:00:51,320 --> 00:00:54,120
All right. Thank you very much, Carlos.

16
00:00:55,040 --> 00:01:09,720
I just.

17
00:01:09,720 --> 00:01:12,080
Okay. Can you hear me okay?

18
00:01:12,080 --> 00:01:14,640
Yes. Good.

19
00:01:14,640 --> 00:01:15,440
All right.

20
00:01:15,440 --> 00:01:20,120
What a pleasure
it is to participate in this symposium and

21
00:01:21,200 --> 00:01:23,240
hear from our opening keynote, Jutta,

22
00:01:23,800 --> 00:01:27,240
and all our panelists
over the last two days.

23
00:01:27,440 --> 00:01:29,240
Thank you so much for inviting me.

24
00:01:29,240 --> 00:01:31,880
It's my privilege to finish us up now.

25
00:01:33,240 --> 00:01:35,960
So yesterday Jutta grounded us

26
00:01:35,960 --> 00:01:39,720
all in the need to do no harm and talked
about some of the ways

27
00:01:39,720 --> 00:01:42,400
we can think about detecting
and avoiding harm.

28
00:01:42,760 --> 00:01:47,120
Today, I'm going to focus on digital
accessibility applications

29
00:01:47,120 --> 00:01:50,880
of AI in general and ask where next

30
00:01:50,880 --> 00:01:57,040
for assistive AI?

31
00:01:57,040 --> 00:01:59,240
I You see my screen?

32
00:02:00,280 --> 00:02:02,360
Yes, we are. You are. Okay.

33
00:02:02,360 --> 00:02:08,160
I just didn't show mine all good.

34
00:02:08,160 --> 00:02:10,280
So my name is Shari Trewin.

35
00:02:10,280 --> 00:02:14,040
I'm an engineering manager in Google's
accessibility team.

36
00:02:14,320 --> 00:02:18,480
I'm also the past chair of
the ACM's SIGAccess

37
00:02:18,680 --> 00:02:22,680
Special Interest Group
on accessible computing.

38
00:02:23,440 --> 00:02:28,720
My background is computer science and AI,
and I've been thinking about the ways

39
00:02:28,720 --> 00:02:32,440
that AI plays into accessibility
for many years.

40
00:02:33,120 --> 00:02:37,040
Much of my work in thinking on AI
and the AI fairness was done

41
00:02:37,040 --> 00:02:41,560
when I worked at IBM, where I was program
director for IBM Accessibility.

42
00:02:41,840 --> 00:02:44,680
So shout out to any
IBM friends in the audience.

43
00:02:46,040 --> 00:02:46,800
At Google,

44
00:02:46,800 --> 00:02:51,120
my team has a focus on developing
new assistive capabilities

45
00:02:51,480 --> 00:02:54,920
and as we've been discussing
for the last few days,

46
00:02:54,920 --> 00:02:57,440
AI has an important role to play.

47
00:02:59,280 --> 00:03:02,040
So there's been a lot of buzz in the news
lately.

48
00:03:02,040 --> 00:03:06,640
Both excitement and alarm
about generative AI,

49
00:03:06,680 --> 00:03:09,040
especially these large language models.

50
00:03:10,520 --> 00:03:13,680
So for example, the ChatGPT model from OpenAI

51
00:03:13,680 --> 00:03:17,200
has been in the news quite a bit,

52
00:03:17,200 --> 00:03:19,800
in case you haven't played with it
yet, here's an example.

53
00:03:19,800 --> 00:03:26,560
So I asked ChatGPT how will AI change
digital accessibility?

54
00:03:26,960 --> 00:03:30,280
Let's try to get it to write
my talk for me.

55
00:03:30,280 --> 00:03:33,680
And it responded
with a pretty positive viewpoint.

56
00:03:33,680 --> 00:03:36,880
It said AI has the potential
to significantly

57
00:03:36,880 --> 00:03:40,120
improve digital accessibility
for people with disabilities.

58
00:03:40,520 --> 00:03:44,000
Here's a few ways
that AI could contribute to this goal.

59
00:03:45,760 --> 00:03:46,520
It went on to

60
00:03:46,520 --> 00:03:49,800
list four examples of transformative AI.

61
00:03:49,840 --> 00:03:54,400
All of these have been major topics
at this symposium, and for each one

62
00:03:54,400 --> 00:03:58,200
it gave a one or two sentence
explanation of what it was

63
00:03:58,400 --> 00:04:02,640
and who it's helpful for.

64
00:04:02,640 --> 00:04:06,640
Finally,
it concluded that AI has the potential

65
00:04:06,640 --> 00:04:10,640
to make digital content
and devices more accessible to people

66
00:04:10,640 --> 00:04:14,360
with disabilities, allowing them
to fully participate in the digital world.

67
00:04:15,320 --> 00:04:17,440
It seems pretty convincing
and well written.

68
00:04:17,960 --> 00:04:22,520
Perhaps I should just end here
and let AI have the last word. But,

69
00:04:23,960 --> 00:04:26,240
you know, it's
kind of it's kind of mind blowing,

70
00:04:26,360 --> 00:04:31,960
although it was pretty terrible jokes
and this is what it can do

71
00:04:31,960 --> 00:04:35,880
without explicitly
being connected to any source of truth.

72
00:04:36,000 --> 00:04:38,840
But it does get things
sometimes flat out wrong.

73
00:04:39,200 --> 00:04:43,760
And there's a risk of bias in the training
data being reflected in the predictions.

74
00:04:44,720 --> 00:04:46,000
And this limits the

75
00:04:46,000 --> 00:04:48,760
ways that
we can apply this technology today.

76
00:04:49,000 --> 00:04:51,800
But it also gives us a glimpse
into the future.

77
00:04:52,640 --> 00:04:55,080
I'm not going to take medical advice

78
00:04:55,080 --> 00:04:59,560
from a generative AI model yet,
but as we get better

79
00:04:59,600 --> 00:05:04,400
at connecting this level
of language fluency with knowledge,

80
00:05:05,240 --> 00:05:08,480
improving accuracy,
detecting and removing bias,

81
00:05:09,040 --> 00:05:13,720
this opens up so many new possibilities
for interaction models

82
00:05:14,040 --> 00:05:17,680
and ways to find to consume
information in the future.

83
00:05:18,800 --> 00:05:25,160
So I'll come back to that later.

84
00:05:25,160 --> 00:05:26,720
For today's talk, I'm going to

85
00:05:26,720 --> 00:05:28,880
slice the topic a little bit differently.

86
00:05:28,920 --> 00:05:34,040
I want to focus on
some of the general research directions

87
00:05:34,040 --> 00:05:37,160
that I see as being important
for moving digital

88
00:05:37,160 --> 00:05:39,320
accessibility forward with AI.

89
00:05:40,160 --> 00:05:43,520
So in our opening keynote, Jutta laid out

90
00:05:43,560 --> 00:05:46,520
some of the risks
that can be associated with AI.

91
00:05:46,560 --> 00:05:49,800
It's not created and applied with equity

92
00:05:50,120 --> 00:05:53,480
and safety in mind, and it's important

93
00:05:53,480 --> 00:05:57,440
to keep these considerations
in mind as we move forward with A.I.

94
00:05:58,520 --> 00:05:59,640
When the benefits of

95
00:05:59,640 --> 00:06:03,040
AI do outweigh the risks in enabling
digital access,

96
00:06:03,400 --> 00:06:07,200
we still have a way to go in
making these benefits

97
00:06:07,200 --> 00:06:10,520
available to everyone, in fact,
to make them accessible

98
00:06:10,960 --> 00:06:14,760
so start
by talking about some current efforts

99
00:06:15,200 --> 00:06:17,640
in the direction making assistive

100
00:06:17,800 --> 00:06:21,280
AI itself more inclusive.

101
00:06:21,280 --> 00:06:24,520
The second topic I want to cover is where

102
00:06:24,520 --> 00:06:27,800
we choose to apply AI.

103
00:06:28,640 --> 00:06:30,920
Focusing on why I called it

104
00:06:31,200 --> 00:06:35,040
AI at source and find web

105
00:06:35,040 --> 00:06:37,840
accessibility work in role emphasizes

106
00:06:39,200 --> 00:06:43,480
the need to shift left that is,
to bake accessibility

107
00:06:43,480 --> 00:06:47,960
in as early as possible in the development
of a digital experience.

108
00:06:48,160 --> 00:06:51,920
So I'll discuss some of the places
where AI can help with that shift

109
00:06:51,920 --> 00:06:56,760
left and highlight both opportunities
and important emerging challenges

110
00:06:57,960 --> 00:07:04,600
that we have for web accessibility.

111
00:07:04,600 --> 00:07:08,240
So we know the AI has already changed

112
00:07:08,240 --> 00:07:11,360
the landscape of assistive technology.

113
00:07:11,360 --> 00:07:16,880
So one research direction is how
do we make these AI models more inclusive?

114
00:07:18,520 --> 00:07:19,360
And I want to

115
00:07:19,360 --> 00:07:21,680
start with a little story about captions.

116
00:07:22,440 --> 00:07:25,200
In 2020, I was accessibility chair

117
00:07:25,200 --> 00:07:28,200
for a very large virtual conference.

118
00:07:29,520 --> 00:07:32,000
We provided a human captioner

119
00:07:32,280 --> 00:07:37,040
who was live transcribing the sessions
in a in a separate live feed.

120
00:07:37,880 --> 00:07:41,560
So I'm showing an image of a slide
from a presentation

121
00:07:41,560 --> 00:07:44,200
here with a transcription
window to the right.

122
00:07:45,560 --> 00:07:49,520
I spoke with a hard of hearing attendee
during the conference

123
00:07:50,160 --> 00:07:53,080
who used captions to supplement
what he could hear,

124
00:07:53,720 --> 00:07:56,560
and he told me
while the live feed had quite a delay.

125
00:07:57,080 --> 00:08:00,680
So he was also using automated captions

126
00:08:00,680 --> 00:08:04,920
that were being streamed
through the conference provider.

127
00:08:05,000 --> 00:08:09,080
Let's add them to this view
highlighted in Green.

128
00:08:09,080 --> 00:08:13,520
So these had a little less delay,
but they had accuracy problems,

129
00:08:13,840 --> 00:08:18,440
especially for foreign speakers
or people with atypical speech,

130
00:08:18,880 --> 00:08:21,840
and especially for people's
names or technical terms.

131
00:08:22,600 --> 00:08:25,760
The important parts.

132
00:08:25,760 --> 00:08:29,320
So he also turned on
the automated captions in his browser,

133
00:08:29,320 --> 00:08:32,040
which used a different speech
to text engine.

134
00:08:32,320 --> 00:08:34,360
I've added those on the screen to

135
00:08:36,200 --> 00:08:39,800
and supplemented
that with an app on his phone

136
00:08:40,160 --> 00:08:43,640
using a third different speech
recognition engine, capturing

137
00:08:43,640 --> 00:08:47,160
the audio as it was played
from his computer and transcribing it.

138
00:08:47,840 --> 00:08:51,360
So that's four sources of captions
to read.

139
00:08:52,080 --> 00:08:56,160
None of them was perfect,
but he combined them to triangulate

140
00:08:56,480 --> 00:08:59,280
interpretations
where the transcripts seem to be wrong.

141
00:09:00,920 --> 00:09:02,360
So we could say AI

142
00:09:02,360 --> 00:09:06,560
powered captions were helping him
to access the conference, no doubt about it.

143
00:09:06,560 --> 00:09:09,200
But it wasn't a very usable experience.

144
00:09:10,040 --> 00:09:13,680
He was empowered,
but he also had a huge burden

145
00:09:13,920 --> 00:09:16,600
in managing his own accessibility.

146
00:09:17,000 --> 00:09:19,520
And there were still gaps, though,

147
00:09:19,920 --> 00:09:24,960
as Michael Cooper pointed out yesterday,
imperfect captions and descriptions

148
00:09:25,280 --> 00:09:29,000
can provide agency
but can also mislead users

149
00:09:29,000 --> 00:09:33,440
and and waste their time.

150
00:09:33,440 --> 00:09:36,080
I also want to point out
that this particular user

151
00:09:36,080 --> 00:09:41,440
was in a really privileged position
because he knows about all these services.

152
00:09:41,760 --> 00:09:45,360
He has devices powerful enough to stream
all these channels,

153
00:09:45,640 --> 00:09:47,520
has good Internet access.

154
00:09:47,520 --> 00:09:50,720
He has a smartphone,
he has the cognitive ability

155
00:09:50,720 --> 00:09:54,560
to make sense of this incredible
information overload.

156
00:09:55,720 --> 00:09:57,760
This really isn't equitable access.

157
00:09:57,760 --> 00:09:58,280
Right.

158
00:09:58,520 --> 00:10:02,280
And and the captions themselves weren't
providing equitable

159
00:10:02,280 --> 00:10:04,800
representation of the conference speakers

160
00:10:05,280 --> 00:10:09,080
to those with atypical speech
were at a disadvantage

161
00:10:09,280 --> 00:10:11,840
in having their message
communicated clearly.

162
00:10:12,640 --> 00:10:17,640
So there's an important gap to be filled.

163
00:10:17,640 --> 00:10:20,800
One of the current limitations
of automated captions is poor

164
00:10:20,800 --> 00:10:23,520
transcription of people
with atypical speech,

165
00:10:23,840 --> 00:10:27,360
and especially when they're using
technical or specialized language.

166
00:10:28,480 --> 00:10:30,560
So for example, Dimitri Kanevsky

167
00:10:30,560 --> 00:10:32,840
is a Google researcher and inventor.

168
00:10:33,800 --> 00:10:38,680
He's an expert in optimization
and algebraic geometry, among many other

169
00:10:38,680 --> 00:10:39,560
topics.

170
00:10:40,000 --> 00:10:43,920
He's Russian and deaf,
both of which affect his English speech.

171
00:10:44,600 --> 00:10:54,280
I'm going to play a short video
clip of Dimitri.

172
00:11:10,400 --> 00:11:12,320
So, Dimitri said

173
00:11:12,320 --> 00:11:15,440
Google has very good
general speech recognition,

174
00:11:15,480 --> 00:11:19,200
but if you do not sound like most people,
it will not understand you.

175
00:11:19,880 --> 00:11:24,440
And on the screen a speech engine
translated that last part of the sentence

176
00:11:24,440 --> 00:11:28,040
as. But if you look at most of people
it will look

177
00:11:28,040 --> 00:11:29,920
and defend you.

178
00:11:31,600 --> 00:11:33,160
So, people

179
00:11:33,160 --> 00:11:36,960
with disabilities
that impact speech such as verbal palsy,

180
00:11:37,040 --> 00:11:41,960
stroke, Down's Syndrome,
Parkinson's, ALS are also impacted

181
00:11:41,960 --> 00:11:46,800
by lack of access to speech recognition,
whether it's for controlling a digital

182
00:11:46,800 --> 00:11:50,720
assistant, communicating with others,
or creating accessible

183
00:11:50,720 --> 00:11:56,280
digital content. Oh,

184
00:11:57,680 --> 00:12:07,160
I want to go to the next slide.

185
00:12:07,160 --> 00:12:11,840
So Google's Project Euphonia
set out to explore

186
00:12:11,840 --> 00:12:16,280
whether personalized speech recognition
models can provide accurate speech

187
00:12:16,280 --> 00:12:19,640
recognition for people like Dimitri
with atypical speech.

188
00:12:20,280 --> 00:12:22,760
And this is a great example of the way
research

189
00:12:22,960 --> 00:12:25,600
can help
to move the state of the art forward.

190
00:12:25,600 --> 00:12:28,600
So the first challenge, as many people

191
00:12:28,600 --> 00:12:32,440
have mentioned already
today, was a lack of suitable speech data.

192
00:12:33,440 --> 00:12:34,560
Project Euphonia

193
00:12:34,560 --> 00:12:37,520
collected over a million utterances

194
00:12:37,520 --> 00:12:40,240
from individuals with speech impairments.

195
00:12:41,040 --> 00:12:43,520
And then the researchers built individual

196
00:12:43,520 --> 00:12:47,640
models for 432 people and compared them

197
00:12:47,960 --> 00:12:50,200
to state of the art general models.

198
00:12:51,680 --> 00:12:56,440
They found that the personalized models
could significantly reduce

199
00:12:56,560 --> 00:13:01,080
word error rates, and so the error rates
had gone from something like 31%

200
00:13:01,080 --> 00:13:04,880
with the general models down to 4.6%.

201
00:13:05,720 --> 00:13:08,360
So it's not just
a significant improvement,

202
00:13:08,360 --> 00:13:10,160
but it's enough of improvement.

203
00:13:10,160 --> 00:13:16,160
It gets to a high enough point
to make the technology practically useful.

204
00:13:16,160 --> 00:13:19,800
And in fact, they've even found
that these personalized models

205
00:13:20,200 --> 00:13:23,960
could sometimes perform
better than human transcribers for people

206
00:13:23,960 --> 00:13:26,520
with more severely disordered speech.

207
00:13:30,560 --> 00:13:33,200
So here's an example of Dimitri

208
00:13:33,200 --> 00:13:42,520
using his personal speech
recognition model.

209
00:13:57,840 --> 00:14:04,640
You can see the.

210
00:14:04,640 --> 00:14:08,840
So the transcription
this time is make all voice interactive

211
00:14:08,840 --> 00:14:12,600
devices be able to understand any person
speak to them.

212
00:14:13,320 --> 00:14:15,760
It's not perfect, but it's much, much

213
00:14:16,520 --> 00:14:18,000
more useful.

214
00:14:18,760 --> 00:14:22,920
Project Euphonia has started in English,
but it's now expanding to include

215
00:14:22,920 --> 00:14:32,320
Hindi, French, Spanish and Japanese.

216
00:14:32,320 --> 00:14:35,840
So that project demonstrated
how much better

217
00:14:35,840 --> 00:14:39,040
speech recognition
technology could be.

218
00:14:39,360 --> 00:14:43,280
But the original data wasn't
shareable outside Google

219
00:14:43,720 --> 00:14:48,680
and that limited the benefits
of all that data gathering effort.

220
00:14:48,680 --> 00:14:53,040
So the Speech Accessibility Project
at the University of Illinois

221
00:14:53,040 --> 00:14:56,360
is an example of
of what we might do about that problem.

222
00:14:56,760 --> 00:15:01,280
It's an initiative to make a dataset
for broader research purposes.

223
00:15:02,080 --> 00:15:05,400
It was launched in 2022,
and it's a coalition

224
00:15:05,400 --> 00:15:10,640
of technologists, academic researchers
and community organizations.

225
00:15:11,240 --> 00:15:15,560
The goal is to collect
a diverse speech dataset for training

226
00:15:15,560 --> 00:15:19,760
speech recognition models to do better
at recognizing atypical speech.

227
00:15:21,400 --> 00:15:24,080
It's building on some of the lessons
learned in Project

228
00:15:24,080 --> 00:15:28,400
Euphonia, paying attention
to Ethical data collection.

229
00:15:28,920 --> 00:15:31,400
So individuals are paid for participating.

230
00:15:31,400 --> 00:15:34,720
Their samples are de-identified
to protect privacy.

231
00:15:35,280 --> 00:15:36,680
The dataset is private.

232
00:15:36,680 --> 00:15:39,800
It is managed by UIUC

233
00:15:40,320 --> 00:15:45,200
and will be made available
for research purposes,

234
00:15:45,200 --> 00:15:49,360
and this effort is backed
by cross-industry very broad support

235
00:15:49,360 --> 00:15:52,960
from Amazon, Apple, Google, 
Meta and Microsoft.

236
00:15:53,720 --> 00:15:57,040
It's going to enable
both academic researchers

237
00:15:57,040 --> 00:15:59,160
and industry partners to make progress.

238
00:16:00,920 --> 00:16:02,480
Although the current work is focused on

239
00:16:02,480 --> 00:16:07,720
speech data, this is in general
a model that could be used for other data

240
00:16:07,960 --> 00:16:10,560
that's needed to make models
more inclusive

241
00:16:11,720 --> 00:16:14,000
so we could think of touch data.

242
00:16:14,720 --> 00:16:18,600
And there are already significant efforts
going on together.

243
00:16:18,600 --> 00:16:26,040
Sign language video data
for for sign language translation.

244
00:16:26,040 --> 00:16:29,680
Then Project Relate
is an example of the kind of app

245
00:16:29,840 --> 00:16:31,960
that can be developed
with this kind of data.

246
00:16:32,520 --> 00:16:36,800
It's an Android app
that provides individuals

247
00:16:36,800 --> 00:16:40,440
with the ability to build their own
personalized speech models

248
00:16:40,880 --> 00:16:44,720
and then use them for text to speech,
for communication,

249
00:16:45,000 --> 00:16:51,200
for communicating with home assistants.

250
00:16:51,200 --> 00:16:54,720
Personalized speech
models look really promising,

251
00:16:55,720 --> 00:16:58,880
and potentially a similar approach
could be taken into build

252
00:16:58,880 --> 00:17:02,760
personalized models
for other things like gesture recognition,

253
00:17:02,800 --> 00:17:07,040
touch screen interactions,
interpreting inaccurate typing.

254
00:17:07,160 --> 00:17:08,320
I think there's

255
00:17:08,320 --> 00:17:17,800
there's a world of opportunity there
that we haven't really begun to explore.

256
00:17:17,800 --> 00:17:21,800
So now that we know we can build
effective personal models

257
00:17:22,160 --> 00:17:26,120
from just a few hundred utterances,
can we learn from this?

258
00:17:26,400 --> 00:17:31,520
How to build more inclusive general models
would be a really important goal.

259
00:17:32,240 --> 00:17:34,720
Can we improve the 
performance even further

260
00:17:34,720 --> 00:17:38,480
by drawing on a person's
frequently used vocabulary?

261
00:17:39,080 --> 00:17:41,800
Can we prime models with vocabulary

262
00:17:41,800 --> 00:17:45,400
from the current context?

263
00:17:45,400 --> 00:17:50,680
And as Shivan Sing mentioned yesterday,
we're beginning to be able to combine

264
00:17:50,680 --> 00:17:53,760
text, image and audio sources to provide

265
00:17:53,760 --> 00:17:56,480
a richer context for AI to use.

266
00:17:57,080 --> 00:17:59,640
So there's very fast progress happening

267
00:17:59,640 --> 00:18:03,320
in all of these areas.

268
00:18:03,320 --> 00:18:08,000
Just as another example, the best student
paper at the Assets 2022 conference

269
00:18:08,360 --> 00:18:13,480
was using vocabularies that were generated
automatically from from photographs

270
00:18:13,720 --> 00:18:18,120
to prime the word prediction
component of a communication system,

271
00:18:18,520 --> 00:18:21,320
for a more efficient conversation around those

272
00:18:21,320 --> 00:18:24,920
photographs.

273
00:18:24,920 --> 00:18:25,480
Finally,

274
00:18:26,480 --> 00:18:28,400
bring your own model.

275
00:18:28,400 --> 00:18:34,080
I really agree with Shaomei Wu
when she said yesterday

276
00:18:34,400 --> 00:18:39,720
that use cases of media
creation are under investigated,

277
00:18:40,280 --> 00:18:43,880
we can apply personalized models
in content creation.

278
00:18:44,400 --> 00:18:47,360
Think about plugging
in your personal speech model

279
00:18:47,680 --> 00:18:52,720
to contribute captions for your
livestreamed audio for this meeting.

280
00:18:52,720 --> 00:18:56,720
The potential is huge
and web standards might need to evolve

281
00:18:56,720 --> 00:19:03,720
to support
some of these kinds of use cases.

282
00:19:03,720 --> 00:19:04,320
Okay.

283
00:19:05,000 --> 00:19:08,840
Next part.
When we when we talk about assistive AI,

284
00:19:09,120 --> 00:19:11,640
we're often talking about technologies
that are

285
00:19:11,880 --> 00:19:15,920
that are being applied at the point
of consumption, helping an individual

286
00:19:15,920 --> 00:19:20,960
to overcome accessibility barriers
in digital content or in the world.

287
00:19:20,960 --> 00:19:26,640
And I want to focus this section on
AI at source and why that is so important.

288
00:19:27,520 --> 00:19:30,920
Powerful AI tools in the hands of users

289
00:19:31,360 --> 00:19:34,080
don't mean that authors
can forget about accessibility,

290
00:19:35,000 --> 00:19:39,320
and we've been talking about many examples
of this through this symposium.

291
00:19:39,320 --> 00:19:45,760
But here are a few
that appealed to me.

292
00:19:45,760 --> 00:19:48,760
So I'm showing a figure from a paper.

293
00:19:49,280 --> 00:19:52,520
The figure is captioned user response time

294
00:19:52,520 --> 00:19:57,200
by authentication condition,
and the figure itself is a box plot.

295
00:19:57,200 --> 00:20:00,920
It shows response times from an experiment
for six different

296
00:20:01,040 --> 00:20:03,400
experimental conditions.

297
00:20:03,400 --> 00:20:06,040
So it's a pretty complex figure.

298
00:20:06,400 --> 00:20:11,160
And if I'm going to publish this
in my paper, my paper is available

299
00:20:11,160 --> 00:20:15,000
and I need to provide a description
of this image.

300
00:20:15,000 --> 00:20:16,520
There's so much information here.

301
00:20:18,080 --> 00:20:20,480
When faced with this task,

302
00:20:20,480 --> 00:20:24,520
about 50% of of academic authors

303
00:20:24,520 --> 00:20:28,120
resort to simply
repeating the caption of the figure.

304
00:20:28,920 --> 00:20:32,240
And this is really no help at all
to a blind scholar.

305
00:20:32,680 --> 00:20:35,040
They can already read the caption
that's in text.

306
00:20:35,800 --> 00:20:38,720
So usually the caption is saying

307
00:20:38,920 --> 00:20:42,200
what information you'll find
in the figure, but it's not giving you

308
00:20:42,200 --> 00:20:46,640
the actual information
that's in the figure.

309
00:20:46,640 --> 00:20:49,440
Now, as we discussed in yesterday's panel,

310
00:20:50,200 --> 00:20:53,400
that blind scholar reading
my paper could use

311
00:20:53,440 --> 00:20:57,800
AI to get a description of the figure,
but AI doesn't

312
00:20:57,800 --> 00:21:01,520
really have the context
to generate a good description.

313
00:21:02,080 --> 00:21:06,040
Only the author knows
what's important to convey.

314
00:21:06,040 --> 00:21:09,120
At the same time,
most authors aren't familiar

315
00:21:09,120 --> 00:21:12,120
with the guidelines
for describing images like this,

316
00:21:12,800 --> 00:21:16,080
and writing
a description can seem like a chore.

317
00:21:16,080 --> 00:21:20,320
That's why I really love the ideas
that Amy Powell shared yesterday

318
00:21:20,320 --> 00:21:25,680
for ways that a tool could help content
creators with their own description tasks,

319
00:21:26,840 --> 00:21:29,480
perhaps by generating an overall structure

320
00:21:30,000 --> 00:21:32,720
or an initial attempt
that a person can edit.

321
00:21:33,360 --> 00:21:34,160
I mean, there are

322
00:21:34,160 --> 00:21:37,760
there are existing guidelines
for describing different kinds of chart.

323
00:21:38,200 --> 00:21:43,520
Why not teach AI
how to identify different kinds of chart

324
00:21:43,520 --> 00:21:51,280
and sort of generate
a beginning description?

325
00:21:51,280 --> 00:21:55,800
And Shivam Singh was talking yesterday as
well about recent progress in this area.

326
00:21:57,120 --> 00:22:00,200
So ideally AI could refine

327
00:22:00,560 --> 00:22:03,800
its text in an interactive dialog
with the author

328
00:22:04,240 --> 00:22:07,640
and then the resulting description
would be provided in the paper

329
00:22:07,880 --> 00:22:13,440
and anyone could access it
whether or not they had their own AI.

330
00:22:13,440 --> 00:22:17,720
So that's what I mean by applying
AI at source where there's a person

331
00:22:17,720 --> 00:22:20,960
with the context to make sure
the description is appropriate

332
00:22:21,560 --> 00:22:23,480
and that can provide a better description.

333
00:22:24,800 --> 00:22:27,080
Of course,
it can only provide one description.

334
00:22:27,080 --> 00:22:31,720
There is also an important role
for image understanding that can support

335
00:22:31,960 --> 00:22:34,360
personalized exploration of images

336
00:22:35,600 --> 00:22:37,880
so that a reader could query information

337
00:22:37,880 --> 00:22:40,200
that wasn't available in a short
description.

338
00:22:41,040 --> 00:22:43,200
Like what were the maximum and minimum

339
00:22:43,200 --> 00:22:46,200
response times for the gesture condition
in this experiment?

340
00:22:46,880 --> 00:22:52,520
I'm not saying that AI at source
is the only solution, but it's important

341
00:22:52,520 --> 00:22:58,000
and perhaps it's an undeveloped piece.

342
00:22:58,000 --> 00:22:59,320
Here's the second example.

343
00:22:59,320 --> 00:23:01,120
I love examples

344
00:23:01,880 --> 00:23:04,120
as we were just talking about
in the earlier panel.

345
00:23:04,120 --> 00:23:08,240
Text transformations
can make written content more accessible.

346
00:23:08,880 --> 00:23:12,040
So, for example, using literal language

347
00:23:12,040 --> 00:23:14,760
is preferable for cognitive accessibility.

348
00:23:15,680 --> 00:23:19,880
So an idiom like she was in for a penny
in for a pound

349
00:23:20,400 --> 00:23:26,040
can be hard to spot if you're not familiar
with that particular idiom.

350
00:23:26,040 --> 00:23:29,560
It can be very confusing
if you try to interpret it literally.

351
00:23:30,680 --> 00:23:31,880
Content authors might

352
00:23:31,880 --> 00:23:34,160
use this kind of language
without realizing

353
00:23:35,200 --> 00:23:40,120
language models could transform text
to improve accessibility in many ways.

354
00:23:40,120 --> 00:23:45,240
And one is by replacing idioms
with more literal phrasing.

355
00:23:45,240 --> 00:23:48,800
So here I asked a language model
to rephrase the sentence

356
00:23:48,800 --> 00:23:52,640
without the idiom,
and it came up with a very sensible,

357
00:23:52,760 --> 00:23:55,520
although a little complex
little literal replacement.

358
00:23:55,880 --> 00:23:59,600
She decided to fully commit
to the situation no matter the cost.

359
00:24:01,560 --> 00:24:03,320
Again, this could be applied

360
00:24:03,320 --> 00:24:06,440
as a user tool and as a tool for authors

361
00:24:06,800 --> 00:24:10,640
to help them identify where their writing
could be misinterpreted.

362
00:24:11,680 --> 00:24:14,200
The one puts the onus on the consumer

363
00:24:14,240 --> 00:24:18,680
to bring their own solution, apply
it and be alert for potential mistakes.

364
00:24:19,120 --> 00:24:21,920
The other fixes
the potential access problems

365
00:24:22,120 --> 00:24:25,120
at source, where the author 
can verify accuracy

366
00:24:27,040 --> 00:24:30,240
and as I mentioned
earlier, because today's large language

367
00:24:30,240 --> 00:24:34,520
models are not connected to a ground truth
and they do have a tendency

368
00:24:34,520 --> 00:24:35,800
to hallucinate.

369
00:24:35,800 --> 00:24:39,160
Applying them at source
is one way to reap the benefits

370
00:24:39,160 --> 00:24:43,640
much more quickly
without risking harm to vulnerable users.

371
00:24:43,640 --> 00:24:47,200
Once we collect language models,
connect them to facts

372
00:24:48,280 --> 00:24:50,560
or connect
speech to the domain of discourse,

373
00:24:50,920 --> 00:24:53,760
we will really see a huge leap
in performance,

374
00:24:54,280 --> 00:25:01,000
reliability and trustworthiness.

375
00:25:01,000 --> 00:25:05,840
So in the previous two examples,
AI could be applied at source.

376
00:25:06,440 --> 00:25:10,040
What about when the
AI has to be on the consumer side,

377
00:25:10,200 --> 00:25:13,800
like when using text to speech,
to read out text on the web.

378
00:25:15,280 --> 00:25:15,840
On the screen

379
00:25:15,840 --> 00:25:20,120
here is the start of the Google 
information sidebar about Edinburgh.

380
00:25:20,480 --> 00:25:25,720
The capital city of Scotland. Is a heading,
subheading in the main text paragraph.

381
00:25:27,240 --> 00:25:28,760
Text to speech is making

382
00:25:28,760 --> 00:25:34,600
huge advances with more and more natural
sounding voices becoming available

383
00:25:34,600 --> 00:25:38,160
and the capability
of more expressive speech,

384
00:25:38,480 --> 00:25:41,560
which itself makes comprehension
more easy.

385
00:25:42,840 --> 00:25:45,080
And expressiveness can include things

386
00:25:45,080 --> 00:25:47,840
like adjusting the volume, the prosody.

387
00:25:49,120 --> 00:25:52,720
When reading a heading, maybe
I would naturally read it a little louder.

388
00:25:53,200 --> 00:25:55,280
Pause afterwards.

389
00:25:55,960 --> 00:26:00,760
For a TTS service to do
the best job reading out text on the web.

390
00:26:01,240 --> 00:26:07,520
It helps to have the semantics
explicitly expressed.

391
00:26:07,520 --> 00:26:12,240
So for example, the use of heading mark up
on Edinburgh on this passage.

392
00:26:13,480 --> 00:26:16,280
It's also important
that domain specific terms

393
00:26:16,280 --> 00:26:19,920
and people's names or place
names are pronounced correctly.

394
00:26:20,600 --> 00:26:23,920
So many people
not from the UK on first sight,

395
00:26:23,920 --> 00:26:26,960
they pronounce Edinburgh.

396
00:26:27,760 --> 00:26:28,520
Web standards,

397
00:26:28,520 --> 00:26:32,520
if they're applied properly,
can mark up the semantics like headings

398
00:26:32,520 --> 00:26:37,680
and pronunciation of specialized
or unusual words, helping the downstream AI

399
00:26:37,760 --> 00:26:41,320
to perform better. In fact,

400
00:26:42,520 --> 00:26:42,880
AI could

401
00:26:42,880 --> 00:26:47,600
also be used to identify the intended
structure and compare against the markup,

402
00:26:48,120 --> 00:26:51,440
or identify unusual words or acronyms
where

403
00:26:51,440 --> 00:26:53,840
pronunciation information
could be helpful,

404
00:26:54,760 --> 00:26:59,120
and then the passage can be read
appropriately by your preferred text

405
00:26:59,120 --> 00:27:02,160
to speech Voice at your preferred
speed and pitch.

406
00:27:04,640 --> 00:27:07,480
Can also be used by a speech

407
00:27:07,480 --> 00:27:11,320
to text model
to marry the vocabulary on the page

408
00:27:11,320 --> 00:27:14,120
with what you're saying
as you're interacting with the page

409
00:27:16,120 --> 00:27:18,680
to use voice controls.

410
00:27:18,680 --> 00:27:23,400
So I'm showing this example to illustrate
the web accessibility standards

411
00:27:23,600 --> 00:27:28,840
work together with assistive AI techniques
to enable the best outcome. And

412
00:27:29,480 --> 00:27:34,120
many uses of assistive technology
can benefit from this information.

413
00:27:34,120 --> 00:27:37,480
So thinking about applying AI at source,

414
00:27:37,960 --> 00:27:40,840
there's an important role here for A.I.

415
00:27:40,840 --> 00:27:44,200
that makes sure that the visual
and structural DOM

416
00:27:44,280 --> 00:27:52,360
representations are aligned.

417
00:27:52,360 --> 00:27:55,320
So I just want to reiterate the

418
00:27:56,760 --> 00:28:00,200
the powerful benefits of applying AI
at authoring time

419
00:28:00,440 --> 00:28:04,200
that these examples illustrate.

420
00:28:04,200 --> 00:28:07,520
So for software,
removing the burden from people

421
00:28:07,520 --> 00:28:11,080
with disabilities
to supply their own tools to bridge gaps.

422
00:28:12,040 --> 00:28:14,400
Secondly, it benefits more people,

423
00:28:14,400 --> 00:28:17,880
including those people
who don't have access to the AI tools,

424
00:28:18,280 --> 00:28:21,960
people with low end devices,
poor internet connectivity,

425
00:28:22,160 --> 00:28:25,760
less technology literacy.

426
00:28:25,760 --> 00:28:31,080
Thirdly, content creator can verify
the accuracy and safety of suggestions,

427
00:28:31,320 --> 00:28:35,400
mitigating harms from bias or errors
because they have the context

428
00:28:36,560 --> 00:28:41,280
and AI can also potentially mitigate harms
in other ways.

429
00:28:41,280 --> 00:28:47,640
For example, flagging videos, images
or animations that might trigger adverse

430
00:28:47,640 --> 00:28:50,720
health consequences
for some people like flashing lights.

431
00:28:52,400 --> 00:28:55,440
So AI inside is likely to reach more people

432
00:28:56,080 --> 00:28:58,440
than AI provided by end users.

433
00:28:58,440 --> 00:29:01,720
I think this is how
we'll get the most benefit for the least

434
00:29:01,720 --> 00:29:06,080
harm.

435
00:29:06,080 --> 00:29:11,360
It's also a huge opportunity
to make accessibility easier to achieve

436
00:29:12,280 --> 00:29:16,760
AI can make it much quicker and easier
to generate the accessibility information

437
00:29:17,000 --> 00:29:19,880
like captions or image descriptions
as we've discussed,

438
00:29:20,680 --> 00:29:23,640
and lowering the barrier to entry with
assistive

439
00:29:24,440 --> 00:29:29,840
tools is one way to encourage
good accessibility practice.

440
00:29:29,840 --> 00:29:33,680
AI can proactively identify
where accessibility work is needed

441
00:29:35,680 --> 00:29:37,720
and evaluate designs

442
00:29:37,720 --> 00:29:40,000
before
even a line of code has been written.

443
00:29:41,720 --> 00:29:43,240
But perhaps

444
00:29:43,240 --> 00:29:47,480
the biggest opportunity and
the greatest need for our attention

445
00:29:47,880 --> 00:29:50,920
is the use of AI to generate code.

446
00:29:51,440 --> 00:29:54,000
Which brings us to the final section

447
00:29:54,800 --> 00:29:56,720
of this talk.

448
00:29:57,120 --> 00:29:59,480
So in previous section

449
00:29:59,480 --> 00:30:02,200
we talked about ways that I can be applied

450
00:30:02,480 --> 00:30:05,800
in content creation
to help build accessibility in,

451
00:30:06,800 --> 00:30:11,120
but AI itself is also impacting
the way websites are designed

452
00:30:11,120 --> 00:30:14,760
and developed
independent of accessibility.

453
00:30:14,760 --> 00:30:19,640
So in this section, let's
think about how this change will impact

454
00:30:19,640 --> 00:30:23,320
our ability to bake accessibility in
and can we use

455
00:30:23,480 --> 00:30:25,320
AI to help us.

456
00:30:28,440 --> 00:30:30,800
As accessibility advocates

457
00:30:30,800 --> 00:30:36,560
we have long been pushing
the need to shift left, and by that

458
00:30:36,560 --> 00:30:41,680
we mean paying attention to accessibility
right from the start of a project.

459
00:30:42,080 --> 00:30:44,080
When you're understanding
the market potential,

460
00:30:44,080 --> 00:30:47,880
when you're gathering requirement,
when you're understanding and evaluating

461
00:30:47,880 --> 00:30:53,000
risks, developing designs and developing
the code that implements those designs.

462
00:30:55,600 --> 00:30:56,440
In a reactive

463
00:30:56,440 --> 00:30:59,720
approach to accessibility,
which is too often what happens,

464
00:31:00,400 --> 00:31:04,960
the first attention to accessibility comes
when automated tools

465
00:31:04,960 --> 00:31:07,760
are run on an already implemented system.

466
00:31:09,520 --> 00:31:11,680
Even then, such tools

467
00:31:11,680 --> 00:31:14,480
don't find all issues and

468
00:31:15,840 --> 00:31:17,920
may not even find
the most significant ones

469
00:31:18,440 --> 00:31:21,280
which can lead teams to prioritize poorly.

470
00:31:22,240 --> 00:31:25,880
So with that, our reactive approach

471
00:31:26,480 --> 00:31:30,920
teams, can be kind of 
overwhelmed with hundreds

472
00:31:30,920 --> 00:31:35,360
or even thousands of issues
kind of late in their process and

473
00:31:36,760 --> 00:31:38,360
have difficulty tackling it.

474
00:31:38,360 --> 00:31:44,240
It makes accessibility
seem much harder than than it could be.

475
00:31:44,240 --> 00:31:47,280
So this morning's panel,
we discussed ways that AI can be used

476
00:31:47,600 --> 00:31:50,560
in testing to help
find accessibility problems.

477
00:31:52,080 --> 00:31:54,800
AI is also already being used earlier

478
00:31:54,800 --> 00:31:57,600
in the process by designers and developers.

479
00:31:58,840 --> 00:32:00,680
In development, for example,

480
00:32:01,880 --> 00:32:04,040
GitHub Copilot is

481
00:32:04,040 --> 00:32:07,200
AI model that makes code completion
predictions

482
00:32:07,880 --> 00:32:12,200
and GitHub claims that in files
where it's turned on nearly

483
00:32:12,200 --> 00:32:18,920
40% of code is being written by GitHub
copilot in popular

484
00:32:18,960 --> 00:32:23,520
coding languages.

485
00:32:23,520 --> 00:32:27,640
There's also systems
that generate code from design wireframes

486
00:32:27,880 --> 00:32:32,120
or from high resolution mockups,
or even from text prompts.

487
00:32:32,120 --> 00:32:34,920
So it's incumbent on us to ask

488
00:32:35,240 --> 00:32:37,560
what data are those systems trained on

489
00:32:38,800 --> 00:32:41,480
in the case of copilot
is trained on GitHub

490
00:32:41,480 --> 00:32:45,360
open source project code.

491
00:32:45,360 --> 00:32:48,920
So what's the probability
that this existing code is accessible?

492
00:32:49,880 --> 00:32:54,800
We know that we still have a lot of work
to do to make digital accessibility

493
00:32:54,800 --> 00:32:58,320
the norm on the web.
Today is the exception,

494
00:32:59,680 --> 00:33:03,040
and many of you
probably know WebAIM does

495
00:33:03,040 --> 00:33:07,280
an annual survey
of the top million website home pages.

496
00:33:07,800 --> 00:33:11,920
It runs an automated tool
and puts the issues that it found

497
00:33:12,800 --> 00:33:15,200
almost 97% of

498
00:33:15,600 --> 00:33:19,480
the million pages
had accessibility issues,

499
00:33:20,000 --> 00:33:22,520
and that's only the automatically
detectable ones.

500
00:33:23,960 --> 00:33:26,840
They found
an average of 50 issues per page.

501
00:33:26,840 --> 00:33:30,440
And they also found the page
complexity is growing

502
00:33:30,440 --> 00:33:33,480
significantly.

503
00:33:33,480 --> 00:33:37,160
Over 80% of the pages
they looked at had low contrast

504
00:33:37,400 --> 00:33:40,720
text issues.

505
00:33:40,720 --> 00:33:44,640
More than half had alternative text
missing for images,

506
00:33:45,440 --> 00:33:47,560
almost half had missing form labels.

507
00:33:48,160 --> 00:33:51,160
So even though these are issues,
they're easy to find with

508
00:33:51,160 --> 00:33:54,800
the automated tools that we have today,
they're still not being addressed.

509
00:33:55,240 --> 00:33:59,840
These are very basic accessibility issues
and they're everywhere.

510
00:33:59,840 --> 00:34:03,440
So we know what this will surely mean
for AI models

511
00:34:03,600 --> 00:34:07,080
learning from today's web.

512
00:34:07,120 --> 00:34:10,840
And here's an example
of how this might be playing out already.

513
00:34:11,520 --> 00:34:14,840
So code

514
00:34:14,840 --> 00:34:18,560
snippets are one of the most common
things that developers search for,

515
00:34:19,280 --> 00:34:23,720
and a large language
model can come up with pretty decent code snippets.

516
00:34:23,720 --> 00:34:27,760
And this is this is a game changer
for developers and it's already happening.

517
00:34:28,240 --> 00:34:31,480
So let's say a
developer is new to Flutter.

518
00:34:31,920 --> 00:34:36,120
Flutter is Google's open
source mobile app development platform.

519
00:34:36,960 --> 00:34:40,920
They want to create a button labeled
with an icon known as an icon button.

520
00:34:41,880 --> 00:34:43,280
So on this slide

521
00:34:43,280 --> 00:34:49,560
is the code that ChatGPT produces
when it's asked for a Flutter code

522
00:34:49,560 --> 00:34:51,760
for an icon button

523
00:34:51,920 --> 00:34:56,480
along with the code snippet,
it also provided some explanation

524
00:34:56,480 --> 00:34:58,720
and it even links
to the documentation page.

525
00:34:58,720 --> 00:35:00,800
So it's pretty useful.

526
00:35:01,240 --> 00:35:03,080
And the code it gave for an icon

527
00:35:03,080 --> 00:35:08,120
button includes a reference to what icons
to use and a function to execute

528
00:35:08,520 --> 00:35:13,640
when the button is pressed.

529
00:35:13,640 --> 00:35:18,160
There's really just one
important difference between this example

530
00:35:18,160 --> 00:35:22,280
generated by ChatGPT and the example given

531
00:35:22,280 --> 00:35:26,480
in the Flutter documentation.

532
00:35:26,480 --> 00:35:29,320
ChatGPT didn't include a tooltip,

533
00:35:30,000 --> 00:35:33,560
which means there's
no text label associated with this button.

534
00:35:34,240 --> 00:35:37,880
That's an accessibility problem.

535
00:35:37,880 --> 00:35:39,000
Let's give it credit.

536
00:35:39,000 --> 00:35:41,520
ChatGPT did mention that it's possible
to add a tool,

537
00:35:42,280 --> 00:35:45,040
but developers
look first at the code example.

538
00:35:45,280 --> 00:35:49,880
If it's not in the example,
it's easily missed.

539
00:35:49,880 --> 00:35:54,080
But in the training data here,
it seems the tooltip was not present

540
00:35:54,320 --> 00:35:59,120
enough of the time for it to surface
as an essential component of an icon button.

541
00:36:02,120 --> 00:36:02,800
So, you

542
00:36:02,800 --> 00:36:05,840
know, there's lots of example code
available online,

543
00:36:05,840 --> 00:36:09,960
but how much of that code demonstrates
accessible coding practices?

544
00:36:10,560 --> 00:36:13,640
Given the state of web accessibility,
it's likely

545
00:36:13,640 --> 00:36:17,080
the answer is not much.

546
00:36:17,080 --> 00:36:22,200
So our AI models are not going to learn
to generate accessible code.

547
00:36:23,200 --> 00:36:27,080
It's really it's just like the societal bias

548
00:36:27,080 --> 00:36:31,400
of the past being entrenched in training
sets of today.

549
00:36:31,400 --> 00:36:36,840
The past lack of accessibility
could be propagated into the future.

550
00:36:36,840 --> 00:36:41,000
So here we have an opportunity
and a potential risk.

551
00:36:41,840 --> 00:36:44,320
AI can help to write accessible code,

552
00:36:44,840 --> 00:36:47,480
but it needs to be trained
on accessible code

553
00:36:47,720 --> 00:36:50,480
or augmented with the tools that can

554
00:36:50,480 --> 00:36:53,480
correct accessibility issues.

555
00:36:53,480 --> 00:36:55,760
And I think it's important
to point out as well

556
00:36:56,440 --> 00:37:00,200
that I deliberately use
an example in a framework

557
00:37:00,520 --> 00:37:06,640
rather than HTML example, because that's
what developers are writing in these days.

558
00:37:07,120 --> 00:37:10,000
They're not writing raw HTML,

559
00:37:10,440 --> 00:37:13,720
their writing of frameworks,
and there are many, many different

560
00:37:13,880 --> 00:37:18,520
frameworks, each with 
their own levels of accessibility and

561
00:37:21,680 --> 00:37:22,400
ways to

562
00:37:22,400 --> 00:37:26,720
incorporate accessibility.

563
00:37:26,720 --> 00:37:28,840
So one thing

564
00:37:30,080 --> 00:37:34,560
is that the theme of this morning
about data being really essential

565
00:37:35,600 --> 00:37:37,680
comes up here again.

566
00:37:37,680 --> 00:37:39,960
Do we have training data

567
00:37:40,320 --> 00:37:44,800
to train a code prediction model?

568
00:37:44,800 --> 00:37:48,600
Perhaps with transfer learning
to generate more accessible code?

569
00:37:49,880 --> 00:37:52,840
Do we have test sets even that we can test

570
00:37:54,440 --> 00:37:55,880
code generation

571
00:37:55,880 --> 00:37:58,880
for its ability
to produce accessible code?

572
00:38:00,080 --> 00:38:04,160
So when we're developing datasets

573
00:38:04,480 --> 00:38:07,480
for either training or testing,

574
00:38:07,880 --> 00:38:09,960
we have to think in terms of the diversity

575
00:38:09,960 --> 00:38:15,000
of of frameworks and methods
that developers are actually working with,

576
00:38:15,000 --> 00:38:18,440
if we want to catch those issues
at the point

577
00:38:18,760 --> 00:38:20,800
of creation.

578
00:38:26,240 --> 00:38:28,320
Again, where, where,

579
00:38:28,320 --> 00:38:31,560
AI is generating code
for a whole user interface

580
00:38:31,880 --> 00:38:37,520
based on a visual design,
we need to be thinking about what

581
00:38:37,520 --> 00:38:43,320
semantics should that design tool capture
to support the generation of code

582
00:38:43,600 --> 00:38:46,800
with the right structure,
the right roles for each area,

583
00:38:47,120 --> 00:38:51,280
the basic fundamentals
of accessibility. So

584
00:38:52,880 --> 00:38:55,080
a final call to action for

585
00:38:55,080 --> 00:39:00,200
the community here is to think about what
what do we need to do here, whether it is

586
00:39:00,480 --> 00:39:06,080
advocacy, awareness raising, research,
data gathering, standards

587
00:39:06,440 --> 00:39:09,840
or refining models
to write accessible code.

588
00:39:10,760 --> 00:39:12,920
This technology is so really young.

589
00:39:12,920 --> 00:39:15,160
It has a lot of room for improvement.

590
00:39:15,160 --> 00:39:19,880
This is a perfect time for us to define
how accessibility

591
00:39:19,880 --> 00:39:25,160
should be built in
and to experiment with different ways.

592
00:39:25,160 --> 00:39:30,080
And, you know, in my opinion,
this perhaps more than anything, is

593
00:39:30,080 --> 00:39:34,360
the trend that we need to get in front
of as an accessibility

594
00:39:35,480 --> 00:39:36,760
community, 
before the poor

595
00:39:36,760 --> 00:39:40,920
practices of the past are entrenched
in the automated code

596
00:39:40,920 --> 00:39:43,960
generators of the future.

597
00:39:43,960 --> 00:39:47,360
AI is already shifting left,
so we must make sure

598
00:39:47,360 --> 00:39:53,360
accessibility goes with it.

599
00:39:53,360 --> 00:40:00,080
So to summarize, we can broaden access
to assistive AI through personalization.

600
00:40:01,160 --> 00:40:01,600
To get the

601
00:40:01,600 --> 00:40:05,480
benefits of AI based empowerment
to all users,

602
00:40:05,480 --> 00:40:09,680
we should make sure that AI integration
with authoring tools

603
00:40:09,680 --> 00:40:15,280
and processes is applied where it can
to make it easier to meet accessibility

604
00:40:15,280 --> 00:40:17,920
standards and improve the overall standard.

605
00:40:18,600 --> 00:40:21,320
Born accessible is still our goal

606
00:40:22,520 --> 00:40:26,840
and AI can help us get there
if we steer it right. As a community

607
00:40:26,840 --> 00:40:30,560
we have, we have a lot of work to do,
but I'm really excited

608
00:40:30,640 --> 00:40:35,840
about the potential here.

609
00:40:35,840 --> 00:40:38,200
So thank you all for listening.

610
00:40:38,200 --> 00:40:41,960
Thanks to my Google colleagues and the IBM

611
00:40:41,960 --> 00:40:44,200
accessibility team for

612
00:40:45,920 --> 00:40:48,840
feedback and ideas
and great conversations.

613
00:40:49,440 --> 00:40:54,200
And now I want to invite Jutta to to join

614
00:40:54,960 --> 00:41:00,800
and let's let's have the conversation.

615
00:41:00,800 --> 00:41:02,600
Thank you, Shari.

616
00:41:02,600 --> 00:41:05,920
And I really, really appreciate
your coverage of authoring

617
00:41:05,920 --> 00:41:09,440
and the prevention of barriers
and the emphasis on timely,

618
00:41:09,440 --> 00:41:12,800
proactive measures.

619
00:41:12,800 --> 00:41:17,240
There may be an opportunity
actually to relook at authoring

620
00:41:17,560 --> 00:41:21,800
environments, etc., within W3C.

621
00:41:21,800 --> 00:41:26,400
Yeah, actually, just just to respond
to that really quickly, I do wonder, like,

622
00:41:26,480 --> 00:41:31,800
should we be focusing
on evaluating frameworks

623
00:41:31,800 --> 00:41:34,760
more than evaluating individual pages?

624
00:41:35,080 --> 00:41:37,800
You know,
I would we get more bang for our buck

625
00:41:38,120 --> 00:41:40,840
if that was where we paid attention?

626
00:41:40,840 --> 00:41:42,080
Yes, exactly.

627
00:41:42,080 --> 00:41:44,880
The opportunity to add, and especially as

628
00:41:46,160 --> 00:41:47,400
these tools are

629
00:41:47,400 --> 00:41:50,960
now also assisting authors,
which was part of what

630
00:41:51,480 --> 00:41:53,920
the authoring of the authoring

631
00:41:54,480 --> 00:41:57,000
standards were looking at, prompting

632
00:41:57,920 --> 00:42:03,080
providing the necessary supports
and making it possible for individuals

633
00:42:03,080 --> 00:42:07,080
with disabilities
to also become authors of code

634
00:42:07,080 --> 00:42:11,720
and to produce code so the greater
participation of the community,

635
00:42:12,480 --> 00:42:16,040
I think, will create that
some of that culture shift.

636
00:42:17,040 --> 00:42:21,200
So thank you very much for covering this.

637
00:42:21,200 --> 00:42:24,560
So in terms of the questions
that we were going to talk about,

638
00:42:24,840 --> 00:42:29,040
you had suggested that we might start
with one of the thorny questions

639
00:42:29,040 --> 00:42:33,320
that was asked yesterday
that we didn't get time to respond to.

640
00:42:34,520 --> 00:42:38,880
So the the question was,
do you think that AI

641
00:42:38,960 --> 00:42:44,240
and big companies such as Google
and Meta driving research in AI

642
00:42:44,240 --> 00:42:49,240
can be problematic
with respect to social societal issues

643
00:42:49,520 --> 00:42:52,480
which don't necessarily garner
the highest revenue?

644
00:42:52,920 --> 00:42:55,640
And if so,
how do you think we can approach this?

645
00:42:56,800 --> 00:42:57,960
Yeah, thank you Jutta 

646
00:42:57,960 --> 00:43:01,440
and thank you to the person
who asked that question too.

647
00:43:03,000 --> 00:43:05,760
It's true that company goals and society

648
00:43:05,760 --> 00:43:09,440
can pull in different directions.

649
00:43:09,440 --> 00:43:12,480
I do think there are benefits to having

650
00:43:12,480 --> 00:43:16,920
big companies working on these core models
because they often

651
00:43:16,920 --> 00:43:21,880
have better access to very large datasets
that can

652
00:43:23,000 --> 00:43:25,040
bring breakthroughs that then

653
00:43:25,040 --> 00:43:27,440
others can share, then that can help

654
00:43:28,880 --> 00:43:31,320
raise the tide
to raise all boats in a way.

655
00:43:31,880 --> 00:43:34,840
But advocacy and policy definitely have

656
00:43:35,240 --> 00:43:39,640
an important role to play in
guiding the application of AI,

657
00:43:39,640 --> 00:43:45,560
in the direction of AI research,
the ways that it's applied.

658
00:43:45,560 --> 00:43:50,680
Also, I wanted to say one approach here
could be through initiatives

659
00:43:50,680 --> 00:43:54,320
like the Speech Accessibility Project
that I talked about.

660
00:43:54,680 --> 00:43:57,320
So that's an example of of big tech
working

661
00:43:57,320 --> 00:44:00,800
together with advocacy groups and academia

662
00:44:01,160 --> 00:44:05,560
to create data that can be applied
to many different research projects.

663
00:44:05,560 --> 00:44:08,280
And that's a model
that we could try to replicate.

664
00:44:08,280 --> 00:44:13,360
Do you think that that I mean, you've
talked quite a bit about the opportunity

665
00:44:13,360 --> 00:44:17,880
for personalization and of course,
one of the biggest issues here is that

666
00:44:18,240 --> 00:44:23,560
large companies are looking
for the largest population,

667
00:44:23,560 --> 00:44:26,920
the largest profit,
which means the largest customer base,

668
00:44:26,920 --> 00:44:30,680
which tends to push them towards
thinking about

669
00:44:30,720 --> 00:44:35,120
and not thinking about minorities,
diversity, etc..

670
00:44:35,120 --> 00:44:38,720
But the training models
and the personalization

671
00:44:40,240 --> 00:44:42,280
strategies that you've talked about

672
00:44:42,320 --> 00:44:47,000
are things that are emerging possibilities
within large learning models.

673
00:44:47,320 --> 00:44:51,080
We have the opportunity to take
what has already been done

674
00:44:51,080 --> 00:44:53,720
generally and apply

675
00:44:54,080 --> 00:44:57,680
more personalized,
smaller data sets, etc..

676
00:44:58,040 --> 00:45:01,040
Do you think there's a there's a role

677
00:45:01,040 --> 00:45:04,960
for the large companies to prepare the the

678
00:45:05,440 --> 00:45:08,960
the ground and then for

679
00:45:10,240 --> 00:45:12,360
the remaining issues to

680
00:45:13,480 --> 00:45:17,000
piggyback on that
with with the new training sets?

681
00:45:17,480 --> 00:45:20,600
Or do you think even there
we're going to have

682
00:45:22,040 --> 00:45:22,920
both

683
00:45:22,920 --> 00:45:25,560
cost and availability issues?

684
00:45:27,280 --> 00:45:29,920
Well, you know,
I mean, I think that the model

685
00:45:29,920 --> 00:45:33,640
that you described is
is is already happening in places

686
00:45:33,960 --> 00:45:37,160
like with the
the speech accessibility project.

687
00:45:37,800 --> 00:45:41,360
The ultimate goal would be

688
00:45:41,360 --> 00:45:45,080
to have one model that can handle

689
00:45:46,640 --> 00:45:49,280
more diverse datasets 

690
00:45:50,360 --> 00:45:51,680
and it takes

691
00:45:52,760 --> 00:45:55,240
a concerted effort to gather that data.

692
00:45:56,920 --> 00:45:59,480
But if a community gathered the data

693
00:46:00,560 --> 00:46:03,720
and it was possible
to contribute that data, then,

694
00:46:05,200 --> 00:46:08,120
you know, that's that's another direction
that we can

695
00:46:09,200 --> 00:46:13,400
influence the the larger models
that are trained on large data.

696
00:46:14,000 --> 00:46:16,920
But personalization is

697
00:46:18,320 --> 00:46:21,120
I think it's going to be very important

698
00:46:21,120 --> 00:46:25,200
for for tackling some of that tail end. So

699
00:46:26,480 --> 00:46:30,720
personalization
is not just an accessibility benefit.

700
00:46:31,080 --> 00:46:33,800
There's there's lots of

701
00:46:33,880 --> 00:46:36,640
there's lots of tail populations, small

702
00:46:36,640 --> 00:46:40,880
end populations that add up to a large end
and a lot of people.

703
00:46:40,880 --> 00:46:45,200
So the more the
I think that the big companies benefit

704
00:46:45,440 --> 00:46:49,880
greatly
by exploring these smaller populations

705
00:46:49,880 --> 00:46:55,400
and learning how to adapt models
to different populations.

706
00:46:55,400 --> 00:46:56,520
And then

707
00:46:57,280 --> 00:47:02,000
as I mentioned, the ultimate goal
would be to learn how to fold that back

708
00:47:02,000 --> 00:47:06,120
in to a larger model
without it being lost in the process.

709
00:47:06,200 --> 00:47:10,920
Yeah, we have the dilemma that the further
you are from the larger model,

710
00:47:10,920 --> 00:47:16,400
the more you need to work
to shift it in your direction.

711
00:47:16,920 --> 00:47:17,840
So the

712
00:47:19,440 --> 00:47:20,320
that is

713
00:47:20,320 --> 00:47:23,480
something
I think that will need to be addressed.

714
00:47:23,480 --> 00:47:28,080
Whatever personalization happens,
the people that need the personalization

715
00:47:28,080 --> 00:47:32,200
the most will have the greatest difficulty
with the personalization.

716
00:47:32,200 --> 00:47:36,200
Do you think there's any strategies
that that might be available

717
00:47:36,440 --> 00:47:40,160
for us to use to address
that particular dilemma?

718
00:47:41,040 --> 00:47:43,800
Yeah You're you're touching my heart
with that question

719
00:47:43,800 --> 00:47:46,640
because I, I really

720
00:47:47,680 --> 00:47:51,560
that's been an ongoing problem
in accessibility for forever

721
00:47:52,040 --> 00:47:55,840
the the and not just in the context of AI,

722
00:47:56,000 --> 00:47:59,040
but the people who would benefit the most

723
00:47:59,040 --> 00:48:04,920
from personalization may be in a position
that makes it hard to discover

724
00:48:04,920 --> 00:48:09,080
and activate even personalization
that's already available.

725
00:48:09,920 --> 00:48:12,200
So one approach that works, I think in

726
00:48:12,200 --> 00:48:15,840
some context is dynamic adaptation,

727
00:48:16,160 --> 00:48:21,440
where instead of a person
needing to adapt to a system,

728
00:48:21,440 --> 00:48:25,320
the system can kind of flexibly adapt
to the person that's using it.

729
00:48:26,040 --> 00:48:28,960
And I think that works in situations where

730
00:48:30,480 --> 00:48:32,560
the person doesn't need to behave
any different

731
00:48:32,600 --> 00:48:35,120
to take advantage of that adaptation.

732
00:48:36,800 --> 00:48:38,320
It doesn't work so well where

733
00:48:38,320 --> 00:48:40,600
there's maybe a specific

734
00:48:41,440 --> 00:48:44,360
input methods that you might want to use

735
00:48:44,360 --> 00:48:47,520
that would be beneficial
where you need to do something different.

736
00:48:47,840 --> 00:48:52,240
So for, you know, for language models,
maybe we can imagine

737
00:48:53,200 --> 00:48:55,520
an Uber language model that

738
00:48:56,560 --> 00:49:01,400
first recognizes, Oh, this person's
speech is closest to this sub model

739
00:49:01,400 --> 00:49:06,200
that I have learned, and I'm
going to use that model for this person.

740
00:49:06,200 --> 00:49:10,120
And you could think of that
in terms of decreasing the

741
00:49:10,120 --> 00:49:10,960
distance, yeah.

742
00:49:11,440 --> 00:49:12,920
Yeah, yeah.

743
00:49:13,920 --> 00:49:15,480
So that's one that's one idea.

744
00:49:15,480 --> 00:49:17,800
What what do you think.

745
00:49:17,800 --> 00:49:19,320
Yeah.

746
00:49:19,320 --> 00:49:22,320
I'm wondering
whether there is an opportunity

747
00:49:22,600 --> 00:49:26,120
or if there ever will be taken
an opportunity

748
00:49:26,560 --> 00:49:30,960
to rethink just how we, we design

749
00:49:31,280 --> 00:49:35,720
what design decision we make
and how we develop

750
00:49:35,720 --> 00:49:39,160
and brings the systems to market

751
00:49:39,560 --> 00:49:42,680
such that
there is the opportunity for greater

752
00:49:43,880 --> 00:49:45,640
democratization or

753
00:49:45,640 --> 00:49:48,320
access to the tools and that

754
00:49:50,000 --> 00:49:55,160
don't begin with the the notion of let's

755
00:49:56,480 --> 00:49:58,880
design first for the majority
and then think about

756
00:49:59,160 --> 00:50:01,760
I mean, this is is an inflection point.

757
00:50:01,760 --> 00:50:06,160
There is an opportunity for small data
sets, zero shot training,

758
00:50:06,160 --> 00:50:10,080
etc., transfer, transformation transfer.

759
00:50:10,080 --> 00:50:14,800
Is this a time
when we can have a strategic push to say,

760
00:50:15,440 --> 00:50:17,840
let's think about other ways of

761
00:50:17,840 --> 00:50:22,240
of actually developing these tools
and releasing these tools?

762
00:50:23,120 --> 00:50:25,760
Maybe that's a little too idealistic that

763
00:50:26,560 --> 00:50:29,480
I don't know what you're thinking,
is there?

764
00:50:29,480 --> 00:50:32,600
Yeah, I, I think especially

765
00:50:32,720 --> 00:50:36,200
if you're in a domain
where you've identified that there's,

766
00:50:37,400 --> 00:50:40,440
you real risk and strong risk of bias.

767
00:50:41,000 --> 00:50:45,480
It's a
it should be a part of the design process

768
00:50:45,800 --> 00:50:51,560
to include people who would be outliers,
people

769
00:50:51,560 --> 00:50:55,760
who are going to test the boundaries
of what your solution can do.

770
00:50:56,560 --> 00:50:58,080
People that are going to help you

771
00:50:59,720 --> 00:51:02,240
understand
the problems that it might introduce.

772
00:51:02,640 --> 00:51:06,680
So it it's

773
00:51:07,720 --> 00:51:08,720
it's what should happen

774
00:51:08,720 --> 00:51:12,560
I think in design in in any system

775
00:51:12,560 --> 00:51:16,400
that especially if you're thinking
in AI, you need to think about

776
00:51:16,880 --> 00:51:21,280
the risks that you might be introducing
and you can’t really think about that

777
00:51:21,320 --> 00:51:25,160
without having the right people involved.

778
00:51:25,160 --> 00:51:25,480
Right...

779
00:51:25,480 --> 00:51:29,000
Whether that's by... somebody yesterday
I think mentioned

780
00:51:29,000 --> 00:51:30,960
something about

781
00:51:32,000 --> 00:51:33,840
teaching designers

782
00:51:33,840 --> 00:51:37,280
and developers more about accessibility.

783
00:51:37,280 --> 00:51:40,640
And and I think that's
a really important point too

784
00:51:41,120 --> 00:51:44,800
that building diverse
teams is really important.

785
00:51:44,920 --> 00:51:48,720
Getting more diversity into computer
science is really important.

786
00:51:49,120 --> 00:51:53,880
But teaching the people who are already
there building things is also important.

787
00:51:54,440 --> 00:51:57,000
And I don't you know,
I don't meet very many

788
00:51:57,000 --> 00:52:00,160
people who say, Oh,
I don't care about accessibility.

789
00:52:00,320 --> 00:52:01,880
It's not important.

790
00:52:01,880 --> 00:52:07,120
It's it's more that it's
it's still too difficult to do.

791
00:52:07,120 --> 00:52:11,680
And that's one place
where I think AI can really, really help.

792
00:52:11,680 --> 00:52:16,600
And some of the the tools that people have
talked about today are examples of that

793
00:52:17,040 --> 00:52:21,200
where if we can make it easy
enough and lower that barrier

794
00:52:21,800 --> 00:52:25,040
and take opportunity of these creation
points

795
00:52:25,040 --> 00:52:28,760
to teach people
as well about accessibility.

796
00:52:28,760 --> 00:52:29,200
So not

797
00:52:30,480 --> 00:52:31,480
not always

798
00:52:31,480 --> 00:52:36,360
to fix everything for them,
but to fix things with them

799
00:52:36,920 --> 00:52:41,880
so that they can learn going
forwards and grow.

800
00:52:41,880 --> 00:52:44,760
I think that's a really exciting area.

801
00:52:44,760 --> 00:52:46,840
Yes, and a great way to to support

802
00:52:46,840 --> 00:52:50,840
born accessible,
so accessible by default

803
00:52:50,920 --> 00:52:54,720
with respect to what is the tools
that are used to create it.

804
00:52:55,640 --> 00:53:01,000
I'm you you contributed some questions
that you would love to discuss

805
00:53:01,440 --> 00:53:06,480
and one of the first ones is is
AI’s role mostly considered

806
00:53:06,600 --> 00:53:10,800
as improving assistive technology
or digital accessibility in general.

807
00:53:10,800 --> 00:53:15,080
And of course, this gets to this idea
of not creating

808
00:53:15,080 --> 00:53:20,280
a segregated set of innovations
that specifically address

809
00:53:20,280 --> 00:53:24,000
people with disabilities,
but also making sure that

810
00:53:25,160 --> 00:53:27,200
the innovations that are brought

811
00:53:27,200 --> 00:53:30,920
about by addressing
the needs of people who whose needs of,

812
00:53:31,280 --> 00:53:35,760
well, who face barriers can benefit
the population at large.

813
00:53:36,440 --> 00:53:39,560
So do what what do you think?

814
00:53:40,040 --> 00:53:42,600
What is the the future direction?

815
00:53:43,520 --> 00:53:45,800
Yeah, this was a question that came from

816
00:53:45,800 --> 00:53:49,280
an attendee, I think that was put into
in the registration process.

817
00:53:49,640 --> 00:53:51,680
And I,

818
00:53:51,680 --> 00:53:54,760
I do think it's really important
to view AI

819
00:53:54,760 --> 00:53:58,120
as a tool for digital 
accessibility in general and

820
00:53:58,200 --> 00:54:01,120
and not to just think about the end

821
00:54:01,200 --> 00:54:04,120
user applications
although those personal AI

822
00:54:04,160 --> 00:54:08,320
technologies are really important
and they're life changing and they can do things

823
00:54:08,320 --> 00:54:13,240
that aren't achievable in any other way,
but AI

824
00:54:13,240 --> 00:54:16,240
is already
a part of the development process

825
00:54:16,240 --> 00:54:19,280
and accessibility
needs to be a part of that.

826
00:54:19,280 --> 00:54:22,960
And we have so many challenges
to solve there.

827
00:54:22,960 --> 00:54:23,800
I think it's

828
00:54:25,640 --> 00:54:28,440
an area
that we need to pay more attention to.

829
00:54:28,640 --> 00:54:33,040
So not just applying AI
to detect accessibility problems,

830
00:54:33,040 --> 00:54:37,160
but engaging with those mainstream
development tools

831
00:54:37,160 --> 00:54:39,600
to make sure the accessibility
is considered.

832
00:54:41,560 --> 00:54:43,680
One sort of associated

833
00:54:43,760 --> 00:54:46,640
piece to this that came to mind.

834
00:54:47,160 --> 00:54:50,360
And I'm going to take the privilege
of being the person asking the questions.

835
00:54:50,640 --> 00:54:54,200
I mean, the focus of most of AI innovation

836
00:54:54,200 --> 00:54:58,400
has been on replicating and potentially
replacing human intelligence

837
00:54:58,800 --> 00:55:05,080
as opposed to augmenting or thinking
about other forms of intelligence.

838
00:55:05,080 --> 00:55:11,320
And I wonder whether the I mean,
our experiences in assistive technology

839
00:55:11,320 --> 00:55:15,520
and how technology
can become an accompaniment

840
00:55:15,520 --> 00:55:19,080
or an augmentation
rather than a replacement

841
00:55:19,600 --> 00:55:22,840
might have some insights to give in this

842
00:55:23,480 --> 00:55:26,920
improvement of digital inclusion.

843
00:55:26,920 --> 00:55:29,080
Yeah,
I think you're you're absolutely right.

844
00:55:29,080 --> 00:55:34,520
It's it's human
AI cooperation and collaboration

845
00:55:34,760 --> 00:55:39,160
that's going to
get us the best results. And

846
00:55:43,000 --> 00:55:45,440
the the

847
00:55:45,640 --> 00:55:47,280
language the language models

848
00:55:47,280 --> 00:55:51,080
that we have,
the promise that they have of more interactive

849
00:55:51,120 --> 00:55:53,640
dialog like interactions

850
00:55:54,120 --> 00:55:57,280
are, you know, heading

851
00:55:57,280 --> 00:55:59,320
in a direction they're going to support

852
00:55:59,920 --> 00:56:03,920
much more natural human AI dialog

853
00:56:03,920 --> 00:56:06,920
and accessibility is such a complex

854
00:56:06,920 --> 00:56:09,960
topic where

855
00:56:10,120 --> 00:56:11,920
it's not always obvious

856
00:56:11,920 --> 00:56:14,840
what I'm trying to convey with this image.

857
00:56:14,840 --> 00:56:18,000
How important is
is the thing, you know, it's not it's

858
00:56:18,000 --> 00:56:20,360
not necessarily easy to

859
00:56:21,960 --> 00:56:24,120
to decide what exactly is the

860
00:56:25,480 --> 00:56:28,400
correct alternatives for for something

861
00:56:28,400 --> 00:56:30,720
or there's plenty of other examples

862
00:56:32,200 --> 00:56:34,920
where a

863
00:56:35,640 --> 00:56:39,480
the combination of an AI
that has been trained on

864
00:56:39,960 --> 00:56:44,160
some of the general principles
of of good accessibility practice

865
00:56:44,400 --> 00:56:48,040
and a person who may not be as familiar
but really understands the domain

866
00:56:48,040 --> 00:56:50,720
and the context
of this particular application.

867
00:56:51,160 --> 00:56:53,640
It's when you put those
two things together

868
00:56:54,400 --> 00:56:56,920
that, things are going to start to work

869
00:56:57,080 --> 00:57:02,600
so AI can support the person,
not replace the person.

870
00:57:02,600 --> 00:57:03,760
And of course,

871
00:57:05,280 --> 00:57:06,000
the the

872
00:57:06,000 --> 00:57:10,280
one issue that we need to thorny issue
that we need to overcome

873
00:57:10,520 --> 00:57:14,120
with respect to AI, is that

874
00:57:14,720 --> 00:57:18,440
the challenge of addressing
more qualitative

875
00:57:18,440 --> 00:57:23,040
non quantitative values and ideas etc.,

876
00:57:23,920 --> 00:57:27,240
so that it'll be interesting
to see what happens there.

877
00:57:28,800 --> 00:57:29,440
Yeah. Yeah.

878
00:57:29,440 --> 00:57:33,200
I thought Yeliz
had a very good suggestion this morning of

879
00:57:33,760 --> 00:57:35,760
perhaps we should pay attention to

880
00:57:36,920 --> 00:57:39,320
how people are making these judgments,
how the AI,

881
00:57:40,040 --> 00:57:42,680
how the accessibility experts

882
00:57:42,680 --> 00:57:45,840
make these judgments,
what are the principles and,

883
00:57:45,960 --> 00:57:49,160
and can we articulate those better

884
00:57:49,480 --> 00:57:52,040
than... than we do

885
00:57:52,600 --> 00:57:56,440
now and communicate those better to.

886
00:57:56,480 --> 00:57:57,080
Right.

887
00:57:57,080 --> 00:58:02,800
And there's been this notion of thick data
which includes the context

888
00:58:02,800 --> 00:58:07,160
because frequently we've isolated the data
from the actual context

889
00:58:07,480 --> 00:58:11,400
and many of these things
are very contextually bound.

890
00:58:11,400 --> 00:58:14,560
And so do you see that there might be

891
00:58:14,920 --> 00:58:17,920
a reinvestigation of the

892
00:58:18,680 --> 00:58:22,800
where that data came from,
what the context of the data was, etc.?

893
00:58:24,200 --> 00:58:27,560
I think there might be a

894
00:58:28,040 --> 00:58:30,800
a rise in methods that that

895
00:58:32,600 --> 00:58:35,200
bring in the whole context,

896
00:58:35,200 --> 00:58:38,120
bring in more of the context,
multimodal inputs.

897
00:58:38,720 --> 00:58:43,000
Do you know even even for

898
00:58:44,360 --> 00:58:48,280
speech recognition
it it's doing what it does

899
00:58:48,280 --> 00:58:52,200
without even really knowing the domain
that it's working in

900
00:58:53,400 --> 00:58:56,760
and and that's pretty mindblowing
really think

901
00:58:57,440 --> 00:59:02,400
but when when it breaks down
is when there are technical terms

902
00:59:02,400 --> 00:59:05,760
when you're talking about a domain that is

903
00:59:07,480 --> 00:59:09,200
less frequently talked about,

904
00:59:09,200 --> 00:59:12,400
less represented and

905
00:59:12,400 --> 00:59:15,200
bringing in that domain knowledge

906
00:59:15,200 --> 00:59:17,400
I think is going to be huge.

907
00:59:17,760 --> 00:59:21,680
And similarly in terms of, of

908
00:59:22,720 --> 00:59:26,640
helping to create text alternatives
for things,

909
00:59:26,640 --> 00:59:31,160
the domain knowledge will will help to

910
00:59:33,000 --> 00:59:36,560
give a better
kind of base suggestion from the AI.

911
00:59:36,920 --> 00:59:39,480
And perhaps with dialog

912
00:59:40,120 --> 00:59:45,440
we can prompt people with
the right questions to help them decide

913
00:59:46,080 --> 00:59:48,600
is this
is this actually a decorative image

914
00:59:48,600 --> 00:59:51,760
or is it important for me to describe
what's in this image?

915
00:59:51,760 --> 00:59:56,280
You know, that's not actually always
a trivial question to answer.

916
00:59:56,280 --> 00:59:59,720
And of course, that brings in the issue
of classification and labeling

917
01:00:00,280 --> 01:00:04,960
and the need to box or classify
specific things.

918
01:00:04,960 --> 01:00:08,680
And many of these things are very fuzzy
contexts

919
01:00:08,680 --> 01:00:13,160
and classifiers are also determined
hierarchically.

920
01:00:13,160 --> 01:00:15,200
And there's yes.

921
01:00:15,800 --> 01:00:20,960
So yeah, maybe we don't need a perfect
classifier, but we need

922
01:00:22,280 --> 01:00:26,760
a good dialog where the,

923
01:00:27,080 --> 01:00:29,520
the, the system knows what questions
to ask

924
01:00:30,040 --> 01:00:32,520
to help the person decide.

925
01:00:32,520 --> 01:00:34,160
Right.

926
01:00:34,160 --> 01:00:37,280
And oh, and I just saw a message from

927
01:00:37,280 --> 01:00:41,720
Carlos saying to end the discussions

928
01:00:42,440 --> 01:00:45,600
and Carlos, I'm wondering
can we fit in one more question.

929
01:00:46,520 --> 01:00:49,960
I actually have to stop
at the top of the hour. So.

930
01:00:50,640 --> 01:00:51,920
Okay.

931
01:00:52,560 --> 01:00:56,960
So we will have an opportunity
to answer the questions

932
01:00:56,960 --> 01:01:00,240
that people have submitted in the question
and answer dialog.

933
01:01:00,920 --> 01:01:03,360
And we have access to those. So

934
01:01:04,440 --> 01:01:06,600
Shari will be able to respond

935
01:01:06,920 --> 01:01:09,360
to some of these additional questions
that have been asked

936
01:01:10,960 --> 01:01:15,480
and and apologies
that we went a little over time.

937
01:01:15,480 --> 01:01:17,960
Carlos. Okay. Turn it back over to you.

938
01:01:18,520 --> 01:01:20,560
Okay. Thank you. Thank you, Shari.

939
01:01:20,840 --> 01:01:23,040
Thank You. Thank you

940
01:01:23,040 --> 01:01:23,480
Thank you.

941
01:01:23,480 --> 01:01:24,560
Shari and Jutta.

942
01:01:24,560 --> 01:01:27,280
It was I was loving this discussion.

943
01:01:27,880 --> 01:01:31,960
So it's really unfortunate that we
we have stop now.

944
01:01:32,880 --> 01:01:35,320
But thank you. Thank you so much for

945
01:01:37,400 --> 01:01:38,640
your presentations.

946
01:01:38,640 --> 01:01:40,280
Thank you all.

947
01:01:40,280 --> 01:01:43,880
Thank you
also to to all the panelists yesterday

948
01:01:43,880 --> 01:01:46,640
and today for making this

949
01:01:47,840 --> 01:01:49,040
a great symposium.

950
01:01:49,040 --> 01:01:52,400
Lots of interesting
and thought provoking ideas.

951
01:01:54,320 --> 01:01:56,400
And thank you all for attending.