Some people rely on ‘screen readers’ (software that reads text on the screen out loud when you move your finger over it) to browse content on Lemmy. Some screen readers can read text on images (I know Apple’s does, not sure about Android), but obviously it can make mistakes and there’s missing context a lot of the time. Hence the transcriptions.
There are also a couple of other benefits. The post is more likely to appear in search results if someone searches for text included in the transcription. And if the image fails to load for whatever reason, or the image host deletes it, you can get the gist from the transcription.
Lol (sh.itjust.works)
Transcription. A four-panel meme....