Clinical linguists would like to use the analysis of spontaneous language sessions with patients to diagnose language deficits or abnormalities (e.g. aphasia). Until now, the analysis of such sessions has been done entirely by hand. This is very labor-intensive and is therefore often not done or only partially done.

Together with the Dutch Association for Clinical Linguistics (VKL), Utrecht University – led by prof. dr. Jan Odijk – is investigating whether this process can be partially automated in order to make this process more efficient. To this end, the Digital Humanities Lab is developing a research prototype of a new application (SASTA) that is suitable for the semi-automatic assessment of language development in accordance with  different assessment methods (TARSP, STAP and ASTA).

SASTA parses patient transcripts with the Alpino dependency parser. A set of queries is formulated and run against these dependency trees to generate a report on the scoring parameters of the assessment method.

Initial results are promising, with many of the queries giving comparable (in some cases even better) outcomes to human analysts, in a fraction of the time.

Additional work is being done to correct commonly occuring errors and intricacies of spontaneous language, which would otherwise cause incorrect dependency parses. This is expected to further improve the results.