We generated this dataset to train a machine learning model for automatically generating psychiatric case notes from doctor-patient conversations. Since, we didn't have access to real doctor-patient conversations, we used transcripts from two different sources to generate audio recordings of enacted conversations between a doctor and a patient. We employed eight students who worked in pairs to generate these recordings. Six of the transcripts that we used to produce this recordings were hand-written by Cheryl Bristow and rest of the transcripts were adapted from Alexander Street which were generated from real doctor-patient conversations. Our study requires recording the doctor and the patient(s) in seperate channels which is the primary reason behind generating our own audio recordings of the conversations.
We used Google Cloud Speech-To-Text API to transcribe the enacted recordings. These newly generated transcripts are auto-generated entirely using AI powered automatic speech recognition whereas the source transcripts are either hand-written or fine-tuned by human transcribers (transcripts from Alexander Street).
We provided the generated transcripts back to the students and asked them to write case notes. The students worked independently using a software that we developed earlier for this purpose. The students had past experience of writing case notes and we let the students write case notes as they practiced without any training or instructions from us.
- Kazi, Nazmul
- Kuntz, Matt
- Kanewala, Upulee
- Kahanda, Indika
- Bristow, Cheryl
- Arzubi, Eric
Index | Category | Abbr. |
---|---|---|
0 | Client Details | CD |
1 | Chief Complaint | CC |
2 | History of Present Illness | HPI |
3 | Past Psychiatric History | PPH |
4 | History of Substance Use | HSU |
5 | Social History | SH |
6 | Family History | FH |
7 | Review of Systems | RS |
Directory | Description |
---|---|
transcripts/source |
Source transcripts that are used to generate the audio recordings. |
recordings |
Audio recordings of the enacted doctor-patient conversations. |
transcripts/transcribed |
Transcripts generated from the audio recordings using Google Cloud Speech-To-Text API. |
casenotes |
Casenotes written by the students, i.e. annotators. |
[
{
"speaker" : 1,
"dialogue" : ["sentence 1", "sentence 2", ...]
},
...
]
Term | Definition |
---|---|
speaker |
Speaker of the current dialogue turn. |
dialogue |
Sentence(s) spoken by the speaker in current dialogue turn. |
[
{
"categoryId" : "0",
"sourceId" : "0",
"formalText" : "formal text"
},
...
]
Term | Definition |
---|---|
categoryId |
Index of the case note category, e.g. 5 = Social History, to which this sentence is used. This property is zero-indexed. |
sourceId |
Index of the source sentence in the transcript. This property is zero-indexed. |
formalText |
Modified version of the sentence as it is used in the case note. |
NOTE: If a sentence is used in multiple casenote categories, a record will appear for each use. "sourceId":"n"
refers to the sentence whose index is n
in the whole transcript whereas multiple sentences can belong to the same dialogue turn. In the following transcript, "sourceId":"3"
refers to sentence_d
:
[
{
"speaker" : 1,
"dialogue" : ["sentence_a", "sentence_b"]
},
{
"speaker" : 2,
"dialogue" : ["sentence_c", "sentence_d", "sentence_e"]
}
]
This is a pickle file (protocol version 4) containing all the transcribed transcripts and the casenotes for easy and quick access to the data using python.
The audio recordings are also available in Oxiago Int. website.
This project is funded by CATalyst Gap fund, Fall 2019.
D0420-S2-T01 D0420-S3-T02 D0420-S3-T03 D0420-S4-T01 D0420-S4-T02 D0421-S1-T01 D0421-S1-T02 D0421-S1-T03 D0421-S1-T04 D0421-S1-T05 D0421-S2-T01 D0421-S2-T02 D0421-S3-T01 D0421-S3-T02 D0421-S3-T03 D0421-S3-T04 D0421-S3-T05 D0422-S1-T01 D0422-S1-T02 D0422-S1-T03 D0422-S1-T04 D0422-S2-T01 D0422-S2-T02 D0422-S3-T01 D0422-S3-T02 D0422-S3-T03 D0422-S3-T04 D0422-S3-T05 D0422-S3-T06 D0422-S4-T01 D0422-S4-T02 D0422-S4-T03 D0422-S4-T04 D0422-S4-T05 D0423-S1-T01 D0423-S1-T02 D0423-S1-T03 D0423-S2-T01 D0423-S2-T02 D0423-S2-T03 D0424-S1-T01 D0424-S1-T02 D0424-S1-T03 D0424-S2-T01 D0424-S2-T02 D0424-S2-T03 D0424-S2-T04 D0424-S2-T05 D0424-S2-T06 D0424-S3-T01 D0424-S3-T02 D0424-S3-T03 D0424-S3-T04 D0425-S1-T01 D0425-S1-T02 D0425-S1-T03 D0425-S2-T01 D0425-S2-T02 D0425-S2-T03 D0425-S2-T04 D0425-S3-T01 D0425-S3-T02 D0425-S3-T03 D0425-S3-T04 D0425-S3-T05
This dataset is provided "As Is" without warranty of any kind. In no event shall the authors or copyright holders be liable for any claim, damages, or other liability. The source transcripts may not be enacted word-to-word as they appear in the transcript. Similarly, we used automatic speech recognition to transcribe the recordings and the transcribed transcripts may not match exactly as the words appear in the audio recordings.
This work is licensed under a Creative Commons Attribution 4.0 International License.