Including more features apart from diagnostic codes #8

sarwart · 2020-07-30T12:45:56Z

Hi Retain team,

I am interested in using retain for my research. I wanted to know how to prepare data in the presence of more features like vital signs or medication apart from diagnostic codes? Do I need to concatenate all the feature together? For example for a patient with diagnostic codes (c1 to c3) and vital signs (v1 to v3) , should the input be like [c1,c2,c3,v1,v2,v3] for a single visit?

mp2893 · 2020-07-30T14:31:25Z

Hi Tabinda,

Yes that’s how you use additional features. However, note that the contributions of the continuous features (e.g. blood pressure) are calculated a bit differently than the contributions of discrete features. (See the original paper for actual equation).

Best
Ed

sarwart · 2020-07-31T01:06:59Z

Dear Ed,

Thank you for your reply. Yes I noticed but in the paper it is mentioned that continuous features could be used.

In case of learning to diagnose (L2D) [27], the input vector xi consists of continuous clinical measures.

When do you say "contributions", are you talking about interpretation of the attention model? What exactly do I need to change to include the continuous features?

Tabinda

mp2893 · 2020-07-31T01:32:31Z

Contributions in RETAIN are different from simply just studying the attention. RETAIN allows users to decompose the prediction value into contributions of individual features in each visit using Eq.2 in the paper.
You don't need to change anything, you can simply just append more features to the input vector.
And I recommend using the TensorFlow version implemented by Optum (https://github.com/Optum/retain-keras), which is better maintained than this one.

sarwart · 2020-07-31T01:37:10Z

Thank you Ed

lucasliu0928 · 2020-08-12T08:03:13Z

Hi Retain team,
I am really interested in this work and trying to understand the framework. I have a couple of questions:

I wonder how does the model deal with patients having different numbers of visits and visits having different lengths of codes/features? Do you pad zeros to the end of the patient's visit lists and do the same for codes within each visit?
If we only use continuous features, what would be the input argument for "inputDimSize", it is 942 in your example for the entire 3-digit ICD9 codes. would it be the number of features?

mp2893 · 2020-08-17T05:15:34Z

Hi Lucas,

Thanks for taking interest in this work.

In principle you are correct. This RETAIN code was implemented in Theano, which is a really old library, so I didn't exactly do as your question, but you should follow your logic when implementing RETAIN with modern libraries such as TensorFlow or PyTorch. However, don't forget to use proper masks to deal with padded elements.
Yes it would be the total number of features.

Best,
Ed

lucasliu0928 · 2020-08-17T19:12:39Z

Hi Lucas,

Thanks for taking interest in this work.

In principle you are correct. This RETAIN code was implemented in Theano, which is a really old library, so I didn't exactly do as your question, but you should follow your logic when implementing RETAIN with modern libraries such as TensorFlow or PyTorch. However, don't forget to use proper masks to deal with padded elements.

Yes it would be the total number of features.

Best,
Ed

Hi Ed,
Thank you for the explanations!

Best,
Lucas

sarwart · 2020-08-17T23:10:32Z

Hi Lucas,

If you implement Retain for continuous feature, it will be great if you can share the code on github.

Tabinda

sarwart changed the title ~~Including extra features apart from diagnostic codes~~ Including more features apart from diagnostic codes Jul 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Including more features apart from diagnostic codes #8

Including more features apart from diagnostic codes #8

sarwart commented Jul 30, 2020

mp2893 commented Jul 30, 2020

sarwart commented Jul 31, 2020

mp2893 commented Jul 31, 2020

sarwart commented Jul 31, 2020

lucasliu0928 commented Aug 12, 2020

mp2893 commented Aug 17, 2020

lucasliu0928 commented Aug 17, 2020

sarwart commented Aug 17, 2020

Including more features apart from diagnostic codes #8

Including more features apart from diagnostic codes #8

Comments

sarwart commented Jul 30, 2020

mp2893 commented Jul 30, 2020

sarwart commented Jul 31, 2020

mp2893 commented Jul 31, 2020

sarwart commented Jul 31, 2020

lucasliu0928 commented Aug 12, 2020

mp2893 commented Aug 17, 2020

lucasliu0928 commented Aug 17, 2020

sarwart commented Aug 17, 2020