Bipolar mood disorder is a severe mental condition that has multiple episodesof either of two types: manic or depressive. These phases can lead patients tobecome hyperactive, hyper-sexual, lethargic, or even commit suicide — all ofwhich seriously impair the quality of life for patients. Predicting these phaseswould help patients manage their lives better and improve our ability to applymedical interventions. Traditionally, interviews are conducted in the evening topredict potential episodes in the following days. While machine learningapproaches have been used successfully before, the data was limited tomeasuring a few self-reported parameters each day. Using biometrics recordedat short intervals over many months presents a new opportunity for machinelearning approaches. However, phases of unrest and hyperactivity, which mightbe predictive signals, are not only often experienced long before the onset ofmanic or depressive phases but are also separated by several uneventful days.This delay and its aperiodic occurrence are a challenge for deep learning. In thisthesis, a fictional dataset that mimics long and irregular delays is created andused to test the effects of such long delays and rare events. LSTMs, RNNs, andGRUs are the go-to models for deep learning in this situation. However, theydiffer in their ability to be trained over a long time. As their acronym suggests,LSTMS are believed to be easier to train and to have a better ability to remember(as their name suggests) than their simpler RNN counterparts. GRUs representa compromise in complexity between RNNs and LSTMs. Here, I will show that,contrary to the common assumption, LSTMs are surprisingly forgetful and thatRNNs have a much better ability to generalize over longer delays with shortersequences. At the same time, I could confirm that LSTMs are easily trained ontasks that have more prolonged delays.