<- readRDS(file = "clusterdata.RDS")
phonedata <- phonedata[complete.cases(phonedata$gender),]
phonedata <- phonedata[, c(1:1821, 1823, 1837)] phonedata
Practical Exercise I: Performance Evaluation with K-fold Cross-Validation
Compute In-sample Performance Estimate
In our first practical exercise, we will demonstrate how to use mlr3 to fit a standard linear regression model to the PhoneStudy dataset by predicting the sociability personality trait score based on all variables of aggregated smartphone usage behavior. Then we compare in-sample and out-of-sample predictive performance based on \(R^2\) and \(RMSE\).
First, we load the PhoneStudy dataset and remove some administrative variables we do not use in our tutorial. We also remove 4 participants who did not report their gender. You can download the dataset here. Note that the dataset is not a .csv
file but instead a .RDS
file that can be loaded into R with the readRDS
function.
We load the mlr3verse - R package (Lang und Schratz 2021), which conveniently loads mlr3 and the most important companion packages. Then we create a task object with a unique id (Sociability_Regr) which is mlr3’s way to store the raw data along with some meta-information for modeling. In mlr3, a task defines a certain prediction problem, here supervised regression with the sociability trait score (named E2.Sociableness in our dataset) as the target.1
library(mlr3verse)
Loading required package: mlr3
<- as_task_regr(phonedata, id = "Sociability_Regr",
task_Soci target = "E2.Sociableness")
The meta-data can be displayed by printing the task object (type task_Soci
). When training a model on a task, mlr3 by default uses all variables except the target as features. We do not want to use gender as a feature although we want to check our models for gender fairness in a later module. Therefore we remove gender from the set of features but keep it within the task object.
$set_col_roles("gender", remove_from = "feature") task_Soci
We recommend to always double check which variables are really intended to be used as features (you can get the full list of feature names with task_Soci$col_roles$feature
), because including the wrong variables is a common source of embarrassing mistakes, which can completely invalidate the whole analysis.
Next we create a learner object, to specify an ML model to apply later. mlr3 does not implement its own ML models but links to available implementations in other R packages. For example, the id regr.lm links to the ordinary lm
function in the stats package. You can find a list of mlr3 ids for the most popular ML models in the mlr3 e-book (Becker u. a. 2022).
<- lrn("regr.lm") lm
We try to train (i.e., estimate model parameters) the learner on the task. In mlr3, objects have “abilities” (also called methods) that can be applied with the following $
-syntax (here the train
method of the learner object is used to train the learner on a specified task).
$train(task = task_Soci) lm
Error: Task 'Sociability_Regr' has missing values in column(s) 'AR_num_calls_in1', 'AR_num_calls_in12', 'AR_num_calls_in24', 'AR_num_calls_in4', 'AR_num_calls_in8', 'AR_num_calls_out1', 'AR_num_calls_out12', 'AR_num_calls_out24', 'AR_num_calls_out4', 'AR_num_calls_out8', 'AR_num_calls_ring1', 'AR_num_calls_ring12', 'AR_num_calls_ring24', 'AR_num_calls_ring4', 'AR_num_calls_ring8', 'IVI_call_in', 'IVI_call_in_week', 'IVI_call_in_weekend', 'IVI_call_miss', 'IVI_call_miss_week', 'IVI_call_miss_weekend', 'IVI_call_out', 'IVI_call_out_week', 'IVI_call_out_weekend', 'IVI_call_ring', 'IVI_call_ring_week', 'IVI_call_ring_weekend', 'IVI_call_week', 'IVI_call_weekend', 'IVI_calls', 'IV_Academia', 'IV_Artistic_Hobby', 'IV_Beauty', 'IV_Betting_Risk', 'IV_Calculator', 'IV_Calendar_Apps', 'IV_Calling', 'IV_Camera', 'IV_Checkup_Monitoring', 'IV_ComicsBooks', 'IV_Dating_Mating', 'IV_E_Mail', 'IV_Eating', 'IV_Education', 'IV_Emergency_Warning', 'IV_Entertainment', 'IV_Financial', 'IV_Gallery', 'IV_Gaming_Action', 'IV_Gaming_Adventure', 'IV_Gaming_Casual', 'IV_Gaming_Knowledge', 'IV_Gaming_Logic', 'IV_Gaming_Role_Playing', 'IV_Gaming_Simulation', 'IV_Gaming_Sports', 'IV_Gaming_Strategy', 'IV_Gaming_Tools_Community', 'IV_Health_SelfMonitoring', 'IV_Image_And_Video_Editing', 'IV_Internet_Browser', 'IV_Jobs_Additional_Income', 'IV_Language_Learning', 'IV_Messaging', 'IV_Music_Audio_Radio', 'IV_News_Magazines', 'IV_Note_Apps', 'IV_Office_Tools', 'IV_Organisation', 'IV_Orientation', 'IV_Personalization', 'IV_Private_Transportation', 'IV_Provider_Services', 'IV_Religion_Spirituality_Esotersim', 'IV_Security', 'IV_Settings', 'IV_Shared_Transportation', 'IV_Sharing_Cloud', 'IV_Shop_Sell_Rent', 'IV_Shopping', 'IV_Sleep', 'IV_Social_Media_Tools', 'IV_Social_Networks', 'IV_Sport_News', 'IV_Sports', 'IV_System_App', 'IV_TVVideo_Apps', 'IV_TV_Film_Guide', 'IV_Timer_Clocks', 'IV_Tools', 'IV_Travel', 'IV_Weather', 'IV_Womens_Apps', 'IV_Workout', 'IV_apps', 'IV_unlock', 'Qn_dur_.app.me.phonestudy_restructured', 'Qn_dur_.canvasm.myo2', 'Qn_dur_.cc.dict.dictcc', 'Qn_dur_.ch.threema.app', 'Qn_dur_.cn.wps.moffice_eng', 'Qn_dur_.com.adobe.reader', 'Qn_dur_.com.airbnb.android', 'Qn_dur_.com.amazon.avod.thirdpartyclient', 'Qn_dur_.com.amazon.mShop.android.shopping', 'Qn_dur_.com.amazon.mp3', 'Qn_dur_.com.android.bluetooth', 'Qn_dur_.com.android.browser', 'Qn_dur_.com.android.calculator2', 'Qn_dur_.com.android.calendar', 'Qn_dur_.com.android.chrome', 'Qn_dur_.com.android.contacts', 'Qn_dur_.com.android.deskclock', 'Qn_dur_.com.android.dialer', 'Qn_dur_.com.android.email', 'Qn_dur_.com.android.launcher3', 'Qn_dur_.com.android.mediacenter', 'Qn_dur_.com.android.phone', 'Qn_dur_.com.android.printspooler', 'Qn_dur_.com.android.settings', 'Qn_dur_.com.android.soundrecorder', 'Qn_dur_.com.android.systemui', 'Qn_dur_.com.android.vending', 'Qn_dur_.com.antivirus', 'Qn_dur_.com.appseleration.android.selfcare', 'Qn_dur_.com.audible.application', 'Qn_dur_.com.avast.android.mobilesecurity', 'Qn_dur_.com.avira.android', 'Qn_dur_.com.baviux.pillreminder', 'Qn_dur_.com.bitstrips.imoji', 'Qn_dur_.com.cleanmaster.mguard', 'Qn_dur_.com.cleanmaster.security', 'Qn_dur_.com.comuto', 'Qn_dur_.com.devuni.flashlight', 'Qn_dur_.com.dn.drivenow', 'Qn_dur_.com.doodle.android', 'Qn_dur_.com.dropbox.android', 'Qn_dur_.com.duolingo', 'Qn_dur_.com.ebay.kleinanzeigen', 'Qn_dur_.com.ebay.mobile', 'Qn_dur_.com.estrongs.android.pop', 'Qn_dur_.com.evernote', 'Qn_dur_.com.example.android.notepad', 'Qn_dur_.com.facebook.katana', 'Qn_dur_.com.facebook.orca', 'Qn_dur_.com.google.android.GoogleCamera', 'Qn_dur_.com.google.android.apps.docs', 'Qn_dur_.com.google.android.apps.docs.editors.docs', 'Qn_dur_.com.google.android.apps.docs.editors.sheets', 'Qn_dur_.com.google.android.apps.genie.geniewidget', 'Qn_dur_.com.google.android.apps.magazines', 'Qn_dur_.com.google.android.apps.maps', 'Qn_dur_.com.google.android.apps.messaging', 'Qn_dur_.com.google.android.apps.paidtasks', 'Qn_dur_.com.google.android.apps.photos', 'Qn_dur_.com.google.android.apps.plus', 'Qn_dur_.com.google.android.apps.translate', 'Qn_dur_.com.google.android.calculator', 'Qn_dur_.com.google.android.calendar', 'Qn_dur_.com.google.android.deskclock', 'Qn_dur_.com.google.android.gm', 'Qn_dur_.com.google.android.googlequicksearchbox', 'Qn_dur_.com.google.android.keep', 'Qn_dur_.com.google.android.music', 'Qn_dur_.com.google.android.play.games', 'Qn_dur_.com.google.android.talk', 'Qn_dur_.com.google.android.youtube', 'Qn_dur_.com.groupon', 'Qn_dur_.com.here.app.maps', 'Qn_dur_.com.hm', 'Qn_dur_.com.htc.Weather', 'Qn_dur_.com.htc.album', 'Qn_dur_.com.htc.android.worldclock', 'Qn_dur_.com.htc.calendar', 'Qn_dur_.com.htc.camera', 'Qn_dur_.com.htc.contacts', 'Qn_dur_.com.htc.launcher', 'Qn_dur_.com.htc.music', 'Qn_dur_.com.htc.sense.browser', 'Qn_dur_.com.htc.sense.mms', 'Qn_dur_.com.htc.video', 'Qn_dur_.com.huawei.android.launcher', 'Qn_dur_.com.huawei.android.totemweather', 'Qn_dur_.com.huawei.camera', 'Qn_dur_.com.huawei.hidisk', 'Qn_dur_.com.huawei.hwvplayer', 'Qn_dur_.com.huawei.vassistant', 'Qn_dur_.com.infraware.polarisviewer4', 'Qn_dur_.com.infraware.polarisviewer5', 'Qn_dur_.com.instagram.android', 'Qn_dur_.com.intsig.camscanner', 'Qn_dur_.com.king.candycrushsaga', 'Qn_dur_.com.king.candycrushsodasaga', 'Qn_dur_.com.mdv.AVVCompanion', 'Qn_dur_.com.mdv.companion', 'Qn_dur_.com.melodis.midomiMusicIdentifier.freemium', 'Qn_dur_.com.microsoft.office.excel', 'Qn_dur_.com.microsoft.office.outlook', 'Qn_dur_.com.microsoft.office.powerpoint', 'Qn_dur_.com.microsoft.office.word', 'Qn_dur_.com.microsoft.skydrive', 'Qn_dur_.com.mobisystems.fileman', 'Qn_dur_.com.mobisystems.office', 'Qn_dur_.com.motorola.cameraone', 'Qn_dur_.com.netbiscuits.kicker', 'Qn_dur_.com.netflix.mediaclient', 'Qn_dur_.com.nianticlabs.pokemongo', 'Qn_dur_.com.ninegag.android.app', 'Qn_dur_.com.nintendo.zaca', 'Qn_dur_.com.oneplus.camera', 'Qn_dur_.com.paypal.android.p2pmobile', 'Qn_dur_.com.picsart.studio', 'Qn_dur_.com.pinterest', 'Qn_dur_.com.pons.onlinedictionary', 'Qn_dur_.com.popularapp.periodcalendar', 'Qn_dur_.com.runtastic.android', 'Qn_dur_.com.samsung.android.app.galaxyfinder', 'Qn_dur_.com.samsung.android.app.memo', 'Qn_dur_.com.samsung.android.app.scrollcapture', 'Qn_dur_.com.samsung.android.calendar', 'Qn_dur_.com.samsung.android.contacts', 'Qn_dur_.com.samsung.android.email.provider', 'Qn_dur_.com.samsung.android.incallui', 'Qn_dur_.com.samsung.android.lool', 'Qn_dur_.com.samsung.android.messaging', 'Qn_dur_.com.samsung.android.qconnect', 'Qn_dur_.com.samsung.android.sm', 'Qn_dur_.com.samsung.android.themestore', 'Qn_dur_.com.samsung.android.video', 'Qn_dur_.com.samsung.android.weather', 'Qn_dur_.com.sec.android.app.camera', 'Qn_dur_.com.sec.android.app.clockpackage', 'Qn_dur_.com.sec.android.app.controlpanel', 'Qn_dur_.com.sec.android.app.fm', 'Qn_dur_.com.sec.android.app.launcher', 'Qn_dur_.com.sec.android.app.memo', 'Qn_dur_.com.sec.android.app.music', 'Qn_dur_.com.sec.android.app.myfiles', 'Qn_dur_.com.sec.android.app.popupcalculator', 'Qn_dur_.com.sec.android.app.samsungapps', 'Qn_dur_.com.sec.android.app.sbrowser', 'Qn_dur_.com.sec.android.app.shealth', 'Qn_dur_.com.sec.android.app.taskmanager', 'Qn_dur_.com.sec.android.app.videoplayer', 'Qn_dur_.com.sec.android.app.voicenote', 'Qn_dur_.com.sec.android.app.voicerecorder', 'Qn_dur_.com.sec.android.app.wallpaperchooser', 'Qn_dur_.com.sec.android.emergencylauncher', 'Qn_dur_.com.sec.android.gallery3d', 'Qn_dur_.com.sec.android.mimage.photoretouching', 'Qn_dur_.com.sec.android.wallpapercropper2', 'Qn_dur_.com.sec.android.widgetapp.ap.hero.accuweather', 'Qn_dur_.com.sec.android.widgetapp.diotek.smemo', 'Qn_dur_.com.shazam.android', 'Qn_dur_.com.shpock.android', 'Qn_dur_.com.skype.raider', 'Qn_dur_.com.snapchat.android', 'Qn_dur_.com.socialnmobile.dictapps.notepad.color.note', 'Qn_dur_.com.sonyericsson.advancedwidget.weather', 'Qn_dur_.com.sonyericsson.album', 'Qn_dur_.com.sonyericsson.android.camera', 'Qn_dur_.com.sonyericsson.android.socialphonebook', 'Qn_dur_.com.sonyericsson.conversations', 'Qn_dur_.com.sonyericsson.extras.liveware', 'Qn_dur_.com.sonyericsson.home', 'Qn_dur_.com.sonyericsson.music', 'Qn_dur_.com.sonyericsson.organizer', 'Qn_dur_.com.sonyericsson.photoeditor', 'Qn_dur_.com.sonyericsson.video', 'Qn_dur_.com.sonymobile.android.contacts', 'Qn_dur_.com.sonymobile.android.dialer', 'Qn_dur_.com.sonymobile.calendar', 'Qn_dur_.com.sonymobile.entrance', 'Qn_dur_.com.soundcloud.android', 'Qn_dur_.com.spotify.music', 'Qn_dur_.com.starfinanz.mobile.android.pushtan', 'Qn_dur_.com.starfinanz.smob.android.sfinanzstatus', 'Qn_dur_.com.supercell.clashofclans', 'Qn_dur_.com.supercell.clashroyale', 'Qn_dur_.com.surpax.ledflashlight.panel', 'Qn_dur_.com.tellm.android.app', 'Qn_dur_.com.tinder', 'Qn_dur_.com.tumblr', 'Qn_dur_.com.twitter.android', 'Qn_dur_.com.urbandroid.sleep', 'Qn_dur_.com.valvesoftware.android.steam.community', 'Qn_dur_.com.viber.voip', 'Qn_dur_.com.vlingo.midas', 'Qn_dur_.com.wetter.androidclient', 'Qn_dur_.com.whatsapp', 'Qn_dur_.com.wunderkinder.wunderlistandroid', 'Qn_dur_.com.xing.android', 'Qn_dur_.com.yahoo.mobile.client.android.mail', 'Qn_dur_.com.yopeso.lieferando', 'Qn_dur_.com.zdf.android.mediathek', 'Qn_dur_.de.amazon.mShop.android', 'Qn_dur_.de.axelspringer.yana.zeropage', 'Qn_dur_.de.bfv.android', 'Qn_dur_.de.burgerking.kingfinder', 'Qn_dur_.de.cellular.focus', 'Qn_dur_.de.cellular.tagesschau', 'Qn_dur_.de.eplus.mappecc.client.android.alditalk', 'Qn_dur_.de.fiducia.smartphone.android.banking.vr', 'Qn_dur_.de.flixbus.app', 'Qn_dur_.de.gmx.mobile.android.mail', 'Qn_dur_.de.hafas.android.db', 'Qn_dur_.de.hafas.android.sbm', 'Qn_dur_.de.is24.android', 'Qn_dur_.de.kicktipp.mbookmark', 'Qn_dur_.de.kleiderkreisel', 'Qn_dur_.de.lieferheld.android', 'Qn_dur_.de.lineas.lit.ntv.android', 'Qn_dur_.de.lmuroomfinder.release', 'Qn_dur_.de.mensaplan.app.android.muenchen', 'Qn_dur_.de.mobile.android.app', 'Qn_dur_.de.motain.iliga', 'Qn_dur_.de.payback.client.android', 'Qn_dur_.de.pixelhouse', 'Qn_dur_.de.schildbach.oeffi', 'Qn_dur_.de.sde.mobile', 'Qn_dur_.de.spiegel.android.app.spon', 'Qn_dur_.de.swm.mvgfahrinfo.muenchen', 'Qn_dur_.de.tagesschau', 'Qn_dur_.de.telekom.mds.mbp', 'Qn_dur_.de.tum.in.tumcampus', 'Qn_dur_.de.tvspielfilm', 'Qn_dur_.de.web.mobile.android.mail', 'Qn_dur_.de.wetteronline.wetterapp', 'Qn_dur_.de.zalando.mobile', 'Qn_dur_.de.zeit.online', 'Qn_dur_.flipboard.boxer.app', 'Qn_dur_.kik.android', 'Qn_dur_.net.lovoo.android', 'Qn_dur_.org.leo.android.dict', 'Qn_dur_.org.mozilla.firefox', 'Qn_dur_.org.telegram.messenger', 'Qn_dur_.org.thoughtcrime.securesms', 'Qn_dur_.org.wikipedia', 'Qn_dur_.pm.lamm.myandroidlogger', 'Qn_dur_.se.feomedia.quizkampen.de.lite', 'Qn_dur_.se.feomedia.quizkampen.de.premium', 'Qn_dur_.tunein.player', 'Qn_dur_.tv.peel.app', 'Qn_dur_.tv.peel.smartremote', 'Qn_dur_.tv.twitch.android.app', 'Qn_dur_.uk.amazon.mShop.android', 'Qn_dur_Academia', 'Qn_dur_Activism_Charity', 'Qn_dur_Artistic_Hobby', 'Qn_dur_Beauty', 'Qn_dur_Betting_Risk', 'Qn_dur_Calculator', 'Qn_dur_Calendar_Apps', 'Qn_dur_Calling', 'Qn_dur_Camera', 'Qn_dur_Checkup_Monitoring', 'Qn_dur_ComicsBooks', 'Qn_dur_Dating_Mating', 'Qn_dur_E_Mail', 'Qn_dur_Eating', 'Qn_dur_Education', 'Qn_dur_Emergency_Warning', 'Qn_dur_Entertainment', 'Qn_dur_Financial', 'Qn_dur_Gallery', 'Qn_dur_Gaming_Action', 'Qn_dur_Gaming_Adventure', 'Qn_dur_Gaming_Casual', 'Qn_dur_Gaming_Knowledge', 'Qn_dur_Gaming_Logic', 'Qn_dur_Gaming_Role_Playing', 'Qn_dur_Gaming_Simulation', 'Qn_dur_Gaming_Sports', 'Qn_dur_Gaming_Strategy', 'Qn_dur_Gaming_Tools_Community', 'Qn_dur_Group_Activity', 'Qn_dur_Health_SelfMonitoring', 'Qn_dur_Image_And_Video_Editing', 'Qn_dur_Internet_Browser', 'Qn_dur_Jobs_Additional_Income', 'Qn_dur_Language_Learning', 'Qn_dur_Messaging', 'Qn_dur_Music_Audio_Radio', 'Qn_dur_News_Magazines', 'Qn_dur_Note_Apps', 'Qn_dur_Office_Tools', 'Qn_dur_Organisation', 'Qn_dur_Orientation', 'Qn_dur_Personalization', 'Qn_dur_Private_Transportation', 'Qn_dur_Provider_Services', 'Qn_dur_Public_Events', 'Qn_dur_Religion_Spirituality_Esotersim', 'Qn_dur_Security', 'Qn_dur_Settings', 'Qn_dur_Shared_Transportation', 'Qn_dur_Sharing_Cloud', 'Qn_dur_Shop_Sell_Rent', 'Qn_dur_Shopping', 'Qn_dur_Sleep', 'Qn_dur_Social_Media_Tools', 'Qn_dur_Social_Networks', 'Qn_dur_Sport_News', 'Qn_dur_Sports', 'Qn_dur_System_App', 'Qn_dur_TVVideo_Apps', 'Qn_dur_TV_Film_Guide', 'Qn_dur_Timer_Clocks', 'Qn_dur_Tools', 'Qn_dur_Travel', 'Qn_dur_Weather', 'Qn_dur_Womens_Apps', 'Qn_dur_Workout', 'Qn_dur_call_ring', 'Qn_firstevent', 'Qn_firstevent_weekdays', 'Qn_firstevent_weekends', 'Qn_lastevent', 'Qn_lastevent_weekdays', 'Qn_lastevent_weekend', 'Qn_rog', 'Qn_rog_weekdays', 'Qn_rog_weekends', 'Responses_calls', 'Responses_sms', 'SDD', 'SDD_daytime', 'SDD_nighttime', 'SDD_sat', 'SDD_weekday', 'SDD_weekend', 'app_simi_dayNight', 'app_simi_weekWeekend', 'contact_simi_call_inOut', 'contact_simi_call_weekWeekend', 'contact_simi_smsPhone', 'contact_simi_sms_inOut', 'daily_huber_homevisits', 'daily_huber_homevisits_weekday', 'daily_huber_homevisits_weekend', 'daily_mean_duration_music', 'daily_mean_duration_music_weekdays', 'daily_mean_duration_music_weekend', 'daily_mean_elev_change', 'daily_mean_elev_change_weekdays', 'daily_mean_elev_change_weekend', 'daily_mean_neg_elev_change', 'daily_mean_num_clusters', 'daily_mean_num_clusters_week', 'daily_mean_num_clusters_weekend', 'daily_mean_pos_elev_change', 'daily_sd_duration_music', 'daily_sd_elev_change', 'daily_sd_homevisits', 'daily_sd_homevisits_weekday', 'daily_sd_homevisits_weekend', 'daily_sd_num_song', 'daily_sd_num_uniq_alb', 'daily_sd_num_uniq_art', 'daily_sd_num_uniq_song', 'daily_sd_sum_intereventall', 'durationHome', 'entropy_contacts_call_in', 'entropy_contacts_call_miss', 'entropy_contacts_call_out', 'entropy_contacts_call_ring', 'entropy_contacts_sms_in', 'entropy_contacts_sms_sent', 'entropy_duration_clusters', 'excess_music_acousticness', 'excess_music_danceability', 'excess_music_energy', 'excess_music_instrumentalness', 'excess_music_liveness', 'excess_music_loudness', 'excess_music_popularity', 'excess_music_speechiness', 'excess_music_tempo', 'excess_music_valence', 'fav1_acousticness', 'fav1_daily_mean_duration', 'fav1_daily_mean_num', 'fav1_danceability', 'fav1_energy', 'fav1_instrumentalness', 'fav1_liveness', 'fav1_loudness', 'fav1_popularity', 'fav1_speechiness', 'fav1_tempo', 'fav1_valence', 'fav2_acousticness', 'fav2_daily_mean_duration', 'fav2_daily_mean_num', 'fav2_danceability', 'fav2_energy', 'fav2_instrumentalness', 'fav2_liveness', 'fav2_loudness', 'fav2_popularity', 'fav2_speechiness', 'fav2_tempo', 'fav2_valence', 'fav3_acousticness', 'fav3_daily_mean_duration', 'fav3_daily_mean_num', 'fav3_danceability', 'fav3_energy', 'fav3_instrumentalness', 'fav3_liveness', 'fav3_loudness', 'fav3_popularity', 'fav3_speechiness', 'fav3_tempo', 'fav3_valence', 'fav4_acousticness', 'fav4_daily_mean_duration', 'fav4_daily_mean_num', 'fav4_danceability', 'fav4_energy', 'fav4_instrumentalness', 'fav4_liveness', 'fav4_loudness', 'fav4_popularity', 'fav4_speechiness', 'fav4_tempo', 'fav4_valence', 'fav5_acousticness', 'fav5_daily_mean_duration', 'fav5_daily_mean_num', 'fav5_danceability', 'fav5_energy', 'fav5_instrumentalness', 'fav5_liveness', 'fav5_loudness', 'fav5_popularity', 'fav5_speechiness', 'fav5_tempo', 'fav5_valence', 'huberM_daily_max_dist_home', 'huberM_daily_max_dist_home_weekday', 'huberM_daily_max_dist_home_weekend', 'huberM_daily_time_spent_home', 'huberM_distance_covered_daily', 'huberM_distance_covered_weekday', 'huberM_distance_covered_weekend', 'huberM_dur_.com.android.deskclock', 'huberM_dur_.com.android.phone', 'huberM_dur_.com.android.settings', 'huberM_dur_.com.android.systemui', 'huberM_dur_.com.estrongs.android.pop', 'huberM_dur_.com.facebook.orca', 'huberM_dur_.com.google.android.calendar', 'huberM_dur_.com.google.android.talk', 'huberM_dur_.com.huawei.vassistant', 'huberM_dur_.com.netflix.mediaclient', 'huberM_dur_.com.samsung.android.app.galaxyfinder', 'huberM_dur_.com.samsung.android.incallui', 'huberM_dur_.com.samsung.android.messaging', 'huberM_dur_.com.sec.android.app.fm', 'huberM_dur_.com.sec.android.app.taskmanager', 'huberM_dur_.com.sonymobile.entrance', 'huberM_dur_.com.surpax.ledflashlight.panel', 'huberM_dur_.flipboard.app', 'huberM_dur_.tv.peel.smartremote', 'huberM_dur_Calendar_Apps', 'huberM_dur_Checkup_Monitoring', 'huberM_dur_Gaming_Tools_Community', 'huberM_dur_Music_Audio_Radio', 'huberM_dur_Personalization', 'huberM_dur_Settings', 'huberM_dur_Timer_Clocks', 'huberM_dur_Tools', 'huberM_dur_night_Calculator', 'huberM_dur_night_Calendar_Apps', 'huberM_dur_night_Calling', 'huberM_dur_night_Camera', 'huberM_dur_night_Gallery', 'huberM_dur_night_Gaming_Strategy', 'huberM_dur_night_Gaming_Tools_Community', 'huberM_dur_night_Language_Learning', 'huberM_dur_night_Music_Audio_Radio', 'huberM_dur_night_Organisation', 'huberM_dur_night_Private_Transportation', 'huberM_dur_night_Settings', 'huberM_dur_night_TVVideo_Apps', 'huberM_dur_night_Timer_Clocks', 'huberM_dur_night_Tools', 'huberM_firstevent', 'huberM_firstevent_weekdays', 'huberM_firstevent_weekends', 'huberM_lastevent', 'huberM_lastevent_weekdays', 'huberM_lastevent_weekend', 'huberM_max_dist_two_points_daily', 'huberM_max_dist_two_points_weekday', 'huberM_max_dist_two_points_weekend', 'huberM_rog_daily', 'huberM_rog_nightly', 'huberM_rog_weekdays', 'huberM_rog_weekends', 'huberM_time_spent_home', 'huberM_time_spent_home_weekday', 'huberM_time_spent_home_weekend', 'maxDistance', 'max_distance_home', 'max_elevation', 'max_elevation_weekdays', 'max_elevation_weekends', 'max_music_acousticness', 'max_music_danceability', 'max_music_energy', 'max_music_instrumentalness', 'max_music_liveness', 'max_music_loudness', 'max_music_popularity', 'max_music_speechiness', 'max_music_tempo', 'max_music_valence', 'mean_charge_conn', 'mean_charge_dis', 'mean_dur_wakeLeave', 'mean_dur_wakeLeaveHome', 'mean_dur_wakeLeaveHome_weekday', 'mean_dur_wakeLeaveHome_weekend', 'mean_elevation', 'mean_elevation_weekdays', 'mean_elevation_weekends', 'mean_music_acousticness', 'mean_music_acousticness_weekday', 'mean_music_acousticness_weekend', 'mean_music_danceability', 'mean_music_danceability_weekday', 'mean_music_danceability_weekend', 'mean_music_energy', 'mean_music_energy_weekday', 'mean_music_energy_weekend', 'mean_music_explicit', 'mean_music_explicit_weekday', 'mean_music_explicit_weekend', 'mean_music_instrumentalness', 'mean_music_instrumentalness_weekday', 'mean_music_instrumentalness_weekend', 'mean_music_liveness', 'mean_music_liveness_weekday', 'mean_music_liveness_weekend', 'mean_music_loudness', 'mean_music_loudness_weekday', 'mean_music_loudness_weekend', 'mean_music_mode', 'mean_music_mode_weekday', 'mean_music_mode_weekend', 'mean_music_popularity', 'mean_music_popularity_weekday', 'mean_music_popularity_weekend', 'mean_music_speechiness', 'mean_music_speechiness_weekday', 'mean_music_speechiness_weekend', 'mean_music_tempo', 'mean_music_tempo_weekday', 'mean_music_tempo_weekend', 'mean_music_valence', 'mean_music_valence_weekday', 'mean_music_valence_weekend', 'mean_time_LeaveHome', 'mean_time_LeaveHome_weekday', 'mean_time_LeaveHome_weekend', 'mean_time_callback', 'mean_time_firstLeave', 'mean_time_lastHome', 'mean_time_lastHome_weekday', 'mean_time_lastHome_weekend', 'min_music_acousticness', 'min_music_danceability', 'min_music_energy', 'min_music_instrumentalness', 'min_music_liveness', 'min_music_loudness', 'min_music_popularity', 'min_music_speechiness', 'min_music_tempo', 'min_music_valence', 'perc_music_key_0', 'perc_music_key_0_weekday', 'perc_music_key_0_weekend', 'perc_music_key_1', 'perc_music_key_10', 'perc_music_key_10_weekday', 'perc_music_key_10_weekend', 'perc_music_key_1_weekday', 'perc_music_key_1_weekend', 'perc_music_key_2', 'perc_music_key_2_weekday', 'perc_music_key_2_weekend', 'perc_music_key_3', 'perc_music_key_3_weekday', 'perc_music_key_3_weekend', 'perc_music_key_4', 'perc_music_key_4_weekday', 'perc_music_key_4_weekend', 'perc_music_key_5', 'perc_music_key_5_weekday', 'perc_music_key_5_weekend', 'perc_music_key_6', 'perc_music_key_6_weekday', 'perc_music_key_6_weekend', 'perc_music_key_7', 'perc_music_key_7_weekday', 'perc_music_key_7_weekend', 'perc_music_key_8', 'perc_music_key_8_weekday', 'perc_music_key_8_weekend', 'perc_music_key_9', 'perc_music_key_9_weekday', 'perc_music_key_9_weekend', 'responserate_calls', 'responserate_sms', 'rog', 'sd_dur_down', 'sd_dur_down_Fri_and_Sat', 'sd_dur_down_Sun_until_Thu', 'sd_dur_wakeLeave', 'sd_dur_wakeLeaveHome', 'sd_dur_wakeLeaveHome_weekday', 'sd_dur_wakeLeaveHome_weekend', 'sd_music_acousticness', 'sd_music_acousticness_weekday', 'sd_music_acousticness_weekend', 'sd_music_danceability', 'sd_music_danceability_weekday', 'sd_music_danceability_weekend', 'sd_music_energy', 'sd_music_energy_weekday', 'sd_music_energy_weekend', 'sd_music_instrumentalness', 'sd_music_instrumentalness_weekday', 'sd_music_instrumentalness_weekend', 'sd_music_liveness', 'sd_music_liveness_weekday', 'sd_music_liveness_weekend', 'sd_music_loudness', 'sd_music_loudness_weekday', 'sd_music_loudness_weekend', 'sd_music_popularity', 'sd_music_popularity_weekday', 'sd_music_popularity_weekend', 'sd_music_speechiness', 'sd_music_speechiness_weekday', 'sd_music_speechiness_weekend', 'sd_music_tempo', 'sd_music_tempo_weekday', 'sd_music_tempo_weekend', 'sd_music_valence', 'sd_music_valence_weekday', 'sd_music_valence_weekend', 'sd_time_LeaveHome', 'sd_time_LeaveHome_weekday', 'sd_time_LeaveHome_weekend', 'sd_time_firstLeave', 'sd_time_lastHome', 'sd_time_lastHome_weekday', 'sd_time_lastHome_weekend', 'skew_music_acousticness', 'skew_music_danceability', 'skew_music_energy', 'skew_music_instrumentalness', 'skew_music_liveness', 'skew_music_loudness', 'skew_music_popularity', 'skew_music_speechiness', 'skew_music_tempo', 'skew_music_valence', but learner 'regr.lm' does not support this
Unfortunately, this fails because there are missing values in the dataset and regr.lm cannot handle them. We will use mlr3pipelines (Binder u. a. 2021) to build a simple analysis pipeline, called GraphLearner in mlr3 (a learner consisting of several consecutive analysis steps that can be visualized as a graph), that automatically replaces missing values with the median of the respective training set (i.e., median imputation), prior to fitting our linear model. We do not recommend using mean or median imputation in real applications.2 A tutorial on how to build more complex analysis pipelines with the mlr3pipelines package can be found in the mlr3 e-book (Becker u. a. 2022).
<- po("imputemedian") # po defines a single pipeline operation
imputer <- as_learner(imputer %>>% lm) # combine po and learner into a pipeline lm
Now, training the augmented learner on the task works just fine.
$train(task = task_Soci) lm
The previous line trained the model and automatically stored it inside the learner object. One great advantage of mlr3 is that we can use the same modeling functions for ML models from different R packages without having to remember the peculiarities of their modeling syntax. We can use the trained model to make predictions which we have to store in a separate object.3
<- lm$predict(task = task_Soci) prediction
We just predicted the same data that we already used for model training, but we could also compute predictions for new observations. In the Sociability task, we did not include four individuals with missing values on the gender variable. Because we do not use gender as a feature here, we can treat these individuals as new data and predict their sociability score with $predict_newdata()
.
<- readRDS(file = "clusterdata.RDS")
phonedata_new <- phonedata_new[
phonedata_new !complete.cases(phonedata_new$gender), c(1:1821, 1837)]
$predict_newdata(newdata = phonedata_new)$response lm
[1] 602.66204 -373.82706 -44.80488 -27.41789
attr(,"non-estim")
1 2 3 4
1 2 3 4
With this functionality, it would be possible to use the model in a practical application. However, it would be irresponsible to apply any predictive model for which the expected predictive performance is unknown. Therefore, we now demonstrate how to evaluate predictive performance with mlr3.
If we wanted to compute in-sample performance based on the predictions for all observations included in our task (which we stored in prediction
), we could calculate the estimates with the score
function and specify the performance measures we are interested in (\(R^2\) and \(RMSE\)) with their respective id. For an exhaustive list of all performance measures available in mlr3, type as.data.table(mlr_measures)
or checkout the mlr3 e-book (Becker u. a. 2022).
<- msrs(c("regr.rsq", "regr.rmse"))
mes $score(mes) prediction
regr.rsq regr.rmse
1.000000e+00 2.433205e-11
The performance on the training data is almost perfect. \(R^2\) is \(1\) and the \(RMSE\) is numerically indistinguishable from \(0\) (see all.equal(0, 2.68163e-11)
). We should always be skeptical when we observe very high in-sample performance, because this can be a sign that the model overfitted to the training data. In general, we should never trust in-sample performance but estimate out-of-sample performance instead.
Compute Out-of-sample Performance Estimate
Next we want to use CV to compute an out-of-sample performance estimate. We specify a resampling strategy (here 5-fold CV). You can run as.data.table(mlr_resamplings)
to get a table of available resampling strategies.
<- rsmp("cv", folds = 5) rdesc
The resample
function randomly splits our dataset based on our resample description, retrains our learner on each subset and computes predictions on each test set. Before running resample
, we set an arbitrary seed to make our results reproducible. Next we compute the out-of-sample performance estimate for our preferred measures aggregated across our 5 test sets with aggregate
.
set.seed(1)
<- resample(learner = lm, task = task_Soci, resampling = rdesc) res
INFO [11:22:30.262] [mlr3] Applying learner 'imputemedian.regr.lm' on task 'Sociability_Regr' (iter 1/5)
INFO [11:22:42.474] [mlr3] Applying learner 'imputemedian.regr.lm' on task 'Sociability_Regr' (iter 2/5)
INFO [11:22:54.633] [mlr3] Applying learner 'imputemedian.regr.lm' on task 'Sociability_Regr' (iter 3/5)
INFO [11:23:06.931] [mlr3] Applying learner 'imputemedian.regr.lm' on task 'Sociability_Regr' (iter 4/5)
INFO [11:23:19.358] [mlr3] Applying learner 'imputemedian.regr.lm' on task 'Sociability_Regr' (iter 5/5)
$aggregate(mes) res
regr.rsq regr.rmse
-2340.92709 79.31954
When we compare the out-of-sample to the in-sample estimates, we realize that the predictions of our model are expected to be really bad. This might be no surprise to many readers because we used ordinary linear regression with 620 observations and 1822 predictor variables, which results in an unidentified model. As a consequence, the \(RMSE\) is huge: a typical deviation between true and predicted sociability scores is about 79, but the true sociability scores in the dataset range only from -4.5 to 5.64. The negative \(R^2\) also implies that the predictive model should not be used in practice. Remember that in contrast to the well known in-sample estimate for linear regression, out-of-sample \(R^2\) can be negative. Negative \(R^2\) indicates that the model performs worse than a simple baseline model that completely ignores all features and merely predicts the mean target value in the test data. The concrete values of negative \(R^2\) do not have any intuitive interpretation. We give a better intuition on why \(R^2\) can become negative in ESM 3.1. The important message here is that with a poorly designed ML model, it is easy to produce worse predictions compared to simple guessing. The naive notion that using any predictive model might still be better than using no formal predictions at all is wrong. However, estimating predictive performance with resampling can prevent us from applying inappropriate models in practice, without relying on expert knowledge about the specific model class (e.g., identification issues in linear regression).
Practical Exercise II:
Follow up here with Practical Exercise II on how to Train a Random Forest and Estimate Predictive Performance.
References
Fußnoten
The E2.Sociableness variable is the estimated person parameter of a Partial Credit Model (Masters 1982) for the sociability facet of the personality trait extraversion in the BFSI. For details, see Stachl u. a. (2020).↩︎
In Stachl u. a. (2020), a more advanced analysis pipeline and imputation strategy was used compared to this tutorial. For a description, see the supplementary information for that paper.↩︎
R issues a warning that the predictions may be misleading, but they are computed nonetheless.↩︎