Imagine a box where you put all of your machine learning stuff, Here it is. [WIP] will update the structure

Bias vs Varience

Metrics

Precision

Recall

Accuracy

F1-score

Cross-Validation

How do you choose which cross validation technique will be used for your project. THink about how your model will be sued and interact with the data in a deployed setting. if the dataset is huge, use Hold-out, which is basically 80-20 method

K fold

if the data points are independent to each other.

if the dataset is unbalanced: stratifiedKfold, as it should be aware with the classes. if very little data, shufflesplit

Time split

if time is influencial in generation of the data, use Timesplit

GroupKfold

if the data is generated by the a patient , and we have lets say n patient generating the x data point, it would be great to use group kfold. Group in this case in the participants.

stratified Groupkfold

if the data points skewed, we can use the stratisfied GroupKfold as well.