# statistics

## Survival analysis 2: parametric survival models

Previous topics Why do we need parametric survival models Is Time a Variable or a Constant? Steady change in hazard and survival Positive exponential change in hazard and survival Negative exponential change in hazard and survival B(u)ilding Exponential model … finally 🥳 How to compute parametric models Final thoughts Further readings and references Previous topics A good understanding of Kaplan-Meier method (KM) is a prerequisite for this post, but since you are here, I suppose you are already familiar with it 😉

## Survival analysis 1: a gentle introduction into Kaplan-Meier Curves

Previous topics Why do we need survival analysis? Death is not the only option! Or: “What is an event?” Censoring How to calculate Kaplan-Meier survival curve “manually” step by step Survival probability How to compute Kaplan-Meier survival curve Interpretation of Kaplan-Meier Curve Comparing survival of groups 2 groups Interpretation of groups comparison using 4 benchmarks Log-Rank test > 2 groups and multiple pairwise (post-hoc) Log-Rank test Multiple survival curves Conclusions What’s next?

## 4. Roc(k) is confusing: or the link between machine learning and epidemiology

Previous topics 1. Introduction to statistics 2. Linear regression 3. Logistic regression Why do we need ROC curves and confusion matrices? For assessing the predictive accuracy of the classifier (i.e. logistic regression) or for assessing the accuracy of medical tests and calculating lot’s of medical metrics (i.e. prevalence). Since confusion matrix works for both, classification and medicine, there is a nice link between machine learning and epidemiology. This is the last lecture in my coarse “Statistics for non statisticians”.

## Let's ROC(k)

Previous topics Why do we need ROC curve? Intro How to ROC(k) Middle (default) threshold Lower threshold Higher threshold Connect the dots ROC curve using logistic regression ROC curve using random forest (fancy ML! 🥳) AUC - Area Under the ROC Curve for model comparison Conclusions Further readings, videos and references Previous topics The Receiver Operating Characteristics (ROC) curve is an offspring of the confusion matrix.

## Confusion Matrix and it's 25 offspring: or the link between machine learning and epidemiology

Previous topics Why do we need confusion matrix? How to get a confusion matrix The anatomy of a confusion matrix Offspring of a confusion matrix True metrics Predictive metrics Likelihood metrics Accuracy metrics How to compute Confusion Matrix and it’s 25 offspring Conclusion, or which performance score is better? What’s next Further readings and references Photo by Markus Spiske on Unsplash. Previous topics If you know how to create a confusion matrix, you are good to go.

## 3. Logistic regression: or what is the probability of success?

Previous topics 1. Introduction to statistics 2. Linear regression Why do we need logistic regression for predictions for studying how things influence other things Since logistic regression can easily handle both numerical and categorical predictors, it gives you the power to check literally anything on it’s ability to increase or decrease the probability (or the odds) of success. Providing probabilities makes a logistic regression one of the most useful statistical tools for understanding the world.

## 2. Linear regression vs. Statistical Tests ⚔ who wins?

Previous topics Introduction to statistics Why do we need linear regression for predictions for studying how things influence other things Regression is a line which tries to be as close as possible to all data points simultaneously and in this way describes your data using only two numbers, intercept and slope. If there is a relationship between two variables, then you can predict one of those variables by knowing only the value of the other.

## 1. Introduction to statistics: The (small) Big Picture or how to solve 95% of statistical problems

Why do we need statistics? to learn about the world to do science to develop artificial intelligence The bad news is - statistics is un-intuitive, boring and hard to understand, otherwise, you’d already know everything. But the good new is - you don’t need to understand it. You just need to know how to use statistics to get the most out of your data. Think about driving a car for a moment.

## Logistic regression 5: multiple logistic regression with interactions

Previous topics Why do we need interactions Two categorical predictors Visual interpretation Post-hoc analysis Model output interpretation One numeric and one categorical predictors Model interpretation Post-hoc Two numeric predictors Multiple logistic regression with higher order interactions Welcome to a new world of machine learning! Choosing a model What’s next Final thoughts Further readings and references Source Previous topics A good understanding of four topics is a prerequisite for this post:

## Logistic regression 4: multiple logistic regression

Previous topics Why do we need multiple logistic regression Two categorical predictors One categorical and one numeric predictors Multiple logistic regression with 3 variables Conclusion When NOT to use a multiple logistic regression What’s next Further readings and references Previous topics A good understanding of three topics is a prerequisite for this post: odds, log-odds and probabilites how logistic regression works simple logistic regression Why do we need multiple logistic regression If one predictor influences another predictor (they are correlated), producing two simple regressions (with each of the predictors) might give completely different results as compared to the model containing both predictors.