ml-chap3-Regression

ml-chap3-Regression CHAPTER 3 : Linear Regression Bias-Variance Tradeoff 3 Prediction of continuous variables Billionaire says: Wait, that’s not what I meant! You says: Chill out, dude. He says: I want to predict a continuous variable for continuous inputs: I want ...

CHAPTER 3 : Linear Regression Bias-Variance Tradeoff 3 Prediction of continuous variables Billionaire says: Wait, that’s not what I meant! You says: Chill out, dude. He says: I want to predict a continuous variable for continuous inputs: I want to predict salaries from GPA. You say: I can regress that… 4 The regression problem Instances: Learn: Mapping from x to t(x) Hypothesis space: Given, basis functions Find coeffs w={w1,…,wk} Why is this called linear regression??? model is linear in the parameters Precisely, minimize the residual squared error: 5 The regression problem in matrix notation 6 Regression solution=simple matrix operations But,why? Billionaire(again)says:Why sum squared error??? You say:Gaussians… Model: prediction is linear function plus Gaussian noise Maximizing log-likelihood Least‐squares Linear Regression is MLE for Gaussians!!! Applications Corner 1 Predict stock value over time from past values other relevant vars e.g.,weather,demands,etc. Applications Corner 2 Measure temperatures at some locations Predict temperatures throughout the environment 11 Bias-Variance tradeoff –Intuition Model too “simple” → does not fit the data well A biased solution Model too complex → small changes to the data, solution changes a lot A high-variance solution 12 (Squared) Bias of learner Given dataset D with m samples, learn function h(x) If you sample a different datasets D, you will learn different h(x) Expected hypothesis: ED[h(x)] Bias: difference between what you expect to learn and truth Measures how well you expect to represent true solution Decreases with more complex model 13 Variance of learner Given a dataset D with m samples, you learn function h(x) If you sample a different datasets D, you will learn different h(x) Variance: difference between what you expect to learn and what you learn from a from a particular dataset Measures how sensitive learner is to specific dataset Decreases with simpler model 14 Bias-Variance Tradeoff Choice of hypothesis class introduces learning bias More complex class → less bias More complex class → more variance More complex class → more variance More complex class → more variance Collect some data, and learn a function h(x) What are sources of prediction error? 15 Sources of error 1 –noise What if we have perfect learner, infinite data? If our learning solution h(x) satisfies h(x)=g(x) Still have remaining, unavoidable error of σ2 due to noise ε 16 Sources of error 2 –Finite data 17 What if we have imperfect learner, or only m training examples? What is our expected squared error per example? Expectation taken over random training sets D of size m, drawn from distribution P(X,T) Bias-Variance Decomposition of Error Assume target function: t = f(x) = g(x) + ε 18 Then expected sq error over fixed size training sets D drawn from P(X,T) can be expressed as sum of three components: Where: Bias-Variance Tradeoff 19 Choice of hypothesis class introduces learning bias More complex class → less bias More complex class →more variance Training set error 20 Given a dataset (Training data) Choose a loss function e.g., squared error (L ) for regression Training set error: For a particular set of parameters, loss function on training data: Training set error as a function of model complexity 21 Prediction error Training set error can be poor measure of “quality” of solution Prediction error: We really care about error over all possible input points, not just training data: 22 23 Prediction error as a function of model complexity 24 Computing prediction error Computing prediction hard integral May not know t(x) for every x Monte Carlo integration (sampling approximation) Sample a set of i.i.d. points {x1,…,xM} from p(x) Approximate integral with sample average 25 Why training set error doesn’t approximate prediction error? Sampling approximation of prediction error: Training error : Very similar equations!!! Why is training set a bad measure of prediction error??? Why training set error doesn’t approximate prediction error? Very similar equations!!! Why is training set a bad measure of prediction error??? 26 27 Test set error Given a dataset, randomly split it into two parts: Training data –{x1,…, xNtrain} Test data –{x1,…, xNtest} Use training data to optimize parameters w Test set error: For the final solution w*, evaluate the error using: Test set error as a function of model complexity 28 Overfitting Overfitting: a learning algorithm overfits the training data if it outputs a solution w when there exists another solution w’ such that: 29 How many points to use for training/testing? Very hard question to answer! Too few training points, learned w is bad Too few test points, you never know if you reached a good solution Bounds, such as Hoeffding’s inequality can help: More on this later this semester, but still hard to answer Typically: if you have a reasonable amount of data, pick test set “large enough” for a “reasonable” estimate of error, and use the rest for learning if you have little data, then … 30 Error estimators 31 Error as a function of number of training examples for a fixed model complexity 32 Error estimators 33 What you need to know Regression Basis function = features Optimizing sum squared error Relationship between regression and Gaussians Bias-Variance trade-off Play with Applet True error, training error, test error Never learn on the test data Overfitting 34

                    本文档为【ml-chap3-Regression】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

ml-chap3-Regression

你可能还喜欢