Were going to gain some insight into how logistic regression works by building a model in Microsoft Excel.It is important to appreciate that our goal here is to learn about logistic regression.If all you want are logistic regression results, there are tools, including the Excel Analysis ToolPack, that will take you there directly.We are going spend more time to obtain our logistic regression by brute force, and the reward for this extra effort will be some greater understanding.
![]() In our hypothetical data, this will be admitted to a hypothetical university program. Either you get in or you dont. For nit-pickers out there, Hypothetical U. So far, it looks a lot like a good old-fashioned linear regression. In a linear regression, an economist might be predicting wages or prices. Our input data represents events that have already occurred, and so 0 means didnt get in and 1 means got in. The regression equation we hope to create cannot be linear since the permissible output values must fall in the range from zero to one. Briefly put, the logit is a function that takes a probability of an event as input and returns the logarithm of the odds of that event as output. Trouble is, this is the related to the results we want to predict; how can we enter an Excel formula for results we dont have yet This is where we create our regression equation. We will have four coefficients: one constant and one coefficient for each of our three input values, gre, gpa, and rank. Well give these coefficients some arbitrary value (in this example 0.001) with the understanding that these starting value will be replaced by optimized values. Just like we can give coefficients to any old line in a linear regression, and then minimize the least squares difference later.) In this example, if we do not pick a pretty small value for the coefficients, the exponentiation to follow will make our values blow-up and the model wont work. Briefly, the likelihood function calculates a probability based on the values of the input variables. The overall likelihood will be the product of individual values for each row. Using calculate the log of the likelihood function we can sum over the rows. Our best estimate of the coefficients will be those that maximize the sum of the log-likelihoods over all the rows. Fortunately, we dont have to optimize them ourselves, we can call upon Solver. Although it comes with Excel, it is provided not by Microsoft but by Frontline Systems and is a subset of their more powerful Analytic Solver add-in. Assuming the Solver add-in is already loaded, we need only go to the Data tab on the Excel ribbon and click the Solver button. We do this by changing the values in F2:F5, representing coefficients b 0 -b 4. It is very important to uncheck the box labeled Make Unconstrained Variables Non-Negative.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
December 2020
Categories |