Abstract: A
limitation of ordinary linear models is the requirement that the dependent
variable is numerical rather than categorical. But many interesting variables
are categorical (people may pass or fail, and so on). A range of techniques has
been developed for analyzing data with categorical dependent variables,
including discriminant analysis, probit analysis, log-linear regression and
logistic regression. The various techniques listed above are applicable in
different situations: for example, log-linear regression requires all regressors
to be categorical, whilst discriminant analysis strictly requires them all to be
continuous. Logistic regression is a type of predictive model that can be used
when the target variable is a categorical variable with two categories, and when
we have a mixture of numerical and categorical regressors. Logistic regression
models can be used for classification. So, it can be used only with a
categorical target variable that has exactly two categories, and a continuous
target variable that has values in the range (0) to (1) representing probability
values.
In
this paper, we consider a study whose goal is to model the response to a
student’s grades as a function of the student’s gender (as a logistic
regression model). The target (dependent) variable, response, has a value (1) if
the student’s grade is ≥ pass (pass, good, very good, and excellent) and
(0) if the student’s grade is < pass. The value of response predicted by
the model represents the probability of achieving an effective outcome.
For this paper, a
systematic sample of students selected from students of Faculty of Commerce,
Tanta
University
, 2007. A binary response logistic regression model is considered for
identifying the relationship between the students’ grades and the students’
gender as a social phenomenon. The statistical software package SPSS will be
used for the binary response logistic regression modeling. Based on the
goodness-of-fit tests, Likelihood-ratio Test, and Classification table, the
Logistic Regression model performs as good as or better than the other
regression models. Our model leads to (For Arabic section) the prediction that
the probability of student grade are ≥ pass is 68.6% for female and 54.9%
for male, the predicted odds of student grade is ≥ pass for male is 0.534
times for female, adding the system of study as a predictor variable
significantly improved the model and the overall success rate in
classification has improved from 74.1% to 78.85%. But for English section, the
probability of student grade is ≥ pass is 79.8% for female and 67.8% for
male, and the predicted odds of student grade is ≥ pass for male is 0.534
times for female.
Keywords and phrases: binary data, dichotomous responses, bivariat and multivariat logistic regression, likelihood-ratio test, goodness of fit test, Wald test, Hosmer-Lemeshow test, classification table.