<pre>Input:
X_train=train_df.drop("Survived",axis=1)
Y_train=train_df["Survived"]
X_test=test_df.drop("PassengerId",axis=1).copy()
X_train.head()
Y_train.head()
X_test.head()
Output:
Pclass----Sex-----Age-------Parch-----Fare-------EMbarked
3--------- 0 -----34.5------0---------7.82-------2
3--------- 1 -----47 ------0---------7----------0
2--------- 0 -----62 ------0---------9.68-------2
3--------- 0 -----27 ------0---------8.66-------0
3--------- 1 -----22 ------1---------12.2-------0
Input:
X_train.shape,Y_train.shape,X_test.shape
Output:((891, 7), (891,), (418, 6))
input:X_train.head()
output:
Survived---Pclass----Sex----Age-----Parch----Fare----Embarked
0----------3---------0------22-------0--------7.25------0
1----------1---------1------38-------0--------71.2833----1
1----------3---------1-------26------0--------7.925------0
1----------1---------1-------35------0---------53.1------0
0----------3---------0-------35------0---------8.05------0
# Logistic Regression
logreg = LogisticRegression()
logreg.fit(X_train, Y_train)
Y_pred = logreg.predict(X_test)
acc_log = round(logreg.score(X_train, Y_train) * 100, 2)
acc_log
c:\users\user\appdata\local\programs\python\python37\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.
FutureWarning)
ValueError Traceback (most recent call last)
<ipython-input-64-5854ca91fc64> in <module>
3 logreg = LogisticRegression()
4 logreg.fit(X_train, Y_train)
----> 5 Y_pred = logreg.predict(X_test)
6 acc_log = round(logreg.score(X_train, Y_train) * 100, 2)
7 acc_log
c:\users\user\appdata\local\programs\python\python37\lib\site-packages\sklearn\linear_model\base.py in predict(self, X)
287 Predicted class label per sample.
288 """
--> 289 scores = self.decision_function(X)
290 if len(scores.shape) == 1:
291 indices = (scores > 0).astype(np.int)
c:\users\user\appdata\local\programs\python\python37\lib\site-packages\sklearn\linear_model\base.py in decision_function(self, X)
268 if X.shape[1] != n_features:
269 raise ValueError("X has %d features per sample; expecting %d"
--> 270 % (X.shape[1], n_features))
271
272 scores = safe_sparse_dot(X, self.coef_.T,
ValueError: X has 6 features per sample; expecting 7
What I have tried:
I tried deleting the Survived column from the train data frame but still it is of no use.
Moreover dropping the Passengerid column from the test data frame removes it from the data frame but dropping the survived column from the train data frame doesn't remove it.