Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assignment 1, kNN classifier #2

Closed
truehines opened this issue Jul 6, 2018 · 1 comment
Closed

Assignment 1, kNN classifier #2

truehines opened this issue Jul 6, 2018 · 1 comment

Comments

@truehines
Copy link

Mahan,

I think I have found an error in the k-fold cross validation snippet of the knn jupyter notebook. In the second segment of this snippet, the line:
train_set = np.concatenate((X_train_folds[:i] + X_train_folds[i+1:]))

I believe that using the "+" operator on these two arrays (X_train_folds[:i] & X_train_folds[i+1:]) will actually add together the array elements instead of concatenating them as you intended. Do you agree with this?

In my own implementation I have the following (the reason for the if-elif-else is that concatenating an empty array gives an error):
if i == 0:
X_train_fold = np.concatenate(X_train_folds[(i + 1):num_folds])
y_train_fold = np.concatenate(y_train_folds[(i + 1):num_folds])
elif i == (num_folds - 1):
X_train_fold = np.concatenate(X_train_folds[0:i])
y_train_fold = np.concatenate(y_train_folds[0:i])
else:
X_train_fold = np.concatenate((np.concatenate(X_train_folds[0:i]), np.concatenate(X_train_folds[(i + 1):num_folds])))
y_train_fold = np.concatenate((np.concatenate(y_train_folds[0:i]), np.concatenate(y_train_folds[(i + 1):num_folds])))
classifier.train(X_train_fold, y_train_fold)

I am open to suggestions on a cleaner way to implement this...

Your feedback is greatly appreciated -- I don't have someone to discuss this type of thing with....

Regards,
True

@MahanFathi
Copy link
Owner

Hey,

Sorry for the delayed response.
When slicing a list, i.e. reading of an array using the : operator for the index, another list is returned. Try this:

myList = [0, 1, 2, 3]
myList[:2] # gives [0, 1]
myList[:1] # gives [0]
myList[:0] # gives []

Also, the + operator when applied on lists, acts as a concatenator. Try this:

myList[:2] + myList[:2] # gives [0, 1, 0, 1]

The numpy concatenate function then takes a list of numpy arrays as the input and stacks them up into a single array along the first axis, which is treated as the new training/test set. Also, notice how deliberately I ommit a fold by indexing. So I guess this is the cleanest way to go. Correct me if I'm wrong.

Best,
Mahan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants