Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cross-validation feature #2

Closed
sanjayaksaxena opened this issue Jul 9, 2018 · 4 comments
Closed

add cross-validation feature #2

sanjayaksaxena opened this issue Jul 9, 2018 · 4 comments
Assignees

Comments

@sanjayaksaxena
Copy link
Member

No description provided.

@sanjayaksaxena sanjayaksaxena self-assigned this Jul 9, 2018
@sanjayaksaxena
Copy link
Member Author

Use wink-helpers cross validation instead!

@jtara1
Copy link

jtara1 commented Dec 6, 2018

not sure if this is proper implementation, but I implemented k folds cross validation func

const _ = require('lodash');
const winkHelpers = require('wink-helpers');

/**
 * zip data & labels, shuffle them, partition them into the number of folds
 * specified
 * @param data {[]}
 * @param labels {[]}
 * @param folds {number}
 * @returns {Array<Array<Array<*>>>} an array of groups of zipped data & labels
 */
module.exports = function kFoldsCrossValidation(data, labels, folds) {
	if (data.length < 2) throw new Error('only 1 row for data given');

	let dataAndLabels = _.zip(data, labels);
	dataAndLabels = winkHelpers.array.shuffle(dataAndLabels);

	return partition(dataAndLabels, folds);
};

function partition(dataAndLabels, numberOfGroups) {
	const groups = [];
	const len = dataAndLabels.length;
	const partitionSize = Math.ceil(len / numberOfGroups);

	for (let i = 0; i < len; i += partitionSize) {
		groups.push(dataAndLabels.slice(i, i + partitionSize));
	}

	return groups;
}

// example usage

// made up data & labels
const start = 100;
const end = 128;
const diffSplit = Math.floor((end - start) / 2);

let a = _.range(start, end);
const labels1 = new Array(diffSplit);
const labels2 = new Array(diffSplit);
labels1.fill('a');
labels2.fill('b');

let l = _.concat(labels1, labels2);

const groups = module.exports(a, l, 3);
console.log(groups);

// test group
let [data, labels] = _.unzip(groups[0]);
console.log(data);
console.log(labels);

or just use https://visualml.io/jsmlt/docs/function/index.html#static-function-trainTestSplit

@sanjayaksaxena
Copy link
Member Author

Hi @jtara1

Thanks a lot for your inputs!

In fact wink-helpers have an API for cross validation, which is not yet completely documented. It lets you compute macro-averaged avgPrecision, avgRecall, and avgFMeasure apart from many more details.

You may like to review the same.

We would keep your inputs in mind at the time of completing all of these.

Best,
Sanjaya

@jtara1
Copy link

jtara1 commented Dec 7, 2018

I was a little confused because it doesn't do the part of the cross validation where you partition data into testing and training sets. https://en.wikipedia.org/wiki/Cross-validation_(statistics)

I had another question about this module, I'll just open another issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants