Skip to content

Commit

Permalink
Add a way to speedup dataframe data imputing
Browse files Browse the repository at this point in the history
  • Loading branch information
jecisc committed Mar 21, 2023
1 parent 752871b commit a43aa18
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 5 deletions.
6 changes: 1 addition & 5 deletions src/AI-DataImputers/AISimpleImputer.class.st
Original file line number Diff line number Diff line change
Expand Up @@ -176,11 +176,7 @@ AISimpleImputer >> transform: aCollection [
self ensureIs2D: aCollection.
self statistics ifNil: [ self error: '#fit: needs to be called before transforming.' ].

^ aCollection collect: [ :subcoll |
subcoll withIndexCollect: [ :elem :index |
elem = self missingValue
ifTrue: [ statistics at: index ]
ifFalse: [ elem ] ] ]
^ aCollection copyReplace: missingValue in2DCollectionBy: statistics
]

{ #category : #options }
Expand Down
12 changes: 12 additions & 0 deletions src/AI-DataImputers/SequenceableCollection.extension.st
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Extension { #name : #SequenceableCollection }

{ #category : #'*AI-DataImputers' }
SequenceableCollection >> copyReplace: missingValue in2DCollectionBy: arrayOfReplacementValues [
"I am a 2D collection and the goal is to return a copy replace the missing values by the values of my second parameter. The good value is the index of the missing value in the sub collection."

^ self collect: [ :subColl |
subColl withIndexCollect: [ :element :index |
element = missingValue
ifTrue: [ arrayOfReplacementValues at: index ]
ifFalse: [ element ] ] ]
]

0 comments on commit a43aa18

Please sign in to comment.