forked from leomarquine/php-etl
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Aggregator Extractor (leomarquine#25)
* toIterator method * pipeline consume as a Generator * new Accumulator extractor * missingDataExxecption if strict * fun with chaining * better doc * @ArthurHoaro get rid of superfluous if statement * @ArthurHoaro fix type juggling/casting * perf, remove unecessary md5 hash * various cosmetics * Accumulator tests * optimization * documentation * svg schema update * missing link in documentation * svg better * rename to Aggregator * incomplete flag * svg tuning * fix fusion Co-authored-by: Nicolas @ remote <[email protected]>
- Loading branch information
1 parent
9c077a1
commit a2fa8c9
Showing
23 changed files
with
1,965 additions
and
52 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
# Aggregator | ||
|
||
Merge rows from a list of partial data iterators with a matching index. | ||
|
||
```php | ||
# user data from one CSV file | ||
$userDataIterator = (new Etl()) | ||
->extract( | ||
new Csv(), | ||
'user_data.csv', | ||
['columns' => ['id','email', 'name']] | ||
) | ||
->toIterator() | ||
; | ||
|
||
# extended info from another source | ||
$extendedInfoIterator = (new Etl()) | ||
->extract( | ||
new Table(), | ||
'extended_info', | ||
['columns' => 'courriel', 'twitter'] | ||
) | ||
# let's rename 'courriel' to 'email' | ||
->tranform( | ||
new RenameColumns(), | ||
[ | ||
'columns' => ['courriel' => 'email'] | ||
] | ||
) | ||
->toIterator() | ||
; | ||
|
||
# merge this two data sources | ||
$mergedData = (new Etl()) | ||
->extract( | ||
new Aggregator(), | ||
[ | ||
$userDataIterator, | ||
$extendedInfoIterator, | ||
], | ||
[ | ||
'index' => ['email'], # common matching index | ||
'columns' => ['id','email','name','twitter'] | ||
] | ||
) | ||
->load( | ||
new CsvLoader(), | ||
'completeUserData.csv' | ||
) | ||
->run() | ||
; | ||
``` | ||
|
||
## Options | ||
|
||
### Index (required) | ||
|
||
An array of column names common in all data sources | ||
|
||
| Type | Default value | | ||
|-------|---------------| | ||
| array | `null` | | ||
|
||
```php | ||
$options = ['index' => ['email']]; | ||
``` | ||
|
||
### Columns (required) | ||
|
||
A `Row` is yield when all specified columns have been found for the matching index. | ||
|
||
| Type | Default value | | ||
|-------|---------------| | ||
| array | `null` | | ||
|
||
```php | ||
$options = ['columns' => ['id', 'name', 'email']]; | ||
``` | ||
|
||
### Strict | ||
|
||
When all Iterators input are fully consummed, if we have any remaining incomplete rows: | ||
|
||
- if *true*: Throw an `IncompleteDataException` | ||
- if *false*: yield the incomplete remaining `Row` flagged as `incomplete` | ||
|
||
| Type | Default value | | ||
|---------|---------------| | ||
| boolean | `true` | | ||
|
||
```php | ||
$options = ['strict' => false]; | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.