Skip to content
This repository has been archived by the owner on Dec 1, 2021. It is now read-only.

Memory exhausts when used in a long-running process #33

Open
monyxie opened this issue Sep 23, 2019 · 3 comments
Open

Memory exhausts when used in a long-running process #33

monyxie opened this issue Sep 23, 2019 · 3 comments

Comments

@monyxie
Copy link

monyxie commented Sep 23, 2019

I'm trying to export a table from MySQL to Neo4j using a PHP script. The table contains some 2 million rows. The script aborts everytime at around 500k rows with this error:

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 62914608 bytes) in E:\code\neo4j-demo\vendor\graphaware\neo4j-bolt\src\PackStream\StreamChannel.php on line 74

Call Stack:
    0.4029     407160   1. {main}() E:\code\neo4j-demo\src\index.php:0
    1.9149    4026888   2. Illuminate\Database\Query\Builder->chunkById() E:\code\neo4j-demo\src\index.php:61
  535.8047   68194584   3. {closure:E:\code\neo4j-demo\src\index.php:46-61}() E:\code\neo4j-demo\vendor\illuminate\database\Concerns\BuildsQueries.php:103
  535.8149   68823696   4. GraphAware\Neo4j\Client\Transaction\Transaction->commit() E:\code\neo4j-demo\src\index.php:56
  535.8156   68860616   5. GraphAware\Bolt\Protocol\V1\Transaction->runMultiple() E:\code\neo4j-demo\vendor\graphaware\neo4j-php-client\src\Transaction\Transaction.php:205
  535.8229   69433616   6. GraphAware\Bolt\Protocol\Pipeline->run() E:\code\neo4j-demo\vendor\graphaware\neo4j-bolt\src\Protocol\V1\Transaction.php:161
  536.5241   69989080   7. GraphAware\Bolt\Protocol\V1\Session->run() E:\code\neo4j-demo\vendor\graphaware\neo4j-bolt\src\Protocol\Pipeline.php:62
  536.5253   69991792   8. GraphAware\Bolt\PackStream\Unpacker->unpack() E:\code\neo4j-demo\vendor\graphaware\neo4j-bolt\src\Protocol\V1\Session.php:106
  536.5254   69991824   9. GraphAware\Bolt\PackStream\StreamChannel->read() E:\code\neo4j-demo\vendor\graphaware\neo4j-bolt\src\PackStream\Unpacker.php:73

The script I'm using is as follows:

<?php

include "../vendor/autoload.php";

use GraphAware\Neo4j\Client\ClientBuilder;
use Illuminate\Container\Container;
use Illuminate\Database\Connectors\ConnectionFactory;
use Illuminate\Support\Collection;

$config = [
    'driver' => 'mysql',
    'host' => 'localhost',
    'port' => 3306,
    'database' => '...',
    'username' => '...',
    'password' => '...',
    'charset' => 'utf8mb4',
    'collation' => 'utf8mb4_general_ci',
    'prefix' => '',
    'strict' => true,
];

$container = new Container();
$factory = new ConnectionFactory($container);
$mysql = $factory->make($config);

$client = ClientBuilder::create()
    ->addConnection('bolt', 'bolt://neo4j:new4j@localhost:7687')
    ->build();

$total = $mysql->query()->from('follow')->count();

$startTime = time();

$done = 0;
$mysql->query()
    ->from('follow')
    ->chunkById(
        1000,
        function (Collection $rows) use ($total, &$done, $client) {
            echo "$done/$total {$rows->last()->id}\n";

            $tx = $client->transaction();
            foreach ($rows as $row) {
                $tx->push(
                    'MERGE (a:Member {uid:$uid}) MERGE (b:Member {uid:$followUid}) MERGE (a)-[:FOLLOWS]->(b)',
                    ['uid' => $row->uid, 'followUid' => $row->followUid]
                );
            }
            $tx->commit();

            $done += count($rows);
        }
    );

echo time() - $startTime;
@betd-claumond
Copy link

Hello @monyxie

Have you found a solution or a workaround for this issue please ?

On my side, I debugged a long-running process and found out the memory was used in the class variable $this->bytes of the class GraphAware\Bolt\PackStream\StreamChannel.

In a long running process, it add all the received data from the start of session in this variable, so this variable get bigger and bigger...

I have not found yet a clean way to restart the session or the client because I'm using Symfony and its dependency injection, but in your case, I guess that recreating a new client from time to time would solve your issue.

It would be nice if you could let us know how you solved this on your side.

Best Regards,
Christophe

@monyxie
Copy link
Author

monyxie commented Mar 31, 2020

@betd-claumond Yep, that's what I found too. I believe I changed the script so that it could pick up the work from last run and continue. But I don't have a solution for this. It's just a one time use script. Sorry.

@betd-claumond
Copy link

@monyxie thanks for the feedback !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants