Skip to content

Commit

Permalink
git-p4: auto-size the block
Browse files Browse the repository at this point in the history
git-p4 originally would fetch changes in one query. On large repos this
could fail because of the limits that Perforce imposes on the number of
items returned and the number of queries in the database.

To fix this, git-p4 learned to query changes in blocks of 512 changes,
However, this can be very slow - if you have a few million changes,
with each chunk taking about a second, it can be an hour or so.

Although it's possible to tune this value manually with the
"--changes-block-size" option, it's far from obvious to ordinary users
that this is what needs doing.

This change alters the block size dynamically by looking for the
specific error messages returned from the Perforce server, and reducing
the block size if the error is seen, either to the limit reported by the
server, or to half the current block size.

That means we can start out with a very large block size, and then let
it automatically drop down to a value that works without error, while
still failing correctly if some other error occurs.

Signed-off-by: Luke Diamand <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
  • Loading branch information
luked99 authored and gitster committed Jun 12, 2018
1 parent 8fa0abf commit 3deed5e
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 6 deletions.
27 changes: 21 additions & 6 deletions git-p4.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,8 @@ def __str__(self):
# Only labels/tags matching this will be imported/exported
defaultLabelRegexp = r'[a-zA-Z0-9_\-.]+$'

# Grab changes in blocks of this many revisions, unless otherwise requested
defaultBlockSize = 512
# The block size is reduced automatically if required
defaultBlockSize = 1<<20

p4_access_checked = False

Expand Down Expand Up @@ -958,7 +958,8 @@ def p4ChangesForPaths(depotPaths, changeRange, requestedBlockSize):
changes = set()

# Retrieve changes a block at a time, to prevent running
# into a MaxResults/MaxScanRows error from the server.
# into a MaxResults/MaxScanRows error from the server. If
# we _do_ hit one of those errors, turn down the block size

while True:
cmd = ['changes']
Expand All @@ -972,10 +973,24 @@ def p4ChangesForPaths(depotPaths, changeRange, requestedBlockSize):
for p in depotPaths:
cmd += ["%s...@%s" % (p, revisionRange)]

# fetch the changes
try:
result = p4CmdList(cmd, errors_as_exceptions=True)
except P4RequestSizeException as e:
if not block_size:
block_size = e.limit
elif block_size > e.limit:
block_size = e.limit
else:
block_size = max(2, block_size // 2)

if verbose: print("block size error, retrying with block size {0}".format(block_size))
continue
except P4Exception as e:
die('Error retrieving changes description ({0})'.format(e.p4ExitCode))

# Insert changes in chronological order
for entry in reversed(p4CmdList(cmd)):
if entry.has_key('p4ExitCode'):
die('Error retrieving changes descriptions ({})'.format(entry['p4ExitCode']))
for entry in reversed(result):
if not entry.has_key('change'):
continue
changes.add(int(entry['change']))
Expand Down
8 changes: 8 additions & 0 deletions t/t9818-git-p4-block.sh
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@ test_expect_success 'Create a repo with multiple depot paths' '
'

test_expect_success 'Clone repo with multiple depot paths' '
test_when_finished cleanup_git &&
(
cd "$git" &&
git p4 clone --changes-block-size=4 //depot/pathA@all //depot/pathB@all \
Expand All @@ -138,6 +139,13 @@ test_expect_success 'Clone repo with multiple depot paths' '
)
'

test_expect_success 'Clone repo with self-sizing block size' '
test_when_finished cleanup_git &&
git p4 clone --changes-block-size=1000000 //depot@all --destination="$git" &&
git -C "$git" log --oneline >log &&
test_line_count \> 10 log
'

test_expect_success 'kill p4d' '
kill_p4d
'
Expand Down

0 comments on commit 3deed5e

Please sign in to comment.