Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed tutorial to run without a cogserver #16

Open
wants to merge 3 commits into
base: cogserver_tutorials_branch
Choose a base branch
from

Conversation

andre-senna
Copy link

@andre-senna andre-senna commented Oct 16, 2018

Although the tutorial is running without cogserver there are a couple of points to remark


(1) Although this tutorial is running using ECAN-based rules and ECAN-based GHOST API, ECAN dynamics itself are disabled. I could not figure out how to ensure the STI of rules so that GHOST can properly match the rules. Even when I run it using a cogserver (and starting all ECAN agents) OpenPSI could not properly match my rules.

This is a potential problem... or not. I don't know. My strong opinion is that this tutorial should not enter in details regarding ECAN and attention dynamics. I think it should focus in how to write rules and make them match in a proper/sensible way.


(2) To run the tutorial, 2 files in the Docker container need to be modified:

/usr/local/share/opencog/scm/opencog/openpsi/main.scm
line # 14: (cog-logger-set-level! opl "info")
line # 132: (usleep 10000000)

/usr/local/share/opencog/scm/opencog/ghost/matcher.scm
line # 283: (rule-selected (eval-and-select candidate-rules #t)))

I don't know what's the proper way to do that so I left it to someone else. I've see that the scm files in /usr/local/share/opencog/scm/opencog/ (in the Docker container) differ from the ones in the opencog repo (the repo used inside the same container).

The change in matcher.scm allows the tutorial to run without ECAN dynamics. The changes in main.scm are a hack to work around the problem I've reported regarding the crash of the Jupyter Notebook after some PSI cycles.

The usleep is the delay between two PSI cycles. The amount I suggested is equivalent to 10s. It's too much, I know, but with this delay the aforementioned crash happens only with a very low probability.
1s is enough to have a good user experience but the crash will probably happen in less then 1 minute after calling (ghost-run).


(3) I've tried to track the crash. It happens in ZMQ while Jupyter Notebook is trying to ACCEPT in the socket with the user interface. It seems like the socket suddenly become unavailable. The behavior is really odd because:

a) It is surely related to OpenPSI or GHOST. Changing the sleeping time also changes the probability of the crash. If sleeping time is set too high it never crashes.

b) unless the logger is misguiding my judgement (by not delivering debug messages timely), the PSI thread in charge of running GHOST is always sleeping when the crash happens.

c) The crash happens only in the Notebook. Never in a standalone guile session. Actually, the crash happens in the SCM file called when the Notebook is setup (i.e. in the SCM kernel: simple-zmq.scm).


@andre-senna
Copy link
Author

BTW I'm not sure this is relevant. Anyway, the offending command is obviously (ghost-run). Without it the crash never happens. It creates and starts a thread to run PSI. I've noticed that Jupyter Notebook returns this output:

#<thread 140645520672512 (23b2a80)

It's missing the ">" in the end. Does it ring any bells?

This fix is meant to work around the problem reported here:
singnet#16

I've just surrounded the call that was raising the exception by
a catch (with no exception management... Just ignoring it)
@andre-senna
Copy link
Author

I couldn't find out why is the Jupyter notebook crashing so I've changed the kernel SCM file to catch the exception and move on (no exceptions management... it's just ignored).

I've added the changed SCM in the PR as well as a file describing all the changes that need to be done in the Docker container files:

sudo vi /usr/local/share/opencog/scm/opencog/openpsi/main.scm
line # 14: (cog-logger-set-level! opl "info")
line # 132: (usleep 100000)

sudo vi /usr/local/share/opencog/scm/opencog/ghost/matcher.scm
line # 283: (rule-selected (eval-and-select candidate-rules #t)))

sudo cp /opencog/notebooks/guile-jupyter-kernel.scm /usr/local/share/jupyter/kernels/guile-kernel/src/guile-jupyter-kernel.scm

The first change is not actually necessary but it's recommended. Specially the change in the log level. Otherwise a log file will grow fast and can potentially crash the server by "no space left on device".

The change in the Jupyter Guile Kernel is supposed to do no harm in the other notebooks.

@andre-senna
Copy link
Author

andre-senna commented Oct 18, 2018

I've tracked the bug and it seems it is in the notebook kernel. It crashes whenever one start a thread in a cell. A notebook with this single cell is enough to crash it:

(define (my-func)(while #t (usleep 10)))
(call-with-new-thread my-func (lambda (key . args) #t))

Any notebook using any opencog feature that uses threads may be affected.

I've already created an issue in the kernel's github repo.

Meanwhile, the workaround I've described in this PR will prevent the bug from crashing the notebooks. I suggest using it in all notebooks while the author don't provide a proper fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant