Skip to content

Commit

Permalink
Merge pull request #52 from karlicoss/updates
Browse files Browse the repository at this point in the history
Updates
  • Loading branch information
karlicoss authored May 18, 2020
2 parents 41c5b34 + 02ba71a commit c410daa
Show file tree
Hide file tree
Showing 12 changed files with 204 additions and 78 deletions.
19 changes: 18 additions & 1 deletion doc/DEVELOPMENT.org
Original file line number Diff line number Diff line change
@@ -1,8 +1,23 @@
* TOC
:PROPERTIES:
:TOC: :include all :depth 3
:END:

:CONTENTS:
- [[#toc][TOC]]
- [[#running-tests][Running tests]]
- [[#ide-setup][IDE setup]]
- [[#linting][Linting]]
- [[#modifyingadding-modules][Modifying/adding modules]]
:END:

* Running tests
I'm using =tox= to run test/lint. You can check out [[file:../.github/workflows/main.yml][Github Actions]] config
and [[file:../scripts/ci/run]] for the up to date info on the specifics.

* IDE setup: make sure my.config is in your package search path
* IDE setup
To benefit from type hinting, make sure =my.config= is in your package search path.

In runtime, ~my.config~ is imported from the user config directory dynamically.

However, Pycharm/Emacs/whatever you use won't be able to figure that out, so you'd need to adjust your IDE configuration.
Expand Down Expand Up @@ -43,3 +58,5 @@ Now if you add =my_reddit_overlay= *in the front* of ~PYTHONPATH~, all the downs
This could be useful to monkey patch some behaviours, or dynamically add some extra data sources -- anything that comes to your mind.

I'll put up a better guide on this, in the meantime see [[https://packaging.python.org/guides/packaging-namespace-packages]["namespace packages"]] for more info.

# TODO add example with overriding 'all'
53 changes: 37 additions & 16 deletions doc/MODULES.org
Original file line number Diff line number Diff line change
@@ -1,11 +1,31 @@
This file is an overview of *documented* modules.
There are many more, see [[file:../README.org::#whats-inside]["What's inside"]] for the full list of modules, I'm progressively working on documenting them.

* TOC
:PROPERTIES:
:TOC: :include all
:END:
:CONTENTS:
- [[#toc][TOC]]
- [[#intro][Intro]]
- [[#configs][Configs]]
- [[#mygoogletakeoutpaths][my.google.takeout.paths]]
- [[#myhypothesis][my.hypothesis]]
- [[#myreddit][my.reddit]]
- [[#mytwittertwint][my.twitter.twint]]
- [[#mytwitterarchive][my.twitter.archive]]
- [[#mylastfm][my.lastfm]]
- [[#myreadingpolar][my.reading.polar]]
- [[#myinstapaper][my.instapaper]]
:END:

* Intro

See [[file:SETUP.org][SETUP]] to find out how to set up your own config.

Some explanations:

- =MY_CONFIG= is whereever you are keeping your private configuration (usually =~/.config/my/=)
- =MY_CONFIG= is the path where you are keeping your private configuration (usually =~/.config/my/=)
- [[https://docs.python.org/3/library/pathlib.html#pathlib.Path][Path]] is a standard Python object to represent paths
- [[https://github.com/karlicoss/HPI/blob/5f4acfddeeeba18237e8b039c8f62bcaa62a4ac2/my/core/common.py#L9][PathIsh]] is a helper type to allow using either =str=, or a =Path=
- [[https://github.com/karlicoss/HPI/blob/5f4acfddeeeba18237e8b039c8f62bcaa62a4ac2/my/core/common.py#L108][Paths]] is another helper type for paths.
Expand All @@ -21,12 +41,15 @@ Some explanations:

- if the field has a default value, you can omit it from your private config altogether

* Configs

The config snippets below are meant to be modified accordingly and *pasted into your private configuration*, e.g =$MY_CONFIG/my/config.py=.

You don't have to set them up all at once, it's recommended to do it gradually.

#+begin_src python :dir .. :results output drawer :exports result
# TODO hmm. drawer raw means it can output outlines, but then have to manually erase the generated results. ugh.

#+begin_src python :dir .. :results output drawer raw :exports result
# TODO ugh, pkgutil.walk_packages doesn't recurse and find packages like my.twitter.archive??
import importlib
# from lint import all_modules # meh
Expand Down Expand Up @@ -63,7 +86,7 @@ for cls, p in modules:
for x in ['.py', '__init__.py']:
if Path(mpath + x).exists():
mpath = mpath + x
print(f'- [[file:../{mpath}][{p}]]')
print(f'** [[file:../{mpath}][{p}]]')
mdoc = m.__doc__
if mdoc is not None:
print(indent(mdoc))
Expand All @@ -73,18 +96,17 @@ for cls, p in modules:
#+end_src

#+RESULTS:
:results:


- [[file:../my/google/takeout/paths.py][my.google.takeout.paths]]
** [[file:../my/google/takeout/paths.py][my.google.takeout.paths]]

Module for locating and accessing [[https://takeout.google.com][Google Takeout]] data

#+begin_src python
class google:
takeout_path: Paths # path/paths/glob for the takeout zips
#+end_src
- [[file:../my/hypothesis.py][my.hypothesis]]
** [[file:../my/hypothesis.py][my.hypothesis]]

[[https://hypothes.is][Hypothes.is]] highlights and annotations

Expand All @@ -98,10 +120,10 @@ for cls, p in modules:
export_path: Paths

# path to a local clone of hypexport
# alternatively, you can put the repository (or a symlink) in $MY_CONFIG/repos/hypexport
# alternatively, you can put the repository (or a symlink) in $MY_CONFIG/my/config/repos/hypexport
hypexport : Optional[PathIsh] = None
#+end_src
- [[file:../my/reddit.py][my.reddit]]
** [[file:../my/reddit.py][my.reddit]]

Reddit data: saved items/comments/upvotes/etc.

Expand All @@ -115,10 +137,10 @@ for cls, p in modules:
export_path: Paths

# path to a local clone of rexport
# alternatively, you can put the repository (or a symlink) in $MY_CONFIG/repos/rexport
# alternatively, you can put the repository (or a symlink) in $MY_CONFIG/my/config/repos/rexport
rexport : Optional[PathIsh] = None
#+end_src
- [[file:../my/twitter/twint.py][my.twitter.twint]]
** [[file:../my/twitter/twint.py][my.twitter.twint]]

Twitter data (tweets and favorites).

Expand All @@ -128,15 +150,15 @@ for cls, p in modules:
class twint:
export_path: Paths # path[s]/glob to the twint Sqlite database
#+end_src
- [[file:../my/twitter/archive.py][my.twitter.archive]]
** [[file:../my/twitter/archive.py][my.twitter.archive]]

Twitter data (uses [[https://help.twitter.com/en/managing-your-account/how-to-download-your-twitter-archive][official twitter archive export]])

#+begin_src python
class twitter:
export_path: Paths # path[s]/glob to the twitter archive takeout
#+end_src
- [[file:../my/lastfm][my.lastfm]]
** [[file:../my/lastfm][my.lastfm]]

Last.fm scrobbles

Expand All @@ -147,7 +169,7 @@ for cls, p in modules:
"""
export_path: Paths
#+end_src
- [[file:../my/reading/polar.py][my.reading.polar]]
** [[file:../my/reading/polar.py][my.reading.polar]]

[[https://github.com/burtonator/polar-books][Polar]] articles and highlights

Expand All @@ -159,7 +181,7 @@ for cls, p in modules:
polar_dir: PathIsh = Path('~/.polar').expanduser()
defensive: bool = True # pass False if you want it to fail faster on errors (useful for debugging)
#+end_src
- [[file:../my/instapaper.py][my.instapaper]]
** [[file:../my/instapaper.py][my.instapaper]]

[[https://www.instapaper.com][Instapaper]] bookmarks, highlights and annotations

Expand All @@ -172,7 +194,6 @@ for cls, p in modules:
export_path : Paths

# path to a local clone of instapexport
# alternatively, you can put the repository (or a symlink) in $MY_CONFIG/repos/instapexport
# alternatively, you can put the repository (or a symlink) in $MY_CONFIG/my/config/repos/instapexport
instapexport: Optional[PathIsh] = None
#+end_src
:end:
104 changes: 67 additions & 37 deletions doc/SETUP.org
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,36 @@
Please don't be shy and raise issues if something in the instructions is unclear.
You'd be really helping me, I want to make the setup as straightforward as possible!

# update with org-make-toc
* TOC
:PROPERTIES:
:TOC: :include all
:END:

:CONTENTS:
- [[#toc][TOC]]
- [[#few-notes][Few notes]]
- [[#setting-up-the-main-package][Setting up the main package]]
- [[#option-1-install-from-pip][option 1: install from PIP]]
- [[#option-2-local-install][option 2: local install]]
- [[#option-3-use-without-installing][option 3: use without installing]]
- [[#optional-packages][Optional packages]]
- [[#setting-up-the-modules][Setting up the modules]]
- [[#private-configuration-myconfig][private configuration (my.config)]]
- [[#module-dependencies][module dependencies]]
- [[#usage-examples][Usage examples]]
- [[#end-to-end-roam-research-setup][End-to-end Roam Research setup]]
- [[#polar][Polar]]
- [[#google-takeout][Google Takeout]]
- [[#kobo-reader][Kobo reader]]
- [[#orger][Orger]]
- [[#orger--polar][Orger + Polar]]
- [[#demopy][demo.py]]
:END:


* Few notes
I understand people may not super familiar with Python, PIP or generally unix, so here are some short notes:
I understand people may not super familiar with Python, PIP or generally unix, so here are some useful notes:

- only python3 is supported, and more specifically, ~python >= 3.6~.
- I'm using ~pip3~ command, but on your system you might only have ~pip~.
Expand All @@ -13,7 +41,7 @@ I understand people may not super familiar with Python, PIP or generally unix, s
- similarly, I'm using =python3= in the documentation, but if your =python --version= says python3, it's okay to use =python=

- when you are using ~pip install~, [[https://stackoverflow.com/a/42989020/706389][always pass]] =--user=, and *never install third party packages with sudo* (unless you know what you are doing)
- throughout the guide I'm assuming the config directory is =~/.config=, but it's different on Mac/Windows.
- throughout the guide I'm assuming the user config directory is =~/.config=, but it's *different on Mac/Windows*.

See [[https://github.com/ActiveState/appdirs/blob/3fe6a83776843a46f20c2e5587afcffe05e03b39/appdirs.py#L187-L190][this]] if you're not sure what's your user config dir.

Expand All @@ -22,12 +50,12 @@ This is a *required step*

You can choose one of the following options:

** install from [[https://pypi.org/project/HPI][PIP]]
This is the easiest way:
** option 1: install from [[https://pypi.org/project/HPI][PIP]]
This is the *easiest way*:

: pip3 install --user HPI

** local install
** option 2: local install
This is convenient if you're planning to add new modules or change the existing ones.

1. Clone the repository: =git clone [email protected]:karlicoss/HPI.git /path/to/hpi=
Expand All @@ -39,7 +67,7 @@ This is convenient if you're planning to add new modules or change the existing

It's *extremely* convenient for developing and debugging.

** use without installing
** option 3: use without installing
This is less convenient, but gives you more control.

1. Clone the repository: =git clone [email protected]:karlicoss/HPI.git /path/to/hpi=
Expand All @@ -59,7 +87,7 @@ This is less convenient, but gives you more control.

The benefit of this way is that you get a bit more control, explicitly allowing your scripts to use your data.

** optional packages
* Optional packages
You can also install some opional packages

: pip3 install 'HPI[optional]'
Expand All @@ -69,12 +97,14 @@ They aren't necessary, but improve your experience. At the moment these are:
- [[https://github.com/karlicoss/cachew][cachew]]: automatic caching library, which can greatly speedup data access
- [[https://github.com/metachris/logzero][logzero]]: a nice logging library, supporting colors

* Setting up the modules
This is an *optional step* as some modules might work without extra setup.
* Setting up modules
This is an *optional step* as few modules work without extra setup.
But it depends on the specific module.

See [[file:MODULES.org][MODULES]] to read documentation on specific modules that interest you.

You might also find interesting to read [[file:CONFIGURING.org][CONFIGURING]], where I'm
elaborating on some rationales behind the current configuration system.
elaborating on some technical rationales behind the current configuration system.

** private configuration (=my.config=)
# TODO write about dynamic configuration
Expand All @@ -87,7 +117,7 @@ The config is simply a *python package* (named =my.config=), expected to be in =

Since it's a Python package, generally it's very *flexible* and there are many ways to set it up.

- The simplest and very minimum you need is =~/.config/my/my/config.py=. For example:
- *The simplest and the very minimum* you need is =~/.config/my/my/config.py=. For example:

#+begin_src python
import pytz # yes, you can use any Python stuff in the config
Expand Down Expand Up @@ -116,32 +146,6 @@ Since it's a Python package, generally it's very *flexible* and there are many w

- or you can just try running them and fill in the attributes Python complains about!

- My config layout is a bit more complicated:

#+begin_src python :exports results :results output
from pathlib import Path
home = Path("~").expanduser()
pp = home / '.config/my/my/config'
for p in sorted(pp.rglob('*')):
if '__pycache__' in p.parts:
continue
ps = str(p).replace(str(home), '~')
print(ps)
#+end_src

#+RESULTS:
#+begin_example
~/.config/my/my/config/__init__.py
~/.config/my/my/config/locations.py
~/.config/my/my/config/repos
~/.config/my/my/config/repos/endoexport
~/.config/my/my/config/repos/fbmessengerexport
~/.config/my/my/config/repos/kobuddy
~/.config/my/my/config/repos/monzoexport
~/.config/my/my/config/repos/pockexport
~/.config/my/my/config/repos/rexport
#+end_example

- Another example is in [[file:example_config][example_config]]:

#+begin_src bash :exports results :results output
Expand Down Expand Up @@ -183,6 +187,32 @@ Feel free to add other files as well though to organize better, it's a real Pyth
Some things (e.g. links to external packages like [[https://github.com/karlicoss/hypexport][hypexport]]) are specified as *ordinary symlinks* in ~repos~ directory.
That way you get easy imports (e.g. =import my.config.repos.hypexport.model=) and proper IDE integration.

- my own config layout is a bit more complicated:

#+begin_src python :exports results :results output
from pathlib import Path
home = Path("~").expanduser()
pp = home / '.config/my/my/config'
for p in sorted(pp.rglob('*')):
if '__pycache__' in p.parts:
continue
ps = str(p).replace(str(home), '~')
print(ps)
#+end_src

#+RESULTS:
#+begin_example
~/.config/my/my/config/__init__.py
~/.config/my/my/config/locations.py
~/.config/my/my/config/repos
~/.config/my/my/config/repos/endoexport
~/.config/my/my/config/repos/fbmessengerexport
~/.config/my/my/config/repos/kobuddy
~/.config/my/my/config/repos/monzoexport
~/.config/my/my/config/repos/pockexport
~/.config/my/my/config/repos/rexport
#+end_example

# TODO link to post about exports?
** module dependencies
Dependencies are different for specific modules you're planning to use, so it's hard to specify.
Expand Down
2 changes: 2 additions & 0 deletions my/cfg.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,5 @@ def set_repo(name: str, repo: Union[Path, str]) -> None:


# TODO set_repo is still useful, but perhaps move this thing away to core?

# TODO ok, I need to get rid of this, better to rely on regular imports
10 changes: 10 additions & 0 deletions my/core/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
# some helper functions
PathIsh = Union[Path, str]

# TODO only used in tests? not sure if useful at all.
# TODO port annotations to kython?..
def import_file(p: PathIsh, name: Optional[str]=None) -> types.ModuleType:
p = Path(p)
Expand All @@ -33,6 +34,13 @@ def import_from(path: PathIsh, name: str) -> types.ModuleType:
sys.path.remove(path)


def import_dir(path: PathIsh, extra: str='') -> types.ModuleType:
p = Path(path)
if p.parts[0] == '~':
p = p.expanduser() # TODO eh. not sure about this..
return import_from(p.parent, p.name + extra)


T = TypeVar('T')
K = TypeVar('K')
V = TypeVar('V')
Expand Down Expand Up @@ -124,6 +132,8 @@ def get_files(pp: Paths, glob: str=DEFAULT_GLOB, sort: bool=True) -> Tuple[Path,

paths: List[Path] = []
for src in sources:
if src.parts[0] == '~':
src = src.expanduser()
if src.is_dir():
gp: Iterable[Path] = src.glob(glob)
paths.extend(gp)
Expand Down
Loading

0 comments on commit c410daa

Please sign in to comment.