Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc. to-dos from hack-a-thon #4

Closed
25 of 30 tasks
willpearse opened this issue Nov 4, 2019 · 34 comments
Closed
25 of 30 tasks

Misc. to-dos from hack-a-thon #4

willpearse opened this issue Nov 4, 2019 · 34 comments
Assignees

Comments

@willpearse
Copy link
Collaborator

willpearse commented Nov 4, 2019

  • Refactor code to use the new wrappers
    • Consider whether you want to write another wrapper for the page option (you're doing the NULL checking the same way a lot)
  • Run through all functions checking return types (we likely want do.call(rbind, RObject) for many of them; see https://github.com/pearselab/Symbiota2/blob/master/R/Checklists_fxns.R#L18 for an example)
  • Run through all the nearly working functions (see https://github.com/pearselab/Symbiota2/blob/master/R/NearlyWorking_fxns.R)
  • (Optional) Run through all the broken functions (see https://github.com/pearselab/Symbiota2/blob/master/R/NotWorking_fxns.R)
  • Make list of API endpoints to add in the future (including: can't get to work easily, and the API endpoint doesn't really exist yet).
  • Store the code for endpoints that don't work yet "somewhere safe" so we can return to it later (in a 'gist'?)
  • Write help-file entries (can be perfunctory)
  • Remove the function calls from the R files
  • Get ROxygen working and add the man files to the repo
  • Now we know how to use vcr to log web calls:
    • Finish re-writing the tests
    • Add VCR to the tests
    • Once you've gone through the triaged not working functions, add tests for those
  • Change repo name to SymbiotaR2 (note the R)
  • Rename files to remove the _fxns thing (to soothe Will's OCD)
  • Add #' @importFrom statements to the .R files
  • Write vignette:
    • Use the web server address on pris for the vignette for the time being and build/test the vignette with that server up. We can then 'cache' the vignette so it will be in the package regardless
    • Convert the vignette preamble to RMarkdown rather than LaTeX
  • Get the following package building and checking routines working:
    • R CMD build
    • R CMD check
    • R CMD check --as-cran
    • Travis-CI continuous integration build
  • Write paper.md using https://github.com/ropensci/suppdata/blob/master/paper.md as a template/starting point
  • Do we want to check the URL each time a function is called?...
  • ROpenSci checklist (create the issue, and make sure everything we need covered is covered)
  • Use goodpractice::gp() to check whether the code is 'well-written'
  • Send to Ben and Curtis for review
@akoontz11 akoontz11 self-assigned this Nov 4, 2019
@akoontz11
Copy link
Member

Hey, I'm suddenly encountering an error whenever trying to call fromJSON (which is for every function): Error in fromJSON(file = dwn_file) : argument "txt" is missing, with no default.

This seems to have cropped up after the code refactoring, but the same error occurs when I try to run the "original" functions I wrote (in the Symbiota2.R file)...still digging into it a bit, but if you have any ideas, let me know.

@willpearse
Copy link
Collaborator Author

willpearse commented Nov 13, 2019 via email

@akoontz11
Copy link
Member

I've been able to cross out a few items on the above checklist. Here's a brief summary:

  1. Refactored fxns (and made a new wrapper fxn for page returns)
  2. All fxns are either functional (in which case, they're listed in their appropriate script division), not functional (in which case, they're typically lacking an API endpoint or encounter a similar issue in the API, and I've grouped them into a gist here), or they're functions with a problematic/unique command format, compared to all other functions (in which case, they're listed in the file Different.R).
  3. I've forgotten how to incorporate the vcr package or use that for the function tests...my bad. If you send me an example/description of what I need to do, I'd be happy to replicate for all of the tests.
  4. I haven't written the vignette yet (<-- hahahahaha) because I'm not totally sure I remember how to use knitr (which is what I was planning to use), but I'm willing to give it a shot sometime later this week.

Let me know if there are any other items you'd like me to take care here before formally "passing this off" (or at least, taking turns on some of these items).

@willpearse
Copy link
Collaborator Author

  1. Amazing well done!
  2. Also amazing!!! I would make a new issue entitled something like "functions that will need work in the future" and then put a link to that gist in there. That way we won't lose what is an awful lot of work, and can easily come back to it. Label the issue as 'wontfix', but don't close it and then we'll be able to find it later.
  3. Basically, wrap each separate API test call a bit like this:
use_cassette(name="something-unique-and-sensible", {
    test_that("name of test", {
    do <- stuff()
    assert_that(something, isgreat)
    }
}

see https://github.com/ropensci/vcr. I'm happy to do this for you, though, if the tests are written (see below though). I do feel I ought to do something here!
4. The best way to start with the vignette would be to write something as if you were writing a GitHub issue. Use the ``` notation to break out blocks of code, and markdown (what you're using to write these issues) for everything else. knitr can turn that into a vignette! (With a little wrapping, which I can show you).

Some other things...
a. I've messed up the lab server, right? So we can't make the tests yet?... :-( Again, I'm sorry for being such a moron...
b. For your unit tests, make each test_that name unique. So instead of having lots of Categories, try having Categories-id, for example. This isn't a priority; change the names when you're doing the vcr stuff (and make the names match.
c. You've checked the return types, yes? If so, are they all reasonably sensible/easy things (e.g., a matrix, a data.frame, or a numeric or something like that)? I ask because before there were some nested lists and things like that, which it would be good to avoid if we can.

@akoontz11
Copy link
Member

akoontz11 commented Nov 22, 2019

  1. Done.
  2. I/We can address this (and updating the test_that names) after we're in a good place with the return types (see below).
  3. Sounds simple enough, I can let you know when I've gotten something like this scripted up.

a. No longer a problem (following another update/restart).
c. I checked the above box on this, but I shouldn't have--there are still some returns that are lists of lists. Categories is an example:

str(Categories(url=url, ID=1))
trying URL 'http://a02235015-6.bluezone.usu.edu/api/collection/categories/1'
downloaded 467 bytes

List of 12
 $ @context        : chr "/api/contexts/Categories"
 $ @id             : chr "/api/collection/categories/1"
 $ @type           : chr "Categories"
 $ id              : num 1
 $ category        : chr "Intermountain Regional Herbaria Network"
 $ icon            : NULL
 $ acronym         : NULL
 $ url             : NULL
 $ inclusive       : num 1
 $ notes           : NULL
 $ initialTimestamp: chr "2019-01-12T09:57:24+00:00"
 $ collectionId    : chr [1:7] "/api/collections/7" "/api/collections/13" "/api/collections/18" "/api/collections/38" ...

Just to make sure I'm understanding this correctly...you'd recommend calling do.call(rbind, test) on this, such that the output is:

 str(do.call(rbind, Categories(url=url, ID=1)))
trying URL 'http://a02235015-6.bluezone.usu.edu/api/collection/categories/1'
downloaded 467 bytes

 chr [1:8, 1:7] "/api/contexts/Categories" "/api/collection/categories/1" "Categories" "1" ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:8] "@context" "@id" "@type" "id" ...
  ..$ : NULL

@willpearse
Copy link
Collaborator Author

This all sounds good, thanks.

The only difficult bit here is (c). I don't necessarily think, in this case, that you should return a different kind of object (and I think rbind wouldn't be a good option here), but I just want to make sure the return format is something that would be intelligible/useful to people. In this case, I can make sense of what's going on, but if a function were currently returning something like this:

$species
$species[[1]]
[1] "a"

$species[[2]]
[1] "b"

$species[[3]]
[1] "c"


$location
$location[[1]]
[1] "bristol"

$location[[2]]
[1] "logan"

$location[[3]]
[1] "bristol"

...then I think it would be better to return something like this:

  species location
1       a  bristol
2       b    logan
3       c  bristol

...does that make some sense? I'm just asking that you make sure that what we're giving the users is vaguely sensible. It doesn't have to be perfect, but if something could easily and obviously be a table, then it should be in a tabular format (matrix or data.frame). Your example above isn't (only collectionID is a vector of length greater than 1) so I don't see that we need to do anything special. Make sense?

@akoontz11
Copy link
Member

Yeah, definitely--thanks for the feedback. My belief is that many (but not all) of the functions are returning objects similar to the Categories example above, such that there's a long list and one of the list items is a vector of length x. But, I can go through and correct any that should be changed to a different format--this should be fairly straightforward.

@willpearse
Copy link
Collaborator Author

willpearse commented Nov 24, 2019 via email

@akoontz11
Copy link
Member

I'm working through finalizing the tests, now (see Checklists tests for an example). 3 things I'd like you to check/confirm for me:

  1. I'm still including the context() line prior to the vcr wrapping for each pair of functions. From what I understand of the context function (which is not a lot...), this is OK, but if it needs to change let me know.
  2. I'm using the same name= argument for the use_cassette and test_that calls. Do these names need to be different?
  3. Each script has some lines in it referring to functions that aren't currently working, just as placeholders. If you feel these need to be removed, let me know and I can drop them.

@willpearse
Copy link
Collaborator Author

willpearse commented Nov 26, 2019 via email

@akoontz11
Copy link
Member

For 3: each collection of test functions has lines in it that refer to functions that aren't working yet (see here for an example (Children) in the Checklists tests--also Dynamic, ProjectCategories, etc.). This isn't a huge deal but maybe something that needs to be removed before any releasing.

@willpearse
Copy link
Collaborator Author

Those functions that aren't working are no longer in the package, right? They're in the gist references in #5 , right?

At any rate, yeah, delete those, they're not doing anything and they're just a bit confusing.

@akoontz11
Copy link
Member

OK, can do.

Just pushed updated versions of the tests using vcr. Two questions and one note:

  1. In the help-SymbiotaR2.R file, I had to specify the directory in which to store the files generated by vcr. Similarly, we'll have to give the users the ability to specify this directory, right? That is, we can't leave this file (help-SymbiotaR2.R) as it currently is.

  2. All that is In the new folder (vcr_outputs) are .yml files generated by each test. We shouldn't expect any other files to be generated from vcr, right? (Hopefully not, because as far as I can tell, there aren't any.)

  3. I included a Notes.txt file in the main directory that I was using to keep track of some (working) functions that I'd like to have someone else look at, at some point (maybe during a "code review", if that's something we're still considering doing in the future). Mainly, I would like someone to check the output of these and give a thumbs up on the formatting.

(Now onto the vignette.)

@willpearse
Copy link
Collaborator Author

willpearse commented Nov 27, 2019 via email

@akoontz11
Copy link
Member

Just pushed up the vignettes/ folder containing the markdown file, which I based off of the equivalent file from suppdata. Could you check this and tell me what you think needs to be altered? I'm not sure I've done the code blocking correctly...and I'd like any input you have on alterations to the text, as well. After this, maybe we can walk through using knitr.

@willpearse
Copy link
Collaborator Author

willpearse commented Nov 27, 2019 via email

@akoontz11
Copy link
Member

Cool, I've updated the vignette header...do I need to remove the other lines, like the begin{document stuff, or is that all fine?

Also, I've included the #'importFrom statements--I believe these are as they need to be.

I'm not totally sure how to build/test the vignette once the server is up, but I think I remember seeing that there might be a keyboard shortcut for this...still digging.

@willpearse
Copy link
Collaborator Author

willpearse commented Nov 28, 2019 via email

@akoontz11
Copy link
Member

Upon calling the lines in the buildscript below:
Rscript -e "require(roxygen2);roxygenize('~/Symbiota2/code/SymbiotaR2/')"
R CMD build ~/Symbiota2/code/SymbiotaR2
Rscript -e "install.packages('SymbiotaR2_0.0.1-tar.gz')"
R CMD check SymbiotaR2_0.0-1.tar.gz

I'm getting an error in the resulting 00install.out file (in the SymbiotaR2.check directory):

** testing if installed package can be loaded
Error: package or namespace load failed for ‘SymbiotaR2’ in namespaceExport(ns, exports):
undefined exports: --ID, numeric;, a, adding, an, argument, be, check, consider, must, rather, string, than
Error: loading failed

@willpearse
Copy link
Collaborator Author

willpearse commented Dec 4, 2019 via email

@akoontz11
Copy link
Member

Here's what I'm working on: during the build command, the function tests are failing. Here are the relevant lines of the generated 00check.log:

* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘test-all.R’
 ERROR
Running the tests in ‘tests/test-all.R’ failed.
Last 13 lines of output:
  ══ testthat results  ═══════════════════════════════════════════════════════════
  OK: 103 SKIPPED: 0 FAILED: 108
  1. Error: ChecklistProjects_ID (@test_Checklist.R#6) 
  2. Error: ChecklistProjects_page (@test_Checklist.R#13) 
  3. Error: Coordinates_ID (@test_Checklist.R#22) 
  4. Error: Coordinates_page (@test_Checklist.R#29) 
  5. Error: TaxaLink_ID (@test_Checklist.R#38) 
  6. Error: TaxaLink_page (@test_Checklist.R#45) 
  7. Error: Checklists_ID (@test_Checklist.R#54) 
  8. Error: Checklists_page (@test_Checklist.R#61) 
  9. Error: Categories_ID (@test_Collection.R#6) 
  1. ...
  
  Error: testthat unit tests failed
  Execution halted

Those line numbers correspond to the fxn calls occurring within the tests; here's an example. I suspect these are throwing errors because the url argument is not recognized as a working url, but I can't say why not, yet...anyway, still digging, let me know if you've any insights.

@willpearse
Copy link
Collaborator Author

What happens when you run the code that generates the error? Does it return something that the tests expect?...

@akoontz11
Copy link
Member

Yes:

> data <- Categories(url=url, ID=1)
trying URL 'http://a02235015-6.bluezone.usu.edu/api/collection/categories/1'
downloaded 467 bytes

> str(data)
List of 12
 $ @context        : chr "/api/contexts/Categories"
 $ @id             : chr "/api/collection/categories/1"
 $ @type           : chr "Categories"
 $ id              : num 1
 $ category        : chr "Intermountain Regional Herbaria Network"
 $ icon            : logi NA
 $ acronym         : logi NA
 $ url             : logi NA
 $ inclusive       : num 1
 $ notes           : logi NA
 $ initialTimestamp: chr "2019-01-12T09:57:24+00:00"
 $ collectionId    : chr [1:7] "/api/collections/7" "/api/collections/13" "/api/collections/18" "/api/collections/38" ...

Because the same tests work when I run the code outside of the builf script is what leads me to believe the issue is with the url argument.

@willpearse
Copy link
Collaborator Author

willpearse commented Dec 10, 2019 via email

@akoontz11
Copy link
Member

Here, within tests/testthat/test_Collection.R

@willpearse
Copy link
Collaborator Author

willpearse commented Dec 10, 2019 via email

@akoontz11
Copy link
Member

From what I've found, the problem with the tests is indeed the url argument being provided...what allowed me to move past the issue was declaring the url argument at the beginning of each set of tests (see here), but this isn't a feasible solution (right?) because the url will need to change between installation instances.

One other thing from the 00check.log: SymbiotaR2_setup is using a path_home() function (see here) which is suposedly "undefined", i.e. not included in any of the loaded libraries or recognized in base. Is this something that needs to be worried about, or no?

@willpearse
Copy link
Collaborator Author

willpearse commented Dec 11, 2019 via email

@akoontz11
Copy link
Member

Yeah, all the tests are wrapped in vcr commands, which generate the .yml files in tests/casettes folder. But...caching the HTTP request is only useful when the exact same request is being made over and over. So, the files generated by these tests are only relevant to our installation, and not something to be included in the package distribution, right?

Similarly, even when the HTTP requests are cached using our particular url, as soon as that url changes, the cached HTTP requests no longer apply and need to be recreated, right?

@willpearse
Copy link
Collaborator Author

willpearse commented Dec 11, 2019 via email

@akoontz11
Copy link
Member

OK, yes, this makes sense. My bad--I was understanding vcr as mainly something to speed up HTTP requests, rather than a tool that shows the testing is robust. However, given that understanding, I don't see why specification of the url argument for each fxn test is what (seemed to) fix the errors I was getting with the tests...perhaps something to be addressed later, though.

Remaining issues:

  1. The package isn't installing: package 'SymbiotaR2_0.0.1-tar.gz' is not available (for R version 3.4.4)
  2. Two warnings, during the package check:
    - Warning: Files in the 'vignettes' directory but no files in 'inst/doc':
    'SymbiotaR2-intro.Rmd'
    - Getting a complaint saying there are LaTeX errors when creating a PDF of the manual, but when I look at the PDF of the manual, it looks OK? So maybe not an issue.

The above points notwithstanding, I'm reaching a point where I'm not sure what else I need to work on (which isn't to say there aren't things to address--just that I'm not sure what they are). So, let me know if you'd like to plan a review, or if we should ping Ben and Curtis.

@willpearse
Copy link
Collaborator Author

willpearse commented Dec 12, 2019 via email

@akoontz11
Copy link
Member

OK, I've pushed the paper.md file up. Let me know if you recommend any changes to it--I believe the formatting is all there.

@akoontz11
Copy link
Member

Closing this issue, as most of the points in this list are now either already addressed in the code or covered by updates recommended per the ROpenSci review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants