Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Origin/reweighting #185

Open
wants to merge 92 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 61 commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
9932c84
start working on adding reweighting options for hadron
pittlerf Sep 26, 2019
b0ed038
rw_orig added
pittlerf Sep 26, 2019
9b67c8b
reading function for reweighting factor
pittlerf Sep 26, 2019
b678374
Correcting errors
pittlerf Sep 27, 2019
3dc4bfa
correcting errors
pittlerf Sep 27, 2019
c1eb13c
adding possibility to reverse correlation functions
pittlerf Sep 28, 2019
754d287
correcting errors
pittlerf Sep 29, 2019
4b26f9f
renaming and error search
pittlerf Sep 30, 2019
12f0654
corrected errors
pittlerf Sep 30, 2019
add1d1a
after multiplying two reweighting factors, you could not increase sta…
pittlerf Sep 30, 2019
32e577a
correcting error
pittlerf Sep 30, 2019
bed0b43
Implementing reweighting: setting the bootstrap or jackknife samples …
pittlerf Sep 30, 2019
b88a01a
removing testing print statements
pittlerf Sep 30, 2019
95e7ce7
correcting errors
pittlerf Sep 30, 2019
7275044
Performing gauge conf list check in reweighting
pittlerf Oct 1, 2019
c9aa2a5
Including it in jackknife as well
pittlerf Oct 1, 2019
a871557
Typo corrected
pittlerf Oct 1, 2019
6e6664d
Specify dependency on dplyr
pittlerf Oct 2, 2019
ed4879c
function that averages correlation functions just over the stochastic…
pittlerf Oct 2, 2019
b35557c
Comparing two vectors with identical()
pittlerf Oct 2, 2019
1240d30
by mistake I have deleted man/addStat.cf.Rd
pittlerf Oct 2, 2019
191f0ba
Errors for Rd file jackknife corrected
pittlerf Oct 2, 2019
430be22
Errors for Rd file cf_boot corrected
pittlerf Oct 2, 2019
08ef589
Further errors corrected
pittlerf Oct 2, 2019
0bbd5a2
Correct cf_boot
pittlerf Oct 2, 2019
fc6e7dd
additional Rd files uploaded
pittlerf Oct 2, 2019
1cf72f3
removing averaging over stochastic samples
pittlerf Oct 3, 2019
17c3058
correcting the appropriate man file as well
pittlerf Oct 3, 2019
81db664
correcting for the behaviour of is.na in if statement
pittlerf Oct 3, 2019
0dda8b2
correcting testing for icf
pittlerf Oct 3, 2019
548eb99
Simplifying bootstrap_rwcf and jackknife_rwcf
pittlerf Oct 8, 2019
0fb2b42
Using functions from branch icf_support
pittlerf Nov 8, 2019
f17f73a
input parameter types for jackknife_rw.cf
pittlerf Nov 8, 2019
2cf8814
Adding more clarifications and correcting typo
pittlerf Nov 8, 2019
0d5ab14
Merge pull request #195 from HISKP-LQCD/icf_support
pittlerf Nov 8, 2019
865dcaf
Merge remote-tracking branch 'origin/origin/reweighting' into origin/…
pittlerf Nov 12, 2019
74b3f53
try to upload modifications
pittlerf Nov 12, 2019
67ad0c1
Consistent assignments
pittlerf Nov 12, 2019
25e64dc
mixin reintroducted
pittlerf Nov 12, 2019
284c9dd
Example for addStat.cf documentation
pittlerf Nov 12, 2019
dd99b7c
the same explanation for combining replicas for reweighting factors
pittlerf Nov 12, 2019
184f86e
including dplyr in DESCRIPTION
pittlerf Nov 12, 2019
bd71437
for names of columns in reweighting factor made some explanations
pittlerf Nov 12, 2019
dbcbe71
remove default argument for conf.index
pittlerf Nov 12, 2019
567e108
Rd files added
pittlerf Nov 12, 2019
d60a89e
adding red colors for plotting negativ values
pittlerf Nov 12, 2019
f93326d
uploading Rd file
pittlerf Nov 12, 2019
b3d9073
correcting names of columns in tmp
pittlerf Nov 13, 2019
b226aa5
Finalize merging
pittlerf Nov 14, 2019
7e0d32d
trying to fix conflicts
pittlerf Nov 26, 2019
4140e6e
trying to fix conflicts
pittlerf Nov 26, 2019
6d6d7ae
trying to correct conflicts
pittlerf Nov 26, 2019
4f9111e
trying to fix conflicts
pittlerf Nov 26, 2019
d64dec0
resolving conflicts
pittlerf Nov 26, 2019
266da59
resolving conflicts again
pittlerf Nov 26, 2019
603f177
resolve intergration error
pittlerf Nov 26, 2019
a714c51
adding errors in plotting the reweighting factors
pittlerf Nov 26, 2019
ea0b6e6
changing back to roxygen700
pittlerf Nov 26, 2019
b43b5d3
fixing conflicts
pittlerf Nov 26, 2019
aa378e9
resolving integration test further
pittlerf Nov 26, 2019
71bbeb8
Removing str
pittlerf Nov 26, 2019
530a082
read.table instead of readcmidatafiles
pittlerf Dec 23, 2019
b32a9d5
computing the error from stochastic samples on each gauge configuration
pittlerf Dec 23, 2019
54f47d9
Aborting, when there is not an apprioriate monomial id in source file
pittlerf Dec 23, 2019
0c3ece7
some comments for multiplying reweighting factors
pittlerf Dec 23, 2019
f36f656
stop explicitely, when dplyr is not available
pittlerf Dec 23, 2019
c6af092
updating docu files
pittlerf Dec 23, 2019
75367e6
typo corrected
pittlerf Dec 27, 2019
60bdf87
some update
pittlerf Dec 27, 2019
c848a72
some comments added
pittlerf Dec 30, 2019
235d318
documentation updated
pittlerf Dec 30, 2019
300b289
documentation updated
pittlerf Dec 30, 2019
f49b156
documentation updated
pittlerf Dec 30, 2019
95ca6dc
simple tests for reweighting: apply randomly generated factors and it…
pittlerf Dec 30, 2019
922457d
documentation for samplerw, it was generated with Gaussian rng
pittlerf Dec 30, 2019
607a131
documentation updated
pittlerf Dec 30, 2019
1785420
Merge branch 'master' into origin/reweighting
martin-ueding Jan 7, 2020
04b99ed
Also delete generated documentation herre
martin-ueding Jan 7, 2020
69b0f2a
Rd files no longer in repo
urbach Apr 9, 2021
35f7cac
merge master into branch
urbach Apr 9, 2021
5ef9a26
Merge remote-tracking branch 'upstream/master' into reweighting
urbach Apr 9, 2021
9553a55
add missing export
urbach Apr 9, 2021
540a1d2
further fixes
urbach Apr 9, 2021
039d134
How to vignette for reweighting process
pittlerf Apr 11, 2021
710ae7b
sorry, I have not realized this, now it is done
pittlerf Apr 13, 2021
2718d04
including missing examples and returns, in the plotting routine use t…
pittlerf Apr 16, 2021
577becf
creating unit reweighting factor with the appropriate constructor
pittlerf Apr 16, 2021
1ac22f2
try correcting integration test
pittlerf Apr 16, 2021
27b27e8
try correcting integration test
pittlerf Apr 16, 2021
673a119
correcting integration test
pittlerf Apr 20, 2021
04ad406
initializing momonialid in order to avoid warning message
pittlerf May 26, 2021
3fe0426
correcting the example for rw_meta, and break the example for rw_orig…
pittlerf May 26, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Maintainer: Carsten Urbach <[email protected]>
SystemRequirements: Gnu Scientific Library version >= 1.8
Description: Toolkit to extract hadronic quantities from Lattice QCD simulations. It contains functionality for IO, plotting, bootstrap and jackknife resampling, fitting, GEVP solving, error and autocorrelation estimation as well as other areas.
Imports:
dplyr,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be okay to make this an optional dependency as reweighting is a bit of an edge-case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now included a stop condition, when dplyr is not available. That was also done in cyprus_readutils

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still, it would be nice if dplyr were an optional dependency at this stage

abind,
boot,
R6,
Expand Down
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
exportPattern("^cf.*")
exportPattern("*.cf")
exportPattern("^rw.*")
exportPattern("*.rw")
exportPattern("^plot.*")
exportPattern("^analysis_*")
exportPattern("^print.*")
Expand Down
211 changes: 207 additions & 4 deletions R/cf.R
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ cf_meta <- function (.cf = cf(), nrObs = 1, Time = NA, nrStypes = 1, symmetrised
return (.cf)
}

#' Bootstrapped CF mixin constructor

#' Bootstrapped CF mixin constructor
#'
#' @param .cf `cf` object to extend.
#' @param boot.R Integer, number of bootstrap samples used.
Expand Down Expand Up @@ -108,6 +109,7 @@ cf_boot <- function (.cf = cf(), boot.R, boot.l, seed, sim, cf.tsboot, icf.tsboo
return (.cf)
}


#' Estimates error from jackknife samples
#'
#' Currently this uses the mean over the jackknife samples in order to compute
Expand Down Expand Up @@ -374,6 +376,78 @@ gen.block.array <- function(n, R, l, endcorr=TRUE) {
return(list(starts = st, lengths = lens))
}

#' Computes the samples for reweighted correlation function
#'
#' @param cf `cf` object.
#' @param rw `rw` object.
#' @param boot.R Integer
#' @param boot.l Integer
#' @param seed Integer
#' @param sim string
#' @param endcorr boolean
#' @export
bootstrap_rw.cf <- function(cf, rw, boot.R=400, boot.l=2, seed=1234, sim="geom", endcorr=TRUE) {
stopifnot(inherits(cf, 'cf_orig'))
stopifnot(inherits(rw, 'rw_orig'))
stopifnot(inherits(rw, 'rw_meta'))
stopifnot(inherits(cf, 'cf_indexed'))


##We should also check that the cf object and the rw object contains the same gauge configurations

stopifnot(rw$conf.index == cf$conf.index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be wrapped in an all


stopifnot( nrow(cf$cf) == length(rw$conf.index) )

boot.l <- ceiling(boot.l)
boot.R <- floor(boot.R)

stopifnot(boot.l >= 1)
stopifnot(boot.l <= nrow(cf$cf))
stopifnot(boot.R >= 1)

##Construct correlation function for the reweighting samples
rw_cf <- cf
rw_cf$cf <- replicate(ncol(cf$cf), rw$rw)

## we set the seed for reproducability and correlation
old_seed <- swap_seed(seed)

## now we bootstrap the correlators*reweighting factor
rwcf.tsboot <- boot::tsboot(cf$cf*rw_cf$cf, statistic = function(x){ return(apply(x, MARGIN=2L, FUN=mean))},
R = boot.R, l=boot.l, sim=sim, endcorr=endcorr)


restore_seed(old_seed)

## we set the seed for reproducability and correlation
old_seed <- swap_seed(seed)

## now we bootstrap the reweighting factor
rw.tsboot <- boot::tsboot(rw_cf$cf, statistic = function(x){ return(apply(x, MARGIN=2L, FUN=mean))},
R = boot.R, l=boot.l, sim=sim, endcorr=endcorr)


rwcf.tsboot$t0<- rwcf.tsboot$t0/rw.tsboot$t0
rwcf.tsboot$t <- rwcf.tsboot$t/rw.tsboot$t

cf <- cf_boot(cf,
boot.R = boot.R,
boot.l = boot.l,
seed = seed,
sim = sim,
cf.tsboot = rwcf.tsboot)

class(cf) <- append(class(cf), 'cfrw_boot')

class(cf) <- setdiff(class(cf), 'cf_orig')


restore_seed(old_seed)

return(invisible(cf))
}

bootstrap.cf <- function(cf, boot.R=400, boot.l=2, seed=1234, sim="geom", endcorr=TRUE) {
stopifnot(inherits(cf, 'cf_orig'))

Expand Down Expand Up @@ -412,6 +486,7 @@ bootstrap.cf <- function(cf, boot.R=400, boot.l=2, seed=1234, sim="geom", endcor
return(invisible(cf))
}


jackknife.cf <- function(cf, boot.l = 1) {
stopifnot(inherits(cf, 'cf_orig'))

Expand Down Expand Up @@ -469,8 +544,83 @@ jackknife.cf <- function(cf, boot.l = 1) {
resampling_method = 'jackknife')

return (invisible(cf))

}

#' Computes the jackknife samples for reweighted correlation function
#'
#' @param cf `cf` object.
#' @param rw `rw` object.
#' @param boot.l Integer
#' @export
jackknife_rw.cf <- function(cf, rw, boot.l = 1) {
stopifnot(inherits(cf, 'cf_orig'))
stopifnot(inherits(rw, 'rw_orig'))
stopifnot(inherits(rw, 'rw_meta'))
stopifnot(inherits(cf, 'cf_indexed'))

stopifnot(rw1$conf.index == rw2$conf.index)

##We should also check that the cf object and the rw object contains the same gauge configurations

stopifnot( nrow(cf$cf) == length(rw$conf.index) )


stopifnot(boot.l >= 1)
boot.l <- ceiling(boot.l)

##Construct correlation function for the reweighting samples
rw_cf <- cf
rw_cf$cf <- replicate(ncol(cf$cf), rw$rw)


## blocking with fixed block length, but overlapping blocks
## number of observations
n <- nrow(cf$cf)
## number of overlapping blocks
N <- n-boot.l+1


numerator <- apply(cf$cf*rw_cf$cf, 2, mean)
denominator <- apply(rw_cf$cf, 2, mean)
t0 <- numerator/denominator

t <- array(NA, dim = c(N, ncol(cf$cf)))
for (i in 1:N) {
## The measurements that we are going to leave out.
ii <- c(i:(i+boot.l-1))
## jackknife replications of the mean
t[i, ] <-

numerator <- apply(cf$cf[-ii, ]* rw_cf$cf[ ii, ], 2L, mean)
denominator <- apply( rw_cf$cf[ ii, ] , 2L, mean )
t[i, ] < numerator/denominator
}


cf <- invalidate.samples.cf(cf)

cf.tsboot <- list(t = t,
t0 = t0,
R = N,
l = boot.l)


cf <- cf_boot(cf,
boot.R = cf.tsboot$R,
Copy link
Member

@kostrzewa kostrzewa Jan 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this now needs endcorr to be specified explicitly

boot.l = cf.tsboot$l,
seed = 0,
sim = 'geom',
cf.tsboot = cf.tsboot,
resampling_method = 'jackknife')

class(cf) <- append(class(cf), 'cfrw_boot')

class(cf) <- setdiff(class(cf), 'cf_orig')

return (invisible(cf))
}
# Gamma method analysis on all time-slices in a 'cf' object
#' uwerr.cf
#' @description
#' Gamma method analysis on all time-slices in a 'cf' object
Expand Down Expand Up @@ -513,10 +663,31 @@ addConfIndex2cf <- function(cf, conf.index) {
if(is.null(cf$conf.index)) {
cf$conf.index <- conf.index
}
class(cf) <- append(class(cf), 'cf_indexed')
return(cf)
}

addStat.cf <- function(cf1, cf2) {
#' Combine correlation function from different replicas
#'
#' @param cf1 `cf` object: correlation function for replicum A
#' @param cf2 `cf` object: correlation function for replicum B
#' @param reverse1 `boolean` After the bifurcation point one of
#' the replicas (chain of correlation
#' functions in simulation time) has
#' to be reversed.
#' @param reverse2 `boolean`
#'
#' @examples
#' Suppose we have correlation functions in replicum A from 0 to 500
#' in steps of 4 and in replicum B from 4 to 500 in steps of 4.
#' To combined the two replicas we have to use
#'
#' addstat.cf(cf_replicumB, cf_replicumA, TRUE, FALSE)
#' which means
#' combined=(cf500 from B, cf496 from B,...,cf004 from B, cf000 from A, ..
#' cf500 from A)
#' @export
addStat.cf <- function(cf1, cf2,reverse1=FALSE, reverse2=FALSE) {
stopifnot(inherits(cf1, 'cf'))
stopifnot(inherits(cf2, 'cf'))

Expand All @@ -530,15 +701,47 @@ addStat.cf <- function(cf1, cf2) {
stopifnot(inherits(cf1, 'cf_meta'))
stopifnot(inherits(cf2, 'cf_meta'))

##Either both should have an index or none of them
stopifnot(inherits(cf1, 'cf_indexed') == inherits(cf1, 'cf_indexed') )

stopifnot(cf1$Time == cf2$Time)
stopifnot(dim(cf1$cf)[2] == dim(cf2$cf)[2])
stopifnot(cf1$nrObs == cf2$nrObs )
stopifnot(cf1$nrStypes == cf2$nrStypes)

cf <- cf1

cf$cf <- rbind(cf1$cf, cf2$cf)
cf$icf <- rbind(cf1$icf, cf2$icf)
cf1_temp<- cf1$cf
icf1_temp <- cf1$icf
if (reverse1 == TRUE){
apply(cf1_temp,2,rev)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't assigned to anything?

if ( has_icf(cf1)){
apply(icf1_temp,2,rev)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

none of these apply calls are assigned to anything...

}
}
cf2_temp <- cf2$cf
icf2_temp <- cf2$icf
if (reverse2 == TRUE){
apply(cf2_temp,2,rev)
if ( has_icf(cf2)){
apply(icf2_temp,2,rev)
}
}
if (inherits(cf1, 'cf_indexed')){
conflist_temp1 <- cf1$conf.index
if (reverse1 == TRUE){
conflist_temp1 <- rev(conflist_temp1)
}
conflist_temp2 <- cf2$conf.index
if (reverse2 == TRUE){
conflist_temp2 <- rev(conflist_temp2)
}
cf$conf.index <- c(conflist_temp1,conflist_temp2)
}


cf$cf <- rbind(cf1_temp, cf2_temp)
cf$icf <- rbind(icf1_temp, icf2_temp)

cf <- invalidate.samples.cf(cf)

Expand Down
43 changes: 43 additions & 0 deletions R/readutils.R
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,50 @@ readtextcf <- function(file, T=48, sym=TRUE, path="", skip=1, check.t=0, ind.vec

return (invisible(ret))
}
#' @title reading reweighting factors for a list of gauge configuration
#' and random samples from ASCII files
#' @param file_names_to_read list of filenames for the reweighting factors
#' @param gauge_conf_list <- a list of integers with the indices of the gauge configs
#' @param nsamples number of stochastic samples used for computing the reweighting factors
read.rw <- function( file_names_to_read, gauge_conf_list, nsamples, monomial_id )
{
stopifnot(length(gauge_conf_list)==length(file_names_to_read))
ret <- rw_meta(conf.index=gauge_conf_list)
tmp <- readcmidatafiles(files=file_names_to_read,skip=0,verbose=TRUE,colClasses=c("integer","integer","numeric","numeric","numeric","numeric","numeric"))
kostrzewa marked this conversation as resolved.
Show resolved Hide resolved
names(tmp)[1] <- "monomialid"
names(tmp)[2] <- "stochastic_index"
names(tmp)[3] <- "kappa_target"
names(tmp)[4] <- "kappa_original"
names(tmp)[5] <- "light_quark_mass_target"
names(tmp)[6] <- "light_quark_mass_original"
names(tmp)[7] <- "reweightingfactor"

# Select the reweighting factor for a particular monomial

dplyr_avail <- requireNamespace("dplyr")
kostrzewa marked this conversation as resolved.
Show resolved Hide resolved
stopifnot(dplyr_avail)

tmp <- dplyr::filter(tmp,monomialid==monomial_id)
kostrzewa marked this conversation as resolved.
Show resolved Hide resolved

# Number of reweighted determinants for each gauge configuration

n_rew_factors <- length(tmp$reweightingfactor)/(nsamples*length(gauge_conf_list))
stopifnot(n_rew_factors == 1)


# Exponentianing and Averaging over the stochastic samples

tmp2 <- matrix(tmp$reweightingfactor,nrow=nsamples,ncol=length(gauge_conf_list)*n_rew_factors)
tmp3 <- apply(exp(-tmp2),2,mean)

# Normalize the largest reweighting factor to be one and storing this factor
# this is neccessary due to the large value of the reweighting factor
# after exponentiating
tmp4 <- tmp3/max(tmp3)

ret <- rw_orig(ret, rw = tmp4, conf.index=gauge_conf_list, max_value = max(tmp3))

}
#' @title reader for Nissa text format correlation functions
#' @param file_basenames_to_read Character vector of file names without the
#' smearing combination suffixes (such as 'll', 'ls', 'sl', 'ss')
Expand Down
Loading