Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rake task for dumping, restoring and anonymizing user data #1013

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions backend/lib/tasks/db.rake
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
namespace :db do
desc 'replace user sensitive data with placeholders'
task anonymize_user: :environment do
if ENV['RAILS_ENV'] == 'production'
puts 'You do not want to anonymize production data'
else
puts 'Anonymizing user data'
@user_ids = User.ids.shuffle
Broadcast.find_each do |broadcast|
broadcast.creator_id = @user_ids.sample
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I would not do that because it gets confusing. E.g. @ciremoussadia is working on a PR where we update the user role if you create a broadcast.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to do proper anonymization 😆 maybe you can do some research if the user id itself is a means of de-anonymization and if yes, what are the counter-measures?

Maybe it's not necessary after all? 🤷‍♂️

broadcast.save!
end
User.find_each do |user|
new_id = @user_ids.pop

user.encrypted_password = user.encrypted_password.truncate(8)
user.auth0_uid = user.auth0_uid.truncate(15) + new_id.to_s if user.auth0_uid
user.latitude = 50 + 0.001 * new_id
user.longitude = 10 - 0.001 * new_id
user.city = "city#{new_id}"
user.postal_code = 10000 + new_id

user.email = "user#{user.id}@example.org"
user.save!
end
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkuku you can probably update all the users in one sql statement: https://apidock.com/rails/ActiveRecord/Base/update_all/class

User.update_all(:password, 'xxxxxx')
User.update_all(:latitute, '50 + 0.001 * users.id')
# ... etc

Rake::Task['db:dump'].invoke
end
end

desc 'Dumps the database to db/APP_NAME.dump'
task dump: :environment do
app, host, db, user = with_config
cmd = "pg_dump --host #{host} #{user_present?(user)} --verbose --clean --no-owner --no-acl --format=c #{db} > #{Rails.root}/db/#{app}.dump"
puts cmd
exec cmd
end

desc 'Restores the database dump at db/APP_NAME.dump.'
task restore: :environment do
app, host, db, user = with_config
cmd = "pg_restore --verbose --host #{host} #{user_present?(user)} --clean --no-owner --no-acl --dbname #{db} #{Rails.root}/db/#{app}.dump"
Rake::Task['db:drop'].invoke
Rake::Task['db:create'].invoke
Rake::Task['db:migrate'].invoke
puts cmd
exec cmd
end

private

def user_present?(user)
user ? '--username ' + user : nil
end

def with_config
[
Rails.application.class.parent_name.underscore,
ActiveRecord::Base.connection_config[:host],
ActiveRecord::Base.connection_config[:database],
ActiveRecord::Base.connection_config[:username]
]
end
end